Introduction to Data Analysis with R for Social Science
UCSB Library, rm 1312
Mondays, Jan. 23, 30 & Feb. 6
10:00 am - 11:50 am
Instructors:
Jon Jablonski, Renata Curty, Seth Erickson
Helpers:
Kristi Liu, Amber Budden
Registration for this workshop begins on January 2, 2023 at 8:00 am PST
Some adblockers block the registration window. If you do not see the
registration box below, please check your adblocker settings.
General Information
Software Carpentry
aims to help researchers get their work done
in less time and with less pain
by teaching them basic research computing skills.
This hands-on workshop will cover basic concepts and tools,
including program design, version control, data management,
and task automation.
Participants will be encouraged to help one another
and to apply what they have learned to their own research problems.
Who:
The course is aimed at students who want to use R for data analysis and
creating charts and graphs, and a basic introduction to scientific
programming.
You don't need to have any previous programming experience.
If you are comfortable downloading and installing software, you have the skills
to complete this workshop.
Requirements:
Participants must bring a laptop with a
Mac, Linux, or Windows operating system (not a tablet, Chromebook, etc.) that they have administrative privileges on.
They should have a few specific software packages installed (listed below).
Accessibility:
We are committed to making this workshop
accessible to everybody. For workshops at a physical location, the workshop organizers have checked that:
The room is wheelchair / scooter accessible.
Accessible and gender-neutral restrooms are available.
Materials will be provided in advance of the workshop and
large-print handouts are available if needed by notifying the
organizers in advance. If we can help making learning easier for
you (e.g. sign-language interpreters, lactation facilities) please
get in touch (using contact details below) and we will
attempt to provide them.
Roles:
To learn more about the roles at the workshop (who will be doing what),
refer to our Workshop FAQ.
Code of Conduct
Everyone who participates in Carpentries activities is required to conform to the Code of Conduct. This document also outlines how to report an incident if needed.
Surveys
Please be sure to complete these surveys before and after the workshop.
How can I write software that other people can use?
11:15
Finish
The actual schedule may vary slightly depending on the topics and exercises chosen by the instructor.
Setup
To participate in a
Software Carpentry
workshop,
you will need access to software as described below.
In addition, you will need an up-to-date web browser.
For this workshop, you have the option of using the Jupyter Hub Instance to use Rstudio. If you would like to install the software on your own device, you
will need to access software and data as described below:
Gapminder Dataset
R (version 4.2.1 or greater)
Rstudio (version 2022.7 or greater)
Tidyverse Package
Scroll past the Jupyter Hub to view the Data download and R/Rstudio installation instructions.
Logging into the RStudio Jupyter Hub Instance
For this workshop, we will be using a Jupyter Hub Instance that LSIT has graciously setup for us with the software and packages preinstalled.
Please use your UCSB NETID to sign into the Jupyter Hub at: https://carpentryworkshop.lsit.ucsb.edu/
Once you have signed in, click the RStudio Launcher button.
You do not need to follow the setup instructions below if you plan on using the Jupyter Hub Interface rather than RStudio on your own computer
Data Download
Download the data from https://raw.githubusercontent.com/swcarpentry/r-novice-gapminder/gh-pages/_episodes_rmd/data/gapminder_data.csv
Save this file as: gapminder_data.csv in an easily accessible place, like your Desktop
To download it directly into your Rstudio Environment:
R and RStudio are two separate pieces of software:
R is a programming language that is especially powerful for data exploration, visualization, and statistical analysis
RStudio is an integrated development environment (IDE) that makes using R easier. In this course we use RStudio to interact with R.
If you don’t already have R and RStudio installed, follow the instructions for your operating system below. You have to install R before you install RStudio.
Update R and RStudio
If you already have R and RStudio installed, first check if your R version is up to date:
When you open RStudio your R version will be printed in the console on the bottom left. Alternatively, you can type sessionInfo() into the console. If your R version is 4.2.1 or later, you don’t need to update R for this lesson. If your version of R is older than that, download and install the latest version of R from the R project website for Windows, for MacOS, or for Linux
It is not necessary to remove old versions of R from your system, but if you wish to do so you can check How do I uninstall R?
Note: The changes introduced by new R versions are usually backwards-compatible. That is, your old code should still work after updating your R version. However, if breaking changes happen, it is useful to know that you can have multiple versions of R installed in parallel and that you can switch between them in RStudio by going to Tools > Global Options > General > Basic.
After installing a new version of R, you will have to reinstall all your packages with the new version. For Windows, there is a package called installr that can help you with upgrading your R version and migrate your package library.
To update RStudio to the latest version, open RStudio and click on Help > Check for Updates. If a new version is available follow the instruction on screen. By default, RStudio will also automatically notify you of new versions every once in a while.
R
R is a programming language
that is especially powerful for data exploration, visualization, and
statistical analysis. To interact with R, we use
RStudio.
Install R by downloading and running
this .exe file
from CRAN.
Also, please install the
RStudio IDE.
Note that if you have separate user and admin accounts, you should run the
installers as administrator (right-click on .exe file and select "Run as
administrator" instead of double-clicking). Otherwise problems may occur later,
for example when installing R packages.
Instructions for R installation on various Linux platforms (debian,
fedora, redhat, and ubuntu) can be found at
<https://cran.r-project.org/bin/linux/>. These will instruct you to
use your package manager (e.g. for Fedora run
sudo dnf install R and for Debian/Ubuntu, add a ppa
repository and then run sudo apt-get install r-base).
Also, please install the
RStudio IDE.
Install required R packages
During the course we will need a number of R packages. Packages contain useful R code written by other people. We will use the packages tidyverse, hexbin, patchwork, and RSQLite.
To try to install these packages, open RStudio and copy and paste the following command into the console window (look for a blinking cursor on the bottom left), then press the Enter (Windows and Linux) or Return (MacOS) to execute the command.
install.packages("tidyverse")
Alternatively, you can install the packages using RStudio’s graphical user interface by going to Tools > Install Packages and typing the names of the packages separated by a comma.
R tries to download and install the packages on your machine. When the installation has finished, you can try to load the packages by pasting the following code into the console:
library(tidyverse)
If you do not see an error like there is no package called ‘…’ you are good to go!