UCSB Library Library Carpentry

University of California, Santa Barbara (Online)

June 3rd, 10th, and 17th, 2020

12:00 pm - 5:00 pm

Instructors: Torin White, Sanjeev Kolli, Renata Curty, Kristi Liu, Jon Jablonski, Greg Janee

Helpers: Kristi Liu, Torin White

*Registration*

Registration & waitlist is at capacity for this event. Please subscribe to UCSB Carpentry workshops list-serve to be alerted when new workshops are published. We plan holding online workshops throughout the summer quarter. They will be sent to the list serve and posted on https://ucsbcarpentry.github.io/ as soon as the events are published.

General Information

Library Carpentry is made by people working in library- and information-related roles to help you:

Library Carpentry introduces you to the fundamentals of computing and provides you with a platform for further self-directed learning. For more information on what we teach and why, please see our paper "Library Carpentry: software skills training for library professionals".

Who: The course is for people working in library- and information-related roles. You don't need to have any previous knowledge of the tools that will be presented at the workshop.

Where: Zoom. Get directions with OpenStreetMap or Google Maps.

When: June 3rd, 10th, and 17th, 2020. Add to your Google Calendar.

Requirements: Participants must bring a laptop with a Mac, Linux, or Windows operating system (not a tablet, Chromebook, etc.) that they have administrative privileges on. They should have a few specific software packages installed (listed below).

Accessibility: We are committed to making this workshop accessible to everybody. The workshop organizers have checked that:

Materials will be provided in advance of the workshop and large-print handouts are available if needed by notifying the organizers in advance. If we can help making learning easier for you (e.g. sign-language interpreters, lactation facilities) please get in touch (using contact details below) and we will attempt to provide them.

Contact: Please email collaboratory@library.ucsb.edu for more information.


Code of Conduct

Everyone who participates in Carpentries activities is required to conform to the Code of Conduct.This document also outlines how to report an incident if needed.


Collaborative Notes

We will use this collaborative document for chatting, taking notes, and sharing URLs and bits of code.


Surveys

Please be sure to complete these surveys before and after the workshop.

Pre-workshop Survey

Post-workshop Survey


Schedule

Day 2

12:00 Review Day 1
12:20 More with Regular Expressions
12:50 Working with Free Text
1:20 Break
1:35 Setup for webscraping & HTML 101
2:05 What is web scraping?
2:15 Selecting content on a web page with XPath
3:00 Break
3:15 Manually scrape data using browser extensions
4:20 End

Day 3

12:00 Review Day 2
12:30 Introduction to JupyterLab
12:50 Break
1:05 Web scraping using Python and Scrapy
2:05 Break
2:20 More Scraping with Scrapy
3:20 Break
3:35 Ethics & Legality of Webscraping
4:15 Post-workshop survey
4:20 End

Syllabus

A Computational Approach

The Unix Shell

  • Files and Directories
  • History and Tab Completion
  • Counting and Sorting Contents in Files
  • Pipes and Redirection
  • Mining or Searching in Files
  • Reference...

Web Scraping with Python

  • What is Web Scraping?
  • Selecting content with Xpath
  • Using browser extensions to scrape
  • Scrape with Python and Scrapy

Setup

To participate in a Library Carpentry workshop, you will need access to the software described below. In addition, you will need an up-to-date version of the Chrome web browser.

The Carpentries maintain a list of common issues that occur during installation as a reference for instructors that may be useful on the Configuration Problems and Solutions wiki page.

The Bash Shell

Bash is a commonly-used shell that gives you the power to do simple tasks more quickly. Please find setup instructions in the lesson.

Scraper Chrome extension

Scraper is a data mining extension for extracting data from a web page. It will help us to teach you XPath. You can download it here, or from the web scraping setup page.

Python with Jupyter Notebooks

We will be usng an online tool to teach you to use Python

Python is a popular language for research computing, and great for general-purpose programming as well. Installing all of its research packages individually can be a bit difficult, so we will be using an online versio of Python to make life easier during the workshop. If you are interested in continuing your Python learning, we recommend Anaconda, an all-in-one installer that includes Jupyter Notebooks.