Navigation:

Updated on:
7/31/19

This syllabus is subject to change. The latest version on this website is the binding syllabus

Syllabus

 

COURSE

My office is 455 South College (top floor, north-east corner). Phone is 413-545-6598. Office hours are any time by appointment. Please feel free to contact me if you have any questions or concerns about this course.

Please read this warning about the course.

We meet Th, 2:30 – 3:20 pm, in Elm 210.
(Campus Map.)

English 391AH (Honors College seminar) introduces you to data science as part of the DaSH initiative.

Outcomes. You will be introduced to 1) basics of the python programming language, 2) simple algorithms, and 3) applying data science to the humanities.

UMass offers a number of introductions to data science, but this course focuses on practical applications in literature, language, history, art, architecture, film, music, dance, society, and politics.

We start from scratch: you do not need to know how to program, and high-school-level math is sufficient. (No calculus!) Grades are based on basic proficiency in python, a good grasp of simple algorithms, and participation.

SCHEDULE is here.

 

BOOKS and MATERIALS:

BOOKS. We will be using:

  • Nick Montfort, Exploratory Programming for the Arts & Humanities. MIT Press. 2016. Amazon.
  • Jodie Archer & Matthew Jockers, The Bestseller Code. St. Martins, 2016. (Amazon)
  • Bird, Klein, Loper. The Natural Language Tool Kit (free)
  • Tim Hall et al., Python 3 for Absolute Beginners (free here)
  • Sarah Boslaugh, Statistics in a Nutshell (free here)
  • Jurafsky & Martin, Speech and Language Processing (free here)

Here are some very useful cheat sheets: python basics, python import.

Recommended Books. Python in a Nutshell.

COMPUTERS. If you do not have a computer, you can check one out from the DuBois Library for free or use one there. If you want an extremely cheap, reliable, and functional computer, try a Raspberry Pi 3 (B+) and a cheap screen, keyboard, and mouse. (UMass IT may have some equipment it is throwing out—go and check!).

You also have access to my office computer through ssh, if you want it. Let me know and I'll set up an account for you.

 

STRUCTURE:

We begin each class with python. The second half of each class focuses on application.

At the end of each class, we will discuss strategies and methods for gathering, cleaning, structuring, and analyzing data. An important component of this will be heuristic biases—or false assumptions that all people make when relying on intuition (following Daniel Kahneman, Thinking, Fast and Slow).

 

ASSIGNMENTS:

Attendance and participation are 100% of your grade. Show up, take part, and you'll do fine.

 

POLICIES:

Attendance is required.

Late Assignments are not accepted. Make provisions beforehand and speak with me if you anticipate obstacles to punctuality. I will accept officially excused absences.

Electronic devices are unwelcome, unless used for class work. Please don't distract people during class.

No recording devices. Your notes from class are fair use only for your personal reading; you cannot legally share them with or sell them to an outside vendor or entity without my written permission.  This pertains to in-class recordings as well.  Distributing notes or in-class recordings without my permission is a violation of faculty copyright protection. This policy aso pertains to notes taken by students with accommodations under the Americans with Disabilities Act (ADA).

Conferences. I encourage you to meet with me at least once during the semester, if only to verify that the grades that you have correspond to the ones in my gradebook.

Office Hours. In 20 years of teaching, only about eight students have ever shown up to office hours. Finally realizing my folly, I don't keep office hours. But, I am usually to be found in my office. (Please let me know beforehand if you want to meet—I may be busy when you pop by, or teaching, or at a meeting.) Otherwise, please make an appointment to meet with me at a time convenient to you and I will certainly try to oblige.

 

ACADEMIC HONESTY:

The University requires you to act and write with the highest degree of integrity. Ignorance of the standards of academic integrity is not normally sufficient evidence of lack of intent. For more information, consult the website of the Dean of Students.

See below (Note 4) for more information.

 

NOTES:

NOTE 1: Please make and keep a copy of all your assignments. In case any difficulties arise with respect to misplaced assignments or with respect to discrepancies between your records and my own, I will accept the evidence of your computer system's dating function. For your own peace of mind, I suggest that you lock any document on the day it is due. That will prevent your system from associating your document with a later date.

NOTE 2: The schedule of this course is subject to change. It is not to be construed as a substitute for your attendance or as a catalogue of all the information for which you are responsible. All changes will be announced beforehand. This syllabus and the accompanying schedule constitute a binding contract between a student and professor. If you do not agree with any of the provisions set herein and as of this moment, then you are free to drop this class within the time allotted by the administration.

NOTE 3: All material pertaining to this course is copyrighted material and is subject to international and US laws of copyright. No recording devices, please.

NOTE 4: Since the integrity of the academic enterprise of any institution of higher education requires honesty in scholarship and research, academic honesty is required of all students at the University of Massachusetts Amherst.

Academic dishonesty is prohibited in all programs of the University. Academic dishonesty includes but is not limited to: cheating, fabrication, plagiarism, and facilitating dishonesty. Appropriate sanctions may be imposed on any student who has committed an act of academic dishonesty. Instructors should take reasonable steps to address academic misconduct. Any person who has reason to believe that a student has committed academic dishonesty should bring such information to the attention of the appropriate course instructor as soon as possible. Instances of academic dishonesty not related to a specific course should be brought to the attention of the appropriate department Head or Chair. The procedures outlined below are intended to provide an efficient and orderly process by which action may be taken if it appears that academic dishonesty has occurred and by which students may appeal such actions.

Since students are expected to be familiar with this policy and the commonly accepted standards of academic integrity, ignorance of such standards is not normally sufficient evidence of lack of intent. For more information about what constitutes academic dishonesty, please see the Dean of Students’ website at: http://umass.edu/dean_students/codeofconduct/acadhonesty/

NOTE 5: Disability Statement. [Text from CTFD] The University of Massachusetts Amherst is committed to making reasonable, effective and appropriate accommodations to meet the needs of students with disabilities and help create a barrier-free campus. If you are in need of accommodation for a documented disability, register with Disability Services to have an accommodation letter sent to your faculty. It is your responsibility to initiate these services and to communicate with faculty ahead of time to manage accommodations in a timely manner. For more information, consult the Disability Services website at http://www.umass.edu/disability/.

Key:

class N   holiday N
final N   quiz N

September

Su M Tu W Th F Sa
1 2 3 4 5 6 7
8 9 10 11 12 13 14
15 16 17 18 19 20 21
22 23 24 25 26 27 28
29 30          

October

Su M Tu W Th F Sa
    1 2 3 4 5
6 7 8 9 10 11 12
13 14 15 16 17 18 19
20 21 22 23 24 25 26
27 28 29 30 31    

November

Su M Tu W Th F Sa
          1 2
3 4 5 6 7 8 9
10 11 12 13 14 15 16
17 18 19 20 21 22 23
24 25 26 27 28 29 30

December

Su M Tu W Th F Sa
1 2 3 4 5 6 7
8 9 10 11 12 13 14
15 16 17 18 19 20 21
22 23 24 25 26 27 28
29 30 31        

LINKS.

Academic Schedule

Sites

Python.org
Python Packages.

Data Science in general

UMass Library
NLTK
NLTK Book

Topic Modelling (LDA)

Videos

VIDEO: Corey Shaffer Python
VIDEO: Socratica: Python
VIDEO: Khan Academy Statistics

Corpora

Oxford Text Archive
Project Gutenberg
Corpus of Western Lit

Kaggle Data Sets
US Government data
Mass Data sets
MA Attourney General Data
Boston Data
National Weather and more
US Census
UMass Amherst Data
Amherst MA Data
Books & Publishing Data

A million headlines (AUS)