Programming for Data Science at URI Fall 2020
Welcome to Programming for Data Science
Syllabus
About
Tools and Resources
Grading
Learning Objective, Schedule, and Rubric
Support
Policies
Class Notes
Class 2: intro to notebooks and python
Class 3: Welcome to Week 2
Class 4: Pandas
Class 5: Accessing Data, continued
Class 6: Exploratory Data Analysis
Class 7: Visualization for EDA
Class 8: Visualization and Starting to Clean Data
Class 9: Preparing Data For Analysis
Class 10: Cleaning review and Ray Summit Keynotes
Class 11: Cleaning Data
Class 12: Constructing Datasets from Multiple Sources
Class 13: Data from multiple sources and Databases
Class 15: Intro to ML & Modeling
Class 16: Naive Bayes Classification
Class 17: Evaluating Classification and Midsemester Feedback
Class 18: Mid Semester Checkin, Git, & How GNB makes decisions
Class 19: Decision Trees
Class 20: Decision Trees and Cross Validation
Class 21: Regression
Class 22: More Regression, More Evaluation and LASSO
Class 23: Interpretting Regression Evaluations
Class 24: Clustering
Class 25: Evaluating Clustering
Class 26: More Clustering Models
Class 27: Model Optimization- Choosing K
Class 28: SVM & Model Optimization
Class 29: Choosing a Model
Class 30: Learning Curves, Validation Curves
Class 31: Confidence Intervals
Class 32: Intro to NLP
Class 33: Tools, Workflow & more NLP
Class : More Representations of Text
Assignments
Assignment 1: Portfolio Setup, Data Science, and Python
Assignment 2: Practicing Python and Accessing Data
Assignment 3: Exploratory Data Analysis
Assignment 4: Preparing Data for Analysis
Assignment 5: Constructing Datasets and Using Databases
Assignment 6: Naive Bayes
Assignment 7: Decision Trees
Assignment 8: Linear Regression
Assignment 9
Assignment 10: Optimizing Models
Assignment 11: Model Comparison
Assignment 12: Fake News
Portfolio
Formatting Tips
Reflective Prompts
Analysis Prompts
FAQ
Syllabus FAQ
GitHub FAQ
Common Debugging Issues
Resources
General Tips and Resources
References on Python
Data Sources
open issue
Index