Skip to main content
Ctrl+K
Logo image
  • About this Book

Syllabus

  • About
  • Tools and Resources
  • Data Science Achievements
  • Grading
  • Grading Policies
  • Course Style Guide
  • Support
  • General URI Policies
  • Communications & Office Hours

Notes

  • 1. Welcome & What is Data Science
  • 2. Iterables and Pandas Data Frames
  • 3. DataFrames from other sources
  • 4. Exploratory Data Analysis
  • 5. Visualization
  • 6. Tidy Data and Structural Repairs
  • 7. Reparing values
  • 8. Merging Data
  • 9. Web Scraping
  • 10. Evaluating ML Algorithms
  • 11. Intro to ML & Naive Bayes
  • 12. Understanding Classification
  • 13. Clustering
  • 14. Clustering Metrics
  • 15. Regression
  • 16. Interpretting Regression
  • 17. Cross Validation
  • 18. Model Optimization
  • 19. Model Comparison
  • 20. Learning Curves and more Model Comparison
  • 21. Representing Text
  • 22. More text representations
  • 23. Neural Networks

Assignments

  • 1. Assignment 1: Setup, Syllabus, and Review
  • 2. Assignment 2: Practicing Python and Accessing Data
  • 3. Assignment 3: Exploratory Data Analysis
  • 4. Assignment 4: Cleaning Data
  • 5. Assignment 5: Constructing Datasets and Using Databases
  • 6. Assignment 6: Auditing Algorithms
  • 7. Assignment 7
  • 8. Assignment 8: Clustering
  • 9. Assignment 9: Linear Regression
  • 10. Assignment 10: Tuning Model Parameters
  • 11. Assignment 11: Model Comparison
  • 12. Assignment 12: Fake News

Portfolio

  • Portfolio
  • Formatting Tips
  • Portfolio Check 1 Ideas
  • Check 2 Ideas
  • Check 3 Ideas

FAQ

  • FAQ
  • Syllabus and Grading FAQ
  • Git and GitHub
  • Code Errors

Resources

  • Glossary
  • References on Python
  • Cheatsheet
  • Data Sources
  • General Tips and Resources
  • How to Study in this class
  • Getting Help with Programming
  • Terminals and Environments
  • Getting Organized for class
  • Advice from FA2020 Students
  • Advice from FA2021 Students
  • Repository
  • Open issue

Index

A | B | C | D | G | I | K | L | P | R | S | T | W

A

  • aggregate
  • anonymous function

B

  • BeautifulSoup

C

  • conditional
  • corpus

D

  • DataFrame
  • dictionary
  • document

G

  • gh
  • git
  • GitHub

I

  • index
  • interpreter
  • iterable
  • iterate

K

  • kernel

L

  • lambda

P

  • PEP 8

R

  • repository

S

  • Series
  • Split Apply Combine
  • stop words
  • suffix

T

  • test accuracy
  • Tidy Data Format
  • token
  • TraceBack
  • training accuracy

W

  • Web Scraping

By Professor Sarah M Brown

© Copyright 2022.