Skip to main content
Ctrl+K
Logo image
  • About this Book

Syllabus

  • About
  • Tools and Resources
  • Data Science Achievements
  • Grading
  • Grading Policies
  • Course Style Guide
  • Support
  • General URI Policies
  • Communications & Office Hours

Notes

  • 1. Welcome & What is Data Science
  • 2. Iterables and Pandas Data Frames
  • 3. Pandas Data Frames and More Iterable Types
  • 4. Exploratory Data Analysis
  • 5. Visualization
  • 6. Cleaning Data - Structure
  • 8. Fixing Values
  • 9. Webscraping
  • 10. Merging Data & Databases
  • 11. Evaluating ML Algorithms
  • 12. ML Models and Auditing with AIF360
  • 13. Classification Models
  • 14. Clustering
  • 15. Kmeans algorithm
  • 16. Regression
  • 17. More regression
  • 18. Task Review and Cross Validation
  • 19. Model Optimization
  • 20. Model Comparison
  • 22. Which model should we deploy?
  • 23. Intro to NLP- representing text data
  • 24. Classifying Text
  • 25. Breaking Down a problem
  • 26. NN
  • 27. Final Grading Notes and Procedures

Assignments

  • 1. Assignment 1: Setup, Syllabus, and Review
  • 2. Assignment 2: Practicing Python and Accessing Data
  • 3. Assignment 3: Exploratory Data Analysis
  • 4. Assignment 4: Cleaning Data
  • 5. Assignment 5: Constructing Datasets and Using Databases
  • 6. Assignment 6: Auditing Algorithms
  • 7. Assignment 7: Classification
  • 8. Assignment 8: Clustering
  • 9. Assignment 9: Regression and Optimization
  • 10. Assignment 11: Model Comparison
  • 11. Assignment 11: Fake News

Portfolio

  • Deepening your knowledge
  • Formatting Tips
  • Generic Extension Ideas
  • Alternatives to Extending Assignments for Level 3
  • Process Level 3

FAQ

  • FAQ
  • Syllabus and Grading FAQ
  • Git and GitHub
  • Code Errors

Resources

  • Glossary
  • References on Python
  • Cheatsheet
  • Data Sources
  • General Tips and Resources
  • How to Study in this class
  • Getting Help with Programming
  • Terminals and Environments
  • Getting Organized for class
  • Advice from FA2020 Students
  • Advice from FA2021 Students
  • Repository
  • Open issue

Index

A | B | C | D | E | G | H | I | K | L | N | P | R | S | T | W

A

  • aggregate
  • anonymous function

B

  • BeautifulSoup

C

  • conditional
  • corpus

D

  • data leakage
  • DataFrame
  • dictionary
  • discriminative
  • document

E

  • error bars

G

  • generative
  • gh
  • git
  • GitHub

H

  • hyperparameter

I

  • index
  • interpreter
  • iterable
  • iterate

K

  • kernel

L

  • lambda

N

  • numpy array

P

  • PEP 8

R

  • repository

S

  • Series
  • shape
  • Split Apply Combine
  • stop words
  • suffix

T

  • test accuracy
  • Tidy Data Format
  • token
  • TraceBack
  • training accuracy
  • transpose

W

  • Web Scraping

By Professor Sarah M Brown

© Copyright 2022.