Assignment 5: Modeling

Models Due: 2024-11-06 (all models must be posted to huggingface)
Evaluations Due: 2024-11-10 (notebooks due for grading)

You can work in a team of 3 where each of you trains 1 model and evaluates 2 others or you can work alone and train 2 models and evaluate 1 trained by another student.

Dataset and EDA¶

Group of 3

Solo

Two pairs

Work in a team of 3. Each of you needs to train a different type of model: classification, regression, or clustering. So you need to choose data appropriate to that task. The two doing classification and clustering may use the same data, but do not have to.

Choose a dataset that is well suited for a machine learning task.

Work in a notebook named training_<task>.ipynb where <task> is either classification, regression, or clustering, depening on which you will use this dataset for.
Include a basic description of the data(what the features are)
Describe the modeling task in your own words
Use EDA to determine if you expect the mdoel to work well or not. What types of mistakes do you think will happen most (think about the confusion matrix)?
Split the data 80-20 and set aside a 20% test set. THis can be uploaded to the course huggingface org as a dataset or uploaded with your model.

Basic Modeling¶

Group of 3

Solo

Two pairs

Work in a team of 3. Each of you needs to train a different type of model: classification, regression, or clustering.

For all of the modeling, work in your training_<task>.ipynb notebook. You will train and evaluate one model in this notebook.

Classification¶

Hypothesize which classifier from the notes will do better and why you think that. Does the data meet the assumptions of Naive Bayes? What is important about this classifier for this application?
Fit your chosen classifier with the default parameters on 75% of the training data (60% of the whole dataset)
Inspect the model to answer the questions appropriate to your model.
- Does this model make sense?
- (if DT) Are there any leaves that are very small?
- (if DT) Is this an interpretable number of levels?
- (if GNB) do the parameters fit the data well? or do the paramters generate similar synthetic data (you can answer statistically only or with synthetic data & a plot)
Evaluate on the test data. Generate and interpret a classification report.
Interpret the model and its performance in terms of the application context in order to give a recommendation, “would you deploy this model” . Example questions to consider in your response include

do you think this model is good enough to use for real?
is this a model you would trust?
do you think that a more complex model should be used?
do you think that maybe this task cannot be done with machine learning?

Clustering¶

Describe what question you would be asking in applying clustering to this dataset. What does it mean if clustering does not work well?
How does this task compare to what the classification task on this dataset?
Apply Kmeans using the known, correct number of clusters, $K$ .
Evaluate how well clustering worked on the data:
- using a true clustering metric and
- using visualization and
- using a clustering metric that uses the ground truth labels
Include a discussion of your results that addresses the following:
- describes what the clustering means
- what the metrics show
- Does this clustering work better or worse than expected based on the classification performance (if you didn’t complete assignment 7, also apply a classifier)

Regression¶

TLDR: Fit a linear regression model, measure the fit with two metrics, and make a plot that helps visualize the result.

Include a basic description of the data(what the features are, units, size of dataset, etc)
Write your own description of what the prediction task is, why regression is appropriate.
Fit a linear model on all numerical features with 75% of your training data (60% of the whole dataset).
Test it on 25% held out validation data and measure the fit with two metrics and one plot
Inspect the model to answer:
- What to the coefficients tell you?
- What to the residuals tell you?
Interpret the model and its performance in terms of the application. Some questions you might want to answer in order to do this include:

do you think this model is good enough to use for real?
is this a model you would trust?
do you think that a more complex model should be used?
do you think that maybe this task cannot be done with machine learning?

Try fitting the model only on one feature. Justify your choice of feature based on the results above. Plot this result.

Group of 3

Solo

Two pairs

Each of you should share the model you trained, so as a team you share 3 models.

Work in your training_<task>.ipynb notebook
Create a model card for your model, including performance, plots and limitations.
Upload your model and model card to the course hugging face organization by following the model sharing tutorial

Evaluate your teams’ models¶

Group of 3

Solo

Two pairs

Evaluate both of your team mates’ models (the two tasks you did not train for)

Create an audit_<task_name>.ipynb
Include an introduction that answers the Model Card Peer Review questions below^[1]
Download the model using hf_hub_download see example
Download the appropriate test data using hf_hub_download see but change to a data file
load the model using (skio) see example
Test and evaluate the performance. Use multiple metrics, interpret all of them.
Answer the audit questions below in your audit notebook for grading^[1]
Share your feedback on the model card as a discussion on the Community tab of your classmate’s model repository your classmate can make their model card better for innovative if they want.

## Model Card Peer Review
1. How well did the model card prepare you to use the model?
1. What additional information might have been helpful if you were deciding between two models that can do the same thing?

## Model Audit
1. How does the model performance you saw compare to the model card?
1. Do you think this model works well? what are the weaknesses or strengths?

Submission¶

You will all submit to your own portfolio repo whether you work alone or in a group.

Export notebooks as myst markdown (by installing jupytext which should include frontend features )
Upload (or push) to a branch called assignment5 in your individual portfolio
Open a PR

Group of 3

Solo

each person submits

one training_<task>.md where <task> is one of {classification, clustering or regression}
two audit_<task>.md where <task> is one of {classification, clustering or regression}

e.g. you might submit training_classification.md, audit_clustering.md and audit_regression.md

Footnotes¶

you can view without scrolling by hovering on the name of the questions.
↩↩

Dataset and EDA¶

Basic Modeling¶

Classification¶

Clustering¶

Regression¶

Share Your Model¶

Evaluate your teams’ models¶

Submission¶