Models Due: 2024-11-06 (all models must be posted to huggingface)
Evaluations Due: 2024-11-10 (notebooks due for grading)
You can work in a team of 3 where each of you trains 1 model and evaluates 2 others or you can work alone and train 2 models and evaluate 1 trained by another student.
Dataset and EDA¶
Work in a team of 3. Each of you needs to train a different type of model: classification, regression, or clustering. So you need to choose data appropriate to that task. The two doing classification and clustering may use the same data, but do not have to.
You will most likely need to choose at least two different datasets: one for regression and one for classifcation+ clustering.
Coordinate so that among the four of you at least one person trains a model of each type: classification, regression, and clustering. You need to choose data appropriate to that task. The students doing classification and clustering may use the same data, but do not have to.
Choose a dataset that is well suited for a machine learning task.
Work in a notebook named
training_<task>.ipynbwhere<task>is either classification, regression, or clustering, depening on which you will use this dataset for.Include a basic description of the data(what the features are)
Describe the modeling task in your own words
Use EDA to determine if you expect the mdoel to work well or not. What types of mistakes do you think will happen most (think about the confusion matrix)?
Split the data 80-20 and set aside a 20% test set. THis can be uploaded to the course huggingface org as a dataset or uploaded with your model.
Basic Modeling¶
Work in a team of 3. Each of you needs to train a different type of model: classification, regression, or clustering.
You should complete two types of training.
Coordinate so that among the four of you at least one person trains a model of each type: classification, regression, and clustering
For all of the modeling, work in your training_<task>.ipynb notebook. You will train and evaluate one model in this notebook.
Classification¶
Hypothesize which classifier from the notes will do better and why you think that. Does the data meet the assumptions of Naive Bayes? What is important about this classifier for this application?
Fit your chosen classifier with the default parameters on 75% of the training data (60% of the whole dataset)
Inspect the model to answer the questions appropriate to your model.
Does this model make sense?
(if DT) Are there any leaves that are very small?
(if DT) Is this an interpretable number of levels?
(if GNB) do the parameters fit the data well? or do the paramters generate similar synthetic data (you can answer statistically only or with synthetic data & a plot)
Evaluate on the test data. Generate and interpret a classification report.
Interpret the model and its performance in terms of the application context in order to give a recommendation, “would you deploy this model” . Example questions to consider in your response include
do you think this model is good enough to use for real?
is this a model you would trust?
do you think that a more complex model should be used?
do you think that maybe this task cannot be done with machine learning?
Clustering¶
Describe what question you would be asking in applying clustering to this dataset. What does it mean if clustering does not work well?
How does this task compare to what the classification task on this dataset?
Apply Kmeans using the known, correct number of clusters, .
Evaluate how well clustering worked on the data:
using a true clustering metric and
using visualization and
using a clustering metric that uses the ground truth labels
Include a discussion of your results that addresses the following:
describes what the clustering means
what the metrics show
Does this clustering work better or worse than expected based on the classification performance (if you didn’t complete assignment 7, also apply a classifier)
Regression¶
TLDR: Fit a linear regression model, measure the fit with two metrics, and make a plot that helps visualize the result.
Include a basic description of the data(what the features are, units, size of dataset, etc)
Write your own description of what the prediction task is, why regression is appropriate.
Fit a linear model on all numerical features with 75% of your training data (60% of the whole dataset).
Test it on 25% held out validation data and measure the fit with two metrics and one plot
Inspect the model to answer:
What to the coefficients tell you?
What to the residuals tell you?
Interpret the model and its performance in terms of the application. Some questions you might want to answer in order to do this include:
do you think this model is good enough to use for real?
is this a model you would trust?
do you think that a more complex model should be used?
do you think that maybe this task cannot be done with machine learning?
Try fitting the model only on one feature. Justify your choice of feature based on the results above. Plot this result.
Share Your Model¶
Each of you should share the model you trained, so as a team you share 3 models.
You can share any single model.
Each of you should share the model you trained, so as a team you share 4 models, there must be at least one of each of the 3 tasks.
Work in your
training_<task>.ipynbnotebookCreate a model card for your model, including performance, plots and limitations.
Upload your model and model card to the course hugging face organization by following the model sharing tutorial
Evaluate your teams’ models¶
Evaluate both of your team mates’ models (the two tasks you did not train for)
Evaluate a model of the type that you did not choose to train.
Evaluate your team mate’s model and one model from the other pair(the two tasks you did not train for)
Create an
audit_<task_name>.ipynbInclude an introduction that answers the Model Card Peer Review questions below[1]
Download the model using
hf_hub_downloadsee exampleDownload the appropriate test data using
hf_hub_downloadsee but change to a data fileload the model using (
skio) see exampleTest and evaluate the performance. Use multiple metrics, interpret all of them.
Answer the audit questions below in your
auditnotebook for grading[1]Share your feedback on the model card as a discussion on the
Communitytab of your classmate’s model repository your classmate can make their model card better forinnovativeif they want.
## Model Card Peer Review
1. How well did the model card prepare you to use the model?
1. What additional information might have been helpful if you were deciding between two models that can do the same thing? ## Model Audit
1. How does the model performance you saw compare to the model card?
1. Do you think this model works well? what are the weaknesses or strengths? Submission¶
You will all submit to your own portfolio repo whether you work alone or in a group.
Export notebooks as myst markdown (by installing jupytext which should include frontend features )
Upload (or push) to a branch called
assignment5in your individual portfolioOpen a PR
each person submits
one
training_<task>.mdwhere<task>is one of {classification,clusteringorregression}two
audit_<task>.mdwhere<task>is one of {classification,clusteringorregression}
e.g. you might submit training_classification.md, audit_clustering.md and audit_regression.md
two
training_<task>.mdwhere<task>is one of {classification,clusteringorregression}one
audit_<task>.mdwhere<task>is one of {classification,clusteringorregression}
e.g. you might submit training_classification.md, training_clustering.md and audit_regression.md