6. Assignment 6: Auditing Algorithms#

accept the assignment _Due: 2023-10-18

Eligible skills: (links to checklists)

  • first chance evaluate 1 and 2

  • construct 1 and 2

  • summarize 1 and 2

  • visualize 1 and 2

  • python 1 and 2

6.2. About the data#

We have provided a reconstructed version of the Adult Dataset, which is a popular benchmark dataset for training machine learning models that comes from a recent paper about the risks of that dataset. The classic Adult dataset tries to predict if a person makes more or less than 50k.

Researchers reconstructed the Adult dataset with the actual value of the income. We trained models to predict income>=$10k, income>=$20k , etc. We used three different learning algorithms, nicknamed ‘LR’, ‘GPR’, and ‘RPR’ for each target (>10k, >20k ,…, >90k).

  • adult_models_only.csv has the model’s predictions

  • adult_reconstruction_bin.csv has the data.

Both have a unique identifier column included.

Think Ahead

Why might the dataset have more samples in it than the model predictions one?

6.3. Complete an audit#

Thoroughly audit any one model. In your audit, use at least three different performance metrics. Compare and contrast performance in those metrics across racial or gender groups.

Include easy to read tables with your performance metrics and interpretations of the model’s overall performance and any disparities that could be understood by a general audience.

If the model you chose was used for some real world decision what might the risks be?

6.4. Extend your Audit#

Note

optional (for more Achievements or deeper understanding/more practice)

Use functions and loops to build a dataset about the performance of the different models so that you can answer the following questions:

  1. Which model (target and learning algorithm) has the best accuracy?

  2. Which target value has the least average disparity by race? by gender?

  3. Which learning algorithm has the least average disparity by race? by gender?

  4. Which model (target and learning algorithm) do you think is overall the best?

Table 6.1 Example table format#

y

model

score

value

subset

=10k

LR

accuracy

.873

overall

=20k

RPR

false_pos_rate

.873

men

This table is not real data, just headers with one example value to help illustrate what the column name means.

Hint

This step you should make separate data frames and then merge them together for construct. If you don’t need construct you can build it as one, for visualize you should use appropriate groupings