# Portfolio

In [1]:

import yaml as yml
import pandas as pd
import os
pd.set_option('display.max_colwidth', None)


def yml_df(file):
    with open(file, 'r') as f:
        file_unparsed = f.read()

    file_dict = yml.safe_load(file_unparsed)
    return pd.DataFrame(file_dict)

outcomes_df = yml_df('../_data/learning_outcomes.yml')
outcomes_df.set_index('keyword',inplace=True)
schedule_df = yml_df('../_data/schedule.yml')
schedule_df.set_index('week', inplace=True)
# schedule_df = pd.merge(schedule_df,outcomes_df,right_on='keyword',  left_on= 'clo')
rubric_df = yml_df('../_data/rubric.yml')
rubric_df.set_index('keyword', inplace=True)
rubric_df.replace({None:'TBD'},inplace=True)
rubric_df.rename(columns={'mastery':'Level 3',
              'compentent':'Level 2',
              'aware':'Level 1'}, inplace=True)

assignment_dummies  = pd.get_dummies(rubric_df['assignments'].apply(pd.Series).stack()).sum(level=0)
assignment_dummies['# Assignments'] = assignment_dummies.sum(axis=1)
col_rename = {float(i):'A' + str(i) for i in range(1,14)}
assignment_dummies.rename(columns =col_rename,inplace=True)

portfolio_dummies  = pd.get_dummies(rubric_df['portfolios'].apply(pd.Series).stack()).sum(level=0)
col_rename = {float(i):'P' + str(i) for i in range(1,5)}
portfolio_dummies.rename(columns =col_rename,inplace=True)


rubric_df = pd.concat([rubric_df,
                      assignment_dummies,
                      portfolio_dummies],axis=1)

assignment_cols =  ['A'+ str(i) for i in range(1,14)] + ['# Assignments']

portfolio_cols = [ 'Level 3'] + ['P' + str(i) for i in range(1,5)]
portfolio_df = rubric_df[portfolio_cols]

This section of the site has a set of portfolio prompts and this page has instructions for portfolio submissions.  

Starting in week 3 it is recommended that you spend some time each week working on items for your portfolio, that way when it's time to submit you only have a little bit to add before submission.

The portfolio is your only chance to earn Level 3 achievements, however, if you have not earned a level 2 for any of the skills in a given check, you could earn level 2 then instead.
The prompts provide a starting point, but remember that to earn achievements, you'll be evaluated by the rubric.
You can see the full rubric for all portfolios in the [syllabus](portfolioskills).
Your portfolio is also an opportunity to be creative, explore things, and answer your own questions that we haven't answered in class to dig deeper on the topics we're covering.
Use the feedback you get on assignments to inspire your portfolio.

Each submission should include an introduction and a number of 'chapters'.  The grade will be based on both that you demonstrate skills through your chapters that are inspired by the prompts and that your summary demonstrates that you *know* you learned the skills. See the [formatting tips](formatting) for advice on how to structure files.




The third submission will be graded on the following criteria and due on December 4:

In [2]:
portfolio_df['Level 3'][portfolio_df['P3']==1].reset_index().set_index('keyword')

Unnamed: 0_level_0,Level 3
keyword,Unnamed: 1_level_1
process,Compare different ways that data science can facilitate decision making
classification,fit and apply classification models and select appropriate classification models for different contexts
regression,can fit and explain regrularized or nonlinear regression
clustering,"apply multiple clustering techniques, and interpret results"
evaluate,Evaluate a model with multiple metrics and cross validation
optimize,Select optimal parameters based of mutiple quanttiateve criteria and automate parameter tuning
compare,Evaluate tradeoffs between different model comparison types
unstructured,apply multiple representations and compare and contrast them for different end results
workflow,"Scope, choose an appropriate tool pipeline and solve data science problems, describe strengths and weakensses of common tools"


On each chapter(for a file) of your portfolio, you should identify which skills by their keyword, you are applying.

You can view a (fake) example [in this repository](https://github.com/rhodyprog4ds/portfolio-brownsarahm) as a [pdf](https://github.com/rhodyprog4ds/portfolio-brownsarahm/blob/gh-pages/portfolio.pdf) or as a [rendered website](https://rhodyprog4ds.github.io/portfolio-brownsarahm/intro.html)

## Upcoming Checks

### Portfolio 4

For the fourth submission, due December 19, you may earn level 1 for *any skill* an unlimited number, as long as your submission is clear and concise.  I recommend that you make a plan early (by updating your `subimssion_4_intro.md` file or making an issue on your repo) and ask for feedback in office hours.

It will be graded primarily on the following criteria:

In [3]:
portfolio_df['Level 3'][portfolio_df['P4']==1].reset_index().set_index('keyword')

Unnamed: 0_level_0,Level 3
keyword,Unnamed: 1_level_1
optimize,Select optimal parameters based of mutiple quanttiateve criteria and automate parameter tuning
compare,Evaluate tradeoffs between different model comparison types
unstructured,apply multiple representations and compare and contrast them for different end results
workflow,"Scope, choose an appropriate tool pipeline and solve data science problems, describe strengths and weakensses of common tools"


You may also earn level 2 for any of those skills and *one additional skill*.  For a single skill of your choice that you have attempted at least twice (assignments and portfolios) you may get an additional attempt in check 4. For this case, you must link to your two previous attempts and describe how you approached working on understanding this skill.

Linking in markdown uses `[]` for display text and `()` for the url like:

```
[previous attempt here](https://github.com/rhodyprog4ds/portfolio-brownsarahm/blob/main/check1/loading_data.md)
```
that would render like: [previous attempt here](https://github.com/rhodyprog4ds/portfolio-brownsarahm/blob/main/check1/loading_data.md)

You can also use relative paths as in the [example intro file](https://raw.githubusercontent.com/rhodyprog4ds/portfolio-brownsarahm/main/submission_1_intro.md). 

## Past Checks

### Portfolio 2

The second submission will be graded on the following criteria and due on November 13:

In [4]:
portfolio_df['Level 3'][portfolio_df['P2']==1].reset_index().set_index('keyword')

Unnamed: 0_level_0,Level 3
keyword,Unnamed: 1_level_1
python,"reliable, efficient, pythonic code that consistently adheres to pep8"
process,Compare different ways that data science can facilitate decision making
access,access data from both common and uncommon formats and identify best practices for formats in different contexts
construct,merge data that is not automatically aligned
summarize,Compute and interpret various summary statistics of subsets of data
visualize,generate complex plots with pandas and plotting libraries and customize with matplotlib or additional parameters
prepare,"apply data reshaping, cleaning, and filtering manipulations reliably and correctly by assessing data as received"
classification,fit and apply classification models and select appropriate classification models for different contexts
regression,can fit and explain regrularized or nonlinear regression
clustering,"apply multiple clustering techniques, and interpret results"


For this portfolio check I encourage you to dig in on either data from one domain and think about how clustering, regression, and classification apply in this domain  OR to inspect these more carefully in a performance prospective, taking a more CS-basics approach.