{
 "cells": [
  {
   "cell_type": "markdown",
   "id": "f7d2d806",
   "metadata": {},
   "source": [
    "## Learning Objective, Schedule, and Rubric"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "id": "fb42db8a",
   "metadata": {
    "tags": [
     "remove-input"
    ]
   },
   "outputs": [],
   "source": [
    "\n",
    "import yaml as yml\n",
    "import pandas as pd\n",
    "import os\n",
    "from IPython.display import display, Markdown\n",
    "pd.set_option('display.max_colwidth', None)\n",
    "\n",
    "\n",
    "def yml_df(file):\n",
    "    with open(file, 'r') as f:\n",
    "        file_unparsed = f.read()\n",
    "\n",
    "    file_dict = yml.safe_load(file_unparsed)\n",
    "    return pd.DataFrame(file_dict)\n",
    "\n",
    "outcomes_df = yml_df('../_data/learning_outcomes.yml')\n",
    "# outcomes_df.set_index('keyword',inplace=True)\n",
    "schedule_df = yml_df('../_data/schedule.yml')\n",
    "schedule_df.set_index('week', inplace=True)\n",
    "# schedule_df = pd.merge(schedule_df,outcomes_df,right_on='keyword',  left_on= 'clo')\n",
    "rubric_df = yml_df('../_data/rubric.yml')\n",
    "rubric_df.set_index('keyword', inplace=True)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "12257fb3",
   "metadata": {},
   "source": [
    "### Learning Outcomes\n",
    "\n",
    "There are five learning outcomes for this course."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "id": "a80627be",
   "metadata": {
    "tags": [
     "remove-input"
    ]
   },
   "outputs": [
    {
     "data": {
      "text/markdown": [
       "1.  (process) Describe the process of data science, define each phase, and identify standard tools  \n",
       "2.  (data) Access and combine data in multiple formats for analysis  \n",
       "3.  (exploratory) Perform exploratory data analyses including descriptive statistics and visualization  \n",
       "4.  (modeling) Select models for data by applying and evaluating mutiple models to a single dataset  \n",
       "5.  (communicate) Communicate solutions to problems with data in common industry formats"
      ],
      "text/plain": [
       "<IPython.core.display.Markdown object>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    }
   ],
   "source": [
    "outcome_list = [ str(i+1) + '. ' + ' (' + k + ') '  + o  for i,(o,k) in enumerate(zip(outcomes_df['outcome'], outcomes_df['keyword']))]\n",
    "\n",
    "display(Markdown('  \\n'.join(outcome_list)))\n",
    "#outcomes_df[['keyword','outcome']]"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "8bf2055f",
   "metadata": {},
   "source": [
    "We will build your skill in the `process` and `communicate` outcomes over the whole semester. The middle three skills will correspond roughly to the content taught for each of the first three portfolio checks.  \n",
    "\n",
    "(schedule)=\n",
    "### Schedule\n",
    "\n",
    "````{margin}\n",
    "```{note}\n",
    "On the [BrightSpace calendar](https://brightspace.uri.edu/d2l/le/calendar/101136) page you can get a feed link to add to the calendar of your choice by clicking on the subscribe (star) button on the top right of the page. Class is for 1 hour there because of Brightspace/zoom integration limitations, but that calendar includes the zoom link.\n",
    "```\n",
    "````\n",
    "\n",
    "The course will meet MWF 1-1:50pm on Zoom. Every class will include participatory live coding (instructor types, students follow along)) instruction and small exercises for you to progress toward level 1 achievements of the new skills introduced in class that day.\n",
    "\n",
    "Programming assignments that will be due each week Tuesday by 11:59pm.\n",
    "_until week 5 they were due Sundays_"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "id": "c67db653",
   "metadata": {
    "tags": [
     "remove-input"
    ]
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>topics</th>\n",
       "      <th>skills</th>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>week</th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>[admin, python review]</td>\n",
       "      <td>process</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>Loading data, Python review</td>\n",
       "      <td>[access, prepare, summarize]</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>Exploratory Data Analysis</td>\n",
       "      <td>[summarize, visualize]</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>Data Cleaning</td>\n",
       "      <td>[prepare, summarize, visualize]</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>5</th>\n",
       "      <td>Databases, Merging DataFrames</td>\n",
       "      <td>[access, construct, summarize]</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>6</th>\n",
       "      <td>Modeling, Naive Bayes, classification performance metrics</td>\n",
       "      <td>[classification, evaluate]</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>7</th>\n",
       "      <td>decision trees, cross validation</td>\n",
       "      <td>[classification, evaluate]</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>8</th>\n",
       "      <td>Regression</td>\n",
       "      <td>[regression, evaluate]</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>9</th>\n",
       "      <td>Clustering</td>\n",
       "      <td>[clustering, evaluate]</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>10</th>\n",
       "      <td>SVM, parameter tuning</td>\n",
       "      <td>[optimize, tools]</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>11</th>\n",
       "      <td>KNN, Model comparison</td>\n",
       "      <td>[compare, tools]</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>12</th>\n",
       "      <td>Text Analysis</td>\n",
       "      <td>[unstructured]</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>13</th>\n",
       "      <td>Topic Modeling</td>\n",
       "      <td>[unstructured, tools]</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>14</th>\n",
       "      <td>Deep Learning</td>\n",
       "      <td>[tools, compare]</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "                                                         topics  \\\n",
       "week                                                              \n",
       "1                                        [admin, python review]   \n",
       "2                                   Loading data, Python review   \n",
       "3                                     Exploratory Data Analysis   \n",
       "4                                                 Data Cleaning   \n",
       "5                                 Databases, Merging DataFrames   \n",
       "6     Modeling, Naive Bayes, classification performance metrics   \n",
       "7                              decision trees, cross validation   \n",
       "8                                                    Regression   \n",
       "9                                                    Clustering   \n",
       "10                                        SVM, parameter tuning   \n",
       "11                                        KNN, Model comparison   \n",
       "12                                                Text Analysis   \n",
       "13                                               Topic Modeling   \n",
       "14                                                Deep Learning   \n",
       "\n",
       "                               skills  \n",
       "week                                   \n",
       "1                             process  \n",
       "2        [access, prepare, summarize]  \n",
       "3              [summarize, visualize]  \n",
       "4     [prepare, summarize, visualize]  \n",
       "5      [access, construct, summarize]  \n",
       "6          [classification, evaluate]  \n",
       "7          [classification, evaluate]  \n",
       "8              [regression, evaluate]  \n",
       "9              [clustering, evaluate]  \n",
       "10                  [optimize, tools]  \n",
       "11                   [compare, tools]  \n",
       "12                     [unstructured]  \n",
       "13              [unstructured, tools]  \n",
       "14                   [tools, compare]  "
      ]
     },
     "execution_count": 3,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "\n",
    "schedule_df.replace({None:'TBD'})\n",
    "schedule_df[['topics','skills']]"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "c180ac7b",
   "metadata": {},
   "source": [
    "(skill-rubric)=\n",
    "### Skill Rubric\n",
    "\n",
    "\n",
    "The skill rubric describes how your participation, assignments, and portfolios will be assessed to earn each achievement. The keyword for each skill is a short name that will be used to refer to skills throughout the course materials; the full description of the skill is in this table."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "id": "5114bd06",
   "metadata": {
    "tags": [
     "remove-input"
    ]
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>skill</th>\n",
       "      <th>Level 1</th>\n",
       "      <th>Level 2</th>\n",
       "      <th>Level 3</th>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>keyword</th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>python</th>\n",
       "      <td>pythonic code writing</td>\n",
       "      <td>python code that mostly runs, occasional pep8 adherance</td>\n",
       "      <td>python code that reliably runs, frequent pep8 adherance</td>\n",
       "      <td>reliable, efficient, pythonic code that consistently adheres to pep8</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>process</th>\n",
       "      <td>describe data science as a process</td>\n",
       "      <td>Identify basic components of data science</td>\n",
       "      <td>Describe and define each stage of the data science process</td>\n",
       "      <td>Compare different ways that data science can facilitate decision making</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>access</th>\n",
       "      <td>access data in multiple formats</td>\n",
       "      <td>load data from at least one format; identify the most common data formats</td>\n",
       "      <td>Load data for processing from the most common formats; Compare and constrast most common formats</td>\n",
       "      <td>access data from both common and uncommon formats and identify best practices for formats in different contexts</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>construct</th>\n",
       "      <td>construct datasets from multiple sources</td>\n",
       "      <td>identify what should happen to merge datasets or when they can be merged</td>\n",
       "      <td>apply basic merges</td>\n",
       "      <td>merge data that is not automatically aligned</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>summarize</th>\n",
       "      <td>Summarize and describe data</td>\n",
       "      <td>Describe the shape and structure of a dataset in basic terms</td>\n",
       "      <td>compute summary statndard statistics of a whole dataset and grouped data</td>\n",
       "      <td>Compute and interpret various summary statistics of subsets of data</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>visualize</th>\n",
       "      <td>Visualize data</td>\n",
       "      <td>identify plot types, generate basic plots from pandas</td>\n",
       "      <td>generate multiple plot types with complete labeling with pandas and seaborn</td>\n",
       "      <td>generate complex plots with pandas and plotting libraries and customize with matplotlib or additional parameters</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>prepare</th>\n",
       "      <td>prepare data for analysis</td>\n",
       "      <td>identify if data is or is not ready for analysis, potential problems with data</td>\n",
       "      <td>apply data reshaping, cleaning, and filtering as directed</td>\n",
       "      <td>apply data reshaping, cleaning, and filtering manipulations reliably and correctly by assessing data as received</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>classification</th>\n",
       "      <td>Apply classification</td>\n",
       "      <td>identify and describe what classification is, apply pre-fit classification models</td>\n",
       "      <td>fit preselected classification model to a dataset</td>\n",
       "      <td>fit and apply classification models and select appropriate classification models for different contexts</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>regression</th>\n",
       "      <td>Apply Regression</td>\n",
       "      <td>identify what data that can be used for regression looks like</td>\n",
       "      <td>can fit linear regression models</td>\n",
       "      <td>can fit and explain regrularized or nonlinear regression</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>clustering</th>\n",
       "      <td>Clustering</td>\n",
       "      <td>describe what clustering is</td>\n",
       "      <td>apply basic clustering</td>\n",
       "      <td>apply multiple clustering techniques, and interpret results</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>evaluate</th>\n",
       "      <td>Evaluate model performance</td>\n",
       "      <td>Explain basic performance metrics for different data science tasks</td>\n",
       "      <td>Apply basic model evaluation metrics to a held out test set</td>\n",
       "      <td>Evaluate a model with multiple metrics and cross validation</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>optimize</th>\n",
       "      <td>Optimize model parameters</td>\n",
       "      <td>Identify when model parameters need to be optimized</td>\n",
       "      <td>Manually optimize basic model parameters such as model order</td>\n",
       "      <td>Select optimal parameters based of mutiple quanttiateve criteria and automate parameter tuning</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>compare</th>\n",
       "      <td>compare models</td>\n",
       "      <td>Qualitatively compare model classes</td>\n",
       "      <td>Compare model classes in specific terms and fit models in terms of traditional model performance metrics</td>\n",
       "      <td>Evaluate tradeoffs between different model comparison types</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>unstructured</th>\n",
       "      <td>model unstructured data</td>\n",
       "      <td>Identify options for representing text data and use them once data is tranformed</td>\n",
       "      <td>Apply at least one representation to transform unstructured data for model fitting or summarizing</td>\n",
       "      <td>apply multiple representations and compare and contrast them for different end results</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>workflow</th>\n",
       "      <td>use industry standard data science tools and workflows to solve data science problems</td>\n",
       "      <td>Solve well strucutred problems with a single tool pipeline</td>\n",
       "      <td>Solve semi-strucutred, completely specified problems, apply common structure to learn new features of standard tools</td>\n",
       "      <td>Scope, choose an appropriate tool pipeline and solve data science problems, describe strengths and weakensses of common tools</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "                                                                                                skill  \\\n",
       "keyword                                                                                                 \n",
       "python                                                                          pythonic code writing   \n",
       "process                                                            describe data science as a process   \n",
       "access                                                                access data in multiple formats   \n",
       "construct                                                    construct datasets from multiple sources   \n",
       "summarize                                                                 Summarize and describe data   \n",
       "visualize                                                                              Visualize data   \n",
       "prepare                                                                     prepare data for analysis   \n",
       "classification                                                                   Apply classification   \n",
       "regression                                                                           Apply Regression   \n",
       "clustering                                                                                 Clustering   \n",
       "evaluate                                                                   Evaluate model performance   \n",
       "optimize                                                                    Optimize model parameters   \n",
       "compare                                                                                compare models   \n",
       "unstructured                                                                  model unstructured data   \n",
       "workflow        use industry standard data science tools and workflows to solve data science problems   \n",
       "\n",
       "                                                                                          Level 1  \\\n",
       "keyword                                                                                             \n",
       "python                                    python code that mostly runs, occasional pep8 adherance   \n",
       "process                                                 Identify basic components of data science   \n",
       "access                  load data from at least one format; identify the most common data formats   \n",
       "construct                identify what should happen to merge datasets or when they can be merged   \n",
       "summarize                            Describe the shape and structure of a dataset in basic terms   \n",
       "visualize                                   identify plot types, generate basic plots from pandas   \n",
       "prepare            identify if data is or is not ready for analysis, potential problems with data   \n",
       "classification  identify and describe what classification is, apply pre-fit classification models   \n",
       "regression                          identify what data that can be used for regression looks like   \n",
       "clustering                                                            describe what clustering is   \n",
       "evaluate                       Explain basic performance metrics for different data science tasks   \n",
       "optimize                                      Identify when model parameters need to be optimized   \n",
       "compare                                                       Qualitatively compare model classes   \n",
       "unstructured     Identify options for representing text data and use them once data is tranformed   \n",
       "workflow                               Solve well strucutred problems with a single tool pipeline   \n",
       "\n",
       "                                                                                                                             Level 2  \\\n",
       "keyword                                                                                                                                \n",
       "python                                                                       python code that reliably runs, frequent pep8 adherance   \n",
       "process                                                                   Describe and define each stage of the data science process   \n",
       "access                              Load data for processing from the most common formats; Compare and constrast most common formats   \n",
       "construct                                                                                                         apply basic merges   \n",
       "summarize                                                   compute summary statndard statistics of a whole dataset and grouped data   \n",
       "visualize                                                generate multiple plot types with complete labeling with pandas and seaborn   \n",
       "prepare                                                                    apply data reshaping, cleaning, and filtering as directed   \n",
       "classification                                                                     fit preselected classification model to a dataset   \n",
       "regression                                                                                          can fit linear regression models   \n",
       "clustering                                                                                                    apply basic clustering   \n",
       "evaluate                                                                 Apply basic model evaluation metrics to a held out test set   \n",
       "optimize                                                                Manually optimize basic model parameters such as model order   \n",
       "compare                     Compare model classes in specific terms and fit models in terms of traditional model performance metrics   \n",
       "unstructured                       Apply at least one representation to transform unstructured data for model fitting or summarizing   \n",
       "workflow        Solve semi-strucutred, completely specified problems, apply common structure to learn new features of standard tools   \n",
       "\n",
       "                                                                                                                                      Level 3  \n",
       "keyword                                                                                                                                        \n",
       "python                                                                   reliable, efficient, pythonic code that consistently adheres to pep8  \n",
       "process                                                               Compare different ways that data science can facilitate decision making  \n",
       "access                        access data from both common and uncommon formats and identify best practices for formats in different contexts  \n",
       "construct                                                                                        merge data that is not automatically aligned  \n",
       "summarize                                                                 Compute and interpret various summary statistics of subsets of data  \n",
       "visualize                    generate complex plots with pandas and plotting libraries and customize with matplotlib or additional parameters  \n",
       "prepare                      apply data reshaping, cleaning, and filtering manipulations reliably and correctly by assessing data as received  \n",
       "classification                        fit and apply classification models and select appropriate classification models for different contexts  \n",
       "regression                                                                           can fit and explain regrularized or nonlinear regression  \n",
       "clustering                                                                        apply multiple clustering techniques, and interpret results  \n",
       "evaluate                                                                          Evaluate a model with multiple metrics and cross validation  \n",
       "optimize                                       Select optimal parameters based of mutiple quanttiateve criteria and automate parameter tuning  \n",
       "compare                                                                           Evaluate tradeoffs between different model comparison types  \n",
       "unstructured                                           apply multiple representations and compare and contrast them for different end results  \n",
       "workflow        Scope, choose an appropriate tool pipeline and solve data science problems, describe strengths and weakensses of common tools  "
      ]
     },
     "execution_count": 4,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "\n",
    "rubric_df.replace({None:'TBD'},inplace=True)\n",
    "rubric_df.rename(columns={'mastery':'Level 3',\n",
    "              'compentent':'Level 2',\n",
    "              'aware':'Level 1'}, inplace=True)\n",
    "\n",
    "rubric_df[['skill','Level 1','Level 2','Level 3']]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "id": "d8b480b2",
   "metadata": {
    "tags": [
     "remove-input"
    ]
   },
   "outputs": [],
   "source": [
    "\n",
    "assignment_dummies  = pd.get_dummies(rubric_df['assignments'].apply(pd.Series).stack()).sum(level=0)\n",
    "assignment_dummies['# Assignments'] = assignment_dummies.sum(axis=1)\n",
    "col_rename = {float(i):'A' + str(i) for i in range(1,14)}\n",
    "assignment_dummies.rename(columns =col_rename,inplace=True)\n",
    "\n",
    "portfolio_dummies  = pd.get_dummies(rubric_df['portfolios'].apply(pd.Series).stack()).sum(level=0)\n",
    "col_rename = {float(i):'P' + str(i) for i in range(1,5)}\n",
    "portfolio_dummies.rename(columns =col_rename,inplace=True)\n",
    "\n",
    "\n",
    "rubric_df = pd.concat([rubric_df,assignment_dummies, portfolio_dummies],axis=1)\n",
    "\n",
    "assignment_cols =  ['A'+ str(i) for i in range(1,14)] + ['# Assignments']\n",
    "\n",
    "portfolio_cols = [ 'Level 3'] + ['P' + str(i) for i in range(1,5)]"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "435f72ae",
   "metadata": {},
   "source": [
    "(assignment-skills)=\n",
    "### Assignments and Skills\n",
    "\n",
    "Using the keywords from the table above, this table shows which assignments you will be able to demonstrate which skills and the total number of assignments that assess each skill. This is the number of opportunities you have to earn Level 2 and still preserve 2 chances to earn Level 3 for each skill."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "id": "290fe58b",
   "metadata": {
    "tags": [
     "remove-input"
    ]
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>A1</th>\n",
       "      <th>A2</th>\n",
       "      <th>A3</th>\n",
       "      <th>A4</th>\n",
       "      <th>A5</th>\n",
       "      <th>A6</th>\n",
       "      <th>A7</th>\n",
       "      <th>A8</th>\n",
       "      <th>A9</th>\n",
       "      <th>A10</th>\n",
       "      <th>A11</th>\n",
       "      <th>A12</th>\n",
       "      <th>A13</th>\n",
       "      <th># Assignments</th>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>keyword</th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>python</th>\n",
       "      <td>1</td>\n",
       "      <td>1</td>\n",
       "      <td>1</td>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>4</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>process</th>\n",
       "      <td>1</td>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>2</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>access</th>\n",
       "      <td>0</td>\n",
       "      <td>1</td>\n",
       "      <td>1</td>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>3</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>construct</th>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>1</td>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>2</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>summarize</th>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>1</td>\n",
       "      <td>1</td>\n",
       "      <td>1</td>\n",
       "      <td>1</td>\n",
       "      <td>1</td>\n",
       "      <td>1</td>\n",
       "      <td>1</td>\n",
       "      <td>1</td>\n",
       "      <td>1</td>\n",
       "      <td>1</td>\n",
       "      <td>1</td>\n",
       "      <td>11</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>visualize</th>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>1</td>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "      <td>1</td>\n",
       "      <td>1</td>\n",
       "      <td>1</td>\n",
       "      <td>1</td>\n",
       "      <td>1</td>\n",
       "      <td>1</td>\n",
       "      <td>1</td>\n",
       "      <td>1</td>\n",
       "      <td>10</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>prepare</th>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>1</td>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>2</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>classification</th>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>1</td>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>3</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>regression</th>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>2</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>clustering</th>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>2</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>evaluate</th>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>1</td>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>2</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>optimize</th>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>1</td>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>2</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>compare</th>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "      <td>1</td>\n",
       "      <td>2</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>unstructured</th>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>1</td>\n",
       "      <td>1</td>\n",
       "      <td>2</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>workflow</th>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>1</td>\n",
       "      <td>1</td>\n",
       "      <td>1</td>\n",
       "      <td>1</td>\n",
       "      <td>4</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "                A1  A2  A3  A4  A5  A6  A7  A8  A9  A10  A11  A12  A13  \\\n",
       "keyword                                                                  \n",
       "python           1   1   1   1   0   0   0   0   0    0    0    0    0   \n",
       "process          1   1   0   0   0   0   0   0   0    0    0    0    0   \n",
       "access           0   1   1   1   0   0   0   0   0    0    0    0    0   \n",
       "construct        0   0   0   0   1   1   0   0   0    0    0    0    0   \n",
       "summarize        0   0   1   1   1   1   1   1   1    1    1    1    1   \n",
       "visualize        0   0   1   1   0   1   1   1   1    1    1    1    1   \n",
       "prepare          0   0   0   1   1   0   0   0   0    0    0    0    0   \n",
       "classification   0   0   0   0   0   1   1   0   0    1    0    0    0   \n",
       "regression       0   0   0   0   0   0   0   1   0    0    1    0    0   \n",
       "clustering       0   0   0   0   0   0   0   0   1    0    1    0    0   \n",
       "evaluate         0   0   0   0   0   0   0   0   0    1    1    0    0   \n",
       "optimize         0   0   0   0   0   0   0   0   0    1    1    0    0   \n",
       "compare          0   0   0   0   0   0   0   0   0    0    1    0    1   \n",
       "unstructured     0   0   0   0   0   0   0   0   0    0    0    1    1   \n",
       "workflow         0   0   0   0   0   0   0   0   0    1    1    1    1   \n",
       "\n",
       "                # Assignments  \n",
       "keyword                        \n",
       "python                      4  \n",
       "process                     2  \n",
       "access                      3  \n",
       "construct                   2  \n",
       "summarize                  11  \n",
       "visualize                  10  \n",
       "prepare                     2  \n",
       "classification              3  \n",
       "regression                  2  \n",
       "clustering                  2  \n",
       "evaluate                    2  \n",
       "optimize                    2  \n",
       "compare                     2  \n",
       "unstructured                2  \n",
       "workflow                    4  "
      ]
     },
     "execution_count": 6,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "rubric_df[assignment_cols]"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "82024c27",
   "metadata": {},
   "source": [
    "(portfolioskills)=\n",
    "### Portfolios and Skills\n",
    "\n",
    "The objective of your portfolio submissions is to earn Level 3 achievements. The following table shows what Level 3 looks like for each skill and identifies which portfolio submissions you can earn that Level 3 in that skill."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "id": "2218366e",
   "metadata": {
    "tags": [
     "remove-input"
    ]
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>Level 3</th>\n",
       "      <th>P1</th>\n",
       "      <th>P2</th>\n",
       "      <th>P3</th>\n",
       "      <th>P4</th>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>keyword</th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>python</th>\n",
       "      <td>reliable, efficient, pythonic code that consistently adheres to pep8</td>\n",
       "      <td>1</td>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>process</th>\n",
       "      <td>Compare different ways that data science can facilitate decision making</td>\n",
       "      <td>0</td>\n",
       "      <td>1</td>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>access</th>\n",
       "      <td>access data from both common and uncommon formats and identify best practices for formats in different contexts</td>\n",
       "      <td>1</td>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>construct</th>\n",
       "      <td>merge data that is not automatically aligned</td>\n",
       "      <td>1</td>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>summarize</th>\n",
       "      <td>Compute and interpret various summary statistics of subsets of data</td>\n",
       "      <td>1</td>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>visualize</th>\n",
       "      <td>generate complex plots with pandas and plotting libraries and customize with matplotlib or additional parameters</td>\n",
       "      <td>1</td>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>prepare</th>\n",
       "      <td>apply data reshaping, cleaning, and filtering manipulations reliably and correctly by assessing data as received</td>\n",
       "      <td>1</td>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>classification</th>\n",
       "      <td>fit and apply classification models and select appropriate classification models for different contexts</td>\n",
       "      <td>0</td>\n",
       "      <td>1</td>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>regression</th>\n",
       "      <td>can fit and explain regrularized or nonlinear regression</td>\n",
       "      <td>0</td>\n",
       "      <td>1</td>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>clustering</th>\n",
       "      <td>apply multiple clustering techniques, and interpret results</td>\n",
       "      <td>0</td>\n",
       "      <td>1</td>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>evaluate</th>\n",
       "      <td>Evaluate a model with multiple metrics and cross validation</td>\n",
       "      <td>0</td>\n",
       "      <td>1</td>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>optimize</th>\n",
       "      <td>Select optimal parameters based of mutiple quanttiateve criteria and automate parameter tuning</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>1</td>\n",
       "      <td>1</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>compare</th>\n",
       "      <td>Evaluate tradeoffs between different model comparison types</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>1</td>\n",
       "      <td>1</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>unstructured</th>\n",
       "      <td>apply multiple representations and compare and contrast them for different end results</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>1</td>\n",
       "      <td>1</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>workflow</th>\n",
       "      <td>Scope, choose an appropriate tool pipeline and solve data science problems, describe strengths and weakensses of common tools</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>1</td>\n",
       "      <td>1</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "                                                                                                                                      Level 3  \\\n",
       "keyword                                                                                                                                         \n",
       "python                                                                   reliable, efficient, pythonic code that consistently adheres to pep8   \n",
       "process                                                               Compare different ways that data science can facilitate decision making   \n",
       "access                        access data from both common and uncommon formats and identify best practices for formats in different contexts   \n",
       "construct                                                                                        merge data that is not automatically aligned   \n",
       "summarize                                                                 Compute and interpret various summary statistics of subsets of data   \n",
       "visualize                    generate complex plots with pandas and plotting libraries and customize with matplotlib or additional parameters   \n",
       "prepare                      apply data reshaping, cleaning, and filtering manipulations reliably and correctly by assessing data as received   \n",
       "classification                        fit and apply classification models and select appropriate classification models for different contexts   \n",
       "regression                                                                           can fit and explain regrularized or nonlinear regression   \n",
       "clustering                                                                        apply multiple clustering techniques, and interpret results   \n",
       "evaluate                                                                          Evaluate a model with multiple metrics and cross validation   \n",
       "optimize                                       Select optimal parameters based of mutiple quanttiateve criteria and automate parameter tuning   \n",
       "compare                                                                           Evaluate tradeoffs between different model comparison types   \n",
       "unstructured                                           apply multiple representations and compare and contrast them for different end results   \n",
       "workflow        Scope, choose an appropriate tool pipeline and solve data science problems, describe strengths and weakensses of common tools   \n",
       "\n",
       "                P1  P2  P3  P4  \n",
       "keyword                         \n",
       "python           1   1   0   0  \n",
       "process          0   1   1   0  \n",
       "access           1   1   0   0  \n",
       "construct        1   1   0   0  \n",
       "summarize        1   1   0   0  \n",
       "visualize        1   1   0   0  \n",
       "prepare          1   1   0   0  \n",
       "classification   0   1   1   0  \n",
       "regression       0   1   1   0  \n",
       "clustering       0   1   1   0  \n",
       "evaluate         0   1   1   0  \n",
       "optimize         0   0   1   1  \n",
       "compare          0   0   1   1  \n",
       "unstructured     0   0   1   1  \n",
       "workflow         0   0   1   1  "
      ]
     },
     "execution_count": 7,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "rubric_df[portfolio_cols]"
   ]
  }
 ],
 "metadata": {
  "jupytext": {
   "text_representation": {
    "extension": ".md",
    "format_name": "myst",
    "format_version": 0.12,
    "jupytext_version": "1.6.0"
   }
  },
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.7.10"
  },
  "source_map": [
   12,
   16,
   41,
   49,
   56,
   76,
   82,
   90,
   103,
   122,
   129,
   133,
   141
  ]
 },
 "nbformat": 4,
 "nbformat_minor": 5
}