{ "cells": [ { "cell_type": "markdown", "id": "f14e95ab", "metadata": {}, "source": [ "# Class 12: Constructing Datasets from Multiple Sources" ] }, { "cell_type": "code", "execution_count": 1, "id": "bbca4d7c", "metadata": {}, "outputs": [], "source": [ "import pandas as pd" ] }, { "cell_type": "markdown", "id": "94f89de0", "metadata": {}, "source": [ "## Introduction\n", "Sometimes we use data from multiple sources. For example data may be provided across a bunch of tables and we want to put them together. Or we may want to build our own dataset to answer the questions that we want to ask.\n", "For example we have a dataset of drivers who have been pulled over by the police, and need to compare distribution of them between day and night times. If we only have time of being pulled over without mentioning day/night, we can use another dataset including time of sunset for every day and build our own dataset out of these two dataset to find the answer.\n", "Later we will also learn about getting data out of databses with some small database operations through SQLite and some python libararies.\n", "\n", "\n", "To use relative paths as in class (`data/2018-games.csv`) instead of a full url `https://raw.githubusercontent.com/rhodyprog4ds/inclass-data/main/2018-games.csv`, download data from [this GitHub repo](https://github.com/rhodyprog4ds/inclass-data) by clicking on the green code button and choosing .zip. Then unzip the data and save it in a data folder in the same folder as the notebook. For the class notes, the urls make it so that the notebook can run without having to store the data in another place." ] }, { "cell_type": "code", "execution_count": 2, "id": "d5646ba0", "metadata": {}, "outputs": [], "source": [ "games_df18 = pd.read_csv('https://raw.githubusercontent.com/rhodyprog4ds/inclass-data/main/2018-games.csv')\n", "games_df19 = pd.read_csv('https://raw.githubusercontent.com/rhodyprog4ds/inclass-data/main/2019-games.csv')" ] }, { "cell_type": "markdown", "id": "70effb06", "metadata": {}, "source": [ "\n", "## Stacking DataFrames\n", "Let's look at the first couple of rows to see how the configuration of data is." ] }, { "cell_type": "code", "execution_count": 3, "id": "ac6fe4f4", "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
Unnamed: 0GAME_DATE_ESTGAME_IDGAME_STATUS_TEXTHOME_TEAM_IDVISITOR_TEAM_IDSEASONTEAM_ID_homePTS_homeFG_PCT_home...AST_homeREB_homeTEAM_ID_awayPTS_awayFG_PCT_awayFT_PCT_awayFG3_PCT_awayAST_awayREB_awayHOME_TEAM_WINS
0161962019-06-1341800406Final1610612744161061276120181610612744110.00.488...28.042.01610612761114.00.4760.7930.39425.039.00
1161972019-06-1041800405Final1610612761161061274420181610612761105.00.447...19.043.01610612744106.00.4630.7140.47627.037.00
\n", "

2 rows × 22 columns

\n", "
" ], "text/plain": [ " Unnamed: 0 GAME_DATE_EST GAME_ID GAME_STATUS_TEXT HOME_TEAM_ID \\\n", "0 16196 2019-06-13 41800406 Final 1610612744 \n", "1 16197 2019-06-10 41800405 Final 1610612761 \n", "\n", " VISITOR_TEAM_ID SEASON TEAM_ID_home PTS_home FG_PCT_home ... \\\n", "0 1610612761 2018 1610612744 110.0 0.488 ... \n", "1 1610612744 2018 1610612761 105.0 0.447 ... \n", "\n", " AST_home REB_home TEAM_ID_away PTS_away FG_PCT_away FT_PCT_away \\\n", "0 28.0 42.0 1610612761 114.0 0.476 0.793 \n", "1 19.0 43.0 1610612744 106.0 0.463 0.714 \n", "\n", " FG3_PCT_away AST_away REB_away HOME_TEAM_WINS \n", "0 0.394 25.0 39.0 0 \n", "1 0.476 27.0 37.0 0 \n", "\n", "[2 rows x 22 columns]" ] }, "execution_count": 3, "metadata": {}, "output_type": "execute_result" } ], "source": [ "games_df18.head(2)" ] }, { "cell_type": "code", "execution_count": 4, "id": "47ccc58e", "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
Unnamed: 0GAME_DATE_ESTGAME_IDGAME_STATUS_TEXTHOME_TEAM_IDVISITOR_TEAM_IDSEASONTEAM_ID_homePTS_homeFG_PCT_home...AST_homeREB_homeTEAM_ID_awayPTS_awayFG_PCT_awayFT_PCT_awayFG3_PCT_awayAST_awayREB_awayHOME_TEAM_WINS
002020-03-0121900895Final161061276616106127492019161061276685.00.354...22.047.0161061274993.00.4020.7620.22620.061.00
112020-03-0121900896Final161061275016106127422019161061275091.00.364...19.057.01610612742111.00.4680.6320.27528.056.00
\n", "

2 rows × 22 columns

\n", "
" ], "text/plain": [ " Unnamed: 0 GAME_DATE_EST GAME_ID GAME_STATUS_TEXT HOME_TEAM_ID \\\n", "0 0 2020-03-01 21900895 Final 1610612766 \n", "1 1 2020-03-01 21900896 Final 1610612750 \n", "\n", " VISITOR_TEAM_ID SEASON TEAM_ID_home PTS_home FG_PCT_home ... \\\n", "0 1610612749 2019 1610612766 85.0 0.354 ... \n", "1 1610612742 2019 1610612750 91.0 0.364 ... \n", "\n", " AST_home REB_home TEAM_ID_away PTS_away FG_PCT_away FT_PCT_away \\\n", "0 22.0 47.0 1610612749 93.0 0.402 0.762 \n", "1 19.0 57.0 1610612742 111.0 0.468 0.632 \n", "\n", " FG3_PCT_away AST_away REB_away HOME_TEAM_WINS \n", "0 0.226 20.0 61.0 0 \n", "1 0.275 28.0 56.0 0 \n", "\n", "[2 rows x 22 columns]" ] }, "execution_count": 4, "metadata": {}, "output_type": "execute_result" } ], "source": [ "games_df19.head(2)" ] }, { "cell_type": "markdown", "id": "f0d22345", "metadata": {}, "source": [ "As we can see, both datasets have the same columns, and just for two different years." ] }, { "cell_type": "code", "execution_count": 5, "id": "f4801a18", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "(1378, 22)" ] }, "execution_count": 5, "metadata": {}, "output_type": "execute_result" } ], "source": [ "games_df18.shape" ] }, { "cell_type": "code", "execution_count": 6, "id": "a9a126b0", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "(965, 22)" ] }, "execution_count": 6, "metadata": {}, "output_type": "execute_result" } ], "source": [ "games_df19.shape" ] }, { "cell_type": "markdown", "id": "5b1dbc92", "metadata": {}, "source": [ "Let's concatenate two dataframes to make one dataframe out of them." ] }, { "cell_type": "code", "execution_count": 7, "id": "a758bdb7", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "(2343, 22)" ] }, "execution_count": 7, "metadata": {}, "output_type": "execute_result" } ], "source": [ "games_df = pd.concat([games_df18,games_df19])\n", "games_df.shape" ] }, { "cell_type": "markdown", "id": "0bfc9139", "metadata": {}, "source": [ "As we can see rows are added up but we have the same numbe of rcolums." ] }, { "cell_type": "code", "execution_count": 8, "id": "c1e2463d", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "Index(['Unnamed: 0', 'GAME_DATE_EST', 'GAME_ID', 'GAME_STATUS_TEXT',\n", " 'HOME_TEAM_ID', 'VISITOR_TEAM_ID', 'SEASON', 'TEAM_ID_home', 'PTS_home',\n", " 'FG_PCT_home', 'FT_PCT_home', 'FG3_PCT_home', 'AST_home', 'REB_home',\n", " 'TEAM_ID_away', 'PTS_away', 'FG_PCT_away', 'FT_PCT_away',\n", " 'FG3_PCT_away', 'AST_away', 'REB_away', 'HOME_TEAM_WINS'],\n", " dtype='object')" ] }, "execution_count": 8, "metadata": {}, "output_type": "execute_result" } ], "source": [ "games_df.columns" ] }, { "cell_type": "markdown", "id": "8df40c57", "metadata": {}, "source": [ "Let's drop the column we do not need." ] }, { "cell_type": "code", "execution_count": 9, "id": "872952bb", "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
GAME_DATE_ESTGAME_IDGAME_STATUS_TEXTHOME_TEAM_IDVISITOR_TEAM_IDSEASONTEAM_ID_homePTS_homeFG_PCT_homeFT_PCT_home...AST_homeREB_homeTEAM_ID_awayPTS_awayFG_PCT_awayFT_PCT_awayFG3_PCT_awayAST_awayREB_awayHOME_TEAM_WINS
02019-06-1341800406Final1610612744161061276120181610612744110.00.4880.700...28.042.01610612761114.00.4760.7930.39425.039.00
12019-06-1041800405Final1610612761161061274420181610612761105.00.4470.778...19.043.01610612744106.00.4630.7140.47627.037.00
\n", "

2 rows × 21 columns

\n", "
" ], "text/plain": [ " GAME_DATE_EST GAME_ID GAME_STATUS_TEXT HOME_TEAM_ID VISITOR_TEAM_ID \\\n", "0 2019-06-13 41800406 Final 1610612744 1610612761 \n", "1 2019-06-10 41800405 Final 1610612761 1610612744 \n", "\n", " SEASON TEAM_ID_home PTS_home FG_PCT_home FT_PCT_home ... AST_home \\\n", "0 2018 1610612744 110.0 0.488 0.700 ... 28.0 \n", "1 2018 1610612761 105.0 0.447 0.778 ... 19.0 \n", "\n", " REB_home TEAM_ID_away PTS_away FG_PCT_away FT_PCT_away FG3_PCT_away \\\n", "0 42.0 1610612761 114.0 0.476 0.793 0.394 \n", "1 43.0 1610612744 106.0 0.463 0.714 0.476 \n", "\n", " AST_away REB_away HOME_TEAM_WINS \n", "0 25.0 39.0 0 \n", "1 27.0 37.0 0 \n", "\n", "[2 rows x 21 columns]" ] }, "execution_count": 9, "metadata": {}, "output_type": "execute_result" } ], "source": [ "games_df.drop(columns= 'Unnamed: 0',inplace=True,)\n", "games_df.head(2)" ] }, { "cell_type": "markdown", "id": "8e243ed3", "metadata": {}, "source": [ "\n", "## Merging Data Frames\n", "Now we read another dataset which some of its columns are the same as dataframe \"games_df\" and some are different." ] }, { "cell_type": "code", "execution_count": 10, "id": "2df04016", "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
LEAGUE_IDTEAM_IDMIN_YEARMAX_YEARABBREVIATIONNICKNAMEYEARFOUNDEDCITYARENAARENACAPACITYOWNERGENERALMANAGERHEADCOACHDLEAGUEAFFILIATION
00161061273719492019ATLHawks1949AtlantaState Farm Arena18729.0Tony ResslerTravis SchlenkLloyd PierceErie Bayhawks
10161061273819462019BOSCeltics1946BostonTD Garden18624.0Wyc GrousbeckDanny AingeBrad StevensMaine Red Claws
\n", "
" ], "text/plain": [ " LEAGUE_ID TEAM_ID MIN_YEAR MAX_YEAR ABBREVIATION NICKNAME \\\n", "0 0 1610612737 1949 2019 ATL Hawks \n", "1 0 1610612738 1946 2019 BOS Celtics \n", "\n", " YEARFOUNDED CITY ARENA ARENACAPACITY OWNER \\\n", "0 1949 Atlanta State Farm Arena 18729.0 Tony Ressler \n", "1 1946 Boston TD Garden 18624.0 Wyc Grousbeck \n", "\n", " GENERALMANAGER HEADCOACH DLEAGUEAFFILIATION \n", "0 Travis Schlenk Lloyd Pierce Erie Bayhawks \n", "1 Danny Ainge Brad Stevens Maine Red Claws " ] }, "execution_count": 10, "metadata": {}, "output_type": "execute_result" } ], "source": [ "teams_df = pd.read_csv('https://raw.githubusercontent.com/rhodyprog4ds/inclass-data/main/teams.csv')\n", "teams_df.head(2)" ] }, { "cell_type": "markdown", "id": "bdaefef6", "metadata": {}, "source": [ "We use left_on='TEAM_ID' and right_on = 'HOME_TEAM_ID' to match two dataframes." ] }, { "cell_type": "code", "execution_count": 11, "id": "f0d13ec3", "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
LEAGUE_IDTEAM_IDMIN_YEARMAX_YEARABBREVIATIONNICKNAMEYEARFOUNDEDCITYARENAARENACAPACITY...AST_homeREB_homeTEAM_ID_awayPTS_awayFG_PCT_awayFT_PCT_awayFG3_PCT_awayAST_awayREB_awayHOME_TEAM_WINS
00161061273719492019ATLHawks1949AtlantaState Farm Arena18729.0...29.061.01610612754135.00.4590.8460.40022.043.00
10161061273719492019ATLHawks1949AtlantaState Farm Arena18729.0...29.044.01610612755122.00.4590.5790.32327.057.01
\n", "

2 rows × 35 columns

\n", "
" ], "text/plain": [ " LEAGUE_ID TEAM_ID MIN_YEAR MAX_YEAR ABBREVIATION NICKNAME \\\n", "0 0 1610612737 1949 2019 ATL Hawks \n", "1 0 1610612737 1949 2019 ATL Hawks \n", "\n", " YEARFOUNDED CITY ARENA ARENACAPACITY ... AST_home \\\n", "0 1949 Atlanta State Farm Arena 18729.0 ... 29.0 \n", "1 1949 Atlanta State Farm Arena 18729.0 ... 29.0 \n", "\n", " REB_home TEAM_ID_away PTS_away FG_PCT_away FT_PCT_away FG3_PCT_away \\\n", "0 61.0 1610612754 135.0 0.459 0.846 0.400 \n", "1 44.0 1610612755 122.0 0.459 0.579 0.323 \n", "\n", " AST_away REB_away HOME_TEAM_WINS \n", "0 22.0 43.0 0 \n", "1 27.0 57.0 1 \n", "\n", "[2 rows x 35 columns]" ] }, "execution_count": 11, "metadata": {}, "output_type": "execute_result" } ], "source": [ "merge1_df = pd.merge(teams_df,games_df,left_on='TEAM_ID', right_on = 'HOME_TEAM_ID')\n", "merge1_df.head(2)" ] }, { "cell_type": "markdown", "id": "793c5964", "metadata": {}, "source": [ "We want information for each game and append the team info onto that. So, lets try another settings in our mergging." ] }, { "cell_type": "code", "execution_count": 12, "id": "b53b7b7f", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "(2343, 35)" ] }, "execution_count": 12, "metadata": {}, "output_type": "execute_result" } ], "source": [ "merge1_df.shape" ] }, { "cell_type": "code", "execution_count": 13, "id": "3bf49b69", "metadata": {}, "outputs": [], "source": [ "merge2_df = pd.merge(teams_df, games_df,left_on='TEAM_ID', right_on = 'HOME_TEAM_ID', how='outer')" ] }, { "cell_type": "code", "execution_count": 14, "id": "2986c10c", "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
LEAGUE_IDTEAM_IDMIN_YEARMAX_YEARABBREVIATIONNICKNAMEYEARFOUNDEDCITYARENAARENACAPACITY...AST_homeREB_homeTEAM_ID_awayPTS_awayFG_PCT_awayFT_PCT_awayFG3_PCT_awayAST_awayREB_awayHOME_TEAM_WINS
00161061273719492019ATLHawks1949AtlantaState Farm Arena18729.0...29.061.01610612754135.00.4590.8460.40022.043.00
10161061273719492019ATLHawks1949AtlantaState Farm Arena18729.0...29.044.01610612755122.00.4590.5790.32327.057.01
\n", "

2 rows × 35 columns

\n", "
" ], "text/plain": [ " LEAGUE_ID TEAM_ID MIN_YEAR MAX_YEAR ABBREVIATION NICKNAME \\\n", "0 0 1610612737 1949 2019 ATL Hawks \n", "1 0 1610612737 1949 2019 ATL Hawks \n", "\n", " YEARFOUNDED CITY ARENA ARENACAPACITY ... AST_home \\\n", "0 1949 Atlanta State Farm Arena 18729.0 ... 29.0 \n", "1 1949 Atlanta State Farm Arena 18729.0 ... 29.0 \n", "\n", " REB_home TEAM_ID_away PTS_away FG_PCT_away FT_PCT_away FG3_PCT_away \\\n", "0 61.0 1610612754 135.0 0.459 0.846 0.400 \n", "1 44.0 1610612755 122.0 0.459 0.579 0.323 \n", "\n", " AST_away REB_away HOME_TEAM_WINS \n", "0 22.0 43.0 0 \n", "1 27.0 57.0 1 \n", "\n", "[2 rows x 35 columns]" ] }, "execution_count": 14, "metadata": {}, "output_type": "execute_result" } ], "source": [ "merge2_df.head(2)" ] }, { "cell_type": "code", "execution_count": 15, "id": "8217507f", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "(2343, 35)" ] }, "execution_count": 15, "metadata": {}, "output_type": "execute_result" } ], "source": [ "merge2_df.shape" ] }, { "cell_type": "markdown", "id": "fc6b7830", "metadata": {}, "source": [ "We can group by \"ARENA\" and then look at the \"mean\" statistics." ] }, { "cell_type": "code", "execution_count": 16, "id": "2cf706c7", "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
LEAGUE_IDTEAM_IDMIN_YEARMAX_YEARYEARFOUNDEDARENACAPACITYGAME_IDHOME_TEAM_IDVISITOR_TEAM_IDSEASON...AST_homeREB_homeTEAM_ID_awayPTS_awayFG_PCT_awayFT_PCT_awayFG3_PCT_awayAST_awayREB_awayHOME_TEAM_WINS
ARENA
AT&T Center0.01.610613e+091976.0000002019.01976.00000018694.02.184021e+071.610613e+091.610613e+092018.397436...25.14102645.7820511.610613e+09108.4487180.4493330.7708080.34823123.98717943.8333330.666667
American Airlines Center0.01.610613e+091980.0000002019.01980.00000019200.02.130985e+071.610613e+091.610613e+092018.426667...23.92000046.0933331.610613e+09108.9600000.4560930.7773200.34770722.96000043.7200000.546667
AmericanAirlines Arena0.01.610613e+091988.0000002019.01988.00000019600.02.105314e+071.610613e+091.610613e+092018.421053...25.03947446.2763161.610613e+09106.6184210.4426840.7719870.34425023.15789542.1315790.644737
Amway Center0.01.610613e+091989.0000002019.01989.0000000.02.171458e+071.610613e+091.610613e+092018.423077...24.52564145.9358971.610613e+09105.5384620.4543720.7724870.35996222.98717944.7692310.538462
Bankers Life Fieldhouse0.01.610613e+091976.0000002019.01976.00000018345.02.197454e+071.610613e+091.610613e+092018.441558...26.36363644.5974031.610613e+09103.0519480.4354940.7618050.32944223.27272744.6623380.675325
Barclays Center0.01.610613e+091976.0000002019.01976.000000NaN2.197516e+071.610613e+091.610613e+092018.413333...23.88000048.0000001.610613e+09110.2933330.4481330.7735330.33854722.64000045.0000000.533333
Capital One Arena0.01.610613e+091961.0000002019.01961.00000020647.02.130190e+071.610613e+091.610613e+092018.418919...25.89189243.9324321.610613e+09115.3108110.4720410.7659590.35332424.55405446.1216220.500000
Chase Center0.01.610613e+091946.0000002019.01946.00000019596.02.350492e+071.610613e+091.610613e+092018.377778...28.44444444.7222221.610613e+09112.8333330.4611440.7821220.37423325.37777843.9000000.511111
Chesapeake Energy Arena0.01.610613e+091967.0000002019.01967.00000019163.02.171802e+071.610613e+091.610613e+092018.425000...23.45000045.3875001.610613e+09109.2500000.4550880.7724370.35160023.17500044.7000000.637500
FedExForum0.01.610613e+091995.0000002019.01995.00000018119.02.131630e+071.610613e+091.610613e+092018.421053...25.39473744.5789471.610613e+09108.2368420.4450920.7863550.36281623.47368444.9868420.526316
Fiserv Forum0.01.610613e+091968.0000002019.01968.00000017500.02.328482e+071.610613e+091.610613e+092018.385542...26.81927751.5903611.610613e+09106.4216870.4172890.7588190.34732523.25301244.3855420.843373
Golden 1 Center0.01.610613e+091948.0000002019.01948.00000017500.02.142553e+071.610613e+091.610613e+092018.416667...24.12500045.2083331.610613e+09112.3611110.4594170.7724440.34951423.80555646.7083330.527778
Little Caesars Arena0.01.610613e+091948.0000002019.01948.00000021000.02.171695e+071.610613e+091.610613e+092018.430380...23.97468444.6708861.610613e+09108.7974680.4691270.7672030.34812724.15189943.4177220.493671
Madison Square Garden0.01.610613e+091946.0000002019.01946.00000019763.02.105316e+071.610613e+091.610613e+092018.421053...21.50000045.4473681.610613e+09112.1052630.4638420.7564610.36523724.26315845.2631580.250000
Moda Center0.01.610613e+091970.0000002019.01970.00000019980.02.316313e+071.610613e+091.610613e+092018.373494...22.36144647.8313251.610613e+09112.0602410.4481450.7938800.37157822.93975944.0481930.662651
Pepsi Center0.01.610613e+091976.0000002019.01976.00000019099.02.369186e+071.610613e+091.610613e+092018.395062...27.41975346.4197531.610613e+09103.7777780.4485310.7421730.32398823.32098842.1728400.790123
Quicken Loans Arena0.01.610613e+091970.0000002019.01970.00000020562.02.132390e+071.610613e+091.610613e+092018.428571...22.03896144.0259741.610613e+09113.5844160.4897660.7768960.38424725.58441642.5974030.298701
Scotiabank Arena0.01.610613e+091995.0000002019.01995.00000019800.02.428270e+071.610613e+091.610613e+092018.377778...25.43333345.1000001.610613e+09106.1444440.4321110.7702110.34754424.15555645.3000000.744444
Smoothie King Center0.01.610613e+092002.0000002019.02002.000000NaN2.156903e+071.610613e+091.610613e+092018.424658...27.23287746.9452051.610613e+09116.8082190.4740960.7715620.36282225.10958944.5616440.438356
Spectrum Center0.01.610613e+091988.0000002019.01988.00000019026.02.104182e+071.610613e+091.610613e+092018.413333...23.74666744.2933331.610613e+09109.3866670.4639200.7615730.35085325.06666745.2666670.493333
Staples Center0.01.610613e+091959.2101912019.01959.21019119060.02.139604e+071.610613e+091.610613e+092018.414013...25.94267547.2866241.610613e+09110.4777070.4437830.7750640.33879624.09554145.1337580.636943
State Farm Arena0.01.610613e+091949.0000002019.01949.00000018729.02.131766e+071.610613e+091.610613e+092018.434211...25.73684245.9605261.610613e+09117.8947370.4634470.7671840.34355325.50000046.2763160.421053
TD Garden0.01.610613e+091946.0000002019.01946.00000018624.02.235308e+071.610613e+091.610613e+092018.397436...25.71794945.9743591.610613e+09106.3589740.4427690.7430640.33580823.28205144.1538460.717949
Talking Stick Resort Arena0.01.610613e+091968.0000002019.01968.000000NaN2.132518e+071.610613e+091.610613e+092018.441558...25.29870141.5194811.610613e+09113.0649350.4749220.7623770.36314324.28571445.0000000.311688
Target Center0.01.610613e+091989.0000002019.01989.00000019356.02.156304e+071.610613e+091.610613e+092018.402778...24.45833346.5833331.610613e+09113.2083330.4623060.7631810.36362524.40277846.5000000.444444
Toyota Center0.01.610613e+091967.0000002019.01967.00000018104.02.283925e+071.610613e+091.610613e+092018.387500...21.71250044.8500001.610613e+09110.2750000.4544380.7591880.33925024.20000045.7000000.737500
United Center0.01.610613e+091966.0000002019.01966.00000021711.02.107486e+071.610613e+091.610613e+092018.435897...22.98717943.5128211.610613e+09110.7051280.4753720.7506280.35580825.60256446.1923080.307692
Vivint Smart Home Arena0.01.610613e+091974.0000002019.01974.00000020148.02.197424e+071.610613e+091.610613e+092018.421053...24.52631647.5921051.610613e+09105.7236840.4436180.7678030.35196120.53947442.2763160.684211
Wells Fargo Center0.01.610613e+091949.0000002019.01949.000000NaN2.282764e+071.610613e+091.610613e+092018.395062...27.39506247.6172841.610613e+09106.7530860.4453460.7572470.34327222.08642041.1604940.802469
\n", "

29 rows × 25 columns

\n", "
" ], "text/plain": [ " LEAGUE_ID TEAM_ID MIN_YEAR MAX_YEAR \\\n", "ARENA \n", "AT&T Center 0.0 1.610613e+09 1976.000000 2019.0 \n", "American Airlines Center 0.0 1.610613e+09 1980.000000 2019.0 \n", "AmericanAirlines Arena 0.0 1.610613e+09 1988.000000 2019.0 \n", "Amway Center 0.0 1.610613e+09 1989.000000 2019.0 \n", "Bankers Life Fieldhouse 0.0 1.610613e+09 1976.000000 2019.0 \n", "Barclays Center 0.0 1.610613e+09 1976.000000 2019.0 \n", "Capital One Arena 0.0 1.610613e+09 1961.000000 2019.0 \n", "Chase Center 0.0 1.610613e+09 1946.000000 2019.0 \n", "Chesapeake Energy Arena 0.0 1.610613e+09 1967.000000 2019.0 \n", "FedExForum 0.0 1.610613e+09 1995.000000 2019.0 \n", "Fiserv Forum 0.0 1.610613e+09 1968.000000 2019.0 \n", "Golden 1 Center 0.0 1.610613e+09 1948.000000 2019.0 \n", "Little Caesars Arena 0.0 1.610613e+09 1948.000000 2019.0 \n", "Madison Square Garden 0.0 1.610613e+09 1946.000000 2019.0 \n", "Moda Center 0.0 1.610613e+09 1970.000000 2019.0 \n", "Pepsi Center 0.0 1.610613e+09 1976.000000 2019.0 \n", "Quicken Loans Arena 0.0 1.610613e+09 1970.000000 2019.0 \n", "Scotiabank Arena 0.0 1.610613e+09 1995.000000 2019.0 \n", "Smoothie King Center 0.0 1.610613e+09 2002.000000 2019.0 \n", "Spectrum Center 0.0 1.610613e+09 1988.000000 2019.0 \n", "Staples Center 0.0 1.610613e+09 1959.210191 2019.0 \n", "State Farm Arena 0.0 1.610613e+09 1949.000000 2019.0 \n", "TD Garden 0.0 1.610613e+09 1946.000000 2019.0 \n", "Talking Stick Resort Arena 0.0 1.610613e+09 1968.000000 2019.0 \n", "Target Center 0.0 1.610613e+09 1989.000000 2019.0 \n", "Toyota Center 0.0 1.610613e+09 1967.000000 2019.0 \n", "United Center 0.0 1.610613e+09 1966.000000 2019.0 \n", "Vivint Smart Home Arena 0.0 1.610613e+09 1974.000000 2019.0 \n", "Wells Fargo Center 0.0 1.610613e+09 1949.000000 2019.0 \n", "\n", " YEARFOUNDED ARENACAPACITY GAME_ID \\\n", "ARENA \n", "AT&T Center 1976.000000 18694.0 2.184021e+07 \n", "American Airlines Center 1980.000000 19200.0 2.130985e+07 \n", "AmericanAirlines Arena 1988.000000 19600.0 2.105314e+07 \n", "Amway Center 1989.000000 0.0 2.171458e+07 \n", "Bankers Life Fieldhouse 1976.000000 18345.0 2.197454e+07 \n", "Barclays Center 1976.000000 NaN 2.197516e+07 \n", "Capital One Arena 1961.000000 20647.0 2.130190e+07 \n", "Chase Center 1946.000000 19596.0 2.350492e+07 \n", "Chesapeake Energy Arena 1967.000000 19163.0 2.171802e+07 \n", "FedExForum 1995.000000 18119.0 2.131630e+07 \n", "Fiserv Forum 1968.000000 17500.0 2.328482e+07 \n", "Golden 1 Center 1948.000000 17500.0 2.142553e+07 \n", "Little Caesars Arena 1948.000000 21000.0 2.171695e+07 \n", "Madison Square Garden 1946.000000 19763.0 2.105316e+07 \n", "Moda Center 1970.000000 19980.0 2.316313e+07 \n", "Pepsi Center 1976.000000 19099.0 2.369186e+07 \n", "Quicken Loans Arena 1970.000000 20562.0 2.132390e+07 \n", "Scotiabank Arena 1995.000000 19800.0 2.428270e+07 \n", "Smoothie King Center 2002.000000 NaN 2.156903e+07 \n", "Spectrum Center 1988.000000 19026.0 2.104182e+07 \n", "Staples Center 1959.210191 19060.0 2.139604e+07 \n", "State Farm Arena 1949.000000 18729.0 2.131766e+07 \n", "TD Garden 1946.000000 18624.0 2.235308e+07 \n", "Talking Stick Resort Arena 1968.000000 NaN 2.132518e+07 \n", "Target Center 1989.000000 19356.0 2.156304e+07 \n", "Toyota Center 1967.000000 18104.0 2.283925e+07 \n", "United Center 1966.000000 21711.0 2.107486e+07 \n", "Vivint Smart Home Arena 1974.000000 20148.0 2.197424e+07 \n", "Wells Fargo Center 1949.000000 NaN 2.282764e+07 \n", "\n", " HOME_TEAM_ID VISITOR_TEAM_ID SEASON ... \\\n", "ARENA ... \n", "AT&T Center 1.610613e+09 1.610613e+09 2018.397436 ... \n", "American Airlines Center 1.610613e+09 1.610613e+09 2018.426667 ... \n", "AmericanAirlines Arena 1.610613e+09 1.610613e+09 2018.421053 ... \n", "Amway Center 1.610613e+09 1.610613e+09 2018.423077 ... \n", "Bankers Life Fieldhouse 1.610613e+09 1.610613e+09 2018.441558 ... \n", "Barclays Center 1.610613e+09 1.610613e+09 2018.413333 ... \n", "Capital One Arena 1.610613e+09 1.610613e+09 2018.418919 ... \n", "Chase Center 1.610613e+09 1.610613e+09 2018.377778 ... \n", "Chesapeake Energy Arena 1.610613e+09 1.610613e+09 2018.425000 ... \n", "FedExForum 1.610613e+09 1.610613e+09 2018.421053 ... \n", "Fiserv Forum 1.610613e+09 1.610613e+09 2018.385542 ... \n", "Golden 1 Center 1.610613e+09 1.610613e+09 2018.416667 ... \n", "Little Caesars Arena 1.610613e+09 1.610613e+09 2018.430380 ... \n", "Madison Square Garden 1.610613e+09 1.610613e+09 2018.421053 ... \n", "Moda Center 1.610613e+09 1.610613e+09 2018.373494 ... \n", "Pepsi Center 1.610613e+09 1.610613e+09 2018.395062 ... \n", "Quicken Loans Arena 1.610613e+09 1.610613e+09 2018.428571 ... \n", "Scotiabank Arena 1.610613e+09 1.610613e+09 2018.377778 ... \n", "Smoothie King Center 1.610613e+09 1.610613e+09 2018.424658 ... \n", "Spectrum Center 1.610613e+09 1.610613e+09 2018.413333 ... \n", "Staples Center 1.610613e+09 1.610613e+09 2018.414013 ... \n", "State Farm Arena 1.610613e+09 1.610613e+09 2018.434211 ... \n", "TD Garden 1.610613e+09 1.610613e+09 2018.397436 ... \n", "Talking Stick Resort Arena 1.610613e+09 1.610613e+09 2018.441558 ... \n", "Target Center 1.610613e+09 1.610613e+09 2018.402778 ... \n", "Toyota Center 1.610613e+09 1.610613e+09 2018.387500 ... \n", "United Center 1.610613e+09 1.610613e+09 2018.435897 ... \n", "Vivint Smart Home Arena 1.610613e+09 1.610613e+09 2018.421053 ... \n", "Wells Fargo Center 1.610613e+09 1.610613e+09 2018.395062 ... \n", "\n", " AST_home REB_home TEAM_ID_away PTS_away \\\n", "ARENA \n", "AT&T Center 25.141026 45.782051 1.610613e+09 108.448718 \n", "American Airlines Center 23.920000 46.093333 1.610613e+09 108.960000 \n", "AmericanAirlines Arena 25.039474 46.276316 1.610613e+09 106.618421 \n", "Amway Center 24.525641 45.935897 1.610613e+09 105.538462 \n", "Bankers Life Fieldhouse 26.363636 44.597403 1.610613e+09 103.051948 \n", "Barclays Center 23.880000 48.000000 1.610613e+09 110.293333 \n", "Capital One Arena 25.891892 43.932432 1.610613e+09 115.310811 \n", "Chase Center 28.444444 44.722222 1.610613e+09 112.833333 \n", "Chesapeake Energy Arena 23.450000 45.387500 1.610613e+09 109.250000 \n", "FedExForum 25.394737 44.578947 1.610613e+09 108.236842 \n", "Fiserv Forum 26.819277 51.590361 1.610613e+09 106.421687 \n", "Golden 1 Center 24.125000 45.208333 1.610613e+09 112.361111 \n", "Little Caesars Arena 23.974684 44.670886 1.610613e+09 108.797468 \n", "Madison Square Garden 21.500000 45.447368 1.610613e+09 112.105263 \n", "Moda Center 22.361446 47.831325 1.610613e+09 112.060241 \n", "Pepsi Center 27.419753 46.419753 1.610613e+09 103.777778 \n", "Quicken Loans Arena 22.038961 44.025974 1.610613e+09 113.584416 \n", "Scotiabank Arena 25.433333 45.100000 1.610613e+09 106.144444 \n", "Smoothie King Center 27.232877 46.945205 1.610613e+09 116.808219 \n", "Spectrum Center 23.746667 44.293333 1.610613e+09 109.386667 \n", "Staples Center 25.942675 47.286624 1.610613e+09 110.477707 \n", "State Farm Arena 25.736842 45.960526 1.610613e+09 117.894737 \n", "TD Garden 25.717949 45.974359 1.610613e+09 106.358974 \n", "Talking Stick Resort Arena 25.298701 41.519481 1.610613e+09 113.064935 \n", "Target Center 24.458333 46.583333 1.610613e+09 113.208333 \n", "Toyota Center 21.712500 44.850000 1.610613e+09 110.275000 \n", "United Center 22.987179 43.512821 1.610613e+09 110.705128 \n", "Vivint Smart Home Arena 24.526316 47.592105 1.610613e+09 105.723684 \n", "Wells Fargo Center 27.395062 47.617284 1.610613e+09 106.753086 \n", "\n", " FG_PCT_away FT_PCT_away FG3_PCT_away AST_away \\\n", "ARENA \n", "AT&T Center 0.449333 0.770808 0.348231 23.987179 \n", "American Airlines Center 0.456093 0.777320 0.347707 22.960000 \n", "AmericanAirlines Arena 0.442684 0.771987 0.344250 23.157895 \n", "Amway Center 0.454372 0.772487 0.359962 22.987179 \n", "Bankers Life Fieldhouse 0.435494 0.761805 0.329442 23.272727 \n", "Barclays Center 0.448133 0.773533 0.338547 22.640000 \n", "Capital One Arena 0.472041 0.765959 0.353324 24.554054 \n", "Chase Center 0.461144 0.782122 0.374233 25.377778 \n", "Chesapeake Energy Arena 0.455088 0.772437 0.351600 23.175000 \n", "FedExForum 0.445092 0.786355 0.362816 23.473684 \n", "Fiserv Forum 0.417289 0.758819 0.347325 23.253012 \n", "Golden 1 Center 0.459417 0.772444 0.349514 23.805556 \n", "Little Caesars Arena 0.469127 0.767203 0.348127 24.151899 \n", "Madison Square Garden 0.463842 0.756461 0.365237 24.263158 \n", "Moda Center 0.448145 0.793880 0.371578 22.939759 \n", "Pepsi Center 0.448531 0.742173 0.323988 23.320988 \n", "Quicken Loans Arena 0.489766 0.776896 0.384247 25.584416 \n", "Scotiabank Arena 0.432111 0.770211 0.347544 24.155556 \n", "Smoothie King Center 0.474096 0.771562 0.362822 25.109589 \n", "Spectrum Center 0.463920 0.761573 0.350853 25.066667 \n", "Staples Center 0.443783 0.775064 0.338796 24.095541 \n", "State Farm Arena 0.463447 0.767184 0.343553 25.500000 \n", "TD Garden 0.442769 0.743064 0.335808 23.282051 \n", "Talking Stick Resort Arena 0.474922 0.762377 0.363143 24.285714 \n", "Target Center 0.462306 0.763181 0.363625 24.402778 \n", "Toyota Center 0.454438 0.759188 0.339250 24.200000 \n", "United Center 0.475372 0.750628 0.355808 25.602564 \n", "Vivint Smart Home Arena 0.443618 0.767803 0.351961 20.539474 \n", "Wells Fargo Center 0.445346 0.757247 0.343272 22.086420 \n", "\n", " REB_away HOME_TEAM_WINS \n", "ARENA \n", "AT&T Center 43.833333 0.666667 \n", "American Airlines Center 43.720000 0.546667 \n", "AmericanAirlines Arena 42.131579 0.644737 \n", "Amway Center 44.769231 0.538462 \n", "Bankers Life Fieldhouse 44.662338 0.675325 \n", "Barclays Center 45.000000 0.533333 \n", "Capital One Arena 46.121622 0.500000 \n", "Chase Center 43.900000 0.511111 \n", "Chesapeake Energy Arena 44.700000 0.637500 \n", "FedExForum 44.986842 0.526316 \n", "Fiserv Forum 44.385542 0.843373 \n", "Golden 1 Center 46.708333 0.527778 \n", "Little Caesars Arena 43.417722 0.493671 \n", "Madison Square Garden 45.263158 0.250000 \n", "Moda Center 44.048193 0.662651 \n", "Pepsi Center 42.172840 0.790123 \n", "Quicken Loans Arena 42.597403 0.298701 \n", "Scotiabank Arena 45.300000 0.744444 \n", "Smoothie King Center 44.561644 0.438356 \n", "Spectrum Center 45.266667 0.493333 \n", "Staples Center 45.133758 0.636943 \n", "State Farm Arena 46.276316 0.421053 \n", "TD Garden 44.153846 0.717949 \n", "Talking Stick Resort Arena 45.000000 0.311688 \n", "Target Center 46.500000 0.444444 \n", "Toyota Center 45.700000 0.737500 \n", "United Center 46.192308 0.307692 \n", "Vivint Smart Home Arena 42.276316 0.684211 \n", "Wells Fargo Center 41.160494 0.802469 \n", "\n", "[29 rows x 25 columns]" ] }, "execution_count": 16, "metadata": {}, "output_type": "execute_result" } ], "source": [ "merge1_df.groupby('ARENA').mean()" ] }, { "cell_type": "markdown", "id": "742c3e08", "metadata": {}, "source": [ "## How to combine the same data for different outcomes\n", "\n", "We can combine datasets in different ways to learn different things about the data.\n", "Let's now read info about the players." ] }, { "cell_type": "code", "execution_count": 17, "id": "20436050", "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
Unnamed: 0PLAYER_NAMETEAM_IDPLAYER_IDSEASON
0626Kawhi Leonard16106127612026952018
1627Pascal Siakam161061276116277832018
2628Marc Gasol16106127612011882018
3629Danny Green16106127612019802018
4630Kyle Lowry16106127612007682018
\n", "
" ], "text/plain": [ " Unnamed: 0 PLAYER_NAME TEAM_ID PLAYER_ID SEASON\n", "0 626 Kawhi Leonard 1610612761 202695 2018\n", "1 627 Pascal Siakam 1610612761 1627783 2018\n", "2 628 Marc Gasol 1610612761 201188 2018\n", "3 629 Danny Green 1610612761 201980 2018\n", "4 630 Kyle Lowry 1610612761 200768 2018" ] }, "execution_count": 17, "metadata": {}, "output_type": "execute_result" } ], "source": [ "players18 = pd.read_csv('https://raw.githubusercontent.com/rhodyprog4ds/inclass-data/main/2018-players.csv')\n", "players19 = pd.read_csv('https://raw.githubusercontent.com/rhodyprog4ds/inclass-data/main/2019-players.csv')\n", "players18.head()" ] }, { "cell_type": "markdown", "id": "f1cd8f53", "metadata": {}, "source": [ "First let's look at the shape of each of them to have a reference for what happens when we try different merges." ] }, { "cell_type": "code", "execution_count": 18, "id": "aef18b85", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "((748, 5), (626, 5))" ] }, "execution_count": 18, "metadata": {}, "output_type": "execute_result" } ], "source": [ "players18.shape, players19.shape" ] }, { "cell_type": "markdown", "id": "9cbed9bc", "metadata": {}, "source": [ "One thing we might want to do is to put all of the information into one long DataFrame. We can do this with `concat`." ] }, { "cell_type": "code", "execution_count": 19, "id": "01e95e00", "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
Unnamed: 0PLAYER_NAMETEAM_IDPLAYER_IDSEASON
0626Kawhi Leonard16106127612026952018
1627Pascal Siakam161061276116277832018
2628Marc Gasol16106127612011882018
3629Danny Green16106127612019802018
4630Kyle Lowry16106127612007682018
..................
621621Anthony Bennett16106127452034612019
622622Ray Spalding161061273716290342019
623623Devyn Marble16106127442039062019
624624Hassani Gravett161061275316297552019
625625JaKeenan Gant161061275416297212019
\n", "

1374 rows × 5 columns

\n", "
" ], "text/plain": [ " Unnamed: 0 PLAYER_NAME TEAM_ID PLAYER_ID SEASON\n", "0 626 Kawhi Leonard 1610612761 202695 2018\n", "1 627 Pascal Siakam 1610612761 1627783 2018\n", "2 628 Marc Gasol 1610612761 201188 2018\n", "3 629 Danny Green 1610612761 201980 2018\n", "4 630 Kyle Lowry 1610612761 200768 2018\n", ".. ... ... ... ... ...\n", "621 621 Anthony Bennett 1610612745 203461 2019\n", "622 622 Ray Spalding 1610612737 1629034 2019\n", "623 623 Devyn Marble 1610612744 203906 2019\n", "624 624 Hassani Gravett 1610612753 1629755 2019\n", "625 625 JaKeenan Gant 1610612754 1629721 2019\n", "\n", "[1374 rows x 5 columns]" ] }, "execution_count": 19, "metadata": {}, "output_type": "execute_result" } ], "source": [ "pd.concat([players18,players19])" ] }, { "cell_type": "markdown", "id": "7357d24d", "metadata": {}, "source": [ "This allows us to see all the players for each year, we could do groupby and count to see how many players played each year for example.\n", "\n", "We can check that this is the size we expected." ] }, { "cell_type": "code", "execution_count": 20, "id": "9c658050", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "(1374, 5)" ] }, "execution_count": 20, "metadata": {}, "output_type": "execute_result" } ], "source": [ "pd.concat([players18,players19]).shape" ] }, { "cell_type": "markdown", "id": "bf57de1e", "metadata": {}, "source": [ "If we use the default merge settings we get an empty result because the two DataFrames have the same columns so pandas tries to merge `on` all of the columns, but there are no rows that have the same value in all of the columns, so there's nothing left." ] }, { "cell_type": "code", "execution_count": 21, "id": "05ea2718", "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
Unnamed: 0PLAYER_NAMETEAM_IDPLAYER_IDSEASON
\n", "
" ], "text/plain": [ "Empty DataFrame\n", "Columns: [Unnamed: 0, PLAYER_NAME, TEAM_ID, PLAYER_ID, SEASON]\n", "Index: []" ] }, "execution_count": 21, "metadata": {}, "output_type": "execute_result" } ], "source": [ "pd.merge(players18,players19,)" ] }, { "cell_type": "markdown", "id": "3b591650", "metadata": {}, "source": [ "If we merge with on='PLAYER_ID', it only requires that one column to be the same to match rows from the two DataFrames together. With the default value for `how` or explicitly setting `how='inner'` we get the info of players who played both seasons." ] }, { "cell_type": "code", "execution_count": 22, "id": "264de540", "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
Unnamed: 0_xPLAYER_NAME_xTEAM_ID_xPLAYER_IDSEASON_xUnnamed: 0_yPLAYER_NAME_yTEAM_ID_ySEASON_y
0626Kawhi Leonard16106127612026952018299Kawhi Leonard16106127462019
1627Pascal Siakam161061276116277832018275Pascal Siakam16106127612019
2628Marc Gasol16106127612011882018276Marc Gasol16106127612019
31157Marc Gasol16106127632011882018276Marc Gasol16106127612019
4629Danny Green16106127612019802018202Danny Green16106127472019
..............................
5331339Abdul Gaddy16106127602035832018561Abdul Gaddy16106127602019
5341342Andre Roberson16106127602034602018566Andre Roberson16106127602019
5351348Norvel Pelle1610612755203658201824Norvel Pelle16106127552019
5361350Denzel Valentine16106127411627756201858Denzel Valentine16106127412019
5371353C.J. Wilcox16106127542039122018573C.J. Wilcox16106127542019
\n", "

538 rows × 9 columns

\n", "
" ], "text/plain": [ " Unnamed: 0_x PLAYER_NAME_x TEAM_ID_x PLAYER_ID SEASON_x \\\n", "0 626 Kawhi Leonard 1610612761 202695 2018 \n", "1 627 Pascal Siakam 1610612761 1627783 2018 \n", "2 628 Marc Gasol 1610612761 201188 2018 \n", "3 1157 Marc Gasol 1610612763 201188 2018 \n", "4 629 Danny Green 1610612761 201980 2018 \n", ".. ... ... ... ... ... \n", "533 1339 Abdul Gaddy 1610612760 203583 2018 \n", "534 1342 Andre Roberson 1610612760 203460 2018 \n", "535 1348 Norvel Pelle 1610612755 203658 2018 \n", "536 1350 Denzel Valentine 1610612741 1627756 2018 \n", "537 1353 C.J. Wilcox 1610612754 203912 2018 \n", "\n", " Unnamed: 0_y PLAYER_NAME_y TEAM_ID_y SEASON_y \n", "0 299 Kawhi Leonard 1610612746 2019 \n", "1 275 Pascal Siakam 1610612761 2019 \n", "2 276 Marc Gasol 1610612761 2019 \n", "3 276 Marc Gasol 1610612761 2019 \n", "4 202 Danny Green 1610612747 2019 \n", ".. ... ... ... ... \n", "533 561 Abdul Gaddy 1610612760 2019 \n", "534 566 Andre Roberson 1610612760 2019 \n", "535 24 Norvel Pelle 1610612755 2019 \n", "536 58 Denzel Valentine 1610612741 2019 \n", "537 573 C.J. Wilcox 1610612754 2019 \n", "\n", "[538 rows x 9 columns]" ] }, "execution_count": 22, "metadata": {}, "output_type": "execute_result" } ], "source": [ "pd.merge(players18,players19,on='PLAYER_ID', how='inner')" ] }, { "cell_type": "markdown", "id": "13c0b233", "metadata": {}, "source": [ "When we use `outer` we get one row for each player who played in either season or both seasons. From this we can for example see who changed teams, who are the rookies in 2019 and who retired or was unsigned in 2019." ] }, { "cell_type": "code", "execution_count": 23, "id": "8b3bd2b6", "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
Unnamed: 0_xPLAYER_NAME_xTEAM_ID_xPLAYER_IDSEASON_xUnnamed: 0_yPLAYER_NAME_yTEAM_ID_ySEASON_y
0626.0Kawhi Leonard1.610613e+092026952018.0299.0Kawhi Leonard1.610613e+092019.0
1627.0Pascal Siakam1.610613e+0916277832018.0275.0Pascal Siakam1.610613e+092019.0
2628.0Marc Gasol1.610613e+092011882018.0276.0Marc Gasol1.610613e+092019.0
31157.0Marc Gasol1.610613e+092011882018.0276.0Marc Gasol1.610613e+092019.0
4629.0Danny Green1.610613e+092019802018.0202.0Danny Green1.610613e+092019.0
..............................
922NaNNaNNaN1629097NaN619.0Terry Larrier1.610613e+092019.0
923NaNNaNNaN203461NaN621.0Anthony Bennett1.610613e+092019.0
924NaNNaNNaN203906NaN623.0Devyn Marble1.610613e+092019.0
925NaNNaNNaN1629755NaN624.0Hassani Gravett1.610613e+092019.0
926NaNNaNNaN1629721NaN625.0JaKeenan Gant1.610613e+092019.0
\n", "

927 rows × 9 columns

\n", "
" ], "text/plain": [ " Unnamed: 0_x PLAYER_NAME_x TEAM_ID_x PLAYER_ID SEASON_x \\\n", "0 626.0 Kawhi Leonard 1.610613e+09 202695 2018.0 \n", "1 627.0 Pascal Siakam 1.610613e+09 1627783 2018.0 \n", "2 628.0 Marc Gasol 1.610613e+09 201188 2018.0 \n", "3 1157.0 Marc Gasol 1.610613e+09 201188 2018.0 \n", "4 629.0 Danny Green 1.610613e+09 201980 2018.0 \n", ".. ... ... ... ... ... \n", "922 NaN NaN NaN 1629097 NaN \n", "923 NaN NaN NaN 203461 NaN \n", "924 NaN NaN NaN 203906 NaN \n", "925 NaN NaN NaN 1629755 NaN \n", "926 NaN NaN NaN 1629721 NaN \n", "\n", " Unnamed: 0_y PLAYER_NAME_y TEAM_ID_y SEASON_y \n", "0 299.0 Kawhi Leonard 1.610613e+09 2019.0 \n", "1 275.0 Pascal Siakam 1.610613e+09 2019.0 \n", "2 276.0 Marc Gasol 1.610613e+09 2019.0 \n", "3 276.0 Marc Gasol 1.610613e+09 2019.0 \n", "4 202.0 Danny Green 1.610613e+09 2019.0 \n", ".. ... ... ... ... \n", "922 619.0 Terry Larrier 1.610613e+09 2019.0 \n", "923 621.0 Anthony Bennett 1.610613e+09 2019.0 \n", "924 623.0 Devyn Marble 1.610613e+09 2019.0 \n", "925 624.0 Hassani Gravett 1.610613e+09 2019.0 \n", "926 625.0 JaKeenan Gant 1.610613e+09 2019.0 \n", "\n", "[927 rows x 9 columns]" ] }, "execution_count": 23, "metadata": {}, "output_type": "execute_result" } ], "source": [ "pd.merge(players18,players19,on='PLAYER_ID', how='outer')" ] }, { "cell_type": "markdown", "id": "4ef712ed", "metadata": {}, "source": [ "Using `left` gives us the `'PLAYER_ID'` that are in the left(`players18`) DataFrame, including those that are in both DataFrame\n", "`right` would give players in the `players19` DataFrame or both DataFrames. With this result, we can see who retired, but not the 2019 rookies." ] }, { "cell_type": "code", "execution_count": 24, "id": "9e47539c", "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
Unnamed: 0_xPLAYER_NAME_xTEAM_ID_xPLAYER_IDSEASON_xUnnamed: 0_yPLAYER_NAME_yTEAM_ID_ySEASON_y
0626Kawhi Leonard16106127612026952018299.0Kawhi Leonard1.610613e+092019.0
1627Pascal Siakam161061276116277832018275.0Pascal Siakam1.610613e+092019.0
2628Marc Gasol16106127612011882018276.0Marc Gasol1.610613e+092019.0
3629Danny Green16106127612019802018202.0Danny Green1.610613e+092019.0
4630Kyle Lowry16106127612007682018451.0Kyle Lowry1.610613e+092019.0
..............................
7491369Tyrius Walker161061275216292462018NaNNaNNaNNaN
7501370Marcus Lee161061274816291592018NaNNaNNaNNaN
7511371Trey Lewis161061276216291632018NaNNaNNaNNaN
7521372Emanuel Terry161061274316291502018NaNNaNNaNNaN
7531373Justin Bibbs161061273816291672018NaNNaNNaNNaN
\n", "

754 rows × 9 columns

\n", "
" ], "text/plain": [ " Unnamed: 0_x PLAYER_NAME_x TEAM_ID_x PLAYER_ID SEASON_x \\\n", "0 626 Kawhi Leonard 1610612761 202695 2018 \n", "1 627 Pascal Siakam 1610612761 1627783 2018 \n", "2 628 Marc Gasol 1610612761 201188 2018 \n", "3 629 Danny Green 1610612761 201980 2018 \n", "4 630 Kyle Lowry 1610612761 200768 2018 \n", ".. ... ... ... ... ... \n", "749 1369 Tyrius Walker 1610612752 1629246 2018 \n", "750 1370 Marcus Lee 1610612748 1629159 2018 \n", "751 1371 Trey Lewis 1610612762 1629163 2018 \n", "752 1372 Emanuel Terry 1610612743 1629150 2018 \n", "753 1373 Justin Bibbs 1610612738 1629167 2018 \n", "\n", " Unnamed: 0_y PLAYER_NAME_y TEAM_ID_y SEASON_y \n", "0 299.0 Kawhi Leonard 1.610613e+09 2019.0 \n", "1 275.0 Pascal Siakam 1.610613e+09 2019.0 \n", "2 276.0 Marc Gasol 1.610613e+09 2019.0 \n", "3 202.0 Danny Green 1.610613e+09 2019.0 \n", "4 451.0 Kyle Lowry 1.610613e+09 2019.0 \n", ".. ... ... ... ... \n", "749 NaN NaN NaN NaN \n", "750 NaN NaN NaN NaN \n", "751 NaN NaN NaN NaN \n", "752 NaN NaN NaN NaN \n", "753 NaN NaN NaN NaN \n", "\n", "[754 rows x 9 columns]" ] }, "execution_count": 24, "metadata": {}, "output_type": "execute_result" } ], "source": [ "pd.merge(players18,players19,on='PLAYER_ID', how='left')" ] }, { "cell_type": "markdown", "id": "0fa8fb6e", "metadata": {}, "source": [ "## Try it yourself\n", "\n", "Try different merges and inspect them:\n", "- how many rows & columns?\n", "- Where are NaN values inserted?\n", "- What rows from the original datasets are not included?\n", "- describe each type of merge in your own words\n", "\n", "\n", "Split a DataFrame into separate data frames by subsetting the columns and indexing the rows with `loc`, then use concat to put it back together. Programmatically check that it's back together correctly." ] } ], "metadata": { "jupytext": { "text_representation": { "extension": ".md", "format_name": "myst", "format_version": 0.12, "jupytext_version": "1.6.0" } }, "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.7.10" }, "source_map": [ 12, 16, 18, 28, 31, 37, 41, 43, 45, 49, 51, 53, 56, 59, 61, 63, 66, 71, 74, 76, 79, 82, 86, 90, 94, 96, 98, 100, 107, 111, 115, 117, 121, 123, 128, 130, 133, 135, 140, 142, 148, 150, 156, 158 ] }, "nbformat": 4, "nbformat_minor": 5 }