Skip to article frontmatterSkip to article content

Pandas Data Frames

Environment notes

See the new, more detailed python instructions. Even if you had things working the updated class environment file: rhodyds_environment.yml has some additional libraries lists.

Then use:

conda activate rhodyds

to have everything available.

If you had a certificate problem, this script should help from the python install.

Tabular Data

Structured data is easier to work with than other data.

We’re going to focus on tabular data for now. At the end of the course, we’ll examine images, which are structured, but more complex and text, which is much less structured.

coffee_data_url = 'https://raw.githubusercontent.com/jldbc/coffee-quality-database/refs/heads/master/data/robusta_data_cleaned.csv'

Pandas

We will use data with a library called pandas. By convention, we import it like:

import pandas as pd

the as keywored creates an alias for the library name.

We can use read_csv to load the data

pd.read_csv(coffee_data_url)
Loading...

We note that one of the columns is named Unnamed 0: because it does not have a name in the file, for that column.

We can use shift + tab to see help. We see that the index_col parameter to treat that column as the index.

coffee_df = pd.read_csv(coffee_data_url,index_col=0)

DataFrame parts

first we will check the type

type(coffee_df)
pandas.core.frame.DataFrame

we saw that we can check the first 5 rows with head

coffee_df.head()
Loading...

Now it looks neater and tht columns is the index.

coffee_df.head(2)
Loading...
coffee_df.tail(1)
Loading...

We can look at all of the column names:

coffee_df.columns
Index(['Species', 'Owner', 'Country.of.Origin', 'Farm.Name', 'Lot.Number', 'Mill', 'ICO.Number', 'Company', 'Altitude', 'Region', 'Producer', 'Number.of.Bags', 'Bag.Weight', 'In.Country.Partner', 'Harvest.Year', 'Grading.Date', 'Owner.1', 'Variety', 'Processing.Method', 'Fragrance...Aroma', 'Flavor', 'Aftertaste', 'Salt...Acid', 'Bitter...Sweet', 'Mouthfeel', 'Uniform.Cup', 'Clean.Cup', 'Balance', 'Cupper.Points', 'Total.Cup.Points', 'Moisture', 'Category.One.Defects', 'Quakers', 'Color', 'Category.Two.Defects', 'Expiration', 'Certification.Body', 'Certification.Address', 'Certification.Contact', 'unit_of_measurement', 'altitude_low_meters', 'altitude_high_meters', 'altitude_mean_meters'], dtype='object')

The rows are named by the index attribute

coffee_df.index
Index([ 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28], dtype='int64')

Both of these are pandas Index objects

We can also take out only the values without the index or the column names.

coffee_df.values
array([['Robusta', 'ankole coffee producers coop', 'Uganda', ..., 1488.0, 1488.0, 1488.0], ['Robusta', 'nishant gurjer', 'India', ..., 3170.0, 3170.0, 3170.0], ['Robusta', 'andrew hetzel', 'India', ..., 1000.0, 1000.0, 1000.0], ..., ['Robusta', 'james moore', 'United States', ..., 795.0, 795.0, 795.0], ['Robusta', 'cafe politico', 'India', ..., nan, nan, nan], ['Robusta', 'cafe politico', 'Vietnam', ..., nan, nan, nan]], shape=(28, 43), dtype=object)

We can also look at the shape

coffee_df.shape
(28, 43)
coffee_df.head(1)
Loading...

Segmenting rows or columns

We can pick out a column using it’s name and []

coffee_df['Species']
1 Robusta 2 Robusta 3 Robusta 4 Robusta 5 Robusta 6 Robusta 7 Robusta 8 Robusta 9 Robusta 10 Robusta 11 Robusta 12 Robusta 13 Robusta 14 Robusta 15 Robusta 16 Robusta 17 Robusta 18 Robusta 19 Robusta 20 Robusta 21 Robusta 22 Robusta 23 Robusta 24 Robusta 25 Robusta 26 Robusta 27 Robusta 28 Robusta Name: Species, dtype: object

this is a pandas Series

type(coffee_df['Species'])
pandas.core.series.Series

Or we can pick a row by label with loc

coffee_df.loc[1]
Species Robusta Owner ankole coffee producers coop Country.of.Origin Uganda Farm.Name kyangundu cooperative society Lot.Number NaN Mill ankole coffee producers ICO.Number 0 Company ankole coffee producers coop Altitude 1488 Region sheema south western Producer Ankole coffee producers coop Number.of.Bags 300 Bag.Weight 60 kg In.Country.Partner Uganda Coffee Development Authority Harvest.Year 2013 Grading.Date June 26th, 2014 Owner.1 Ankole coffee producers coop Variety NaN Processing.Method NaN Fragrance...Aroma 7.83 Flavor 8.08 Aftertaste 7.75 Salt...Acid 7.92 Bitter...Sweet 8.0 Mouthfeel 8.25 Uniform.Cup 10.0 Clean.Cup 10.0 Balance 7.92 Cupper.Points 8.0 Total.Cup.Points 83.75 Moisture 0.12 Category.One.Defects 0 Quakers 0 Color Green Category.Two.Defects 2 Expiration June 26th, 2015 Certification.Body Uganda Coffee Development Authority Certification.Address e36d0270932c3b657e96b7b0278dfd85dc0fe743 Certification.Contact 03077a1c6bac60e6f514691634a7f6eb5c85aae8 unit_of_measurement m altitude_low_meters 1488.0 altitude_high_meters 1488.0 altitude_mean_meters 1488.0 Name: 1, dtype: object

or by number with iloc

coffee_df.iloc[1:5:2]
Loading...

Masking to Select Subsets

We can use a boolean mask or Series with boolean values, to pick out a subset of rows.

To build up to that, we will first evaluate a boolean expression on a column.

See a visualization of this on a smaller dataset

coffee_df['Flavor']>7.8
1 True 2 False 3 True 4 True 5 True 6 True 7 False 8 False 9 False 10 True 11 True 12 True 13 False 14 False 15 False 16 False 17 False 18 False 19 False 20 False 21 False 22 False 23 False 24 False 25 False 26 False 27 False 28 False Name: Flavor, dtype: bool

Now, we can save it to a variable, and then use it to mask, by treating it as an index.

high_flavor = coffee_df['Flavor']>7.8
coffee_df[high_flavor]
Loading...

this segments out only the selected rows.

General df information

We can get an over view

coffee_df.info()
<class 'pandas.core.frame.DataFrame'>
Index: 28 entries, 1 to 28
Data columns (total 43 columns):
 #   Column                 Non-Null Count  Dtype  
---  ------                 --------------  -----  
 0   Species                28 non-null     object 
 1   Owner                  28 non-null     object 
 2   Country.of.Origin      28 non-null     object 
 3   Farm.Name              25 non-null     object 
 4   Lot.Number             6 non-null      object 
 5   Mill                   20 non-null     object 
 6   ICO.Number             17 non-null     object 
 7   Company                28 non-null     object 
 8   Altitude               25 non-null     object 
 9   Region                 26 non-null     object 
 10  Producer               26 non-null     object 
 11  Number.of.Bags         28 non-null     int64  
 12  Bag.Weight             28 non-null     object 
 13  In.Country.Partner     28 non-null     object 
 14  Harvest.Year           28 non-null     int64  
 15  Grading.Date           28 non-null     object 
 16  Owner.1                28 non-null     object 
 17  Variety                3 non-null      object 
 18  Processing.Method      10 non-null     object 
 19  Fragrance...Aroma      28 non-null     float64
 20  Flavor                 28 non-null     float64
 21  Aftertaste             28 non-null     float64
 22  Salt...Acid            28 non-null     float64
 23  Bitter...Sweet         28 non-null     float64
 24  Mouthfeel              28 non-null     float64
 25  Uniform.Cup            28 non-null     float64
 26  Clean.Cup              28 non-null     float64
 27  Balance                28 non-null     float64
 28  Cupper.Points          28 non-null     float64
 29  Total.Cup.Points       28 non-null     float64
 30  Moisture               28 non-null     float64
 31  Category.One.Defects   28 non-null     int64  
 32  Quakers                28 non-null     int64  
 33  Color                  25 non-null     object 
 34  Category.Two.Defects   28 non-null     int64  
 35  Expiration             28 non-null     object 
 36  Certification.Body     28 non-null     object 
 37  Certification.Address  28 non-null     object 
 38  Certification.Contact  28 non-null     object 
 39  unit_of_measurement    28 non-null     object 
 40  altitude_low_meters    25 non-null     float64
 41  altitude_high_meters   25 non-null     float64
 42  altitude_mean_meters   25 non-null     float64
dtypes: float64(15), int64(5), object(23)
memory usage: 10.7+ KB

We can view the the pandas dtypes. This is close to the regular data type of the content, but with some differences.

coffee_df.dtypes
Species object Owner object Country.of.Origin object Farm.Name object Lot.Number object Mill object ICO.Number object Company object Altitude object Region object Producer object Number.of.Bags int64 Bag.Weight object In.Country.Partner object Harvest.Year int64 Grading.Date object Owner.1 object Variety object Processing.Method object Fragrance...Aroma float64 Flavor float64 Aftertaste float64 Salt...Acid float64 Bitter...Sweet float64 Mouthfeel float64 Uniform.Cup float64 Clean.Cup float64 Balance float64 Cupper.Points float64 Total.Cup.Points float64 Moisture float64 Category.One.Defects int64 Quakers int64 Color object Category.Two.Defects int64 Expiration object Certification.Body object Certification.Address object Certification.Contact object unit_of_measurement object altitude_low_meters float64 altitude_high_meters float64 altitude_mean_meters float64 dtype: object
coffee_df.head().tail(1)
Loading...
print(coffee_df)
    Species                              Owner Country.of.Origin  \
1   Robusta       ankole coffee producers coop            Uganda   
2   Robusta                     nishant gurjer             India   
3   Robusta                      andrew hetzel             India   
4   Robusta                             ugacof            Uganda   
5   Robusta       katuka development trust ltd            Uganda   
6   Robusta                      andrew hetzel             India   
7   Robusta                      andrew hetzel             India   
8   Robusta                     nishant gurjer             India   
9   Robusta                     nishant gurjer             India   
10  Robusta                             ugacof            Uganda   
11  Robusta                             ugacof            Uganda   
12  Robusta                     nishant gurjer             India   
13  Robusta                      andrew hetzel             India   
14  Robusta  kasozi coffee farmers association            Uganda   
15  Robusta       ankole coffee producers coop            Uganda   
16  Robusta                      andrew hetzel             India   
17  Robusta                      andrew hetzel             India   
18  Robusta                 kawacom uganda ltd            Uganda   
19  Robusta                      nitubaasa ltd            Uganda   
20  Robusta              mannya coffee project            Uganda   
21  Robusta                      andrew hetzel             India   
22  Robusta                      andrew hetzel             India   
23  Robusta                      andrew hetzel     United States   
24  Robusta                        luis robles           Ecuador   
25  Robusta                        luis robles           Ecuador   
26  Robusta                        james moore     United States   
27  Robusta                      cafe politico             India   
28  Robusta                      cafe politico           Vietnam   

                             Farm.Name Lot.Number  \
1        kyangundu cooperative society        NaN   
2       sethuraman estate kaapi royale         25   
3                    sethuraman estate        NaN   
4                  ugacof project area        NaN   
5   katikamu capca farmers association        NaN   
6                                  NaN        NaN   
7                   sethuraman estates        NaN   
8       sethuraman estate kaapi royale          7   
9                    sethuraman estate        RKR   
10                              ishaka        NaN   
11                 ugacof project area        NaN   
12      sethuraman estate kaapi royale      RC AB   
13                  sethuraman estates        NaN   
14               kasozi coffee farmers        NaN   
15              kyangundu coop society        NaN   
16                   sethuraman estate        NaN   
17                  sethuraman estates        NaN   
18                            bushenyi        NaN   
19   kigezi coffee farmers association        NaN   
20               mannya coffee project        NaN   
21                  sethuraman estates        NaN   
22                  sethuraman estates        NaN   
23                  sethuraman estates        NaN   
24                           robustasa   Lavado 1   
25                           robustasa   Lavado 3   
26                     fazenda cazengo        NaN   
27                                 NaN        NaN   
28                                 NaN        NaN   

                                      Mill         ICO.Number  \
1                  ankole coffee producers                  0   
2                        sethuraman estate    14/1148/2017/21   
3                                      NaN               0000   
4                                   ugacof                  0   
5                 katuka development trust                  0   
6                                   (self)                NaN   
7                                      NaN                NaN   
8                        sethuraman estate    14/1148/2017/18   
9                        sethuraman estate    14/1148/2016/17   
10                            nsubuga umar                  0   
11                                  ugacof                  0   
12                       sethuraman estate    14/1148/2016/12   
13                                     NaN                NaN   
14                                     NaN                  0   
15  ankole coffee producers coop union ltd                  0   
16                                     NaN               0000   
17                      sethuraman estates                NaN   
18                                 kawacom                  0   
19                               nitubaasa                  0   
20                   mannya coffee project                  0   
21                                     NaN                NaN   
22                      sethuraman estates                NaN   
23                      sethuraman estates                NaN   
24                             our own lab                NaN   
25                          own laboratory                NaN   
26                            cafe cazengo                NaN   
27                                     NaN  14-1118-2014-0087   
28                                     NaN                NaN   

                              Company    Altitude  \
1        ankole coffee producers coop        1488   
2                        kaapi royale        3170   
3                   sethuraman estate       1000m   
4                          ugacof ltd        1212   
5        katuka development trust ltd   1200-1300   
6                     cafemakers, llc       3000'   
7                          cafemakers        750m   
8                        kaapi royale        3140   
9                        kaapi royale        1000   
10                         ugacof ltd    900-1300   
11                         ugacof ltd        1095   
12                       kaapi royale        1000   
13                         cafemakers        750m   
14  kasozi coffee farmers association        1367   
15       ankole coffee producers coop        1488   
16                  sethuraman estate       1000m   
17                    cafemakers, llc        750m   
18                 kawacom uganda ltd        1600   
19                      nitubaasa ltd        1745   
20              mannya coffee project        1200   
21                         cafemakers        750m   
22                    cafemakers, llc        750m   
23                    cafemakers, llc       3000'   
24                          robustasa         NaN   
25                          robustasa          40   
26            global opportunity fund  795 meters   
27                      cafe politico         NaN   
28                      cafe politico         NaN   

                           Region  ...         Color  Category.Two.Defects  \
1            sheema south western  ...         Green                     2   
2     chikmagalur karnataka indua  ...           NaN                     2   
3                     chikmagalur  ...         Green                     0   
4                         central  ...         Green                     7   
5           luwero central region  ...         Green                     3   
6                     chikmagalur  ...         Green                     0   
7                     chikmagalur  ...         Green                     0   
8     chikmagalur karnataka india  ...  Bluish-Green                     0   
9           chikmagalur karnataka  ...         Green                     0   
10                        western  ...         Green                     6   
11       iganga namadrope eastern  ...         Green                     1   
12          chikmagalur karnataka  ...         Green                     0   
13                    chikmagalur  ...         Green                     1   
14                        eastern  ...         Green                     7   
15                  south western  ...         Green                     2   
16                    chikmagalur  ...         Green                     0   
17                    chikmagalur  ...    Blue-Green                     0   
18                        western  ...         Green                     1   
19                        western  ...         Green                     2   
20                       southern  ...         Green                     1   
21                    chikmagalur  ...  Bluish-Green                     1   
22                    chikmagalur  ...         Green                     0   
23                    chikmagalur  ...         Green                     0   
24               san juan, playas  ...    Blue-Green                     1   
25               san juan, playas  ...    Blue-Green                     0   
26  kwanza norte province, angola  ...           NaN                     6   
27                            NaN  ...         Green                     1   
28                            NaN  ...           NaN                     9   

             Expiration                   Certification.Body  \
1       June 26th, 2015  Uganda Coffee Development Authority   
2    October 31st, 2018         Specialty Coffee Association   
3      April 29th, 2016         Specialty Coffee Association   
4       July 14th, 2015  Uganda Coffee Development Authority   
5       June 26th, 2015  Uganda Coffee Development Authority   
6   February 28th, 2013         Specialty Coffee Association   
7        May 15th, 2015         Specialty Coffee Association   
8    October 25th, 2018         Specialty Coffee Association   
9     August 17th, 2017         Specialty Coffee Association   
10     August 5th, 2015  Uganda Coffee Development Authority   
11      June 26th, 2015  Uganda Coffee Development Authority   
12    August 23rd, 2017         Specialty Coffee Association   
13       May 19th, 2015         Specialty Coffee Association   
14      July 14th, 2015  Uganda Coffee Development Authority   
15      July 14th, 2015  Uganda Coffee Development Authority   
16     April 29th, 2016         Specialty Coffee Association   
17       June 3rd, 2014         Specialty Coffee Association   
18      June 27th, 2015  Uganda Coffee Development Authority   
19      June 27th, 2015  Uganda Coffee Development Authority   
20      June 27th, 2015  Uganda Coffee Development Authority   
21       May 19th, 2015         Specialty Coffee Association   
22      June 20th, 2014         Specialty Coffee Association   
23  February 28th, 2013         Specialty Coffee Association   
24   January 18th, 2017         Specialty Coffee Association   
25   January 18th, 2017         Specialty Coffee Association   
26  December 23rd, 2015         Specialty Coffee Association   
27    August 25th, 2015         Specialty Coffee Association   
28    August 25th, 2015         Specialty Coffee Association   

                       Certification.Address  \
1   e36d0270932c3b657e96b7b0278dfd85dc0fe743   
2   ff7c18ad303d4b603ac3f8cff7e611ffc735e720   
3   ff7c18ad303d4b603ac3f8cff7e611ffc735e720   
4   e36d0270932c3b657e96b7b0278dfd85dc0fe743   
5   e36d0270932c3b657e96b7b0278dfd85dc0fe743   
6   ff7c18ad303d4b603ac3f8cff7e611ffc735e720   
7   ff7c18ad303d4b603ac3f8cff7e611ffc735e720   
8   ff7c18ad303d4b603ac3f8cff7e611ffc735e720   
9   ff7c18ad303d4b603ac3f8cff7e611ffc735e720   
10  e36d0270932c3b657e96b7b0278dfd85dc0fe743   
11  e36d0270932c3b657e96b7b0278dfd85dc0fe743   
12  ff7c18ad303d4b603ac3f8cff7e611ffc735e720   
13  ff7c18ad303d4b603ac3f8cff7e611ffc735e720   
14  e36d0270932c3b657e96b7b0278dfd85dc0fe743   
15  e36d0270932c3b657e96b7b0278dfd85dc0fe743   
16  ff7c18ad303d4b603ac3f8cff7e611ffc735e720   
17  ff7c18ad303d4b603ac3f8cff7e611ffc735e720   
18  e36d0270932c3b657e96b7b0278dfd85dc0fe743   
19  e36d0270932c3b657e96b7b0278dfd85dc0fe743   
20  e36d0270932c3b657e96b7b0278dfd85dc0fe743   
21  ff7c18ad303d4b603ac3f8cff7e611ffc735e720   
22  ff7c18ad303d4b603ac3f8cff7e611ffc735e720   
23  ff7c18ad303d4b603ac3f8cff7e611ffc735e720   
24  ff7c18ad303d4b603ac3f8cff7e611ffc735e720   
25  ff7c18ad303d4b603ac3f8cff7e611ffc735e720   
26  ff7c18ad303d4b603ac3f8cff7e611ffc735e720   
27  ff7c18ad303d4b603ac3f8cff7e611ffc735e720   
28  ff7c18ad303d4b603ac3f8cff7e611ffc735e720   

                       Certification.Contact unit_of_measurement  \
1   03077a1c6bac60e6f514691634a7f6eb5c85aae8                   m   
2   352d0cf7f3e9be14dad7df644ad65efc27605ae2                   m   
3   352d0cf7f3e9be14dad7df644ad65efc27605ae2                   m   
4   03077a1c6bac60e6f514691634a7f6eb5c85aae8                   m   
5   03077a1c6bac60e6f514691634a7f6eb5c85aae8                   m   
6   352d0cf7f3e9be14dad7df644ad65efc27605ae2                   m   
7   352d0cf7f3e9be14dad7df644ad65efc27605ae2                   m   
8   352d0cf7f3e9be14dad7df644ad65efc27605ae2                   m   
9   352d0cf7f3e9be14dad7df644ad65efc27605ae2                   m   
10  03077a1c6bac60e6f514691634a7f6eb5c85aae8                   m   
11  03077a1c6bac60e6f514691634a7f6eb5c85aae8                   m   
12  352d0cf7f3e9be14dad7df644ad65efc27605ae2                   m   
13  352d0cf7f3e9be14dad7df644ad65efc27605ae2                   m   
14  03077a1c6bac60e6f514691634a7f6eb5c85aae8                   m   
15  03077a1c6bac60e6f514691634a7f6eb5c85aae8                   m   
16  352d0cf7f3e9be14dad7df644ad65efc27605ae2                   m   
17  352d0cf7f3e9be14dad7df644ad65efc27605ae2                   m   
18  03077a1c6bac60e6f514691634a7f6eb5c85aae8                   m   
19  03077a1c6bac60e6f514691634a7f6eb5c85aae8                   m   
20  03077a1c6bac60e6f514691634a7f6eb5c85aae8                   m   
21  352d0cf7f3e9be14dad7df644ad65efc27605ae2                   m   
22  352d0cf7f3e9be14dad7df644ad65efc27605ae2                   m   
23  352d0cf7f3e9be14dad7df644ad65efc27605ae2                   m   
24  352d0cf7f3e9be14dad7df644ad65efc27605ae2                   m   
25  352d0cf7f3e9be14dad7df644ad65efc27605ae2                   m   
26  352d0cf7f3e9be14dad7df644ad65efc27605ae2                   m   
27  352d0cf7f3e9be14dad7df644ad65efc27605ae2                   m   
28  352d0cf7f3e9be14dad7df644ad65efc27605ae2                   m   

   altitude_low_meters altitude_high_meters  altitude_mean_meters  
1               1488.0               1488.0                1488.0  
2               3170.0               3170.0                3170.0  
3               1000.0               1000.0                1000.0  
4               1212.0               1212.0                1212.0  
5               1200.0               1300.0                1250.0  
6               3000.0               3000.0                3000.0  
7                750.0                750.0                 750.0  
8               3140.0               3140.0                3140.0  
9               1000.0               1000.0                1000.0  
10               900.0               1300.0                1100.0  
11              1095.0               1095.0                1095.0  
12              1000.0               1000.0                1000.0  
13               750.0                750.0                 750.0  
14              1367.0               1367.0                1367.0  
15              1488.0               1488.0                1488.0  
16              1000.0               1000.0                1000.0  
17               750.0                750.0                 750.0  
18              1600.0               1600.0                1600.0  
19              1745.0               1745.0                1745.0  
20              1200.0               1200.0                1200.0  
21               750.0                750.0                 750.0  
22               750.0                750.0                 750.0  
23              3000.0               3000.0                3000.0  
24                 NaN                  NaN                   NaN  
25                40.0                 40.0                  40.0  
26               795.0                795.0                 795.0  
27                 NaN                  NaN                   NaN  
28                 NaN                  NaN                   NaN  

[28 rows x 43 columns]

Creating a Data Frame

We can use the constructor to make a DataFrame from many different input formats.

For example with a list of lists

tiny_data_list = [[1,3,4],[2,6,1]]
pd.DataFrame(data =tiny_data_list, columns=['a','b','c'])
Loading...

or a dictionary of lists

tiny_data_dict = {'a':[1,2],'b':[3,6],'c':[4,1]}
pd.DataFrame(tiny_data_dict)
Loading...

Reading from a website

We saw in the help that pandas has a lot of read_ methods. You can also see a list on the IO page of the docs

grading_page = 'https://rhodyprog4ds.github.io/BrownFall25/grading/'

We can read from html files too! (websites)

This has a long output so I have hidden it by default, but you can view it.

pd.read_html(grading_page)
Output
[ outcome A1 A2 A3 A4 A5 A6 A7 A8 A9 A10 0 process 🟨 🟨 🟨 🟨 ✅ ✅ ✅ ✅ ✅ ✅ 1 communication ✅ ✅ ✅ ✅ ✅ ✅ ✅ ✅ ✅ ✅ 2 data ✅ ✅ ✅ ✅ ✅ ✅ ✅ ✅ ✅ ✅ 3 modeling NaN NaN NaN NaN NaN NaN ✅ ✅ ✅ ✅ 4 exploratory 🟨 ✅ ✅ ✅ ✅ ✅ ✅ ✅ ✅ ✅, outcome level A A- B+ B B- C+ C C- D+ D F 0 experience active 15 15 15 15 10 10 10 5 5 5 0 1 experience present 22 22 22 22 22 22 22 22 22 22 0 2 process innovative 5 4 3 2 1 0 0 0 0 0 0 3 process complete 5 5 5 5 4 4 3 2 2 1 0 4 process partial 5 5 5 5 5 5 5 5 5 3 0 5 communication innovative 3 2 1 0 0 0 0 0 0 0 0 6 communication complete 8 8 8 8 0 0 0 0 0 0 0 7 communication partial 5 5 5 5 5 5 5 5 5 3 0 8 data innovative 3 2 1 0 0 0 0 0 0 0 0 9 data complete 8 8 8 8 7 6 5 4 3 2 0 10 modeling innovative 3 2 1 0 0 0 0 0 0 0 0 11 modeling complete 4 4 4 4 3 3 2 2 1 1 0 12 exploratory innovative 3 2 1 0 0 0 0 0 0 0 0 13 exploratory complete 6 6 6 6 5 4 3 2 1 1 0]

this does not look like the others, it is a list of the DataFrames for all of the tables on the page. Even if there is only one, it will still be in a list.

Now that we know it is a list, we will save it in a variable and then, pick out one DataFrame.

If you go to the page you can see that there are two tables there. Let’s take the second one.

grading_tables_list = pd.read_html(grading_page)
min_earnings_df = grading_tables_list[1]
min_earnings_df
Loading...

This table is from this website so in the notes we can also view it as a table to see that it is the same.

We can then verify that it is a DataFrame

type(min_earnings_df)
pandas.core.frame.DataFrame

Questions