3. Data Frames and other iterables#

Today, we’re going to explore DataFrames in greater detail. We’ll continue using that same coffee dataset.

import pandas as pd
coffee_data_url = 'https://raw.githubusercontent.com/jldbc/coffee-quality-database/master/data/robusta_data_cleaned.csv'
coffee_df =pd.read_csv(coffee_data_url)

Important

A reason to use Jupyter is that it formats the output to be more readable. Compare the view of the DataFrame with jupyter and without.

Jupyter uses the object’s to_html method if it exists, where the print function casts the object to a string.

coffee_df
Unnamed: 0 Species Owner Country.of.Origin Farm.Name Lot.Number Mill ICO.Number Company Altitude ... Color Category.Two.Defects Expiration Certification.Body Certification.Address Certification.Contact unit_of_measurement altitude_low_meters altitude_high_meters altitude_mean_meters
0 1 Robusta ankole coffee producers coop Uganda kyangundu cooperative society NaN ankole coffee producers 0 ankole coffee producers coop 1488 ... Green 2 June 26th, 2015 Uganda Coffee Development Authority e36d0270932c3b657e96b7b0278dfd85dc0fe743 03077a1c6bac60e6f514691634a7f6eb5c85aae8 m 1488.0 1488.0 1488.0
1 2 Robusta nishant gurjer India sethuraman estate kaapi royale 25 sethuraman estate 14/1148/2017/21 kaapi royale 3170 ... NaN 2 October 31st, 2018 Specialty Coffee Association ff7c18ad303d4b603ac3f8cff7e611ffc735e720 352d0cf7f3e9be14dad7df644ad65efc27605ae2 m 3170.0 3170.0 3170.0
2 3 Robusta andrew hetzel India sethuraman estate NaN NaN 0000 sethuraman estate 1000m ... Green 0 April 29th, 2016 Specialty Coffee Association ff7c18ad303d4b603ac3f8cff7e611ffc735e720 352d0cf7f3e9be14dad7df644ad65efc27605ae2 m 1000.0 1000.0 1000.0
3 4 Robusta ugacof Uganda ugacof project area NaN ugacof 0 ugacof ltd 1212 ... Green 7 July 14th, 2015 Uganda Coffee Development Authority e36d0270932c3b657e96b7b0278dfd85dc0fe743 03077a1c6bac60e6f514691634a7f6eb5c85aae8 m 1212.0 1212.0 1212.0
4 5 Robusta katuka development trust ltd Uganda katikamu capca farmers association NaN katuka development trust 0 katuka development trust ltd 1200-1300 ... Green 3 June 26th, 2015 Uganda Coffee Development Authority e36d0270932c3b657e96b7b0278dfd85dc0fe743 03077a1c6bac60e6f514691634a7f6eb5c85aae8 m 1200.0 1300.0 1250.0
5 6 Robusta andrew hetzel India NaN NaN (self) NaN cafemakers, llc 3000' ... Green 0 February 28th, 2013 Specialty Coffee Association ff7c18ad303d4b603ac3f8cff7e611ffc735e720 352d0cf7f3e9be14dad7df644ad65efc27605ae2 m 3000.0 3000.0 3000.0
6 7 Robusta andrew hetzel India sethuraman estates NaN NaN NaN cafemakers 750m ... Green 0 May 15th, 2015 Specialty Coffee Association ff7c18ad303d4b603ac3f8cff7e611ffc735e720 352d0cf7f3e9be14dad7df644ad65efc27605ae2 m 750.0 750.0 750.0
7 8 Robusta nishant gurjer India sethuraman estate kaapi royale 7 sethuraman estate 14/1148/2017/18 kaapi royale 3140 ... Bluish-Green 0 October 25th, 2018 Specialty Coffee Association ff7c18ad303d4b603ac3f8cff7e611ffc735e720 352d0cf7f3e9be14dad7df644ad65efc27605ae2 m 3140.0 3140.0 3140.0
8 9 Robusta nishant gurjer India sethuraman estate RKR sethuraman estate 14/1148/2016/17 kaapi royale 1000 ... Green 0 August 17th, 2017 Specialty Coffee Association ff7c18ad303d4b603ac3f8cff7e611ffc735e720 352d0cf7f3e9be14dad7df644ad65efc27605ae2 m 1000.0 1000.0 1000.0
9 10 Robusta ugacof Uganda ishaka NaN nsubuga umar 0 ugacof ltd 900-1300 ... Green 6 August 5th, 2015 Uganda Coffee Development Authority e36d0270932c3b657e96b7b0278dfd85dc0fe743 03077a1c6bac60e6f514691634a7f6eb5c85aae8 m 900.0 1300.0 1100.0
10 11 Robusta ugacof Uganda ugacof project area NaN ugacof 0 ugacof ltd 1095 ... Green 1 June 26th, 2015 Uganda Coffee Development Authority e36d0270932c3b657e96b7b0278dfd85dc0fe743 03077a1c6bac60e6f514691634a7f6eb5c85aae8 m 1095.0 1095.0 1095.0
11 12 Robusta nishant gurjer India sethuraman estate kaapi royale RC AB sethuraman estate 14/1148/2016/12 kaapi royale 1000 ... Green 0 August 23rd, 2017 Specialty Coffee Association ff7c18ad303d4b603ac3f8cff7e611ffc735e720 352d0cf7f3e9be14dad7df644ad65efc27605ae2 m 1000.0 1000.0 1000.0
12 13 Robusta andrew hetzel India sethuraman estates NaN NaN NaN cafemakers 750m ... Green 1 May 19th, 2015 Specialty Coffee Association ff7c18ad303d4b603ac3f8cff7e611ffc735e720 352d0cf7f3e9be14dad7df644ad65efc27605ae2 m 750.0 750.0 750.0
13 14 Robusta kasozi coffee farmers association Uganda kasozi coffee farmers NaN NaN 0 kasozi coffee farmers association 1367 ... Green 7 July 14th, 2015 Uganda Coffee Development Authority e36d0270932c3b657e96b7b0278dfd85dc0fe743 03077a1c6bac60e6f514691634a7f6eb5c85aae8 m 1367.0 1367.0 1367.0
14 15 Robusta ankole coffee producers coop Uganda kyangundu coop society NaN ankole coffee producers coop union ltd 0 ankole coffee producers coop 1488 ... Green 2 July 14th, 2015 Uganda Coffee Development Authority e36d0270932c3b657e96b7b0278dfd85dc0fe743 03077a1c6bac60e6f514691634a7f6eb5c85aae8 m 1488.0 1488.0 1488.0
15 16 Robusta andrew hetzel India sethuraman estate NaN NaN 0000 sethuraman estate 1000m ... Green 0 April 29th, 2016 Specialty Coffee Association ff7c18ad303d4b603ac3f8cff7e611ffc735e720 352d0cf7f3e9be14dad7df644ad65efc27605ae2 m 1000.0 1000.0 1000.0
16 17 Robusta andrew hetzel India sethuraman estates NaN sethuraman estates NaN cafemakers, llc 750m ... Blue-Green 0 June 3rd, 2014 Specialty Coffee Association ff7c18ad303d4b603ac3f8cff7e611ffc735e720 352d0cf7f3e9be14dad7df644ad65efc27605ae2 m 750.0 750.0 750.0
17 18 Robusta kawacom uganda ltd Uganda bushenyi NaN kawacom 0 kawacom uganda ltd 1600 ... Green 1 June 27th, 2015 Uganda Coffee Development Authority e36d0270932c3b657e96b7b0278dfd85dc0fe743 03077a1c6bac60e6f514691634a7f6eb5c85aae8 m 1600.0 1600.0 1600.0
18 19 Robusta nitubaasa ltd Uganda kigezi coffee farmers association NaN nitubaasa 0 nitubaasa ltd 1745 ... Green 2 June 27th, 2015 Uganda Coffee Development Authority e36d0270932c3b657e96b7b0278dfd85dc0fe743 03077a1c6bac60e6f514691634a7f6eb5c85aae8 m 1745.0 1745.0 1745.0
19 20 Robusta mannya coffee project Uganda mannya coffee project NaN mannya coffee project 0 mannya coffee project 1200 ... Green 1 June 27th, 2015 Uganda Coffee Development Authority e36d0270932c3b657e96b7b0278dfd85dc0fe743 03077a1c6bac60e6f514691634a7f6eb5c85aae8 m 1200.0 1200.0 1200.0
20 21 Robusta andrew hetzel India sethuraman estates NaN NaN NaN cafemakers 750m ... Bluish-Green 1 May 19th, 2015 Specialty Coffee Association ff7c18ad303d4b603ac3f8cff7e611ffc735e720 352d0cf7f3e9be14dad7df644ad65efc27605ae2 m 750.0 750.0 750.0
21 22 Robusta andrew hetzel India sethuraman estates NaN sethuraman estates NaN cafemakers, llc 750m ... Green 0 June 20th, 2014 Specialty Coffee Association ff7c18ad303d4b603ac3f8cff7e611ffc735e720 352d0cf7f3e9be14dad7df644ad65efc27605ae2 m 750.0 750.0 750.0
22 23 Robusta andrew hetzel United States sethuraman estates NaN sethuraman estates NaN cafemakers, llc 3000' ... Green 0 February 28th, 2013 Specialty Coffee Association ff7c18ad303d4b603ac3f8cff7e611ffc735e720 352d0cf7f3e9be14dad7df644ad65efc27605ae2 m 3000.0 3000.0 3000.0
23 24 Robusta luis robles Ecuador robustasa Lavado 1 our own lab NaN robustasa NaN ... Blue-Green 1 January 18th, 2017 Specialty Coffee Association ff7c18ad303d4b603ac3f8cff7e611ffc735e720 352d0cf7f3e9be14dad7df644ad65efc27605ae2 m NaN NaN NaN
24 25 Robusta luis robles Ecuador robustasa Lavado 3 own laboratory NaN robustasa 40 ... Blue-Green 0 January 18th, 2017 Specialty Coffee Association ff7c18ad303d4b603ac3f8cff7e611ffc735e720 352d0cf7f3e9be14dad7df644ad65efc27605ae2 m 40.0 40.0 40.0
25 26 Robusta james moore United States fazenda cazengo NaN cafe cazengo NaN global opportunity fund 795 meters ... NaN 6 December 23rd, 2015 Specialty Coffee Association ff7c18ad303d4b603ac3f8cff7e611ffc735e720 352d0cf7f3e9be14dad7df644ad65efc27605ae2 m 795.0 795.0 795.0
26 27 Robusta cafe politico India NaN NaN NaN 14-1118-2014-0087 cafe politico NaN ... Green 1 August 25th, 2015 Specialty Coffee Association ff7c18ad303d4b603ac3f8cff7e611ffc735e720 352d0cf7f3e9be14dad7df644ad65efc27605ae2 m NaN NaN NaN
27 28 Robusta cafe politico Vietnam NaN NaN NaN NaN cafe politico NaN ... None 9 August 25th, 2015 Specialty Coffee Association ff7c18ad303d4b603ac3f8cff7e611ffc735e720 352d0cf7f3e9be14dad7df644ad65efc27605ae2 m NaN NaN NaN

28 rows × 44 columns

print(coffee_df)
    Unnamed: 0  Species                              Owner Country.of.Origin  \
0            1  Robusta       ankole coffee producers coop            Uganda   
1            2  Robusta                     nishant gurjer             India   
2            3  Robusta                      andrew hetzel             India   
3            4  Robusta                             ugacof            Uganda   
4            5  Robusta       katuka development trust ltd            Uganda   
5            6  Robusta                      andrew hetzel             India   
6            7  Robusta                      andrew hetzel             India   
7            8  Robusta                     nishant gurjer             India   
8            9  Robusta                     nishant gurjer             India   
9           10  Robusta                             ugacof            Uganda   
10          11  Robusta                             ugacof            Uganda   
11          12  Robusta                     nishant gurjer             India   
12          13  Robusta                      andrew hetzel             India   
13          14  Robusta  kasozi coffee farmers association            Uganda   
14          15  Robusta       ankole coffee producers coop            Uganda   
15          16  Robusta                      andrew hetzel             India   
16          17  Robusta                      andrew hetzel             India   
17          18  Robusta                 kawacom uganda ltd            Uganda   
18          19  Robusta                      nitubaasa ltd            Uganda   
19          20  Robusta              mannya coffee project            Uganda   
20          21  Robusta                      andrew hetzel             India   
21          22  Robusta                      andrew hetzel             India   
22          23  Robusta                      andrew hetzel     United States   
23          24  Robusta                        luis robles           Ecuador   
24          25  Robusta                        luis robles           Ecuador   
25          26  Robusta                        james moore     United States   
26          27  Robusta                      cafe politico             India   
27          28  Robusta                      cafe politico           Vietnam   

                             Farm.Name Lot.Number  \
0        kyangundu cooperative society        NaN   
1       sethuraman estate kaapi royale         25   
2                    sethuraman estate        NaN   
3                  ugacof project area        NaN   
4   katikamu capca farmers association        NaN   
5                                  NaN        NaN   
6                   sethuraman estates        NaN   
7       sethuraman estate kaapi royale          7   
8                    sethuraman estate        RKR   
9                               ishaka        NaN   
10                 ugacof project area        NaN   
11      sethuraman estate kaapi royale      RC AB   
12                  sethuraman estates        NaN   
13               kasozi coffee farmers        NaN   
14              kyangundu coop society        NaN   
15                   sethuraman estate        NaN   
16                  sethuraman estates        NaN   
17                            bushenyi        NaN   
18   kigezi coffee farmers association        NaN   
19               mannya coffee project        NaN   
20                  sethuraman estates        NaN   
21                  sethuraman estates        NaN   
22                  sethuraman estates        NaN   
23                           robustasa   Lavado 1   
24                           robustasa   Lavado 3   
25                     fazenda cazengo        NaN   
26                                 NaN        NaN   
27                                 NaN        NaN   

                                      Mill         ICO.Number  \
0                  ankole coffee producers                  0   
1                        sethuraman estate    14/1148/2017/21   
2                                      NaN               0000   
3                                   ugacof                  0   
4                 katuka development trust                  0   
5                                   (self)                NaN   
6                                      NaN                NaN   
7                        sethuraman estate    14/1148/2017/18   
8                        sethuraman estate    14/1148/2016/17   
9                             nsubuga umar                  0   
10                                  ugacof                  0   
11                       sethuraman estate    14/1148/2016/12   
12                                     NaN                NaN   
13                                     NaN                  0   
14  ankole coffee producers coop union ltd                  0   
15                                     NaN               0000   
16                      sethuraman estates                NaN   
17                                 kawacom                  0   
18                               nitubaasa                  0   
19                   mannya coffee project                  0   
20                                     NaN                NaN   
21                      sethuraman estates                NaN   
22                      sethuraman estates                NaN   
23                             our own lab                NaN   
24                          own laboratory                NaN   
25                            cafe cazengo                NaN   
26                                     NaN  14-1118-2014-0087   
27                                     NaN                NaN   

                              Company    Altitude  ...         Color  \
0        ankole coffee producers coop        1488  ...         Green   
1                        kaapi royale        3170  ...           NaN   
2                   sethuraman estate       1000m  ...         Green   
3                          ugacof ltd        1212  ...         Green   
4        katuka development trust ltd   1200-1300  ...         Green   
5                     cafemakers, llc       3000'  ...         Green   
6                          cafemakers        750m  ...         Green   
7                        kaapi royale        3140  ...  Bluish-Green   
8                        kaapi royale        1000  ...         Green   
9                          ugacof ltd    900-1300  ...         Green   
10                         ugacof ltd        1095  ...         Green   
11                       kaapi royale        1000  ...         Green   
12                         cafemakers        750m  ...         Green   
13  kasozi coffee farmers association        1367  ...         Green   
14       ankole coffee producers coop        1488  ...         Green   
15                  sethuraman estate       1000m  ...         Green   
16                    cafemakers, llc        750m  ...    Blue-Green   
17                 kawacom uganda ltd        1600  ...         Green   
18                      nitubaasa ltd        1745  ...         Green   
19              mannya coffee project        1200  ...         Green   
20                         cafemakers        750m  ...  Bluish-Green   
21                    cafemakers, llc        750m  ...         Green   
22                    cafemakers, llc       3000'  ...         Green   
23                          robustasa         NaN  ...    Blue-Green   
24                          robustasa          40  ...    Blue-Green   
25            global opportunity fund  795 meters  ...           NaN   
26                      cafe politico         NaN  ...         Green   
27                      cafe politico         NaN  ...          None   

   Category.Two.Defects           Expiration  \
0                     2      June 26th, 2015   
1                     2   October 31st, 2018   
2                     0     April 29th, 2016   
3                     7      July 14th, 2015   
4                     3      June 26th, 2015   
5                     0  February 28th, 2013   
6                     0       May 15th, 2015   
7                     0   October 25th, 2018   
8                     0    August 17th, 2017   
9                     6     August 5th, 2015   
10                    1      June 26th, 2015   
11                    0    August 23rd, 2017   
12                    1       May 19th, 2015   
13                    7      July 14th, 2015   
14                    2      July 14th, 2015   
15                    0     April 29th, 2016   
16                    0       June 3rd, 2014   
17                    1      June 27th, 2015   
18                    2      June 27th, 2015   
19                    1      June 27th, 2015   
20                    1       May 19th, 2015   
21                    0      June 20th, 2014   
22                    0  February 28th, 2013   
23                    1   January 18th, 2017   
24                    0   January 18th, 2017   
25                    6  December 23rd, 2015   
26                    1    August 25th, 2015   
27                    9    August 25th, 2015   

                     Certification.Body  \
0   Uganda Coffee Development Authority   
1          Specialty Coffee Association   
2          Specialty Coffee Association   
3   Uganda Coffee Development Authority   
4   Uganda Coffee Development Authority   
5          Specialty Coffee Association   
6          Specialty Coffee Association   
7          Specialty Coffee Association   
8          Specialty Coffee Association   
9   Uganda Coffee Development Authority   
10  Uganda Coffee Development Authority   
11         Specialty Coffee Association   
12         Specialty Coffee Association   
13  Uganda Coffee Development Authority   
14  Uganda Coffee Development Authority   
15         Specialty Coffee Association   
16         Specialty Coffee Association   
17  Uganda Coffee Development Authority   
18  Uganda Coffee Development Authority   
19  Uganda Coffee Development Authority   
20         Specialty Coffee Association   
21         Specialty Coffee Association   
22         Specialty Coffee Association   
23         Specialty Coffee Association   
24         Specialty Coffee Association   
25         Specialty Coffee Association   
26         Specialty Coffee Association   
27         Specialty Coffee Association   

                       Certification.Address  \
0   e36d0270932c3b657e96b7b0278dfd85dc0fe743   
1   ff7c18ad303d4b603ac3f8cff7e611ffc735e720   
2   ff7c18ad303d4b603ac3f8cff7e611ffc735e720   
3   e36d0270932c3b657e96b7b0278dfd85dc0fe743   
4   e36d0270932c3b657e96b7b0278dfd85dc0fe743   
5   ff7c18ad303d4b603ac3f8cff7e611ffc735e720   
6   ff7c18ad303d4b603ac3f8cff7e611ffc735e720   
7   ff7c18ad303d4b603ac3f8cff7e611ffc735e720   
8   ff7c18ad303d4b603ac3f8cff7e611ffc735e720   
9   e36d0270932c3b657e96b7b0278dfd85dc0fe743   
10  e36d0270932c3b657e96b7b0278dfd85dc0fe743   
11  ff7c18ad303d4b603ac3f8cff7e611ffc735e720   
12  ff7c18ad303d4b603ac3f8cff7e611ffc735e720   
13  e36d0270932c3b657e96b7b0278dfd85dc0fe743   
14  e36d0270932c3b657e96b7b0278dfd85dc0fe743   
15  ff7c18ad303d4b603ac3f8cff7e611ffc735e720   
16  ff7c18ad303d4b603ac3f8cff7e611ffc735e720   
17  e36d0270932c3b657e96b7b0278dfd85dc0fe743   
18  e36d0270932c3b657e96b7b0278dfd85dc0fe743   
19  e36d0270932c3b657e96b7b0278dfd85dc0fe743   
20  ff7c18ad303d4b603ac3f8cff7e611ffc735e720   
21  ff7c18ad303d4b603ac3f8cff7e611ffc735e720   
22  ff7c18ad303d4b603ac3f8cff7e611ffc735e720   
23  ff7c18ad303d4b603ac3f8cff7e611ffc735e720   
24  ff7c18ad303d4b603ac3f8cff7e611ffc735e720   
25  ff7c18ad303d4b603ac3f8cff7e611ffc735e720   
26  ff7c18ad303d4b603ac3f8cff7e611ffc735e720   
27  ff7c18ad303d4b603ac3f8cff7e611ffc735e720   

                       Certification.Contact unit_of_measurement  \
0   03077a1c6bac60e6f514691634a7f6eb5c85aae8                   m   
1   352d0cf7f3e9be14dad7df644ad65efc27605ae2                   m   
2   352d0cf7f3e9be14dad7df644ad65efc27605ae2                   m   
3   03077a1c6bac60e6f514691634a7f6eb5c85aae8                   m   
4   03077a1c6bac60e6f514691634a7f6eb5c85aae8                   m   
5   352d0cf7f3e9be14dad7df644ad65efc27605ae2                   m   
6   352d0cf7f3e9be14dad7df644ad65efc27605ae2                   m   
7   352d0cf7f3e9be14dad7df644ad65efc27605ae2                   m   
8   352d0cf7f3e9be14dad7df644ad65efc27605ae2                   m   
9   03077a1c6bac60e6f514691634a7f6eb5c85aae8                   m   
10  03077a1c6bac60e6f514691634a7f6eb5c85aae8                   m   
11  352d0cf7f3e9be14dad7df644ad65efc27605ae2                   m   
12  352d0cf7f3e9be14dad7df644ad65efc27605ae2                   m   
13  03077a1c6bac60e6f514691634a7f6eb5c85aae8                   m   
14  03077a1c6bac60e6f514691634a7f6eb5c85aae8                   m   
15  352d0cf7f3e9be14dad7df644ad65efc27605ae2                   m   
16  352d0cf7f3e9be14dad7df644ad65efc27605ae2                   m   
17  03077a1c6bac60e6f514691634a7f6eb5c85aae8                   m   
18  03077a1c6bac60e6f514691634a7f6eb5c85aae8                   m   
19  03077a1c6bac60e6f514691634a7f6eb5c85aae8                   m   
20  352d0cf7f3e9be14dad7df644ad65efc27605ae2                   m   
21  352d0cf7f3e9be14dad7df644ad65efc27605ae2                   m   
22  352d0cf7f3e9be14dad7df644ad65efc27605ae2                   m   
23  352d0cf7f3e9be14dad7df644ad65efc27605ae2                   m   
24  352d0cf7f3e9be14dad7df644ad65efc27605ae2                   m   
25  352d0cf7f3e9be14dad7df644ad65efc27605ae2                   m   
26  352d0cf7f3e9be14dad7df644ad65efc27605ae2                   m   
27  352d0cf7f3e9be14dad7df644ad65efc27605ae2                   m   

   altitude_low_meters altitude_high_meters altitude_mean_meters  
0               1488.0               1488.0               1488.0  
1               3170.0               3170.0               3170.0  
2               1000.0               1000.0               1000.0  
3               1212.0               1212.0               1212.0  
4               1200.0               1300.0               1250.0  
5               3000.0               3000.0               3000.0  
6                750.0                750.0                750.0  
7               3140.0               3140.0               3140.0  
8               1000.0               1000.0               1000.0  
9                900.0               1300.0               1100.0  
10              1095.0               1095.0               1095.0  
11              1000.0               1000.0               1000.0  
12               750.0                750.0                750.0  
13              1367.0               1367.0               1367.0  
14              1488.0               1488.0               1488.0  
15              1000.0               1000.0               1000.0  
16               750.0                750.0                750.0  
17              1600.0               1600.0               1600.0  
18              1745.0               1745.0               1745.0  
19              1200.0               1200.0               1200.0  
20               750.0                750.0                750.0  
21               750.0                750.0                750.0  
22              3000.0               3000.0               3000.0  
23                 NaN                  NaN                  NaN  
24                40.0                 40.0                 40.0  
25               795.0                795.0                795.0  
26                 NaN                  NaN                  NaN  
27                 NaN                  NaN                  NaN  

[28 rows x 44 columns]

3.1. Examining the Structure of a Data Frame#

I told you this was a DataFrame, but we can check with type.

type(coffee_df)
pandas.core.frame.DataFrame

We can also see that the DataFrame type comes from the pandas library, without the library loaded this type does not exist.

We can also exmaine its parts. It consists of several; first the column headings

coffee_df.columns
Index(['Unnamed: 0', 'Species', 'Owner', 'Country.of.Origin', 'Farm.Name',
       'Lot.Number', 'Mill', 'ICO.Number', 'Company', 'Altitude', 'Region',
       'Producer', 'Number.of.Bags', 'Bag.Weight', 'In.Country.Partner',
       'Harvest.Year', 'Grading.Date', 'Owner.1', 'Variety',
       'Processing.Method', 'Fragrance...Aroma', 'Flavor', 'Aftertaste',
       'Salt...Acid', 'Bitter...Sweet', 'Mouthfeel', 'Uniform.Cup',
       'Clean.Cup', 'Balance', 'Cupper.Points', 'Total.Cup.Points', 'Moisture',
       'Category.One.Defects', 'Quakers', 'Color', 'Category.Two.Defects',
       'Expiration', 'Certification.Body', 'Certification.Address',
       'Certification.Contact', 'unit_of_measurement', 'altitude_low_meters',
       'altitude_high_meters', 'altitude_mean_meters'],
      dtype='object')

These are a special type called Index

type(coffee_df.columns)
pandas.core.indexes.base.Index

These are still iterable, much like python lists.

and it stores the data

coffee_df.values
array([[1, 'Robusta', 'ankole coffee producers coop', ..., 1488.0,
        1488.0, 1488.0],
       [2, 'Robusta', 'nishant gurjer', ..., 3170.0, 3170.0, 3170.0],
       [3, 'Robusta', 'andrew hetzel', ..., 1000.0, 1000.0, 1000.0],
       ...,
       [26, 'Robusta', 'james moore', ..., 795.0, 795.0, 795.0],
       [27, 'Robusta', 'cafe politico', ..., nan, nan, nan],
       [28, 'Robusta', 'cafe politico', ..., nan, nan, nan]], dtype=object)

It also has an index (first column, visually) but it is special because this is how you can index the data.

coffee_df.index
RangeIndex(start=0, stop=28, step=1)

Right now this is an autogenerated index, but we can also use the index_col parameter to set that up front.

coffee_df = pd.read_csv(coffee_data_url,index_col=0)
coffee_df
Species Owner Country.of.Origin Farm.Name Lot.Number Mill ICO.Number Company Altitude Region ... Color Category.Two.Defects Expiration Certification.Body Certification.Address Certification.Contact unit_of_measurement altitude_low_meters altitude_high_meters altitude_mean_meters
1 Robusta ankole coffee producers coop Uganda kyangundu cooperative society NaN ankole coffee producers 0 ankole coffee producers coop 1488 sheema south western ... Green 2 June 26th, 2015 Uganda Coffee Development Authority e36d0270932c3b657e96b7b0278dfd85dc0fe743 03077a1c6bac60e6f514691634a7f6eb5c85aae8 m 1488.0 1488.0 1488.0
2 Robusta nishant gurjer India sethuraman estate kaapi royale 25 sethuraman estate 14/1148/2017/21 kaapi royale 3170 chikmagalur karnataka indua ... NaN 2 October 31st, 2018 Specialty Coffee Association ff7c18ad303d4b603ac3f8cff7e611ffc735e720 352d0cf7f3e9be14dad7df644ad65efc27605ae2 m 3170.0 3170.0 3170.0
3 Robusta andrew hetzel India sethuraman estate NaN NaN 0000 sethuraman estate 1000m chikmagalur ... Green 0 April 29th, 2016 Specialty Coffee Association ff7c18ad303d4b603ac3f8cff7e611ffc735e720 352d0cf7f3e9be14dad7df644ad65efc27605ae2 m 1000.0 1000.0 1000.0
4 Robusta ugacof Uganda ugacof project area NaN ugacof 0 ugacof ltd 1212 central ... Green 7 July 14th, 2015 Uganda Coffee Development Authority e36d0270932c3b657e96b7b0278dfd85dc0fe743 03077a1c6bac60e6f514691634a7f6eb5c85aae8 m 1212.0 1212.0 1212.0
5 Robusta katuka development trust ltd Uganda katikamu capca farmers association NaN katuka development trust 0 katuka development trust ltd 1200-1300 luwero central region ... Green 3 June 26th, 2015 Uganda Coffee Development Authority e36d0270932c3b657e96b7b0278dfd85dc0fe743 03077a1c6bac60e6f514691634a7f6eb5c85aae8 m 1200.0 1300.0 1250.0
6 Robusta andrew hetzel India NaN NaN (self) NaN cafemakers, llc 3000' chikmagalur ... Green 0 February 28th, 2013 Specialty Coffee Association ff7c18ad303d4b603ac3f8cff7e611ffc735e720 352d0cf7f3e9be14dad7df644ad65efc27605ae2 m 3000.0 3000.0 3000.0
7 Robusta andrew hetzel India sethuraman estates NaN NaN NaN cafemakers 750m chikmagalur ... Green 0 May 15th, 2015 Specialty Coffee Association ff7c18ad303d4b603ac3f8cff7e611ffc735e720 352d0cf7f3e9be14dad7df644ad65efc27605ae2 m 750.0 750.0 750.0
8 Robusta nishant gurjer India sethuraman estate kaapi royale 7 sethuraman estate 14/1148/2017/18 kaapi royale 3140 chikmagalur karnataka india ... Bluish-Green 0 October 25th, 2018 Specialty Coffee Association ff7c18ad303d4b603ac3f8cff7e611ffc735e720 352d0cf7f3e9be14dad7df644ad65efc27605ae2 m 3140.0 3140.0 3140.0
9 Robusta nishant gurjer India sethuraman estate RKR sethuraman estate 14/1148/2016/17 kaapi royale 1000 chikmagalur karnataka ... Green 0 August 17th, 2017 Specialty Coffee Association ff7c18ad303d4b603ac3f8cff7e611ffc735e720 352d0cf7f3e9be14dad7df644ad65efc27605ae2 m 1000.0 1000.0 1000.0
10 Robusta ugacof Uganda ishaka NaN nsubuga umar 0 ugacof ltd 900-1300 western ... Green 6 August 5th, 2015 Uganda Coffee Development Authority e36d0270932c3b657e96b7b0278dfd85dc0fe743 03077a1c6bac60e6f514691634a7f6eb5c85aae8 m 900.0 1300.0 1100.0
11 Robusta ugacof Uganda ugacof project area NaN ugacof 0 ugacof ltd 1095 iganga namadrope eastern ... Green 1 June 26th, 2015 Uganda Coffee Development Authority e36d0270932c3b657e96b7b0278dfd85dc0fe743 03077a1c6bac60e6f514691634a7f6eb5c85aae8 m 1095.0 1095.0 1095.0
12 Robusta nishant gurjer India sethuraman estate kaapi royale RC AB sethuraman estate 14/1148/2016/12 kaapi royale 1000 chikmagalur karnataka ... Green 0 August 23rd, 2017 Specialty Coffee Association ff7c18ad303d4b603ac3f8cff7e611ffc735e720 352d0cf7f3e9be14dad7df644ad65efc27605ae2 m 1000.0 1000.0 1000.0
13 Robusta andrew hetzel India sethuraman estates NaN NaN NaN cafemakers 750m chikmagalur ... Green 1 May 19th, 2015 Specialty Coffee Association ff7c18ad303d4b603ac3f8cff7e611ffc735e720 352d0cf7f3e9be14dad7df644ad65efc27605ae2 m 750.0 750.0 750.0
14 Robusta kasozi coffee farmers association Uganda kasozi coffee farmers NaN NaN 0 kasozi coffee farmers association 1367 eastern ... Green 7 July 14th, 2015 Uganda Coffee Development Authority e36d0270932c3b657e96b7b0278dfd85dc0fe743 03077a1c6bac60e6f514691634a7f6eb5c85aae8 m 1367.0 1367.0 1367.0
15 Robusta ankole coffee producers coop Uganda kyangundu coop society NaN ankole coffee producers coop union ltd 0 ankole coffee producers coop 1488 south western ... Green 2 July 14th, 2015 Uganda Coffee Development Authority e36d0270932c3b657e96b7b0278dfd85dc0fe743 03077a1c6bac60e6f514691634a7f6eb5c85aae8 m 1488.0 1488.0 1488.0
16 Robusta andrew hetzel India sethuraman estate NaN NaN 0000 sethuraman estate 1000m chikmagalur ... Green 0 April 29th, 2016 Specialty Coffee Association ff7c18ad303d4b603ac3f8cff7e611ffc735e720 352d0cf7f3e9be14dad7df644ad65efc27605ae2 m 1000.0 1000.0 1000.0
17 Robusta andrew hetzel India sethuraman estates NaN sethuraman estates NaN cafemakers, llc 750m chikmagalur ... Blue-Green 0 June 3rd, 2014 Specialty Coffee Association ff7c18ad303d4b603ac3f8cff7e611ffc735e720 352d0cf7f3e9be14dad7df644ad65efc27605ae2 m 750.0 750.0 750.0
18 Robusta kawacom uganda ltd Uganda bushenyi NaN kawacom 0 kawacom uganda ltd 1600 western ... Green 1 June 27th, 2015 Uganda Coffee Development Authority e36d0270932c3b657e96b7b0278dfd85dc0fe743 03077a1c6bac60e6f514691634a7f6eb5c85aae8 m 1600.0 1600.0 1600.0
19 Robusta nitubaasa ltd Uganda kigezi coffee farmers association NaN nitubaasa 0 nitubaasa ltd 1745 western ... Green 2 June 27th, 2015 Uganda Coffee Development Authority e36d0270932c3b657e96b7b0278dfd85dc0fe743 03077a1c6bac60e6f514691634a7f6eb5c85aae8 m 1745.0 1745.0 1745.0
20 Robusta mannya coffee project Uganda mannya coffee project NaN mannya coffee project 0 mannya coffee project 1200 southern ... Green 1 June 27th, 2015 Uganda Coffee Development Authority e36d0270932c3b657e96b7b0278dfd85dc0fe743 03077a1c6bac60e6f514691634a7f6eb5c85aae8 m 1200.0 1200.0 1200.0
21 Robusta andrew hetzel India sethuraman estates NaN NaN NaN cafemakers 750m chikmagalur ... Bluish-Green 1 May 19th, 2015 Specialty Coffee Association ff7c18ad303d4b603ac3f8cff7e611ffc735e720 352d0cf7f3e9be14dad7df644ad65efc27605ae2 m 750.0 750.0 750.0
22 Robusta andrew hetzel India sethuraman estates NaN sethuraman estates NaN cafemakers, llc 750m chikmagalur ... Green 0 June 20th, 2014 Specialty Coffee Association ff7c18ad303d4b603ac3f8cff7e611ffc735e720 352d0cf7f3e9be14dad7df644ad65efc27605ae2 m 750.0 750.0 750.0
23 Robusta andrew hetzel United States sethuraman estates NaN sethuraman estates NaN cafemakers, llc 3000' chikmagalur ... Green 0 February 28th, 2013 Specialty Coffee Association ff7c18ad303d4b603ac3f8cff7e611ffc735e720 352d0cf7f3e9be14dad7df644ad65efc27605ae2 m 3000.0 3000.0 3000.0
24 Robusta luis robles Ecuador robustasa Lavado 1 our own lab NaN robustasa NaN san juan, playas ... Blue-Green 1 January 18th, 2017 Specialty Coffee Association ff7c18ad303d4b603ac3f8cff7e611ffc735e720 352d0cf7f3e9be14dad7df644ad65efc27605ae2 m NaN NaN NaN
25 Robusta luis robles Ecuador robustasa Lavado 3 own laboratory NaN robustasa 40 san juan, playas ... Blue-Green 0 January 18th, 2017 Specialty Coffee Association ff7c18ad303d4b603ac3f8cff7e611ffc735e720 352d0cf7f3e9be14dad7df644ad65efc27605ae2 m 40.0 40.0 40.0
26 Robusta james moore United States fazenda cazengo NaN cafe cazengo NaN global opportunity fund 795 meters kwanza norte province, angola ... NaN 6 December 23rd, 2015 Specialty Coffee Association ff7c18ad303d4b603ac3f8cff7e611ffc735e720 352d0cf7f3e9be14dad7df644ad65efc27605ae2 m 795.0 795.0 795.0
27 Robusta cafe politico India NaN NaN NaN 14-1118-2014-0087 cafe politico NaN NaN ... Green 1 August 25th, 2015 Specialty Coffee Association ff7c18ad303d4b603ac3f8cff7e611ffc735e720 352d0cf7f3e9be14dad7df644ad65efc27605ae2 m NaN NaN NaN
28 Robusta cafe politico Vietnam NaN NaN NaN NaN cafe politico NaN NaN ... None 9 August 25th, 2015 Specialty Coffee Association ff7c18ad303d4b603ac3f8cff7e611ffc735e720 352d0cf7f3e9be14dad7df644ad65efc27605ae2 m NaN NaN NaN

28 rows × 43 columns

coffee_df.index
Int64Index([ 1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11, 12, 13, 14, 15, 16, 17,
            18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28],
           dtype='int64')

Now it’s neater

3.2. Extracting Parts of Data Frames#

We can look at the first 5 rows with head

coffee_df.head()
Species Owner Country.of.Origin Farm.Name Lot.Number Mill ICO.Number Company Altitude Region ... Color Category.Two.Defects Expiration Certification.Body Certification.Address Certification.Contact unit_of_measurement altitude_low_meters altitude_high_meters altitude_mean_meters
1 Robusta ankole coffee producers coop Uganda kyangundu cooperative society NaN ankole coffee producers 0 ankole coffee producers coop 1488 sheema south western ... Green 2 June 26th, 2015 Uganda Coffee Development Authority e36d0270932c3b657e96b7b0278dfd85dc0fe743 03077a1c6bac60e6f514691634a7f6eb5c85aae8 m 1488.0 1488.0 1488.0
2 Robusta nishant gurjer India sethuraman estate kaapi royale 25 sethuraman estate 14/1148/2017/21 kaapi royale 3170 chikmagalur karnataka indua ... NaN 2 October 31st, 2018 Specialty Coffee Association ff7c18ad303d4b603ac3f8cff7e611ffc735e720 352d0cf7f3e9be14dad7df644ad65efc27605ae2 m 3170.0 3170.0 3170.0
3 Robusta andrew hetzel India sethuraman estate NaN NaN 0000 sethuraman estate 1000m chikmagalur ... Green 0 April 29th, 2016 Specialty Coffee Association ff7c18ad303d4b603ac3f8cff7e611ffc735e720 352d0cf7f3e9be14dad7df644ad65efc27605ae2 m 1000.0 1000.0 1000.0
4 Robusta ugacof Uganda ugacof project area NaN ugacof 0 ugacof ltd 1212 central ... Green 7 July 14th, 2015 Uganda Coffee Development Authority e36d0270932c3b657e96b7b0278dfd85dc0fe743 03077a1c6bac60e6f514691634a7f6eb5c85aae8 m 1212.0 1212.0 1212.0
5 Robusta katuka development trust ltd Uganda katikamu capca farmers association NaN katuka development trust 0 katuka development trust ltd 1200-1300 luwero central region ... Green 3 June 26th, 2015 Uganda Coffee Development Authority e36d0270932c3b657e96b7b0278dfd85dc0fe743 03077a1c6bac60e6f514691634a7f6eb5c85aae8 m 1200.0 1300.0 1250.0

5 rows × 43 columns

and the last 5 with tail

coffee_df.tail()
Species Owner Country.of.Origin Farm.Name Lot.Number Mill ICO.Number Company Altitude Region ... Color Category.Two.Defects Expiration Certification.Body Certification.Address Certification.Contact unit_of_measurement altitude_low_meters altitude_high_meters altitude_mean_meters
24 Robusta luis robles Ecuador robustasa Lavado 1 our own lab NaN robustasa NaN san juan, playas ... Blue-Green 1 January 18th, 2017 Specialty Coffee Association ff7c18ad303d4b603ac3f8cff7e611ffc735e720 352d0cf7f3e9be14dad7df644ad65efc27605ae2 m NaN NaN NaN
25 Robusta luis robles Ecuador robustasa Lavado 3 own laboratory NaN robustasa 40 san juan, playas ... Blue-Green 0 January 18th, 2017 Specialty Coffee Association ff7c18ad303d4b603ac3f8cff7e611ffc735e720 352d0cf7f3e9be14dad7df644ad65efc27605ae2 m 40.0 40.0 40.0
26 Robusta james moore United States fazenda cazengo NaN cafe cazengo NaN global opportunity fund 795 meters kwanza norte province, angola ... NaN 6 December 23rd, 2015 Specialty Coffee Association ff7c18ad303d4b603ac3f8cff7e611ffc735e720 352d0cf7f3e9be14dad7df644ad65efc27605ae2 m 795.0 795.0 795.0
27 Robusta cafe politico India NaN NaN NaN 14-1118-2014-0087 cafe politico NaN NaN ... Green 1 August 25th, 2015 Specialty Coffee Association ff7c18ad303d4b603ac3f8cff7e611ffc735e720 352d0cf7f3e9be14dad7df644ad65efc27605ae2 m NaN NaN NaN
28 Robusta cafe politico Vietnam NaN NaN NaN NaN cafe politico NaN NaN ... None 9 August 25th, 2015 Specialty Coffee Association ff7c18ad303d4b603ac3f8cff7e611ffc735e720 352d0cf7f3e9be14dad7df644ad65efc27605ae2 m NaN NaN NaN

5 rows × 43 columns

the shape of a DataFrame is an attribute

coffee_df.shape
(28, 43)
len(coffee_df)
28

We can pick out columns by name.

coffee_df['Species']
1     Robusta
2     Robusta
3     Robusta
4     Robusta
5     Robusta
6     Robusta
7     Robusta
8     Robusta
9     Robusta
10    Robusta
11    Robusta
12    Robusta
13    Robusta
14    Robusta
15    Robusta
16    Robusta
17    Robusta
18    Robusta
19    Robusta
20    Robusta
21    Robusta
22    Robusta
23    Robusta
24    Robusta
25    Robusta
26    Robusta
27    Robusta
28    Robusta
Name: Species, dtype: object

Important

We did not do this step in class

We can pick out rows with loc

coffee_df.loc[0]
---------------------------------------------------------------------------
KeyError                                  Traceback (most recent call last)
File /opt/hostedtoolcache/Python/3.9.16/x64/lib/python3.9/site-packages/pandas/core/indexes/base.py:3803, in Index.get_loc(self, key, method, tolerance)
   3802 try:
-> 3803     return self._engine.get_loc(casted_key)
   3804 except KeyError as err:

File /opt/hostedtoolcache/Python/3.9.16/x64/lib/python3.9/site-packages/pandas/_libs/index.pyx:138, in pandas._libs.index.IndexEngine.get_loc()

File /opt/hostedtoolcache/Python/3.9.16/x64/lib/python3.9/site-packages/pandas/_libs/index.pyx:165, in pandas._libs.index.IndexEngine.get_loc()

File pandas/_libs/hashtable_class_helper.pxi:2263, in pandas._libs.hashtable.Int64HashTable.get_item()

File pandas/_libs/hashtable_class_helper.pxi:2273, in pandas._libs.hashtable.Int64HashTable.get_item()

KeyError: 0

The above exception was the direct cause of the following exception:

KeyError                                  Traceback (most recent call last)
Cell In[18], line 1
----> 1 coffee_df.loc[0]

File /opt/hostedtoolcache/Python/3.9.16/x64/lib/python3.9/site-packages/pandas/core/indexing.py:1073, in _LocationIndexer.__getitem__(self, key)
   1070 axis = self.axis or 0
   1072 maybe_callable = com.apply_if_callable(key, self.obj)
-> 1073 return self._getitem_axis(maybe_callable, axis=axis)

File /opt/hostedtoolcache/Python/3.9.16/x64/lib/python3.9/site-packages/pandas/core/indexing.py:1312, in _LocIndexer._getitem_axis(self, key, axis)
   1310 # fall thru to straight lookup
   1311 self._validate_key(key, axis)
-> 1312 return self._get_label(key, axis=axis)

File /opt/hostedtoolcache/Python/3.9.16/x64/lib/python3.9/site-packages/pandas/core/indexing.py:1260, in _LocIndexer._get_label(self, label, axis)
   1258 def _get_label(self, label, axis: int):
   1259     # GH#5567 this will fail if the label is not present in the axis.
-> 1260     return self.obj.xs(label, axis=axis)

File /opt/hostedtoolcache/Python/3.9.16/x64/lib/python3.9/site-packages/pandas/core/generic.py:4056, in NDFrame.xs(self, key, axis, level, drop_level)
   4054             new_index = index[loc]
   4055 else:
-> 4056     loc = index.get_loc(key)
   4058     if isinstance(loc, np.ndarray):
   4059         if loc.dtype == np.bool_:

File /opt/hostedtoolcache/Python/3.9.16/x64/lib/python3.9/site-packages/pandas/core/indexes/base.py:3805, in Index.get_loc(self, key, method, tolerance)
   3803     return self._engine.get_loc(casted_key)
   3804 except KeyError as err:
-> 3805     raise KeyError(key) from err
   3806 except TypeError:
   3807     # If we have a listlike key, _check_indexing_error will raise
   3808     #  InvalidIndexError. Otherwise we fall through and re-raise
   3809     #  the TypeError.
   3810     self._check_indexing_error(key)

KeyError: 0

3.3. Reading data from websites#

We’ll first read from the course website.

Note

This is our first bit of web scraping! We will do more, but for very structured data it can be this easy

comm_url = 'https://rhodyprog4ds.github.io/BrownFall22/syllabus/communication.html#'

So far, we’ve read data in from a .csv file with pd.read_csv and created a DataFrame with the constructor pd.DataFrame using a dictionary. Pandas provides many interfaces for reading in data. They’re described on the Pandas IO page.

We can use the read_html method to read from this page. We know that it has multiple tables on the page, and from the help, we know that it will return a list of DataFrames.

df_list = pd.read_html(comm_url)

We can also verify what it returns

type(df_list)
list

We can index with [] to pick one item from the list and verify that it is a DataFrame.

type(df_list[0])
pandas.core.frame.DataFrame

3.4. Pythonic Loops#

In Python, loops do not require an iterator variable. It has an interable object and a loop variable.

for loop_variable in iterable_object:
    # loop body

the loop_variable takes on the value of each item in the iterable_object each time it goes through, in order. Writing loops this way makes them more compact and more readable, this is more like English. For example:

name = 'sarah'
for letter in name:
    print(letter.upper())
S
A
R
A
H

It is best to name variables so that the loop variable makes sense as an item from the iterable. For example, names have letters in them, and an item in df_list makes sense as df.

for df in df_list:
    print(df.shape)
(6, 4)
(6, 4)
(1, 3)
(3, 3)
(2, 3)

3.5. Types Solution#

Warning

I am using bad variable names here a, b ,… because these are only as options for a question and we will not use them again

a = [char for char in 'abcde']
a
['a', 'b', 'c', 'd', 'e']
type(a)
list
b = {char:i for i, char in enumerate('abcde')}
b
{'a': 0, 'b': 1, 'c': 2, 'd': 3, 'e': 4}
type(b)
dict
c = ('a','b','c','d','e')
c
('a', 'b', 'c', 'd', 'e')
type(c)
tuple
d = 'a b c d e'.split('')
d
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
Cell In[31], line 1
----> 1 d = 'a b c d e'.split('')
      2 d

ValueError: empty separator
type(d)
---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
Cell In[32], line 1
----> 1 type(d)

NameError: name 'd' is not defined

3.6. Questions After Class#

3.6.1. what is a dictionary in python?#

a dictionary is a datatype from base python that stores key, value pairs.

For example

prof_info = {'first':'Sarah', 'last':'Brown', 'title':'Dr.'}
prof_info
{'first': 'Sarah', 'last': 'Brown', 'title': 'Dr.'}

We can use the keys to index in and get the values out

prof_info['title']
'Dr.'

Even though we will mostly use DataFrame, dictionaries and other base python types are important. Dictionaries are very powerful they can hold whole functions in them. For example, the Python language does not have a switch case (which can be used for handling many if/else cases) but instead dictionaries can be used for that.

Further Reading

You can read more about the details of data types in Pandas in the documentation

3.6.2. How to see unique values in a column#

We will get to this soon! We got the first part, picking out a single column to look at, we will see the method for that probably on Monday, but maybe on Friday.