4. Pandas DataFrames#

Today, we’re going to explore DataFrames in greater detail. We’ll continue using that same coffee dataset.

coffee_data_url = 'https://raw.githubusercontent.com/jldbc/coffee-quality-database/master/data/robusta_data_cleaned.csv'

4.1. More about loading libraries#

We can import pandas without the alias pd if we want, but then we have to use the full name everywhere

import pandas
pandas.read_csv(coffee_data_url)
Unnamed: 0 Species Owner Country.of.Origin Farm.Name Lot.Number Mill ICO.Number Company Altitude ... Color Category.Two.Defects Expiration Certification.Body Certification.Address Certification.Contact unit_of_measurement altitude_low_meters altitude_high_meters altitude_mean_meters
0 1 Robusta ankole coffee producers coop Uganda kyangundu cooperative society NaN ankole coffee producers 0 ankole coffee producers coop 1488 ... Green 2 June 26th, 2015 Uganda Coffee Development Authority e36d0270932c3b657e96b7b0278dfd85dc0fe743 03077a1c6bac60e6f514691634a7f6eb5c85aae8 m 1488.0 1488.0 1488.0
1 2 Robusta nishant gurjer India sethuraman estate kaapi royale 25 sethuraman estate 14/1148/2017/21 kaapi royale 3170 ... NaN 2 October 31st, 2018 Specialty Coffee Association ff7c18ad303d4b603ac3f8cff7e611ffc735e720 352d0cf7f3e9be14dad7df644ad65efc27605ae2 m 3170.0 3170.0 3170.0
2 3 Robusta andrew hetzel India sethuraman estate NaN NaN 0000 sethuraman estate 1000m ... Green 0 April 29th, 2016 Specialty Coffee Association ff7c18ad303d4b603ac3f8cff7e611ffc735e720 352d0cf7f3e9be14dad7df644ad65efc27605ae2 m 1000.0 1000.0 1000.0
3 4 Robusta ugacof Uganda ugacof project area NaN ugacof 0 ugacof ltd 1212 ... Green 7 July 14th, 2015 Uganda Coffee Development Authority e36d0270932c3b657e96b7b0278dfd85dc0fe743 03077a1c6bac60e6f514691634a7f6eb5c85aae8 m 1212.0 1212.0 1212.0
4 5 Robusta katuka development trust ltd Uganda katikamu capca farmers association NaN katuka development trust 0 katuka development trust ltd 1200-1300 ... Green 3 June 26th, 2015 Uganda Coffee Development Authority e36d0270932c3b657e96b7b0278dfd85dc0fe743 03077a1c6bac60e6f514691634a7f6eb5c85aae8 m 1200.0 1300.0 1250.0
5 6 Robusta andrew hetzel India NaN NaN (self) NaN cafemakers, llc 3000' ... Green 0 February 28th, 2013 Specialty Coffee Association ff7c18ad303d4b603ac3f8cff7e611ffc735e720 352d0cf7f3e9be14dad7df644ad65efc27605ae2 m 3000.0 3000.0 3000.0
6 7 Robusta andrew hetzel India sethuraman estates NaN NaN NaN cafemakers 750m ... Green 0 May 15th, 2015 Specialty Coffee Association ff7c18ad303d4b603ac3f8cff7e611ffc735e720 352d0cf7f3e9be14dad7df644ad65efc27605ae2 m 750.0 750.0 750.0
7 8 Robusta nishant gurjer India sethuraman estate kaapi royale 7 sethuraman estate 14/1148/2017/18 kaapi royale 3140 ... Bluish-Green 0 October 25th, 2018 Specialty Coffee Association ff7c18ad303d4b603ac3f8cff7e611ffc735e720 352d0cf7f3e9be14dad7df644ad65efc27605ae2 m 3140.0 3140.0 3140.0
8 9 Robusta nishant gurjer India sethuraman estate RKR sethuraman estate 14/1148/2016/17 kaapi royale 1000 ... Green 0 August 17th, 2017 Specialty Coffee Association ff7c18ad303d4b603ac3f8cff7e611ffc735e720 352d0cf7f3e9be14dad7df644ad65efc27605ae2 m 1000.0 1000.0 1000.0
9 10 Robusta ugacof Uganda ishaka NaN nsubuga umar 0 ugacof ltd 900-1300 ... Green 6 August 5th, 2015 Uganda Coffee Development Authority e36d0270932c3b657e96b7b0278dfd85dc0fe743 03077a1c6bac60e6f514691634a7f6eb5c85aae8 m 900.0 1300.0 1100.0
10 11 Robusta ugacof Uganda ugacof project area NaN ugacof 0 ugacof ltd 1095 ... Green 1 June 26th, 2015 Uganda Coffee Development Authority e36d0270932c3b657e96b7b0278dfd85dc0fe743 03077a1c6bac60e6f514691634a7f6eb5c85aae8 m 1095.0 1095.0 1095.0
11 12 Robusta nishant gurjer India sethuraman estate kaapi royale RC AB sethuraman estate 14/1148/2016/12 kaapi royale 1000 ... Green 0 August 23rd, 2017 Specialty Coffee Association ff7c18ad303d4b603ac3f8cff7e611ffc735e720 352d0cf7f3e9be14dad7df644ad65efc27605ae2 m 1000.0 1000.0 1000.0
12 13 Robusta andrew hetzel India sethuraman estates NaN NaN NaN cafemakers 750m ... Green 1 May 19th, 2015 Specialty Coffee Association ff7c18ad303d4b603ac3f8cff7e611ffc735e720 352d0cf7f3e9be14dad7df644ad65efc27605ae2 m 750.0 750.0 750.0
13 14 Robusta kasozi coffee farmers association Uganda kasozi coffee farmers NaN NaN 0 kasozi coffee farmers association 1367 ... Green 7 July 14th, 2015 Uganda Coffee Development Authority e36d0270932c3b657e96b7b0278dfd85dc0fe743 03077a1c6bac60e6f514691634a7f6eb5c85aae8 m 1367.0 1367.0 1367.0
14 15 Robusta ankole coffee producers coop Uganda kyangundu coop society NaN ankole coffee producers coop union ltd 0 ankole coffee producers coop 1488 ... Green 2 July 14th, 2015 Uganda Coffee Development Authority e36d0270932c3b657e96b7b0278dfd85dc0fe743 03077a1c6bac60e6f514691634a7f6eb5c85aae8 m 1488.0 1488.0 1488.0
15 16 Robusta andrew hetzel India sethuraman estate NaN NaN 0000 sethuraman estate 1000m ... Green 0 April 29th, 2016 Specialty Coffee Association ff7c18ad303d4b603ac3f8cff7e611ffc735e720 352d0cf7f3e9be14dad7df644ad65efc27605ae2 m 1000.0 1000.0 1000.0
16 17 Robusta andrew hetzel India sethuraman estates NaN sethuraman estates NaN cafemakers, llc 750m ... Blue-Green 0 June 3rd, 2014 Specialty Coffee Association ff7c18ad303d4b603ac3f8cff7e611ffc735e720 352d0cf7f3e9be14dad7df644ad65efc27605ae2 m 750.0 750.0 750.0
17 18 Robusta kawacom uganda ltd Uganda bushenyi NaN kawacom 0 kawacom uganda ltd 1600 ... Green 1 June 27th, 2015 Uganda Coffee Development Authority e36d0270932c3b657e96b7b0278dfd85dc0fe743 03077a1c6bac60e6f514691634a7f6eb5c85aae8 m 1600.0 1600.0 1600.0
18 19 Robusta nitubaasa ltd Uganda kigezi coffee farmers association NaN nitubaasa 0 nitubaasa ltd 1745 ... Green 2 June 27th, 2015 Uganda Coffee Development Authority e36d0270932c3b657e96b7b0278dfd85dc0fe743 03077a1c6bac60e6f514691634a7f6eb5c85aae8 m 1745.0 1745.0 1745.0
19 20 Robusta mannya coffee project Uganda mannya coffee project NaN mannya coffee project 0 mannya coffee project 1200 ... Green 1 June 27th, 2015 Uganda Coffee Development Authority e36d0270932c3b657e96b7b0278dfd85dc0fe743 03077a1c6bac60e6f514691634a7f6eb5c85aae8 m 1200.0 1200.0 1200.0
20 21 Robusta andrew hetzel India sethuraman estates NaN NaN NaN cafemakers 750m ... Bluish-Green 1 May 19th, 2015 Specialty Coffee Association ff7c18ad303d4b603ac3f8cff7e611ffc735e720 352d0cf7f3e9be14dad7df644ad65efc27605ae2 m 750.0 750.0 750.0
21 22 Robusta andrew hetzel India sethuraman estates NaN sethuraman estates NaN cafemakers, llc 750m ... Green 0 June 20th, 2014 Specialty Coffee Association ff7c18ad303d4b603ac3f8cff7e611ffc735e720 352d0cf7f3e9be14dad7df644ad65efc27605ae2 m 750.0 750.0 750.0
22 23 Robusta andrew hetzel United States sethuraman estates NaN sethuraman estates NaN cafemakers, llc 3000' ... Green 0 February 28th, 2013 Specialty Coffee Association ff7c18ad303d4b603ac3f8cff7e611ffc735e720 352d0cf7f3e9be14dad7df644ad65efc27605ae2 m 3000.0 3000.0 3000.0
23 24 Robusta luis robles Ecuador robustasa Lavado 1 our own lab NaN robustasa NaN ... Blue-Green 1 January 18th, 2017 Specialty Coffee Association ff7c18ad303d4b603ac3f8cff7e611ffc735e720 352d0cf7f3e9be14dad7df644ad65efc27605ae2 m NaN NaN NaN
24 25 Robusta luis robles Ecuador robustasa Lavado 3 own laboratory NaN robustasa 40 ... Blue-Green 0 January 18th, 2017 Specialty Coffee Association ff7c18ad303d4b603ac3f8cff7e611ffc735e720 352d0cf7f3e9be14dad7df644ad65efc27605ae2 m 40.0 40.0 40.0
25 26 Robusta james moore United States fazenda cazengo NaN cafe cazengo NaN global opportunity fund 795 meters ... NaN 6 December 23rd, 2015 Specialty Coffee Association ff7c18ad303d4b603ac3f8cff7e611ffc735e720 352d0cf7f3e9be14dad7df644ad65efc27605ae2 m 795.0 795.0 795.0
26 27 Robusta cafe politico India NaN NaN NaN 14-1118-2014-0087 cafe politico NaN ... Green 1 August 25th, 2015 Specialty Coffee Association ff7c18ad303d4b603ac3f8cff7e611ffc735e720 352d0cf7f3e9be14dad7df644ad65efc27605ae2 m NaN NaN NaN
27 28 Robusta cafe politico Vietnam NaN NaN NaN NaN cafe politico NaN ... None 9 August 25th, 2015 Specialty Coffee Association ff7c18ad303d4b603ac3f8cff7e611ffc735e720 352d0cf7f3e9be14dad7df644ad65efc27605ae2 m NaN NaN NaN

28 rows × 44 columns

We’ll use pd because that’s the more common convention and so that we can type fewer characters throughout our code

import pandas as pd

4.2. Examining DataFrames#

df = pd.read_csv(coffee_data_url,index_col=0)

We can look at the first 5 rows with head

df.head()
Species Owner Country.of.Origin Farm.Name Lot.Number Mill ICO.Number Company Altitude Region ... Color Category.Two.Defects Expiration Certification.Body Certification.Address Certification.Contact unit_of_measurement altitude_low_meters altitude_high_meters altitude_mean_meters
1 Robusta ankole coffee producers coop Uganda kyangundu cooperative society NaN ankole coffee producers 0 ankole coffee producers coop 1488 sheema south western ... Green 2 June 26th, 2015 Uganda Coffee Development Authority e36d0270932c3b657e96b7b0278dfd85dc0fe743 03077a1c6bac60e6f514691634a7f6eb5c85aae8 m 1488.0 1488.0 1488.0
2 Robusta nishant gurjer India sethuraman estate kaapi royale 25 sethuraman estate 14/1148/2017/21 kaapi royale 3170 chikmagalur karnataka indua ... NaN 2 October 31st, 2018 Specialty Coffee Association ff7c18ad303d4b603ac3f8cff7e611ffc735e720 352d0cf7f3e9be14dad7df644ad65efc27605ae2 m 3170.0 3170.0 3170.0
3 Robusta andrew hetzel India sethuraman estate NaN NaN 0000 sethuraman estate 1000m chikmagalur ... Green 0 April 29th, 2016 Specialty Coffee Association ff7c18ad303d4b603ac3f8cff7e611ffc735e720 352d0cf7f3e9be14dad7df644ad65efc27605ae2 m 1000.0 1000.0 1000.0
4 Robusta ugacof Uganda ugacof project area NaN ugacof 0 ugacof ltd 1212 central ... Green 7 July 14th, 2015 Uganda Coffee Development Authority e36d0270932c3b657e96b7b0278dfd85dc0fe743 03077a1c6bac60e6f514691634a7f6eb5c85aae8 m 1212.0 1212.0 1212.0
5 Robusta katuka development trust ltd Uganda katikamu capca farmers association NaN katuka development trust 0 katuka development trust ltd 1200-1300 luwero central region ... Green 3 June 26th, 2015 Uganda Coffee Development Authority e36d0270932c3b657e96b7b0278dfd85dc0fe743 03077a1c6bac60e6f514691634a7f6eb5c85aae8 m 1200.0 1300.0 1250.0

5 rows × 43 columns

Using help, we can see that that head takes one parameter and has a default value of 5, which is why we got 5 rows, but we can get 2 instead

df.head(2)
Species Owner Country.of.Origin Farm.Name Lot.Number Mill ICO.Number Company Altitude Region ... Color Category.Two.Defects Expiration Certification.Body Certification.Address Certification.Contact unit_of_measurement altitude_low_meters altitude_high_meters altitude_mean_meters
1 Robusta ankole coffee producers coop Uganda kyangundu cooperative society NaN ankole coffee producers 0 ankole coffee producers coop 1488 sheema south western ... Green 2 June 26th, 2015 Uganda Coffee Development Authority e36d0270932c3b657e96b7b0278dfd85dc0fe743 03077a1c6bac60e6f514691634a7f6eb5c85aae8 m 1488.0 1488.0 1488.0
2 Robusta nishant gurjer India sethuraman estate kaapi royale 25 sethuraman estate 14/1148/2017/21 kaapi royale 3170 chikmagalur karnataka indua ... NaN 2 October 31st, 2018 Specialty Coffee Association ff7c18ad303d4b603ac3f8cff7e611ffc735e720 352d0cf7f3e9be14dad7df644ad65efc27605ae2 m 3170.0 3170.0 3170.0

2 rows × 43 columns

We can look at the last rows with tail

df.tail(3)
Species Owner Country.of.Origin Farm.Name Lot.Number Mill ICO.Number Company Altitude Region ... Color Category.Two.Defects Expiration Certification.Body Certification.Address Certification.Contact unit_of_measurement altitude_low_meters altitude_high_meters altitude_mean_meters
26 Robusta james moore United States fazenda cazengo NaN cafe cazengo NaN global opportunity fund 795 meters kwanza norte province, angola ... NaN 6 December 23rd, 2015 Specialty Coffee Association ff7c18ad303d4b603ac3f8cff7e611ffc735e720 352d0cf7f3e9be14dad7df644ad65efc27605ae2 m 795.0 795.0 795.0
27 Robusta cafe politico India NaN NaN NaN 14-1118-2014-0087 cafe politico NaN NaN ... Green 1 August 25th, 2015 Specialty Coffee Association ff7c18ad303d4b603ac3f8cff7e611ffc735e720 352d0cf7f3e9be14dad7df644ad65efc27605ae2 m NaN NaN NaN
28 Robusta cafe politico Vietnam NaN NaN NaN NaN cafe politico NaN NaN ... None 9 August 25th, 2015 Specialty Coffee Association ff7c18ad303d4b603ac3f8cff7e611ffc735e720 352d0cf7f3e9be14dad7df644ad65efc27605ae2 m NaN NaN NaN

3 rows × 43 columns

I told you this was a DataFrame, but we can check with type.

type(df)
pandas.core.frame.DataFrame

We can also exmaine its parts. It consists of several; first the column headings

df.columns
Index(['Species', 'Owner', 'Country.of.Origin', 'Farm.Name', 'Lot.Number',
       'Mill', 'ICO.Number', 'Company', 'Altitude', 'Region', 'Producer',
       'Number.of.Bags', 'Bag.Weight', 'In.Country.Partner', 'Harvest.Year',
       'Grading.Date', 'Owner.1', 'Variety', 'Processing.Method',
       'Fragrance...Aroma', 'Flavor', 'Aftertaste', 'Salt...Acid',
       'Bitter...Sweet', 'Mouthfeel', 'Uniform.Cup', 'Clean.Cup', 'Balance',
       'Cupper.Points', 'Total.Cup.Points', 'Moisture', 'Category.One.Defects',
       'Quakers', 'Color', 'Category.Two.Defects', 'Expiration',
       'Certification.Body', 'Certification.Address', 'Certification.Contact',
       'unit_of_measurement', 'altitude_low_meters', 'altitude_high_meters',
       'altitude_mean_meters'],
      dtype='object')

These are a special type called Index

type(df.columns)
pandas.core.indexes.base.Index

It also has an index

df.index
Int64Index([ 1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11, 12, 13, 14, 15, 16, 17,
            18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28],
           dtype='int64')

and values

df.values
array([['Robusta', 'ankole coffee producers coop', 'Uganda', ..., 1488.0,
        1488.0, 1488.0],
       ['Robusta', 'nishant gurjer', 'India', ..., 3170.0, 3170.0,
        3170.0],
       ['Robusta', 'andrew hetzel', 'India', ..., 1000.0, 1000.0, 1000.0],
       ...,
       ['Robusta', 'james moore', 'United States', ..., 795.0, 795.0,
        795.0],
       ['Robusta', 'cafe politico', 'India', ..., nan, nan, nan],
       ['Robusta', 'cafe politico', 'Vietnam', ..., nan, nan, nan]],
      dtype=object)

it also knows its own shape

df.shape
(28, 43)

we can use builtin fucntions on our DataFrame too not just its own methods and attributes.

len(df)
28

Why does len turn green? it’s a python reserve word

4.3. Building a Data Frame programmatically#

One way to build a data frame is from a dictionary:

people = {'names':['Sarah','Connor','Kenza'],
         'username':['brownsarahm','sudoPsych','kbdlh']}
people
{'names': ['Sarah', 'Connor', 'Kenza'],
 'username': ['brownsarahm', 'sudoPsych', 'kbdlh']}
type(people)
dict
people_df = pd.DataFrame(people)
people_df
names username
0 Sarah brownsarahm
1 Connor sudoPsych
2 Kenza kbdlh
type(people['names'])
list
type(people)
dict
type({4,5,5})
set
{4,5,5}
{4, 5}
people['names']
['Sarah', 'Connor', 'Kenza']
type(set(people['names']))
set
unique_people = set(people['names'])
type(unique_people)
set
df.columns
Index(['Species', 'Owner', 'Country.of.Origin', 'Farm.Name', 'Lot.Number',
       'Mill', 'ICO.Number', 'Company', 'Altitude', 'Region', 'Producer',
       'Number.of.Bags', 'Bag.Weight', 'In.Country.Partner', 'Harvest.Year',
       'Grading.Date', 'Owner.1', 'Variety', 'Processing.Method',
       'Fragrance...Aroma', 'Flavor', 'Aftertaste', 'Salt...Acid',
       'Bitter...Sweet', 'Mouthfeel', 'Uniform.Cup', 'Clean.Cup', 'Balance',
       'Cupper.Points', 'Total.Cup.Points', 'Moisture', 'Category.One.Defects',
       'Quakers', 'Color', 'Category.Two.Defects', 'Expiration',
       'Certification.Body', 'Certification.Address', 'Certification.Contact',
       'unit_of_measurement', 'altitude_low_meters', 'altitude_high_meters',
       'altitude_mean_meters'],
      dtype='object')
for col in df.columns:
    print(col.split('.'))
['Species']
['Owner']
['Country', 'of', 'Origin']
['Farm', 'Name']
['Lot', 'Number']
['Mill']
['ICO', 'Number']
['Company']
['Altitude']
['Region']
['Producer']
['Number', 'of', 'Bags']
['Bag', 'Weight']
['In', 'Country', 'Partner']
['Harvest', 'Year']
['Grading', 'Date']
['Owner', '1']
['Variety']
['Processing', 'Method']
['Fragrance', '', '', 'Aroma']
['Flavor']
['Aftertaste']
['Salt', '', '', 'Acid']
['Bitter', '', '', 'Sweet']
['Mouthfeel']
['Uniform', 'Cup']
['Clean', 'Cup']
['Balance']
['Cupper', 'Points']
['Total', 'Cup', 'Points']
['Moisture']
['Category', 'One', 'Defects']
['Quakers']
['Color']
['Category', 'Two', 'Defects']
['Expiration']
['Certification', 'Body']
['Certification', 'Address']
['Certification', 'Contact']
['unit_of_measurement']
['altitude_low_meters']
['altitude_high_meters']
['altitude_mean_meters']
for key,value in people.items():
    print(key,':',value)
names : ['Sarah', 'Connor', 'Kenza']
username : ['brownsarahm', 'sudoPsych', 'kbdlh']
df['Owner']
1          ankole coffee producers coop
2                        nishant gurjer
3                         andrew hetzel
4                                ugacof
5          katuka development trust ltd
6                         andrew hetzel
7                         andrew hetzel
8                        nishant gurjer
9                        nishant gurjer
10                               ugacof
11                               ugacof
12                       nishant gurjer
13                        andrew hetzel
14    kasozi coffee farmers association
15         ankole coffee producers coop
16                        andrew hetzel
17                        andrew hetzel
18                   kawacom uganda ltd
19                        nitubaasa ltd
20                mannya coffee project
21                        andrew hetzel
22                        andrew hetzel
23                        andrew hetzel
24                          luis robles
25                          luis robles
26                          james moore
27                        cafe politico
28                        cafe politico
Name: Owner, dtype: object
df.Owner
1          ankole coffee producers coop
2                        nishant gurjer
3                         andrew hetzel
4                                ugacof
5          katuka development trust ltd
6                         andrew hetzel
7                         andrew hetzel
8                        nishant gurjer
9                        nishant gurjer
10                               ugacof
11                               ugacof
12                       nishant gurjer
13                        andrew hetzel
14    kasozi coffee farmers association
15         ankole coffee producers coop
16                        andrew hetzel
17                        andrew hetzel
18                   kawacom uganda ltd
19                        nitubaasa ltd
20                mannya coffee project
21                        andrew hetzel
22                        andrew hetzel
23                        andrew hetzel
24                          luis robles
25                          luis robles
26                          james moore
27                        cafe politico
28                        cafe politico
Name: Owner, dtype: object
df
Species Owner Country.of.Origin Farm.Name Lot.Number Mill ICO.Number Company Altitude Region ... Color Category.Two.Defects Expiration Certification.Body Certification.Address Certification.Contact unit_of_measurement altitude_low_meters altitude_high_meters altitude_mean_meters
1 Robusta ankole coffee producers coop Uganda kyangundu cooperative society NaN ankole coffee producers 0 ankole coffee producers coop 1488 sheema south western ... Green 2 June 26th, 2015 Uganda Coffee Development Authority e36d0270932c3b657e96b7b0278dfd85dc0fe743 03077a1c6bac60e6f514691634a7f6eb5c85aae8 m 1488.0 1488.0 1488.0
2 Robusta nishant gurjer India sethuraman estate kaapi royale 25 sethuraman estate 14/1148/2017/21 kaapi royale 3170 chikmagalur karnataka indua ... NaN 2 October 31st, 2018 Specialty Coffee Association ff7c18ad303d4b603ac3f8cff7e611ffc735e720 352d0cf7f3e9be14dad7df644ad65efc27605ae2 m 3170.0 3170.0 3170.0
3 Robusta andrew hetzel India sethuraman estate NaN NaN 0000 sethuraman estate 1000m chikmagalur ... Green 0 April 29th, 2016 Specialty Coffee Association ff7c18ad303d4b603ac3f8cff7e611ffc735e720 352d0cf7f3e9be14dad7df644ad65efc27605ae2 m 1000.0 1000.0 1000.0
4 Robusta ugacof Uganda ugacof project area NaN ugacof 0 ugacof ltd 1212 central ... Green 7 July 14th, 2015 Uganda Coffee Development Authority e36d0270932c3b657e96b7b0278dfd85dc0fe743 03077a1c6bac60e6f514691634a7f6eb5c85aae8 m 1212.0 1212.0 1212.0
5 Robusta katuka development trust ltd Uganda katikamu capca farmers association NaN katuka development trust 0 katuka development trust ltd 1200-1300 luwero central region ... Green 3 June 26th, 2015 Uganda Coffee Development Authority e36d0270932c3b657e96b7b0278dfd85dc0fe743 03077a1c6bac60e6f514691634a7f6eb5c85aae8 m 1200.0 1300.0 1250.0
6 Robusta andrew hetzel India NaN NaN (self) NaN cafemakers, llc 3000' chikmagalur ... Green 0 February 28th, 2013 Specialty Coffee Association ff7c18ad303d4b603ac3f8cff7e611ffc735e720 352d0cf7f3e9be14dad7df644ad65efc27605ae2 m 3000.0 3000.0 3000.0
7 Robusta andrew hetzel India sethuraman estates NaN NaN NaN cafemakers 750m chikmagalur ... Green 0 May 15th, 2015 Specialty Coffee Association ff7c18ad303d4b603ac3f8cff7e611ffc735e720 352d0cf7f3e9be14dad7df644ad65efc27605ae2 m 750.0 750.0 750.0
8 Robusta nishant gurjer India sethuraman estate kaapi royale 7 sethuraman estate 14/1148/2017/18 kaapi royale 3140 chikmagalur karnataka india ... Bluish-Green 0 October 25th, 2018 Specialty Coffee Association ff7c18ad303d4b603ac3f8cff7e611ffc735e720 352d0cf7f3e9be14dad7df644ad65efc27605ae2 m 3140.0 3140.0 3140.0
9 Robusta nishant gurjer India sethuraman estate RKR sethuraman estate 14/1148/2016/17 kaapi royale 1000 chikmagalur karnataka ... Green 0 August 17th, 2017 Specialty Coffee Association ff7c18ad303d4b603ac3f8cff7e611ffc735e720 352d0cf7f3e9be14dad7df644ad65efc27605ae2 m 1000.0 1000.0 1000.0
10 Robusta ugacof Uganda ishaka NaN nsubuga umar 0 ugacof ltd 900-1300 western ... Green 6 August 5th, 2015 Uganda Coffee Development Authority e36d0270932c3b657e96b7b0278dfd85dc0fe743 03077a1c6bac60e6f514691634a7f6eb5c85aae8 m 900.0 1300.0 1100.0
11 Robusta ugacof Uganda ugacof project area NaN ugacof 0 ugacof ltd 1095 iganga namadrope eastern ... Green 1 June 26th, 2015 Uganda Coffee Development Authority e36d0270932c3b657e96b7b0278dfd85dc0fe743 03077a1c6bac60e6f514691634a7f6eb5c85aae8 m 1095.0 1095.0 1095.0
12 Robusta nishant gurjer India sethuraman estate kaapi royale RC AB sethuraman estate 14/1148/2016/12 kaapi royale 1000 chikmagalur karnataka ... Green 0 August 23rd, 2017 Specialty Coffee Association ff7c18ad303d4b603ac3f8cff7e611ffc735e720 352d0cf7f3e9be14dad7df644ad65efc27605ae2 m 1000.0 1000.0 1000.0
13 Robusta andrew hetzel India sethuraman estates NaN NaN NaN cafemakers 750m chikmagalur ... Green 1 May 19th, 2015 Specialty Coffee Association ff7c18ad303d4b603ac3f8cff7e611ffc735e720 352d0cf7f3e9be14dad7df644ad65efc27605ae2 m 750.0 750.0 750.0
14 Robusta kasozi coffee farmers association Uganda kasozi coffee farmers NaN NaN 0 kasozi coffee farmers association 1367 eastern ... Green 7 July 14th, 2015 Uganda Coffee Development Authority e36d0270932c3b657e96b7b0278dfd85dc0fe743 03077a1c6bac60e6f514691634a7f6eb5c85aae8 m 1367.0 1367.0 1367.0
15 Robusta ankole coffee producers coop Uganda kyangundu coop society NaN ankole coffee producers coop union ltd 0 ankole coffee producers coop 1488 south western ... Green 2 July 14th, 2015 Uganda Coffee Development Authority e36d0270932c3b657e96b7b0278dfd85dc0fe743 03077a1c6bac60e6f514691634a7f6eb5c85aae8 m 1488.0 1488.0 1488.0
16 Robusta andrew hetzel India sethuraman estate NaN NaN 0000 sethuraman estate 1000m chikmagalur ... Green 0 April 29th, 2016 Specialty Coffee Association ff7c18ad303d4b603ac3f8cff7e611ffc735e720 352d0cf7f3e9be14dad7df644ad65efc27605ae2 m 1000.0 1000.0 1000.0
17 Robusta andrew hetzel India sethuraman estates NaN sethuraman estates NaN cafemakers, llc 750m chikmagalur ... Blue-Green 0 June 3rd, 2014 Specialty Coffee Association ff7c18ad303d4b603ac3f8cff7e611ffc735e720 352d0cf7f3e9be14dad7df644ad65efc27605ae2 m 750.0 750.0 750.0
18 Robusta kawacom uganda ltd Uganda bushenyi NaN kawacom 0 kawacom uganda ltd 1600 western ... Green 1 June 27th, 2015 Uganda Coffee Development Authority e36d0270932c3b657e96b7b0278dfd85dc0fe743 03077a1c6bac60e6f514691634a7f6eb5c85aae8 m 1600.0 1600.0 1600.0
19 Robusta nitubaasa ltd Uganda kigezi coffee farmers association NaN nitubaasa 0 nitubaasa ltd 1745 western ... Green 2 June 27th, 2015 Uganda Coffee Development Authority e36d0270932c3b657e96b7b0278dfd85dc0fe743 03077a1c6bac60e6f514691634a7f6eb5c85aae8 m 1745.0 1745.0 1745.0
20 Robusta mannya coffee project Uganda mannya coffee project NaN mannya coffee project 0 mannya coffee project 1200 southern ... Green 1 June 27th, 2015 Uganda Coffee Development Authority e36d0270932c3b657e96b7b0278dfd85dc0fe743 03077a1c6bac60e6f514691634a7f6eb5c85aae8 m 1200.0 1200.0 1200.0
21 Robusta andrew hetzel India sethuraman estates NaN NaN NaN cafemakers 750m chikmagalur ... Bluish-Green 1 May 19th, 2015 Specialty Coffee Association ff7c18ad303d4b603ac3f8cff7e611ffc735e720 352d0cf7f3e9be14dad7df644ad65efc27605ae2 m 750.0 750.0 750.0
22 Robusta andrew hetzel India sethuraman estates NaN sethuraman estates NaN cafemakers, llc 750m chikmagalur ... Green 0 June 20th, 2014 Specialty Coffee Association ff7c18ad303d4b603ac3f8cff7e611ffc735e720 352d0cf7f3e9be14dad7df644ad65efc27605ae2 m 750.0 750.0 750.0
23 Robusta andrew hetzel United States sethuraman estates NaN sethuraman estates NaN cafemakers, llc 3000' chikmagalur ... Green 0 February 28th, 2013 Specialty Coffee Association ff7c18ad303d4b603ac3f8cff7e611ffc735e720 352d0cf7f3e9be14dad7df644ad65efc27605ae2 m 3000.0 3000.0 3000.0
24 Robusta luis robles Ecuador robustasa Lavado 1 our own lab NaN robustasa NaN san juan, playas ... Blue-Green 1 January 18th, 2017 Specialty Coffee Association ff7c18ad303d4b603ac3f8cff7e611ffc735e720 352d0cf7f3e9be14dad7df644ad65efc27605ae2 m NaN NaN NaN
25 Robusta luis robles Ecuador robustasa Lavado 3 own laboratory NaN robustasa 40 san juan, playas ... Blue-Green 0 January 18th, 2017 Specialty Coffee Association ff7c18ad303d4b603ac3f8cff7e611ffc735e720 352d0cf7f3e9be14dad7df644ad65efc27605ae2 m 40.0 40.0 40.0
26 Robusta james moore United States fazenda cazengo NaN cafe cazengo NaN global opportunity fund 795 meters kwanza norte province, angola ... NaN 6 December 23rd, 2015 Specialty Coffee Association ff7c18ad303d4b603ac3f8cff7e611ffc735e720 352d0cf7f3e9be14dad7df644ad65efc27605ae2 m 795.0 795.0 795.0
27 Robusta cafe politico India NaN NaN NaN 14-1118-2014-0087 cafe politico NaN NaN ... Green 1 August 25th, 2015 Specialty Coffee Association ff7c18ad303d4b603ac3f8cff7e611ffc735e720 352d0cf7f3e9be14dad7df644ad65efc27605ae2 m NaN NaN NaN
28 Robusta cafe politico Vietnam NaN NaN NaN NaN cafe politico NaN NaN ... None 9 August 25th, 2015 Specialty Coffee Association ff7c18ad303d4b603ac3f8cff7e611ffc735e720 352d0cf7f3e9be14dad7df644ad65efc27605ae2 m NaN NaN NaN

28 rows × 43 columns

Key points:

write three things to remember from today’s class

4.4. Questions After Classroom#

many overlapping questions today

4.5. General#

How to know which function to use in certain problems or situations

This is something you build up knowledge of slowly, and, sometimes you have a general idea, but have to look up the specifics. Having domain expertise of the dataset or a collaborator that does will help you

4.6. Clarifying#

Is there a way to have a set show the duplicates that get discarded?

no, set is casting the data type so it loses information

being able to access the code somewhere without asking to scroll would be nice
  • will work on adding most code to prismia, but if I miss some, always ask.

4.7. Course Admin#

When will homeworks be posted/due typically?
  • Posted Wednesday

  • Due the followign Tuesday

4.8. Questions we’ll answer later#

can you use cast a pandas dataframe into a set?
  • there are better ways to find unique values and remove duplicates in a dataframe

4.9. Try it yourself#

  • Create variables of three different types with facts about yourself. Use descriptive variable names relative to the contents, not their types.

title = 'dr' #string
office_number = 134 # int
courses_taught = ['Programming for Data Science',
  'Machine Learning for Science & Society']
  • Create a list, again with a descriptive name, and print out the types

about_prof_brown_list = [title, office_number, courses_taught]

# regular for loop
for fact in about_prof_brown_list:
    print (type(fact))
<class 'str'>
<class 'int'>
<class 'list'>
  • Write a function, type_extractor that takes a list and a type and returns the item of that type from the list

  • Test your function on all three items from your dictionary.

  • Use one type of jupyter help on your function, what does it display? If it doesn’t display anything modify your function so that help will work.

  • Make yourself notes in the most memorable way for you about what a DataFrame is.

Ram Token Opportunity

Contribute possible practice questions to the notes using the suggest an edit button behind the GitHub menu at the top of the page.