Pandas DataFrames
Contents
4. Pandas DataFrames#
Today, we’re going to explore DataFrames in greater detail. We’ll continue using that same coffee dataset.
coffee_data_url = 'https://raw.githubusercontent.com/jldbc/coffee-quality-database/master/data/robusta_data_cleaned.csv'
4.1. More about loading libraries#
We can import pandas without the alias pd
if we want, but then we have to use
the full name everywhere
import pandas
pandas.read_csv(coffee_data_url)
Unnamed: 0 | Species | Owner | Country.of.Origin | Farm.Name | Lot.Number | Mill | ICO.Number | Company | Altitude | ... | Color | Category.Two.Defects | Expiration | Certification.Body | Certification.Address | Certification.Contact | unit_of_measurement | altitude_low_meters | altitude_high_meters | altitude_mean_meters | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | 1 | Robusta | ankole coffee producers coop | Uganda | kyangundu cooperative society | NaN | ankole coffee producers | 0 | ankole coffee producers coop | 1488 | ... | Green | 2 | June 26th, 2015 | Uganda Coffee Development Authority | e36d0270932c3b657e96b7b0278dfd85dc0fe743 | 03077a1c6bac60e6f514691634a7f6eb5c85aae8 | m | 1488.0 | 1488.0 | 1488.0 |
1 | 2 | Robusta | nishant gurjer | India | sethuraman estate kaapi royale | 25 | sethuraman estate | 14/1148/2017/21 | kaapi royale | 3170 | ... | NaN | 2 | October 31st, 2018 | Specialty Coffee Association | ff7c18ad303d4b603ac3f8cff7e611ffc735e720 | 352d0cf7f3e9be14dad7df644ad65efc27605ae2 | m | 3170.0 | 3170.0 | 3170.0 |
2 | 3 | Robusta | andrew hetzel | India | sethuraman estate | NaN | NaN | 0000 | sethuraman estate | 1000m | ... | Green | 0 | April 29th, 2016 | Specialty Coffee Association | ff7c18ad303d4b603ac3f8cff7e611ffc735e720 | 352d0cf7f3e9be14dad7df644ad65efc27605ae2 | m | 1000.0 | 1000.0 | 1000.0 |
3 | 4 | Robusta | ugacof | Uganda | ugacof project area | NaN | ugacof | 0 | ugacof ltd | 1212 | ... | Green | 7 | July 14th, 2015 | Uganda Coffee Development Authority | e36d0270932c3b657e96b7b0278dfd85dc0fe743 | 03077a1c6bac60e6f514691634a7f6eb5c85aae8 | m | 1212.0 | 1212.0 | 1212.0 |
4 | 5 | Robusta | katuka development trust ltd | Uganda | katikamu capca farmers association | NaN | katuka development trust | 0 | katuka development trust ltd | 1200-1300 | ... | Green | 3 | June 26th, 2015 | Uganda Coffee Development Authority | e36d0270932c3b657e96b7b0278dfd85dc0fe743 | 03077a1c6bac60e6f514691634a7f6eb5c85aae8 | m | 1200.0 | 1300.0 | 1250.0 |
5 | 6 | Robusta | andrew hetzel | India | NaN | NaN | (self) | NaN | cafemakers, llc | 3000' | ... | Green | 0 | February 28th, 2013 | Specialty Coffee Association | ff7c18ad303d4b603ac3f8cff7e611ffc735e720 | 352d0cf7f3e9be14dad7df644ad65efc27605ae2 | m | 3000.0 | 3000.0 | 3000.0 |
6 | 7 | Robusta | andrew hetzel | India | sethuraman estates | NaN | NaN | NaN | cafemakers | 750m | ... | Green | 0 | May 15th, 2015 | Specialty Coffee Association | ff7c18ad303d4b603ac3f8cff7e611ffc735e720 | 352d0cf7f3e9be14dad7df644ad65efc27605ae2 | m | 750.0 | 750.0 | 750.0 |
7 | 8 | Robusta | nishant gurjer | India | sethuraman estate kaapi royale | 7 | sethuraman estate | 14/1148/2017/18 | kaapi royale | 3140 | ... | Bluish-Green | 0 | October 25th, 2018 | Specialty Coffee Association | ff7c18ad303d4b603ac3f8cff7e611ffc735e720 | 352d0cf7f3e9be14dad7df644ad65efc27605ae2 | m | 3140.0 | 3140.0 | 3140.0 |
8 | 9 | Robusta | nishant gurjer | India | sethuraman estate | RKR | sethuraman estate | 14/1148/2016/17 | kaapi royale | 1000 | ... | Green | 0 | August 17th, 2017 | Specialty Coffee Association | ff7c18ad303d4b603ac3f8cff7e611ffc735e720 | 352d0cf7f3e9be14dad7df644ad65efc27605ae2 | m | 1000.0 | 1000.0 | 1000.0 |
9 | 10 | Robusta | ugacof | Uganda | ishaka | NaN | nsubuga umar | 0 | ugacof ltd | 900-1300 | ... | Green | 6 | August 5th, 2015 | Uganda Coffee Development Authority | e36d0270932c3b657e96b7b0278dfd85dc0fe743 | 03077a1c6bac60e6f514691634a7f6eb5c85aae8 | m | 900.0 | 1300.0 | 1100.0 |
10 | 11 | Robusta | ugacof | Uganda | ugacof project area | NaN | ugacof | 0 | ugacof ltd | 1095 | ... | Green | 1 | June 26th, 2015 | Uganda Coffee Development Authority | e36d0270932c3b657e96b7b0278dfd85dc0fe743 | 03077a1c6bac60e6f514691634a7f6eb5c85aae8 | m | 1095.0 | 1095.0 | 1095.0 |
11 | 12 | Robusta | nishant gurjer | India | sethuraman estate kaapi royale | RC AB | sethuraman estate | 14/1148/2016/12 | kaapi royale | 1000 | ... | Green | 0 | August 23rd, 2017 | Specialty Coffee Association | ff7c18ad303d4b603ac3f8cff7e611ffc735e720 | 352d0cf7f3e9be14dad7df644ad65efc27605ae2 | m | 1000.0 | 1000.0 | 1000.0 |
12 | 13 | Robusta | andrew hetzel | India | sethuraman estates | NaN | NaN | NaN | cafemakers | 750m | ... | Green | 1 | May 19th, 2015 | Specialty Coffee Association | ff7c18ad303d4b603ac3f8cff7e611ffc735e720 | 352d0cf7f3e9be14dad7df644ad65efc27605ae2 | m | 750.0 | 750.0 | 750.0 |
13 | 14 | Robusta | kasozi coffee farmers association | Uganda | kasozi coffee farmers | NaN | NaN | 0 | kasozi coffee farmers association | 1367 | ... | Green | 7 | July 14th, 2015 | Uganda Coffee Development Authority | e36d0270932c3b657e96b7b0278dfd85dc0fe743 | 03077a1c6bac60e6f514691634a7f6eb5c85aae8 | m | 1367.0 | 1367.0 | 1367.0 |
14 | 15 | Robusta | ankole coffee producers coop | Uganda | kyangundu coop society | NaN | ankole coffee producers coop union ltd | 0 | ankole coffee producers coop | 1488 | ... | Green | 2 | July 14th, 2015 | Uganda Coffee Development Authority | e36d0270932c3b657e96b7b0278dfd85dc0fe743 | 03077a1c6bac60e6f514691634a7f6eb5c85aae8 | m | 1488.0 | 1488.0 | 1488.0 |
15 | 16 | Robusta | andrew hetzel | India | sethuraman estate | NaN | NaN | 0000 | sethuraman estate | 1000m | ... | Green | 0 | April 29th, 2016 | Specialty Coffee Association | ff7c18ad303d4b603ac3f8cff7e611ffc735e720 | 352d0cf7f3e9be14dad7df644ad65efc27605ae2 | m | 1000.0 | 1000.0 | 1000.0 |
16 | 17 | Robusta | andrew hetzel | India | sethuraman estates | NaN | sethuraman estates | NaN | cafemakers, llc | 750m | ... | Blue-Green | 0 | June 3rd, 2014 | Specialty Coffee Association | ff7c18ad303d4b603ac3f8cff7e611ffc735e720 | 352d0cf7f3e9be14dad7df644ad65efc27605ae2 | m | 750.0 | 750.0 | 750.0 |
17 | 18 | Robusta | kawacom uganda ltd | Uganda | bushenyi | NaN | kawacom | 0 | kawacom uganda ltd | 1600 | ... | Green | 1 | June 27th, 2015 | Uganda Coffee Development Authority | e36d0270932c3b657e96b7b0278dfd85dc0fe743 | 03077a1c6bac60e6f514691634a7f6eb5c85aae8 | m | 1600.0 | 1600.0 | 1600.0 |
18 | 19 | Robusta | nitubaasa ltd | Uganda | kigezi coffee farmers association | NaN | nitubaasa | 0 | nitubaasa ltd | 1745 | ... | Green | 2 | June 27th, 2015 | Uganda Coffee Development Authority | e36d0270932c3b657e96b7b0278dfd85dc0fe743 | 03077a1c6bac60e6f514691634a7f6eb5c85aae8 | m | 1745.0 | 1745.0 | 1745.0 |
19 | 20 | Robusta | mannya coffee project | Uganda | mannya coffee project | NaN | mannya coffee project | 0 | mannya coffee project | 1200 | ... | Green | 1 | June 27th, 2015 | Uganda Coffee Development Authority | e36d0270932c3b657e96b7b0278dfd85dc0fe743 | 03077a1c6bac60e6f514691634a7f6eb5c85aae8 | m | 1200.0 | 1200.0 | 1200.0 |
20 | 21 | Robusta | andrew hetzel | India | sethuraman estates | NaN | NaN | NaN | cafemakers | 750m | ... | Bluish-Green | 1 | May 19th, 2015 | Specialty Coffee Association | ff7c18ad303d4b603ac3f8cff7e611ffc735e720 | 352d0cf7f3e9be14dad7df644ad65efc27605ae2 | m | 750.0 | 750.0 | 750.0 |
21 | 22 | Robusta | andrew hetzel | India | sethuraman estates | NaN | sethuraman estates | NaN | cafemakers, llc | 750m | ... | Green | 0 | June 20th, 2014 | Specialty Coffee Association | ff7c18ad303d4b603ac3f8cff7e611ffc735e720 | 352d0cf7f3e9be14dad7df644ad65efc27605ae2 | m | 750.0 | 750.0 | 750.0 |
22 | 23 | Robusta | andrew hetzel | United States | sethuraman estates | NaN | sethuraman estates | NaN | cafemakers, llc | 3000' | ... | Green | 0 | February 28th, 2013 | Specialty Coffee Association | ff7c18ad303d4b603ac3f8cff7e611ffc735e720 | 352d0cf7f3e9be14dad7df644ad65efc27605ae2 | m | 3000.0 | 3000.0 | 3000.0 |
23 | 24 | Robusta | luis robles | Ecuador | robustasa | Lavado 1 | our own lab | NaN | robustasa | NaN | ... | Blue-Green | 1 | January 18th, 2017 | Specialty Coffee Association | ff7c18ad303d4b603ac3f8cff7e611ffc735e720 | 352d0cf7f3e9be14dad7df644ad65efc27605ae2 | m | NaN | NaN | NaN |
24 | 25 | Robusta | luis robles | Ecuador | robustasa | Lavado 3 | own laboratory | NaN | robustasa | 40 | ... | Blue-Green | 0 | January 18th, 2017 | Specialty Coffee Association | ff7c18ad303d4b603ac3f8cff7e611ffc735e720 | 352d0cf7f3e9be14dad7df644ad65efc27605ae2 | m | 40.0 | 40.0 | 40.0 |
25 | 26 | Robusta | james moore | United States | fazenda cazengo | NaN | cafe cazengo | NaN | global opportunity fund | 795 meters | ... | NaN | 6 | December 23rd, 2015 | Specialty Coffee Association | ff7c18ad303d4b603ac3f8cff7e611ffc735e720 | 352d0cf7f3e9be14dad7df644ad65efc27605ae2 | m | 795.0 | 795.0 | 795.0 |
26 | 27 | Robusta | cafe politico | India | NaN | NaN | NaN | 14-1118-2014-0087 | cafe politico | NaN | ... | Green | 1 | August 25th, 2015 | Specialty Coffee Association | ff7c18ad303d4b603ac3f8cff7e611ffc735e720 | 352d0cf7f3e9be14dad7df644ad65efc27605ae2 | m | NaN | NaN | NaN |
27 | 28 | Robusta | cafe politico | Vietnam | NaN | NaN | NaN | NaN | cafe politico | NaN | ... | None | 9 | August 25th, 2015 | Specialty Coffee Association | ff7c18ad303d4b603ac3f8cff7e611ffc735e720 | 352d0cf7f3e9be14dad7df644ad65efc27605ae2 | m | NaN | NaN | NaN |
28 rows × 44 columns
We’ll use pd
because that’s the more common convention and so that we can type
fewer characters throughout our code
import pandas as pd
4.2. Examining DataFrames#
df = pd.read_csv(coffee_data_url,index_col=0)
We can look at the first 5 rows with head
df.head()
Species | Owner | Country.of.Origin | Farm.Name | Lot.Number | Mill | ICO.Number | Company | Altitude | Region | ... | Color | Category.Two.Defects | Expiration | Certification.Body | Certification.Address | Certification.Contact | unit_of_measurement | altitude_low_meters | altitude_high_meters | altitude_mean_meters | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
1 | Robusta | ankole coffee producers coop | Uganda | kyangundu cooperative society | NaN | ankole coffee producers | 0 | ankole coffee producers coop | 1488 | sheema south western | ... | Green | 2 | June 26th, 2015 | Uganda Coffee Development Authority | e36d0270932c3b657e96b7b0278dfd85dc0fe743 | 03077a1c6bac60e6f514691634a7f6eb5c85aae8 | m | 1488.0 | 1488.0 | 1488.0 |
2 | Robusta | nishant gurjer | India | sethuraman estate kaapi royale | 25 | sethuraman estate | 14/1148/2017/21 | kaapi royale | 3170 | chikmagalur karnataka indua | ... | NaN | 2 | October 31st, 2018 | Specialty Coffee Association | ff7c18ad303d4b603ac3f8cff7e611ffc735e720 | 352d0cf7f3e9be14dad7df644ad65efc27605ae2 | m | 3170.0 | 3170.0 | 3170.0 |
3 | Robusta | andrew hetzel | India | sethuraman estate | NaN | NaN | 0000 | sethuraman estate | 1000m | chikmagalur | ... | Green | 0 | April 29th, 2016 | Specialty Coffee Association | ff7c18ad303d4b603ac3f8cff7e611ffc735e720 | 352d0cf7f3e9be14dad7df644ad65efc27605ae2 | m | 1000.0 | 1000.0 | 1000.0 |
4 | Robusta | ugacof | Uganda | ugacof project area | NaN | ugacof | 0 | ugacof ltd | 1212 | central | ... | Green | 7 | July 14th, 2015 | Uganda Coffee Development Authority | e36d0270932c3b657e96b7b0278dfd85dc0fe743 | 03077a1c6bac60e6f514691634a7f6eb5c85aae8 | m | 1212.0 | 1212.0 | 1212.0 |
5 | Robusta | katuka development trust ltd | Uganda | katikamu capca farmers association | NaN | katuka development trust | 0 | katuka development trust ltd | 1200-1300 | luwero central region | ... | Green | 3 | June 26th, 2015 | Uganda Coffee Development Authority | e36d0270932c3b657e96b7b0278dfd85dc0fe743 | 03077a1c6bac60e6f514691634a7f6eb5c85aae8 | m | 1200.0 | 1300.0 | 1250.0 |
5 rows × 43 columns
Using help, we can see that that head takes one parameter and has a default value of 5, which is why we got 5 rows, but we can get 2 instead
df.head(2)
Species | Owner | Country.of.Origin | Farm.Name | Lot.Number | Mill | ICO.Number | Company | Altitude | Region | ... | Color | Category.Two.Defects | Expiration | Certification.Body | Certification.Address | Certification.Contact | unit_of_measurement | altitude_low_meters | altitude_high_meters | altitude_mean_meters | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
1 | Robusta | ankole coffee producers coop | Uganda | kyangundu cooperative society | NaN | ankole coffee producers | 0 | ankole coffee producers coop | 1488 | sheema south western | ... | Green | 2 | June 26th, 2015 | Uganda Coffee Development Authority | e36d0270932c3b657e96b7b0278dfd85dc0fe743 | 03077a1c6bac60e6f514691634a7f6eb5c85aae8 | m | 1488.0 | 1488.0 | 1488.0 |
2 | Robusta | nishant gurjer | India | sethuraman estate kaapi royale | 25 | sethuraman estate | 14/1148/2017/21 | kaapi royale | 3170 | chikmagalur karnataka indua | ... | NaN | 2 | October 31st, 2018 | Specialty Coffee Association | ff7c18ad303d4b603ac3f8cff7e611ffc735e720 | 352d0cf7f3e9be14dad7df644ad65efc27605ae2 | m | 3170.0 | 3170.0 | 3170.0 |
2 rows × 43 columns
We can look at the last rows with tail
df.tail(3)
Species | Owner | Country.of.Origin | Farm.Name | Lot.Number | Mill | ICO.Number | Company | Altitude | Region | ... | Color | Category.Two.Defects | Expiration | Certification.Body | Certification.Address | Certification.Contact | unit_of_measurement | altitude_low_meters | altitude_high_meters | altitude_mean_meters | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
26 | Robusta | james moore | United States | fazenda cazengo | NaN | cafe cazengo | NaN | global opportunity fund | 795 meters | kwanza norte province, angola | ... | NaN | 6 | December 23rd, 2015 | Specialty Coffee Association | ff7c18ad303d4b603ac3f8cff7e611ffc735e720 | 352d0cf7f3e9be14dad7df644ad65efc27605ae2 | m | 795.0 | 795.0 | 795.0 |
27 | Robusta | cafe politico | India | NaN | NaN | NaN | 14-1118-2014-0087 | cafe politico | NaN | NaN | ... | Green | 1 | August 25th, 2015 | Specialty Coffee Association | ff7c18ad303d4b603ac3f8cff7e611ffc735e720 | 352d0cf7f3e9be14dad7df644ad65efc27605ae2 | m | NaN | NaN | NaN |
28 | Robusta | cafe politico | Vietnam | NaN | NaN | NaN | NaN | cafe politico | NaN | NaN | ... | None | 9 | August 25th, 2015 | Specialty Coffee Association | ff7c18ad303d4b603ac3f8cff7e611ffc735e720 | 352d0cf7f3e9be14dad7df644ad65efc27605ae2 | m | NaN | NaN | NaN |
3 rows × 43 columns
I told you this was a DataFrame, but we can check with type.
type(df)
pandas.core.frame.DataFrame
We can also exmaine its parts. It consists of several; first the column headings
df.columns
Index(['Species', 'Owner', 'Country.of.Origin', 'Farm.Name', 'Lot.Number',
'Mill', 'ICO.Number', 'Company', 'Altitude', 'Region', 'Producer',
'Number.of.Bags', 'Bag.Weight', 'In.Country.Partner', 'Harvest.Year',
'Grading.Date', 'Owner.1', 'Variety', 'Processing.Method',
'Fragrance...Aroma', 'Flavor', 'Aftertaste', 'Salt...Acid',
'Bitter...Sweet', 'Mouthfeel', 'Uniform.Cup', 'Clean.Cup', 'Balance',
'Cupper.Points', 'Total.Cup.Points', 'Moisture', 'Category.One.Defects',
'Quakers', 'Color', 'Category.Two.Defects', 'Expiration',
'Certification.Body', 'Certification.Address', 'Certification.Contact',
'unit_of_measurement', 'altitude_low_meters', 'altitude_high_meters',
'altitude_mean_meters'],
dtype='object')
These are a special type called Index
type(df.columns)
pandas.core.indexes.base.Index
It also has an index
df.index
Int64Index([ 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17,
18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28],
dtype='int64')
and values
df.values
array([['Robusta', 'ankole coffee producers coop', 'Uganda', ..., 1488.0,
1488.0, 1488.0],
['Robusta', 'nishant gurjer', 'India', ..., 3170.0, 3170.0,
3170.0],
['Robusta', 'andrew hetzel', 'India', ..., 1000.0, 1000.0, 1000.0],
...,
['Robusta', 'james moore', 'United States', ..., 795.0, 795.0,
795.0],
['Robusta', 'cafe politico', 'India', ..., nan, nan, nan],
['Robusta', 'cafe politico', 'Vietnam', ..., nan, nan, nan]],
dtype=object)
it also knows its own shape
df.shape
(28, 43)
we can use builtin fucntions on our DataFrame too not just its own methods and attributes.
len(df)
28
Why does len
turn green?
it’s a python reserve word
4.3. Building a Data Frame programmatically#
One way to build a data frame is from a dictionary:
people = {'names':['Sarah','Connor','Kenza'],
'username':['brownsarahm','sudoPsych','kbdlh']}
people
{'names': ['Sarah', 'Connor', 'Kenza'],
'username': ['brownsarahm', 'sudoPsych', 'kbdlh']}
type(people)
dict
people_df = pd.DataFrame(people)
people_df
names | username | |
---|---|---|
0 | Sarah | brownsarahm |
1 | Connor | sudoPsych |
2 | Kenza | kbdlh |
type(people['names'])
list
type(people)
dict
type({4,5,5})
set
{4,5,5}
{4, 5}
people['names']
['Sarah', 'Connor', 'Kenza']
type(set(people['names']))
set
unique_people = set(people['names'])
type(unique_people)
set
df.columns
Index(['Species', 'Owner', 'Country.of.Origin', 'Farm.Name', 'Lot.Number',
'Mill', 'ICO.Number', 'Company', 'Altitude', 'Region', 'Producer',
'Number.of.Bags', 'Bag.Weight', 'In.Country.Partner', 'Harvest.Year',
'Grading.Date', 'Owner.1', 'Variety', 'Processing.Method',
'Fragrance...Aroma', 'Flavor', 'Aftertaste', 'Salt...Acid',
'Bitter...Sweet', 'Mouthfeel', 'Uniform.Cup', 'Clean.Cup', 'Balance',
'Cupper.Points', 'Total.Cup.Points', 'Moisture', 'Category.One.Defects',
'Quakers', 'Color', 'Category.Two.Defects', 'Expiration',
'Certification.Body', 'Certification.Address', 'Certification.Contact',
'unit_of_measurement', 'altitude_low_meters', 'altitude_high_meters',
'altitude_mean_meters'],
dtype='object')
for col in df.columns:
print(col.split('.'))
['Species']
['Owner']
['Country', 'of', 'Origin']
['Farm', 'Name']
['Lot', 'Number']
['Mill']
['ICO', 'Number']
['Company']
['Altitude']
['Region']
['Producer']
['Number', 'of', 'Bags']
['Bag', 'Weight']
['In', 'Country', 'Partner']
['Harvest', 'Year']
['Grading', 'Date']
['Owner', '1']
['Variety']
['Processing', 'Method']
['Fragrance', '', '', 'Aroma']
['Flavor']
['Aftertaste']
['Salt', '', '', 'Acid']
['Bitter', '', '', 'Sweet']
['Mouthfeel']
['Uniform', 'Cup']
['Clean', 'Cup']
['Balance']
['Cupper', 'Points']
['Total', 'Cup', 'Points']
['Moisture']
['Category', 'One', 'Defects']
['Quakers']
['Color']
['Category', 'Two', 'Defects']
['Expiration']
['Certification', 'Body']
['Certification', 'Address']
['Certification', 'Contact']
['unit_of_measurement']
['altitude_low_meters']
['altitude_high_meters']
['altitude_mean_meters']
for key,value in people.items():
print(key,':',value)
names : ['Sarah', 'Connor', 'Kenza']
username : ['brownsarahm', 'sudoPsych', 'kbdlh']
df['Owner']
1 ankole coffee producers coop
2 nishant gurjer
3 andrew hetzel
4 ugacof
5 katuka development trust ltd
6 andrew hetzel
7 andrew hetzel
8 nishant gurjer
9 nishant gurjer
10 ugacof
11 ugacof
12 nishant gurjer
13 andrew hetzel
14 kasozi coffee farmers association
15 ankole coffee producers coop
16 andrew hetzel
17 andrew hetzel
18 kawacom uganda ltd
19 nitubaasa ltd
20 mannya coffee project
21 andrew hetzel
22 andrew hetzel
23 andrew hetzel
24 luis robles
25 luis robles
26 james moore
27 cafe politico
28 cafe politico
Name: Owner, dtype: object
df.Owner
1 ankole coffee producers coop
2 nishant gurjer
3 andrew hetzel
4 ugacof
5 katuka development trust ltd
6 andrew hetzel
7 andrew hetzel
8 nishant gurjer
9 nishant gurjer
10 ugacof
11 ugacof
12 nishant gurjer
13 andrew hetzel
14 kasozi coffee farmers association
15 ankole coffee producers coop
16 andrew hetzel
17 andrew hetzel
18 kawacom uganda ltd
19 nitubaasa ltd
20 mannya coffee project
21 andrew hetzel
22 andrew hetzel
23 andrew hetzel
24 luis robles
25 luis robles
26 james moore
27 cafe politico
28 cafe politico
Name: Owner, dtype: object
df
Species | Owner | Country.of.Origin | Farm.Name | Lot.Number | Mill | ICO.Number | Company | Altitude | Region | ... | Color | Category.Two.Defects | Expiration | Certification.Body | Certification.Address | Certification.Contact | unit_of_measurement | altitude_low_meters | altitude_high_meters | altitude_mean_meters | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
1 | Robusta | ankole coffee producers coop | Uganda | kyangundu cooperative society | NaN | ankole coffee producers | 0 | ankole coffee producers coop | 1488 | sheema south western | ... | Green | 2 | June 26th, 2015 | Uganda Coffee Development Authority | e36d0270932c3b657e96b7b0278dfd85dc0fe743 | 03077a1c6bac60e6f514691634a7f6eb5c85aae8 | m | 1488.0 | 1488.0 | 1488.0 |
2 | Robusta | nishant gurjer | India | sethuraman estate kaapi royale | 25 | sethuraman estate | 14/1148/2017/21 | kaapi royale | 3170 | chikmagalur karnataka indua | ... | NaN | 2 | October 31st, 2018 | Specialty Coffee Association | ff7c18ad303d4b603ac3f8cff7e611ffc735e720 | 352d0cf7f3e9be14dad7df644ad65efc27605ae2 | m | 3170.0 | 3170.0 | 3170.0 |
3 | Robusta | andrew hetzel | India | sethuraman estate | NaN | NaN | 0000 | sethuraman estate | 1000m | chikmagalur | ... | Green | 0 | April 29th, 2016 | Specialty Coffee Association | ff7c18ad303d4b603ac3f8cff7e611ffc735e720 | 352d0cf7f3e9be14dad7df644ad65efc27605ae2 | m | 1000.0 | 1000.0 | 1000.0 |
4 | Robusta | ugacof | Uganda | ugacof project area | NaN | ugacof | 0 | ugacof ltd | 1212 | central | ... | Green | 7 | July 14th, 2015 | Uganda Coffee Development Authority | e36d0270932c3b657e96b7b0278dfd85dc0fe743 | 03077a1c6bac60e6f514691634a7f6eb5c85aae8 | m | 1212.0 | 1212.0 | 1212.0 |
5 | Robusta | katuka development trust ltd | Uganda | katikamu capca farmers association | NaN | katuka development trust | 0 | katuka development trust ltd | 1200-1300 | luwero central region | ... | Green | 3 | June 26th, 2015 | Uganda Coffee Development Authority | e36d0270932c3b657e96b7b0278dfd85dc0fe743 | 03077a1c6bac60e6f514691634a7f6eb5c85aae8 | m | 1200.0 | 1300.0 | 1250.0 |
6 | Robusta | andrew hetzel | India | NaN | NaN | (self) | NaN | cafemakers, llc | 3000' | chikmagalur | ... | Green | 0 | February 28th, 2013 | Specialty Coffee Association | ff7c18ad303d4b603ac3f8cff7e611ffc735e720 | 352d0cf7f3e9be14dad7df644ad65efc27605ae2 | m | 3000.0 | 3000.0 | 3000.0 |
7 | Robusta | andrew hetzel | India | sethuraman estates | NaN | NaN | NaN | cafemakers | 750m | chikmagalur | ... | Green | 0 | May 15th, 2015 | Specialty Coffee Association | ff7c18ad303d4b603ac3f8cff7e611ffc735e720 | 352d0cf7f3e9be14dad7df644ad65efc27605ae2 | m | 750.0 | 750.0 | 750.0 |
8 | Robusta | nishant gurjer | India | sethuraman estate kaapi royale | 7 | sethuraman estate | 14/1148/2017/18 | kaapi royale | 3140 | chikmagalur karnataka india | ... | Bluish-Green | 0 | October 25th, 2018 | Specialty Coffee Association | ff7c18ad303d4b603ac3f8cff7e611ffc735e720 | 352d0cf7f3e9be14dad7df644ad65efc27605ae2 | m | 3140.0 | 3140.0 | 3140.0 |
9 | Robusta | nishant gurjer | India | sethuraman estate | RKR | sethuraman estate | 14/1148/2016/17 | kaapi royale | 1000 | chikmagalur karnataka | ... | Green | 0 | August 17th, 2017 | Specialty Coffee Association | ff7c18ad303d4b603ac3f8cff7e611ffc735e720 | 352d0cf7f3e9be14dad7df644ad65efc27605ae2 | m | 1000.0 | 1000.0 | 1000.0 |
10 | Robusta | ugacof | Uganda | ishaka | NaN | nsubuga umar | 0 | ugacof ltd | 900-1300 | western | ... | Green | 6 | August 5th, 2015 | Uganda Coffee Development Authority | e36d0270932c3b657e96b7b0278dfd85dc0fe743 | 03077a1c6bac60e6f514691634a7f6eb5c85aae8 | m | 900.0 | 1300.0 | 1100.0 |
11 | Robusta | ugacof | Uganda | ugacof project area | NaN | ugacof | 0 | ugacof ltd | 1095 | iganga namadrope eastern | ... | Green | 1 | June 26th, 2015 | Uganda Coffee Development Authority | e36d0270932c3b657e96b7b0278dfd85dc0fe743 | 03077a1c6bac60e6f514691634a7f6eb5c85aae8 | m | 1095.0 | 1095.0 | 1095.0 |
12 | Robusta | nishant gurjer | India | sethuraman estate kaapi royale | RC AB | sethuraman estate | 14/1148/2016/12 | kaapi royale | 1000 | chikmagalur karnataka | ... | Green | 0 | August 23rd, 2017 | Specialty Coffee Association | ff7c18ad303d4b603ac3f8cff7e611ffc735e720 | 352d0cf7f3e9be14dad7df644ad65efc27605ae2 | m | 1000.0 | 1000.0 | 1000.0 |
13 | Robusta | andrew hetzel | India | sethuraman estates | NaN | NaN | NaN | cafemakers | 750m | chikmagalur | ... | Green | 1 | May 19th, 2015 | Specialty Coffee Association | ff7c18ad303d4b603ac3f8cff7e611ffc735e720 | 352d0cf7f3e9be14dad7df644ad65efc27605ae2 | m | 750.0 | 750.0 | 750.0 |
14 | Robusta | kasozi coffee farmers association | Uganda | kasozi coffee farmers | NaN | NaN | 0 | kasozi coffee farmers association | 1367 | eastern | ... | Green | 7 | July 14th, 2015 | Uganda Coffee Development Authority | e36d0270932c3b657e96b7b0278dfd85dc0fe743 | 03077a1c6bac60e6f514691634a7f6eb5c85aae8 | m | 1367.0 | 1367.0 | 1367.0 |
15 | Robusta | ankole coffee producers coop | Uganda | kyangundu coop society | NaN | ankole coffee producers coop union ltd | 0 | ankole coffee producers coop | 1488 | south western | ... | Green | 2 | July 14th, 2015 | Uganda Coffee Development Authority | e36d0270932c3b657e96b7b0278dfd85dc0fe743 | 03077a1c6bac60e6f514691634a7f6eb5c85aae8 | m | 1488.0 | 1488.0 | 1488.0 |
16 | Robusta | andrew hetzel | India | sethuraman estate | NaN | NaN | 0000 | sethuraman estate | 1000m | chikmagalur | ... | Green | 0 | April 29th, 2016 | Specialty Coffee Association | ff7c18ad303d4b603ac3f8cff7e611ffc735e720 | 352d0cf7f3e9be14dad7df644ad65efc27605ae2 | m | 1000.0 | 1000.0 | 1000.0 |
17 | Robusta | andrew hetzel | India | sethuraman estates | NaN | sethuraman estates | NaN | cafemakers, llc | 750m | chikmagalur | ... | Blue-Green | 0 | June 3rd, 2014 | Specialty Coffee Association | ff7c18ad303d4b603ac3f8cff7e611ffc735e720 | 352d0cf7f3e9be14dad7df644ad65efc27605ae2 | m | 750.0 | 750.0 | 750.0 |
18 | Robusta | kawacom uganda ltd | Uganda | bushenyi | NaN | kawacom | 0 | kawacom uganda ltd | 1600 | western | ... | Green | 1 | June 27th, 2015 | Uganda Coffee Development Authority | e36d0270932c3b657e96b7b0278dfd85dc0fe743 | 03077a1c6bac60e6f514691634a7f6eb5c85aae8 | m | 1600.0 | 1600.0 | 1600.0 |
19 | Robusta | nitubaasa ltd | Uganda | kigezi coffee farmers association | NaN | nitubaasa | 0 | nitubaasa ltd | 1745 | western | ... | Green | 2 | June 27th, 2015 | Uganda Coffee Development Authority | e36d0270932c3b657e96b7b0278dfd85dc0fe743 | 03077a1c6bac60e6f514691634a7f6eb5c85aae8 | m | 1745.0 | 1745.0 | 1745.0 |
20 | Robusta | mannya coffee project | Uganda | mannya coffee project | NaN | mannya coffee project | 0 | mannya coffee project | 1200 | southern | ... | Green | 1 | June 27th, 2015 | Uganda Coffee Development Authority | e36d0270932c3b657e96b7b0278dfd85dc0fe743 | 03077a1c6bac60e6f514691634a7f6eb5c85aae8 | m | 1200.0 | 1200.0 | 1200.0 |
21 | Robusta | andrew hetzel | India | sethuraman estates | NaN | NaN | NaN | cafemakers | 750m | chikmagalur | ... | Bluish-Green | 1 | May 19th, 2015 | Specialty Coffee Association | ff7c18ad303d4b603ac3f8cff7e611ffc735e720 | 352d0cf7f3e9be14dad7df644ad65efc27605ae2 | m | 750.0 | 750.0 | 750.0 |
22 | Robusta | andrew hetzel | India | sethuraman estates | NaN | sethuraman estates | NaN | cafemakers, llc | 750m | chikmagalur | ... | Green | 0 | June 20th, 2014 | Specialty Coffee Association | ff7c18ad303d4b603ac3f8cff7e611ffc735e720 | 352d0cf7f3e9be14dad7df644ad65efc27605ae2 | m | 750.0 | 750.0 | 750.0 |
23 | Robusta | andrew hetzel | United States | sethuraman estates | NaN | sethuraman estates | NaN | cafemakers, llc | 3000' | chikmagalur | ... | Green | 0 | February 28th, 2013 | Specialty Coffee Association | ff7c18ad303d4b603ac3f8cff7e611ffc735e720 | 352d0cf7f3e9be14dad7df644ad65efc27605ae2 | m | 3000.0 | 3000.0 | 3000.0 |
24 | Robusta | luis robles | Ecuador | robustasa | Lavado 1 | our own lab | NaN | robustasa | NaN | san juan, playas | ... | Blue-Green | 1 | January 18th, 2017 | Specialty Coffee Association | ff7c18ad303d4b603ac3f8cff7e611ffc735e720 | 352d0cf7f3e9be14dad7df644ad65efc27605ae2 | m | NaN | NaN | NaN |
25 | Robusta | luis robles | Ecuador | robustasa | Lavado 3 | own laboratory | NaN | robustasa | 40 | san juan, playas | ... | Blue-Green | 0 | January 18th, 2017 | Specialty Coffee Association | ff7c18ad303d4b603ac3f8cff7e611ffc735e720 | 352d0cf7f3e9be14dad7df644ad65efc27605ae2 | m | 40.0 | 40.0 | 40.0 |
26 | Robusta | james moore | United States | fazenda cazengo | NaN | cafe cazengo | NaN | global opportunity fund | 795 meters | kwanza norte province, angola | ... | NaN | 6 | December 23rd, 2015 | Specialty Coffee Association | ff7c18ad303d4b603ac3f8cff7e611ffc735e720 | 352d0cf7f3e9be14dad7df644ad65efc27605ae2 | m | 795.0 | 795.0 | 795.0 |
27 | Robusta | cafe politico | India | NaN | NaN | NaN | 14-1118-2014-0087 | cafe politico | NaN | NaN | ... | Green | 1 | August 25th, 2015 | Specialty Coffee Association | ff7c18ad303d4b603ac3f8cff7e611ffc735e720 | 352d0cf7f3e9be14dad7df644ad65efc27605ae2 | m | NaN | NaN | NaN |
28 | Robusta | cafe politico | Vietnam | NaN | NaN | NaN | NaN | cafe politico | NaN | NaN | ... | None | 9 | August 25th, 2015 | Specialty Coffee Association | ff7c18ad303d4b603ac3f8cff7e611ffc735e720 | 352d0cf7f3e9be14dad7df644ad65efc27605ae2 | m | NaN | NaN | NaN |
28 rows × 43 columns
Key points:
write three things to remember from today’s class
4.4. Questions After Classroom#
many overlapping questions today
4.5. General#
How to know which function to use in certain problems or situations
This is something you build up knowledge of slowly, and, sometimes you have a general idea, but have to look up the specifics. Having domain expertise of the dataset or a collaborator that does will help you
4.6. Clarifying#
Is there a way to have a set show the duplicates that get discarded?
no, set
is casting the data type so it loses information
being able to access the code somewhere without asking to scroll would be nice
will work on adding most code to prismia, but if I miss some, always ask.
4.7. Course Admin#
When will homeworks be posted/due typically?
Posted Wednesday
Due the followign Tuesday
4.8. Questions we’ll answer later#
can you use cast a pandas dataframe into a set?
there are better ways to find unique values and remove duplicates in a dataframe
4.9. Try it yourself#
Create variables of three different types with facts about yourself. Use descriptive variable names relative to the contents, not their types.
title = 'dr' #string
office_number = 134 # int
courses_taught = ['Programming for Data Science',
'Machine Learning for Science & Society']
Create a list, again with a descriptive name, and print out the types
about_prof_brown_list = [title, office_number, courses_taught]
# regular for loop
for fact in about_prof_brown_list:
print (type(fact))
<class 'str'>
<class 'int'>
<class 'list'>
Write a function,
type_extractor
that takes a list and a type and returns the item of that type from the listTest your function on all three items from your dictionary.
Use one type of jupyter help on your function, what does it display? If it doesn’t display anything modify your function so that help will work.
Make yourself notes in the most memorable way for you about what a DataFrame is.
Ram Token Opportunity
Contribute possible practice questions to the notes using the suggest an edit button behind the GitHub menu at the top of the page.