Data Frames and other iterables
Contents
3. Data Frames and other iterables#
Today, we’re going to explore DataFrames in greater detail. We’ll continue using that same coffee dataset.
import pandas as pd
coffee_data_url = 'https://raw.githubusercontent.com/jldbc/coffee-quality-database/master/data/robusta_data_cleaned.csv'
coffee_df =pd.read_csv(coffee_data_url)
Important
A reason to use Jupyter is that it formats the output to be more readable. Compare the view of the DataFrame with jupyter and without.
Jupyter uses the object’s to_html
method if it exists, where the print
function casts the object to a string.
coffee_df
Unnamed: 0 | Species | Owner | Country.of.Origin | Farm.Name | Lot.Number | Mill | ICO.Number | Company | Altitude | ... | Color | Category.Two.Defects | Expiration | Certification.Body | Certification.Address | Certification.Contact | unit_of_measurement | altitude_low_meters | altitude_high_meters | altitude_mean_meters | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | 1 | Robusta | ankole coffee producers coop | Uganda | kyangundu cooperative society | NaN | ankole coffee producers | 0 | ankole coffee producers coop | 1488 | ... | Green | 2 | June 26th, 2015 | Uganda Coffee Development Authority | e36d0270932c3b657e96b7b0278dfd85dc0fe743 | 03077a1c6bac60e6f514691634a7f6eb5c85aae8 | m | 1488.0 | 1488.0 | 1488.0 |
1 | 2 | Robusta | nishant gurjer | India | sethuraman estate kaapi royale | 25 | sethuraman estate | 14/1148/2017/21 | kaapi royale | 3170 | ... | NaN | 2 | October 31st, 2018 | Specialty Coffee Association | ff7c18ad303d4b603ac3f8cff7e611ffc735e720 | 352d0cf7f3e9be14dad7df644ad65efc27605ae2 | m | 3170.0 | 3170.0 | 3170.0 |
2 | 3 | Robusta | andrew hetzel | India | sethuraman estate | NaN | NaN | 0000 | sethuraman estate | 1000m | ... | Green | 0 | April 29th, 2016 | Specialty Coffee Association | ff7c18ad303d4b603ac3f8cff7e611ffc735e720 | 352d0cf7f3e9be14dad7df644ad65efc27605ae2 | m | 1000.0 | 1000.0 | 1000.0 |
3 | 4 | Robusta | ugacof | Uganda | ugacof project area | NaN | ugacof | 0 | ugacof ltd | 1212 | ... | Green | 7 | July 14th, 2015 | Uganda Coffee Development Authority | e36d0270932c3b657e96b7b0278dfd85dc0fe743 | 03077a1c6bac60e6f514691634a7f6eb5c85aae8 | m | 1212.0 | 1212.0 | 1212.0 |
4 | 5 | Robusta | katuka development trust ltd | Uganda | katikamu capca farmers association | NaN | katuka development trust | 0 | katuka development trust ltd | 1200-1300 | ... | Green | 3 | June 26th, 2015 | Uganda Coffee Development Authority | e36d0270932c3b657e96b7b0278dfd85dc0fe743 | 03077a1c6bac60e6f514691634a7f6eb5c85aae8 | m | 1200.0 | 1300.0 | 1250.0 |
5 | 6 | Robusta | andrew hetzel | India | NaN | NaN | (self) | NaN | cafemakers, llc | 3000' | ... | Green | 0 | February 28th, 2013 | Specialty Coffee Association | ff7c18ad303d4b603ac3f8cff7e611ffc735e720 | 352d0cf7f3e9be14dad7df644ad65efc27605ae2 | m | 3000.0 | 3000.0 | 3000.0 |
6 | 7 | Robusta | andrew hetzel | India | sethuraman estates | NaN | NaN | NaN | cafemakers | 750m | ... | Green | 0 | May 15th, 2015 | Specialty Coffee Association | ff7c18ad303d4b603ac3f8cff7e611ffc735e720 | 352d0cf7f3e9be14dad7df644ad65efc27605ae2 | m | 750.0 | 750.0 | 750.0 |
7 | 8 | Robusta | nishant gurjer | India | sethuraman estate kaapi royale | 7 | sethuraman estate | 14/1148/2017/18 | kaapi royale | 3140 | ... | Bluish-Green | 0 | October 25th, 2018 | Specialty Coffee Association | ff7c18ad303d4b603ac3f8cff7e611ffc735e720 | 352d0cf7f3e9be14dad7df644ad65efc27605ae2 | m | 3140.0 | 3140.0 | 3140.0 |
8 | 9 | Robusta | nishant gurjer | India | sethuraman estate | RKR | sethuraman estate | 14/1148/2016/17 | kaapi royale | 1000 | ... | Green | 0 | August 17th, 2017 | Specialty Coffee Association | ff7c18ad303d4b603ac3f8cff7e611ffc735e720 | 352d0cf7f3e9be14dad7df644ad65efc27605ae2 | m | 1000.0 | 1000.0 | 1000.0 |
9 | 10 | Robusta | ugacof | Uganda | ishaka | NaN | nsubuga umar | 0 | ugacof ltd | 900-1300 | ... | Green | 6 | August 5th, 2015 | Uganda Coffee Development Authority | e36d0270932c3b657e96b7b0278dfd85dc0fe743 | 03077a1c6bac60e6f514691634a7f6eb5c85aae8 | m | 900.0 | 1300.0 | 1100.0 |
10 | 11 | Robusta | ugacof | Uganda | ugacof project area | NaN | ugacof | 0 | ugacof ltd | 1095 | ... | Green | 1 | June 26th, 2015 | Uganda Coffee Development Authority | e36d0270932c3b657e96b7b0278dfd85dc0fe743 | 03077a1c6bac60e6f514691634a7f6eb5c85aae8 | m | 1095.0 | 1095.0 | 1095.0 |
11 | 12 | Robusta | nishant gurjer | India | sethuraman estate kaapi royale | RC AB | sethuraman estate | 14/1148/2016/12 | kaapi royale | 1000 | ... | Green | 0 | August 23rd, 2017 | Specialty Coffee Association | ff7c18ad303d4b603ac3f8cff7e611ffc735e720 | 352d0cf7f3e9be14dad7df644ad65efc27605ae2 | m | 1000.0 | 1000.0 | 1000.0 |
12 | 13 | Robusta | andrew hetzel | India | sethuraman estates | NaN | NaN | NaN | cafemakers | 750m | ... | Green | 1 | May 19th, 2015 | Specialty Coffee Association | ff7c18ad303d4b603ac3f8cff7e611ffc735e720 | 352d0cf7f3e9be14dad7df644ad65efc27605ae2 | m | 750.0 | 750.0 | 750.0 |
13 | 14 | Robusta | kasozi coffee farmers association | Uganda | kasozi coffee farmers | NaN | NaN | 0 | kasozi coffee farmers association | 1367 | ... | Green | 7 | July 14th, 2015 | Uganda Coffee Development Authority | e36d0270932c3b657e96b7b0278dfd85dc0fe743 | 03077a1c6bac60e6f514691634a7f6eb5c85aae8 | m | 1367.0 | 1367.0 | 1367.0 |
14 | 15 | Robusta | ankole coffee producers coop | Uganda | kyangundu coop society | NaN | ankole coffee producers coop union ltd | 0 | ankole coffee producers coop | 1488 | ... | Green | 2 | July 14th, 2015 | Uganda Coffee Development Authority | e36d0270932c3b657e96b7b0278dfd85dc0fe743 | 03077a1c6bac60e6f514691634a7f6eb5c85aae8 | m | 1488.0 | 1488.0 | 1488.0 |
15 | 16 | Robusta | andrew hetzel | India | sethuraman estate | NaN | NaN | 0000 | sethuraman estate | 1000m | ... | Green | 0 | April 29th, 2016 | Specialty Coffee Association | ff7c18ad303d4b603ac3f8cff7e611ffc735e720 | 352d0cf7f3e9be14dad7df644ad65efc27605ae2 | m | 1000.0 | 1000.0 | 1000.0 |
16 | 17 | Robusta | andrew hetzel | India | sethuraman estates | NaN | sethuraman estates | NaN | cafemakers, llc | 750m | ... | Blue-Green | 0 | June 3rd, 2014 | Specialty Coffee Association | ff7c18ad303d4b603ac3f8cff7e611ffc735e720 | 352d0cf7f3e9be14dad7df644ad65efc27605ae2 | m | 750.0 | 750.0 | 750.0 |
17 | 18 | Robusta | kawacom uganda ltd | Uganda | bushenyi | NaN | kawacom | 0 | kawacom uganda ltd | 1600 | ... | Green | 1 | June 27th, 2015 | Uganda Coffee Development Authority | e36d0270932c3b657e96b7b0278dfd85dc0fe743 | 03077a1c6bac60e6f514691634a7f6eb5c85aae8 | m | 1600.0 | 1600.0 | 1600.0 |
18 | 19 | Robusta | nitubaasa ltd | Uganda | kigezi coffee farmers association | NaN | nitubaasa | 0 | nitubaasa ltd | 1745 | ... | Green | 2 | June 27th, 2015 | Uganda Coffee Development Authority | e36d0270932c3b657e96b7b0278dfd85dc0fe743 | 03077a1c6bac60e6f514691634a7f6eb5c85aae8 | m | 1745.0 | 1745.0 | 1745.0 |
19 | 20 | Robusta | mannya coffee project | Uganda | mannya coffee project | NaN | mannya coffee project | 0 | mannya coffee project | 1200 | ... | Green | 1 | June 27th, 2015 | Uganda Coffee Development Authority | e36d0270932c3b657e96b7b0278dfd85dc0fe743 | 03077a1c6bac60e6f514691634a7f6eb5c85aae8 | m | 1200.0 | 1200.0 | 1200.0 |
20 | 21 | Robusta | andrew hetzel | India | sethuraman estates | NaN | NaN | NaN | cafemakers | 750m | ... | Bluish-Green | 1 | May 19th, 2015 | Specialty Coffee Association | ff7c18ad303d4b603ac3f8cff7e611ffc735e720 | 352d0cf7f3e9be14dad7df644ad65efc27605ae2 | m | 750.0 | 750.0 | 750.0 |
21 | 22 | Robusta | andrew hetzel | India | sethuraman estates | NaN | sethuraman estates | NaN | cafemakers, llc | 750m | ... | Green | 0 | June 20th, 2014 | Specialty Coffee Association | ff7c18ad303d4b603ac3f8cff7e611ffc735e720 | 352d0cf7f3e9be14dad7df644ad65efc27605ae2 | m | 750.0 | 750.0 | 750.0 |
22 | 23 | Robusta | andrew hetzel | United States | sethuraman estates | NaN | sethuraman estates | NaN | cafemakers, llc | 3000' | ... | Green | 0 | February 28th, 2013 | Specialty Coffee Association | ff7c18ad303d4b603ac3f8cff7e611ffc735e720 | 352d0cf7f3e9be14dad7df644ad65efc27605ae2 | m | 3000.0 | 3000.0 | 3000.0 |
23 | 24 | Robusta | luis robles | Ecuador | robustasa | Lavado 1 | our own lab | NaN | robustasa | NaN | ... | Blue-Green | 1 | January 18th, 2017 | Specialty Coffee Association | ff7c18ad303d4b603ac3f8cff7e611ffc735e720 | 352d0cf7f3e9be14dad7df644ad65efc27605ae2 | m | NaN | NaN | NaN |
24 | 25 | Robusta | luis robles | Ecuador | robustasa | Lavado 3 | own laboratory | NaN | robustasa | 40 | ... | Blue-Green | 0 | January 18th, 2017 | Specialty Coffee Association | ff7c18ad303d4b603ac3f8cff7e611ffc735e720 | 352d0cf7f3e9be14dad7df644ad65efc27605ae2 | m | 40.0 | 40.0 | 40.0 |
25 | 26 | Robusta | james moore | United States | fazenda cazengo | NaN | cafe cazengo | NaN | global opportunity fund | 795 meters | ... | NaN | 6 | December 23rd, 2015 | Specialty Coffee Association | ff7c18ad303d4b603ac3f8cff7e611ffc735e720 | 352d0cf7f3e9be14dad7df644ad65efc27605ae2 | m | 795.0 | 795.0 | 795.0 |
26 | 27 | Robusta | cafe politico | India | NaN | NaN | NaN | 14-1118-2014-0087 | cafe politico | NaN | ... | Green | 1 | August 25th, 2015 | Specialty Coffee Association | ff7c18ad303d4b603ac3f8cff7e611ffc735e720 | 352d0cf7f3e9be14dad7df644ad65efc27605ae2 | m | NaN | NaN | NaN |
27 | 28 | Robusta | cafe politico | Vietnam | NaN | NaN | NaN | NaN | cafe politico | NaN | ... | None | 9 | August 25th, 2015 | Specialty Coffee Association | ff7c18ad303d4b603ac3f8cff7e611ffc735e720 | 352d0cf7f3e9be14dad7df644ad65efc27605ae2 | m | NaN | NaN | NaN |
28 rows × 44 columns
print(coffee_df)
Unnamed: 0 Species Owner Country.of.Origin \
0 1 Robusta ankole coffee producers coop Uganda
1 2 Robusta nishant gurjer India
2 3 Robusta andrew hetzel India
3 4 Robusta ugacof Uganda
4 5 Robusta katuka development trust ltd Uganda
5 6 Robusta andrew hetzel India
6 7 Robusta andrew hetzel India
7 8 Robusta nishant gurjer India
8 9 Robusta nishant gurjer India
9 10 Robusta ugacof Uganda
10 11 Robusta ugacof Uganda
11 12 Robusta nishant gurjer India
12 13 Robusta andrew hetzel India
13 14 Robusta kasozi coffee farmers association Uganda
14 15 Robusta ankole coffee producers coop Uganda
15 16 Robusta andrew hetzel India
16 17 Robusta andrew hetzel India
17 18 Robusta kawacom uganda ltd Uganda
18 19 Robusta nitubaasa ltd Uganda
19 20 Robusta mannya coffee project Uganda
20 21 Robusta andrew hetzel India
21 22 Robusta andrew hetzel India
22 23 Robusta andrew hetzel United States
23 24 Robusta luis robles Ecuador
24 25 Robusta luis robles Ecuador
25 26 Robusta james moore United States
26 27 Robusta cafe politico India
27 28 Robusta cafe politico Vietnam
Farm.Name Lot.Number \
0 kyangundu cooperative society NaN
1 sethuraman estate kaapi royale 25
2 sethuraman estate NaN
3 ugacof project area NaN
4 katikamu capca farmers association NaN
5 NaN NaN
6 sethuraman estates NaN
7 sethuraman estate kaapi royale 7
8 sethuraman estate RKR
9 ishaka NaN
10 ugacof project area NaN
11 sethuraman estate kaapi royale RC AB
12 sethuraman estates NaN
13 kasozi coffee farmers NaN
14 kyangundu coop society NaN
15 sethuraman estate NaN
16 sethuraman estates NaN
17 bushenyi NaN
18 kigezi coffee farmers association NaN
19 mannya coffee project NaN
20 sethuraman estates NaN
21 sethuraman estates NaN
22 sethuraman estates NaN
23 robustasa Lavado 1
24 robustasa Lavado 3
25 fazenda cazengo NaN
26 NaN NaN
27 NaN NaN
Mill ICO.Number \
0 ankole coffee producers 0
1 sethuraman estate 14/1148/2017/21
2 NaN 0000
3 ugacof 0
4 katuka development trust 0
5 (self) NaN
6 NaN NaN
7 sethuraman estate 14/1148/2017/18
8 sethuraman estate 14/1148/2016/17
9 nsubuga umar 0
10 ugacof 0
11 sethuraman estate 14/1148/2016/12
12 NaN NaN
13 NaN 0
14 ankole coffee producers coop union ltd 0
15 NaN 0000
16 sethuraman estates NaN
17 kawacom 0
18 nitubaasa 0
19 mannya coffee project 0
20 NaN NaN
21 sethuraman estates NaN
22 sethuraman estates NaN
23 our own lab NaN
24 own laboratory NaN
25 cafe cazengo NaN
26 NaN 14-1118-2014-0087
27 NaN NaN
Company Altitude ... Color \
0 ankole coffee producers coop 1488 ... Green
1 kaapi royale 3170 ... NaN
2 sethuraman estate 1000m ... Green
3 ugacof ltd 1212 ... Green
4 katuka development trust ltd 1200-1300 ... Green
5 cafemakers, llc 3000' ... Green
6 cafemakers 750m ... Green
7 kaapi royale 3140 ... Bluish-Green
8 kaapi royale 1000 ... Green
9 ugacof ltd 900-1300 ... Green
10 ugacof ltd 1095 ... Green
11 kaapi royale 1000 ... Green
12 cafemakers 750m ... Green
13 kasozi coffee farmers association 1367 ... Green
14 ankole coffee producers coop 1488 ... Green
15 sethuraman estate 1000m ... Green
16 cafemakers, llc 750m ... Blue-Green
17 kawacom uganda ltd 1600 ... Green
18 nitubaasa ltd 1745 ... Green
19 mannya coffee project 1200 ... Green
20 cafemakers 750m ... Bluish-Green
21 cafemakers, llc 750m ... Green
22 cafemakers, llc 3000' ... Green
23 robustasa NaN ... Blue-Green
24 robustasa 40 ... Blue-Green
25 global opportunity fund 795 meters ... NaN
26 cafe politico NaN ... Green
27 cafe politico NaN ... None
Category.Two.Defects Expiration \
0 2 June 26th, 2015
1 2 October 31st, 2018
2 0 April 29th, 2016
3 7 July 14th, 2015
4 3 June 26th, 2015
5 0 February 28th, 2013
6 0 May 15th, 2015
7 0 October 25th, 2018
8 0 August 17th, 2017
9 6 August 5th, 2015
10 1 June 26th, 2015
11 0 August 23rd, 2017
12 1 May 19th, 2015
13 7 July 14th, 2015
14 2 July 14th, 2015
15 0 April 29th, 2016
16 0 June 3rd, 2014
17 1 June 27th, 2015
18 2 June 27th, 2015
19 1 June 27th, 2015
20 1 May 19th, 2015
21 0 June 20th, 2014
22 0 February 28th, 2013
23 1 January 18th, 2017
24 0 January 18th, 2017
25 6 December 23rd, 2015
26 1 August 25th, 2015
27 9 August 25th, 2015
Certification.Body \
0 Uganda Coffee Development Authority
1 Specialty Coffee Association
2 Specialty Coffee Association
3 Uganda Coffee Development Authority
4 Uganda Coffee Development Authority
5 Specialty Coffee Association
6 Specialty Coffee Association
7 Specialty Coffee Association
8 Specialty Coffee Association
9 Uganda Coffee Development Authority
10 Uganda Coffee Development Authority
11 Specialty Coffee Association
12 Specialty Coffee Association
13 Uganda Coffee Development Authority
14 Uganda Coffee Development Authority
15 Specialty Coffee Association
16 Specialty Coffee Association
17 Uganda Coffee Development Authority
18 Uganda Coffee Development Authority
19 Uganda Coffee Development Authority
20 Specialty Coffee Association
21 Specialty Coffee Association
22 Specialty Coffee Association
23 Specialty Coffee Association
24 Specialty Coffee Association
25 Specialty Coffee Association
26 Specialty Coffee Association
27 Specialty Coffee Association
Certification.Address \
0 e36d0270932c3b657e96b7b0278dfd85dc0fe743
1 ff7c18ad303d4b603ac3f8cff7e611ffc735e720
2 ff7c18ad303d4b603ac3f8cff7e611ffc735e720
3 e36d0270932c3b657e96b7b0278dfd85dc0fe743
4 e36d0270932c3b657e96b7b0278dfd85dc0fe743
5 ff7c18ad303d4b603ac3f8cff7e611ffc735e720
6 ff7c18ad303d4b603ac3f8cff7e611ffc735e720
7 ff7c18ad303d4b603ac3f8cff7e611ffc735e720
8 ff7c18ad303d4b603ac3f8cff7e611ffc735e720
9 e36d0270932c3b657e96b7b0278dfd85dc0fe743
10 e36d0270932c3b657e96b7b0278dfd85dc0fe743
11 ff7c18ad303d4b603ac3f8cff7e611ffc735e720
12 ff7c18ad303d4b603ac3f8cff7e611ffc735e720
13 e36d0270932c3b657e96b7b0278dfd85dc0fe743
14 e36d0270932c3b657e96b7b0278dfd85dc0fe743
15 ff7c18ad303d4b603ac3f8cff7e611ffc735e720
16 ff7c18ad303d4b603ac3f8cff7e611ffc735e720
17 e36d0270932c3b657e96b7b0278dfd85dc0fe743
18 e36d0270932c3b657e96b7b0278dfd85dc0fe743
19 e36d0270932c3b657e96b7b0278dfd85dc0fe743
20 ff7c18ad303d4b603ac3f8cff7e611ffc735e720
21 ff7c18ad303d4b603ac3f8cff7e611ffc735e720
22 ff7c18ad303d4b603ac3f8cff7e611ffc735e720
23 ff7c18ad303d4b603ac3f8cff7e611ffc735e720
24 ff7c18ad303d4b603ac3f8cff7e611ffc735e720
25 ff7c18ad303d4b603ac3f8cff7e611ffc735e720
26 ff7c18ad303d4b603ac3f8cff7e611ffc735e720
27 ff7c18ad303d4b603ac3f8cff7e611ffc735e720
Certification.Contact unit_of_measurement \
0 03077a1c6bac60e6f514691634a7f6eb5c85aae8 m
1 352d0cf7f3e9be14dad7df644ad65efc27605ae2 m
2 352d0cf7f3e9be14dad7df644ad65efc27605ae2 m
3 03077a1c6bac60e6f514691634a7f6eb5c85aae8 m
4 03077a1c6bac60e6f514691634a7f6eb5c85aae8 m
5 352d0cf7f3e9be14dad7df644ad65efc27605ae2 m
6 352d0cf7f3e9be14dad7df644ad65efc27605ae2 m
7 352d0cf7f3e9be14dad7df644ad65efc27605ae2 m
8 352d0cf7f3e9be14dad7df644ad65efc27605ae2 m
9 03077a1c6bac60e6f514691634a7f6eb5c85aae8 m
10 03077a1c6bac60e6f514691634a7f6eb5c85aae8 m
11 352d0cf7f3e9be14dad7df644ad65efc27605ae2 m
12 352d0cf7f3e9be14dad7df644ad65efc27605ae2 m
13 03077a1c6bac60e6f514691634a7f6eb5c85aae8 m
14 03077a1c6bac60e6f514691634a7f6eb5c85aae8 m
15 352d0cf7f3e9be14dad7df644ad65efc27605ae2 m
16 352d0cf7f3e9be14dad7df644ad65efc27605ae2 m
17 03077a1c6bac60e6f514691634a7f6eb5c85aae8 m
18 03077a1c6bac60e6f514691634a7f6eb5c85aae8 m
19 03077a1c6bac60e6f514691634a7f6eb5c85aae8 m
20 352d0cf7f3e9be14dad7df644ad65efc27605ae2 m
21 352d0cf7f3e9be14dad7df644ad65efc27605ae2 m
22 352d0cf7f3e9be14dad7df644ad65efc27605ae2 m
23 352d0cf7f3e9be14dad7df644ad65efc27605ae2 m
24 352d0cf7f3e9be14dad7df644ad65efc27605ae2 m
25 352d0cf7f3e9be14dad7df644ad65efc27605ae2 m
26 352d0cf7f3e9be14dad7df644ad65efc27605ae2 m
27 352d0cf7f3e9be14dad7df644ad65efc27605ae2 m
altitude_low_meters altitude_high_meters altitude_mean_meters
0 1488.0 1488.0 1488.0
1 3170.0 3170.0 3170.0
2 1000.0 1000.0 1000.0
3 1212.0 1212.0 1212.0
4 1200.0 1300.0 1250.0
5 3000.0 3000.0 3000.0
6 750.0 750.0 750.0
7 3140.0 3140.0 3140.0
8 1000.0 1000.0 1000.0
9 900.0 1300.0 1100.0
10 1095.0 1095.0 1095.0
11 1000.0 1000.0 1000.0
12 750.0 750.0 750.0
13 1367.0 1367.0 1367.0
14 1488.0 1488.0 1488.0
15 1000.0 1000.0 1000.0
16 750.0 750.0 750.0
17 1600.0 1600.0 1600.0
18 1745.0 1745.0 1745.0
19 1200.0 1200.0 1200.0
20 750.0 750.0 750.0
21 750.0 750.0 750.0
22 3000.0 3000.0 3000.0
23 NaN NaN NaN
24 40.0 40.0 40.0
25 795.0 795.0 795.0
26 NaN NaN NaN
27 NaN NaN NaN
[28 rows x 44 columns]
3.1. Examining the Structure of a Data Frame#
I told you this was a DataFrame, but we can check with type.
type(coffee_df)
pandas.core.frame.DataFrame
We can also see that the DataFrame type comes from the pandas
library, without the library loaded this type does not exist.
We can also exmaine its parts. It consists of several; first the column headings
coffee_df.columns
Index(['Unnamed: 0', 'Species', 'Owner', 'Country.of.Origin', 'Farm.Name',
'Lot.Number', 'Mill', 'ICO.Number', 'Company', 'Altitude', 'Region',
'Producer', 'Number.of.Bags', 'Bag.Weight', 'In.Country.Partner',
'Harvest.Year', 'Grading.Date', 'Owner.1', 'Variety',
'Processing.Method', 'Fragrance...Aroma', 'Flavor', 'Aftertaste',
'Salt...Acid', 'Bitter...Sweet', 'Mouthfeel', 'Uniform.Cup',
'Clean.Cup', 'Balance', 'Cupper.Points', 'Total.Cup.Points', 'Moisture',
'Category.One.Defects', 'Quakers', 'Color', 'Category.Two.Defects',
'Expiration', 'Certification.Body', 'Certification.Address',
'Certification.Contact', 'unit_of_measurement', 'altitude_low_meters',
'altitude_high_meters', 'altitude_mean_meters'],
dtype='object')
These are a special type called Index
type(coffee_df.columns)
pandas.core.indexes.base.Index
These are still iterable, much like python lists.
and it stores the data
coffee_df.values
array([[1, 'Robusta', 'ankole coffee producers coop', ..., 1488.0,
1488.0, 1488.0],
[2, 'Robusta', 'nishant gurjer', ..., 3170.0, 3170.0, 3170.0],
[3, 'Robusta', 'andrew hetzel', ..., 1000.0, 1000.0, 1000.0],
...,
[26, 'Robusta', 'james moore', ..., 795.0, 795.0, 795.0],
[27, 'Robusta', 'cafe politico', ..., nan, nan, nan],
[28, 'Robusta', 'cafe politico', ..., nan, nan, nan]], dtype=object)
It also has an index (first column, visually) but it is special because this is how you can index the data.
coffee_df.index
RangeIndex(start=0, stop=28, step=1)
Right now this is an autogenerated index, but we can also use the index_col
parameter to set that up front.
coffee_df = pd.read_csv(coffee_data_url,index_col=0)
coffee_df
Species | Owner | Country.of.Origin | Farm.Name | Lot.Number | Mill | ICO.Number | Company | Altitude | Region | ... | Color | Category.Two.Defects | Expiration | Certification.Body | Certification.Address | Certification.Contact | unit_of_measurement | altitude_low_meters | altitude_high_meters | altitude_mean_meters | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
1 | Robusta | ankole coffee producers coop | Uganda | kyangundu cooperative society | NaN | ankole coffee producers | 0 | ankole coffee producers coop | 1488 | sheema south western | ... | Green | 2 | June 26th, 2015 | Uganda Coffee Development Authority | e36d0270932c3b657e96b7b0278dfd85dc0fe743 | 03077a1c6bac60e6f514691634a7f6eb5c85aae8 | m | 1488.0 | 1488.0 | 1488.0 |
2 | Robusta | nishant gurjer | India | sethuraman estate kaapi royale | 25 | sethuraman estate | 14/1148/2017/21 | kaapi royale | 3170 | chikmagalur karnataka indua | ... | NaN | 2 | October 31st, 2018 | Specialty Coffee Association | ff7c18ad303d4b603ac3f8cff7e611ffc735e720 | 352d0cf7f3e9be14dad7df644ad65efc27605ae2 | m | 3170.0 | 3170.0 | 3170.0 |
3 | Robusta | andrew hetzel | India | sethuraman estate | NaN | NaN | 0000 | sethuraman estate | 1000m | chikmagalur | ... | Green | 0 | April 29th, 2016 | Specialty Coffee Association | ff7c18ad303d4b603ac3f8cff7e611ffc735e720 | 352d0cf7f3e9be14dad7df644ad65efc27605ae2 | m | 1000.0 | 1000.0 | 1000.0 |
4 | Robusta | ugacof | Uganda | ugacof project area | NaN | ugacof | 0 | ugacof ltd | 1212 | central | ... | Green | 7 | July 14th, 2015 | Uganda Coffee Development Authority | e36d0270932c3b657e96b7b0278dfd85dc0fe743 | 03077a1c6bac60e6f514691634a7f6eb5c85aae8 | m | 1212.0 | 1212.0 | 1212.0 |
5 | Robusta | katuka development trust ltd | Uganda | katikamu capca farmers association | NaN | katuka development trust | 0 | katuka development trust ltd | 1200-1300 | luwero central region | ... | Green | 3 | June 26th, 2015 | Uganda Coffee Development Authority | e36d0270932c3b657e96b7b0278dfd85dc0fe743 | 03077a1c6bac60e6f514691634a7f6eb5c85aae8 | m | 1200.0 | 1300.0 | 1250.0 |
6 | Robusta | andrew hetzel | India | NaN | NaN | (self) | NaN | cafemakers, llc | 3000' | chikmagalur | ... | Green | 0 | February 28th, 2013 | Specialty Coffee Association | ff7c18ad303d4b603ac3f8cff7e611ffc735e720 | 352d0cf7f3e9be14dad7df644ad65efc27605ae2 | m | 3000.0 | 3000.0 | 3000.0 |
7 | Robusta | andrew hetzel | India | sethuraman estates | NaN | NaN | NaN | cafemakers | 750m | chikmagalur | ... | Green | 0 | May 15th, 2015 | Specialty Coffee Association | ff7c18ad303d4b603ac3f8cff7e611ffc735e720 | 352d0cf7f3e9be14dad7df644ad65efc27605ae2 | m | 750.0 | 750.0 | 750.0 |
8 | Robusta | nishant gurjer | India | sethuraman estate kaapi royale | 7 | sethuraman estate | 14/1148/2017/18 | kaapi royale | 3140 | chikmagalur karnataka india | ... | Bluish-Green | 0 | October 25th, 2018 | Specialty Coffee Association | ff7c18ad303d4b603ac3f8cff7e611ffc735e720 | 352d0cf7f3e9be14dad7df644ad65efc27605ae2 | m | 3140.0 | 3140.0 | 3140.0 |
9 | Robusta | nishant gurjer | India | sethuraman estate | RKR | sethuraman estate | 14/1148/2016/17 | kaapi royale | 1000 | chikmagalur karnataka | ... | Green | 0 | August 17th, 2017 | Specialty Coffee Association | ff7c18ad303d4b603ac3f8cff7e611ffc735e720 | 352d0cf7f3e9be14dad7df644ad65efc27605ae2 | m | 1000.0 | 1000.0 | 1000.0 |
10 | Robusta | ugacof | Uganda | ishaka | NaN | nsubuga umar | 0 | ugacof ltd | 900-1300 | western | ... | Green | 6 | August 5th, 2015 | Uganda Coffee Development Authority | e36d0270932c3b657e96b7b0278dfd85dc0fe743 | 03077a1c6bac60e6f514691634a7f6eb5c85aae8 | m | 900.0 | 1300.0 | 1100.0 |
11 | Robusta | ugacof | Uganda | ugacof project area | NaN | ugacof | 0 | ugacof ltd | 1095 | iganga namadrope eastern | ... | Green | 1 | June 26th, 2015 | Uganda Coffee Development Authority | e36d0270932c3b657e96b7b0278dfd85dc0fe743 | 03077a1c6bac60e6f514691634a7f6eb5c85aae8 | m | 1095.0 | 1095.0 | 1095.0 |
12 | Robusta | nishant gurjer | India | sethuraman estate kaapi royale | RC AB | sethuraman estate | 14/1148/2016/12 | kaapi royale | 1000 | chikmagalur karnataka | ... | Green | 0 | August 23rd, 2017 | Specialty Coffee Association | ff7c18ad303d4b603ac3f8cff7e611ffc735e720 | 352d0cf7f3e9be14dad7df644ad65efc27605ae2 | m | 1000.0 | 1000.0 | 1000.0 |
13 | Robusta | andrew hetzel | India | sethuraman estates | NaN | NaN | NaN | cafemakers | 750m | chikmagalur | ... | Green | 1 | May 19th, 2015 | Specialty Coffee Association | ff7c18ad303d4b603ac3f8cff7e611ffc735e720 | 352d0cf7f3e9be14dad7df644ad65efc27605ae2 | m | 750.0 | 750.0 | 750.0 |
14 | Robusta | kasozi coffee farmers association | Uganda | kasozi coffee farmers | NaN | NaN | 0 | kasozi coffee farmers association | 1367 | eastern | ... | Green | 7 | July 14th, 2015 | Uganda Coffee Development Authority | e36d0270932c3b657e96b7b0278dfd85dc0fe743 | 03077a1c6bac60e6f514691634a7f6eb5c85aae8 | m | 1367.0 | 1367.0 | 1367.0 |
15 | Robusta | ankole coffee producers coop | Uganda | kyangundu coop society | NaN | ankole coffee producers coop union ltd | 0 | ankole coffee producers coop | 1488 | south western | ... | Green | 2 | July 14th, 2015 | Uganda Coffee Development Authority | e36d0270932c3b657e96b7b0278dfd85dc0fe743 | 03077a1c6bac60e6f514691634a7f6eb5c85aae8 | m | 1488.0 | 1488.0 | 1488.0 |
16 | Robusta | andrew hetzel | India | sethuraman estate | NaN | NaN | 0000 | sethuraman estate | 1000m | chikmagalur | ... | Green | 0 | April 29th, 2016 | Specialty Coffee Association | ff7c18ad303d4b603ac3f8cff7e611ffc735e720 | 352d0cf7f3e9be14dad7df644ad65efc27605ae2 | m | 1000.0 | 1000.0 | 1000.0 |
17 | Robusta | andrew hetzel | India | sethuraman estates | NaN | sethuraman estates | NaN | cafemakers, llc | 750m | chikmagalur | ... | Blue-Green | 0 | June 3rd, 2014 | Specialty Coffee Association | ff7c18ad303d4b603ac3f8cff7e611ffc735e720 | 352d0cf7f3e9be14dad7df644ad65efc27605ae2 | m | 750.0 | 750.0 | 750.0 |
18 | Robusta | kawacom uganda ltd | Uganda | bushenyi | NaN | kawacom | 0 | kawacom uganda ltd | 1600 | western | ... | Green | 1 | June 27th, 2015 | Uganda Coffee Development Authority | e36d0270932c3b657e96b7b0278dfd85dc0fe743 | 03077a1c6bac60e6f514691634a7f6eb5c85aae8 | m | 1600.0 | 1600.0 | 1600.0 |
19 | Robusta | nitubaasa ltd | Uganda | kigezi coffee farmers association | NaN | nitubaasa | 0 | nitubaasa ltd | 1745 | western | ... | Green | 2 | June 27th, 2015 | Uganda Coffee Development Authority | e36d0270932c3b657e96b7b0278dfd85dc0fe743 | 03077a1c6bac60e6f514691634a7f6eb5c85aae8 | m | 1745.0 | 1745.0 | 1745.0 |
20 | Robusta | mannya coffee project | Uganda | mannya coffee project | NaN | mannya coffee project | 0 | mannya coffee project | 1200 | southern | ... | Green | 1 | June 27th, 2015 | Uganda Coffee Development Authority | e36d0270932c3b657e96b7b0278dfd85dc0fe743 | 03077a1c6bac60e6f514691634a7f6eb5c85aae8 | m | 1200.0 | 1200.0 | 1200.0 |
21 | Robusta | andrew hetzel | India | sethuraman estates | NaN | NaN | NaN | cafemakers | 750m | chikmagalur | ... | Bluish-Green | 1 | May 19th, 2015 | Specialty Coffee Association | ff7c18ad303d4b603ac3f8cff7e611ffc735e720 | 352d0cf7f3e9be14dad7df644ad65efc27605ae2 | m | 750.0 | 750.0 | 750.0 |
22 | Robusta | andrew hetzel | India | sethuraman estates | NaN | sethuraman estates | NaN | cafemakers, llc | 750m | chikmagalur | ... | Green | 0 | June 20th, 2014 | Specialty Coffee Association | ff7c18ad303d4b603ac3f8cff7e611ffc735e720 | 352d0cf7f3e9be14dad7df644ad65efc27605ae2 | m | 750.0 | 750.0 | 750.0 |
23 | Robusta | andrew hetzel | United States | sethuraman estates | NaN | sethuraman estates | NaN | cafemakers, llc | 3000' | chikmagalur | ... | Green | 0 | February 28th, 2013 | Specialty Coffee Association | ff7c18ad303d4b603ac3f8cff7e611ffc735e720 | 352d0cf7f3e9be14dad7df644ad65efc27605ae2 | m | 3000.0 | 3000.0 | 3000.0 |
24 | Robusta | luis robles | Ecuador | robustasa | Lavado 1 | our own lab | NaN | robustasa | NaN | san juan, playas | ... | Blue-Green | 1 | January 18th, 2017 | Specialty Coffee Association | ff7c18ad303d4b603ac3f8cff7e611ffc735e720 | 352d0cf7f3e9be14dad7df644ad65efc27605ae2 | m | NaN | NaN | NaN |
25 | Robusta | luis robles | Ecuador | robustasa | Lavado 3 | own laboratory | NaN | robustasa | 40 | san juan, playas | ... | Blue-Green | 0 | January 18th, 2017 | Specialty Coffee Association | ff7c18ad303d4b603ac3f8cff7e611ffc735e720 | 352d0cf7f3e9be14dad7df644ad65efc27605ae2 | m | 40.0 | 40.0 | 40.0 |
26 | Robusta | james moore | United States | fazenda cazengo | NaN | cafe cazengo | NaN | global opportunity fund | 795 meters | kwanza norte province, angola | ... | NaN | 6 | December 23rd, 2015 | Specialty Coffee Association | ff7c18ad303d4b603ac3f8cff7e611ffc735e720 | 352d0cf7f3e9be14dad7df644ad65efc27605ae2 | m | 795.0 | 795.0 | 795.0 |
27 | Robusta | cafe politico | India | NaN | NaN | NaN | 14-1118-2014-0087 | cafe politico | NaN | NaN | ... | Green | 1 | August 25th, 2015 | Specialty Coffee Association | ff7c18ad303d4b603ac3f8cff7e611ffc735e720 | 352d0cf7f3e9be14dad7df644ad65efc27605ae2 | m | NaN | NaN | NaN |
28 | Robusta | cafe politico | Vietnam | NaN | NaN | NaN | NaN | cafe politico | NaN | NaN | ... | None | 9 | August 25th, 2015 | Specialty Coffee Association | ff7c18ad303d4b603ac3f8cff7e611ffc735e720 | 352d0cf7f3e9be14dad7df644ad65efc27605ae2 | m | NaN | NaN | NaN |
28 rows × 43 columns
coffee_df.index
Int64Index([ 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17,
18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28],
dtype='int64')
Now it’s neater
3.2. Extracting Parts of Data Frames#
We can look at the first 5 rows with head
coffee_df.head()
Species | Owner | Country.of.Origin | Farm.Name | Lot.Number | Mill | ICO.Number | Company | Altitude | Region | ... | Color | Category.Two.Defects | Expiration | Certification.Body | Certification.Address | Certification.Contact | unit_of_measurement | altitude_low_meters | altitude_high_meters | altitude_mean_meters | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
1 | Robusta | ankole coffee producers coop | Uganda | kyangundu cooperative society | NaN | ankole coffee producers | 0 | ankole coffee producers coop | 1488 | sheema south western | ... | Green | 2 | June 26th, 2015 | Uganda Coffee Development Authority | e36d0270932c3b657e96b7b0278dfd85dc0fe743 | 03077a1c6bac60e6f514691634a7f6eb5c85aae8 | m | 1488.0 | 1488.0 | 1488.0 |
2 | Robusta | nishant gurjer | India | sethuraman estate kaapi royale | 25 | sethuraman estate | 14/1148/2017/21 | kaapi royale | 3170 | chikmagalur karnataka indua | ... | NaN | 2 | October 31st, 2018 | Specialty Coffee Association | ff7c18ad303d4b603ac3f8cff7e611ffc735e720 | 352d0cf7f3e9be14dad7df644ad65efc27605ae2 | m | 3170.0 | 3170.0 | 3170.0 |
3 | Robusta | andrew hetzel | India | sethuraman estate | NaN | NaN | 0000 | sethuraman estate | 1000m | chikmagalur | ... | Green | 0 | April 29th, 2016 | Specialty Coffee Association | ff7c18ad303d4b603ac3f8cff7e611ffc735e720 | 352d0cf7f3e9be14dad7df644ad65efc27605ae2 | m | 1000.0 | 1000.0 | 1000.0 |
4 | Robusta | ugacof | Uganda | ugacof project area | NaN | ugacof | 0 | ugacof ltd | 1212 | central | ... | Green | 7 | July 14th, 2015 | Uganda Coffee Development Authority | e36d0270932c3b657e96b7b0278dfd85dc0fe743 | 03077a1c6bac60e6f514691634a7f6eb5c85aae8 | m | 1212.0 | 1212.0 | 1212.0 |
5 | Robusta | katuka development trust ltd | Uganda | katikamu capca farmers association | NaN | katuka development trust | 0 | katuka development trust ltd | 1200-1300 | luwero central region | ... | Green | 3 | June 26th, 2015 | Uganda Coffee Development Authority | e36d0270932c3b657e96b7b0278dfd85dc0fe743 | 03077a1c6bac60e6f514691634a7f6eb5c85aae8 | m | 1200.0 | 1300.0 | 1250.0 |
5 rows × 43 columns
and the last 5 with tail
coffee_df.tail()
Species | Owner | Country.of.Origin | Farm.Name | Lot.Number | Mill | ICO.Number | Company | Altitude | Region | ... | Color | Category.Two.Defects | Expiration | Certification.Body | Certification.Address | Certification.Contact | unit_of_measurement | altitude_low_meters | altitude_high_meters | altitude_mean_meters | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
24 | Robusta | luis robles | Ecuador | robustasa | Lavado 1 | our own lab | NaN | robustasa | NaN | san juan, playas | ... | Blue-Green | 1 | January 18th, 2017 | Specialty Coffee Association | ff7c18ad303d4b603ac3f8cff7e611ffc735e720 | 352d0cf7f3e9be14dad7df644ad65efc27605ae2 | m | NaN | NaN | NaN |
25 | Robusta | luis robles | Ecuador | robustasa | Lavado 3 | own laboratory | NaN | robustasa | 40 | san juan, playas | ... | Blue-Green | 0 | January 18th, 2017 | Specialty Coffee Association | ff7c18ad303d4b603ac3f8cff7e611ffc735e720 | 352d0cf7f3e9be14dad7df644ad65efc27605ae2 | m | 40.0 | 40.0 | 40.0 |
26 | Robusta | james moore | United States | fazenda cazengo | NaN | cafe cazengo | NaN | global opportunity fund | 795 meters | kwanza norte province, angola | ... | NaN | 6 | December 23rd, 2015 | Specialty Coffee Association | ff7c18ad303d4b603ac3f8cff7e611ffc735e720 | 352d0cf7f3e9be14dad7df644ad65efc27605ae2 | m | 795.0 | 795.0 | 795.0 |
27 | Robusta | cafe politico | India | NaN | NaN | NaN | 14-1118-2014-0087 | cafe politico | NaN | NaN | ... | Green | 1 | August 25th, 2015 | Specialty Coffee Association | ff7c18ad303d4b603ac3f8cff7e611ffc735e720 | 352d0cf7f3e9be14dad7df644ad65efc27605ae2 | m | NaN | NaN | NaN |
28 | Robusta | cafe politico | Vietnam | NaN | NaN | NaN | NaN | cafe politico | NaN | NaN | ... | None | 9 | August 25th, 2015 | Specialty Coffee Association | ff7c18ad303d4b603ac3f8cff7e611ffc735e720 | 352d0cf7f3e9be14dad7df644ad65efc27605ae2 | m | NaN | NaN | NaN |
5 rows × 43 columns
the shape of a DataFrame is an attribute
coffee_df.shape
(28, 43)
len(coffee_df)
28
We can pick out columns by name.
coffee_df['Species']
1 Robusta
2 Robusta
3 Robusta
4 Robusta
5 Robusta
6 Robusta
7 Robusta
8 Robusta
9 Robusta
10 Robusta
11 Robusta
12 Robusta
13 Robusta
14 Robusta
15 Robusta
16 Robusta
17 Robusta
18 Robusta
19 Robusta
20 Robusta
21 Robusta
22 Robusta
23 Robusta
24 Robusta
25 Robusta
26 Robusta
27 Robusta
28 Robusta
Name: Species, dtype: object
Important
We did not do this step in class
We can pick out rows with loc
coffee_df.loc[0]
---------------------------------------------------------------------------
KeyError Traceback (most recent call last)
File /opt/hostedtoolcache/Python/3.9.16/x64/lib/python3.9/site-packages/pandas/core/indexes/base.py:3803, in Index.get_loc(self, key, method, tolerance)
3802 try:
-> 3803 return self._engine.get_loc(casted_key)
3804 except KeyError as err:
File /opt/hostedtoolcache/Python/3.9.16/x64/lib/python3.9/site-packages/pandas/_libs/index.pyx:138, in pandas._libs.index.IndexEngine.get_loc()
File /opt/hostedtoolcache/Python/3.9.16/x64/lib/python3.9/site-packages/pandas/_libs/index.pyx:165, in pandas._libs.index.IndexEngine.get_loc()
File pandas/_libs/hashtable_class_helper.pxi:2263, in pandas._libs.hashtable.Int64HashTable.get_item()
File pandas/_libs/hashtable_class_helper.pxi:2273, in pandas._libs.hashtable.Int64HashTable.get_item()
KeyError: 0
The above exception was the direct cause of the following exception:
KeyError Traceback (most recent call last)
Cell In[18], line 1
----> 1 coffee_df.loc[0]
File /opt/hostedtoolcache/Python/3.9.16/x64/lib/python3.9/site-packages/pandas/core/indexing.py:1073, in _LocationIndexer.__getitem__(self, key)
1070 axis = self.axis or 0
1072 maybe_callable = com.apply_if_callable(key, self.obj)
-> 1073 return self._getitem_axis(maybe_callable, axis=axis)
File /opt/hostedtoolcache/Python/3.9.16/x64/lib/python3.9/site-packages/pandas/core/indexing.py:1312, in _LocIndexer._getitem_axis(self, key, axis)
1310 # fall thru to straight lookup
1311 self._validate_key(key, axis)
-> 1312 return self._get_label(key, axis=axis)
File /opt/hostedtoolcache/Python/3.9.16/x64/lib/python3.9/site-packages/pandas/core/indexing.py:1260, in _LocIndexer._get_label(self, label, axis)
1258 def _get_label(self, label, axis: int):
1259 # GH#5567 this will fail if the label is not present in the axis.
-> 1260 return self.obj.xs(label, axis=axis)
File /opt/hostedtoolcache/Python/3.9.16/x64/lib/python3.9/site-packages/pandas/core/generic.py:4056, in NDFrame.xs(self, key, axis, level, drop_level)
4054 new_index = index[loc]
4055 else:
-> 4056 loc = index.get_loc(key)
4058 if isinstance(loc, np.ndarray):
4059 if loc.dtype == np.bool_:
File /opt/hostedtoolcache/Python/3.9.16/x64/lib/python3.9/site-packages/pandas/core/indexes/base.py:3805, in Index.get_loc(self, key, method, tolerance)
3803 return self._engine.get_loc(casted_key)
3804 except KeyError as err:
-> 3805 raise KeyError(key) from err
3806 except TypeError:
3807 # If we have a listlike key, _check_indexing_error will raise
3808 # InvalidIndexError. Otherwise we fall through and re-raise
3809 # the TypeError.
3810 self._check_indexing_error(key)
KeyError: 0
3.3. Reading data from websites#
We’ll first read from the course website.
Note
This is our first bit of web scraping! We will do more, but for very structured data it can be this easy
comm_url = 'https://rhodyprog4ds.github.io/BrownFall22/syllabus/communication.html#'
So far, we’ve read data in from a .csv file with pd.read_csv
and created a DataFrame with the constructor pd.DataFrame
using a dictionary. Pandas provides many interfaces for reading in data. They’re described on the Pandas IO page.
We can use the read_html
method to read from this page. We know that it has multiple tables on the page, and from the help, we know that it will return a list of DataFrames.
df_list = pd.read_html(comm_url)
We can also verify what it returns
type(df_list)
list
We can index with []
to pick one item from the list and verify that it is a DataFrame.
type(df_list[0])
pandas.core.frame.DataFrame
3.4. Pythonic Loops#
In Python, loops do not require an iterator variable. It has an interable object and a loop variable.
for loop_variable in iterable_object:
# loop body
the loop_variable
takes on the value of each item in the iterable_object
each time it goes through, in order. Writing loops this way makes them more
compact and more readable, this is more like English. For example:
name = 'sarah'
for letter in name:
print(letter.upper())
S
A
R
A
H
It is best to name variables so that the loop variable makes sense as an item from the iterable. For example, names have letters in them, and an item in df_list
makes sense as df
.
for df in df_list:
print(df.shape)
(6, 4)
(6, 4)
(1, 3)
(3, 3)
(2, 3)
3.5. Types Solution#
Warning
I am using bad variable names here a
, b
,… because these are only as options for a question and we will not use them again
a = [char for char in 'abcde']
a
['a', 'b', 'c', 'd', 'e']
type(a)
list
b = {char:i for i, char in enumerate('abcde')}
b
{'a': 0, 'b': 1, 'c': 2, 'd': 3, 'e': 4}
type(b)
dict
c = ('a','b','c','d','e')
c
('a', 'b', 'c', 'd', 'e')
type(c)
tuple
d = 'a b c d e'.split('')
d
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
Cell In[31], line 1
----> 1 d = 'a b c d e'.split('')
2 d
ValueError: empty separator
type(d)
---------------------------------------------------------------------------
NameError Traceback (most recent call last)
Cell In[32], line 1
----> 1 type(d)
NameError: name 'd' is not defined
3.6. Questions After Class#
3.6.1. what is a dictionary in python?#
a dictionary is a datatype from base python that stores key, value pairs.
For example
prof_info = {'first':'Sarah', 'last':'Brown', 'title':'Dr.'}
prof_info
{'first': 'Sarah', 'last': 'Brown', 'title': 'Dr.'}
We can use the keys to index in and get the values out
prof_info['title']
'Dr.'
Even though we will mostly use DataFrame, dictionaries and other base python types are important. Dictionaries are very powerful they can hold whole functions in them. For example, the Python language does not have a switch case (which can be used for handling many if/else cases) but instead dictionaries can be used for that.
Further Reading
You can read more about the details of data types in Pandas in the documentation
3.6.2. How to see unique values in a column#
We will get to this soon! We got the first part, picking out a single column to look at, we will see the method for that probably on Monday, but maybe on Friday.