2. Iterables and Pandas Data Frames#
2.1. House Keeping#
2.2. Assignment 1#
You can revise it to fix anything you learned today before I give feedback.
2.3. Closing Jupyter server.#
In the terminal use Ctrl+C (actually control, not command on mac).
It will ask you a question and give options, read and follow
or
do ctrl+C a second time.
A jupyter server typically runs at localhost:8888
, but if you have multiple servers running the count increases.
Once I saw a student in office hours working on localhost:8894
asking why their code kept crashing.
Important
Remember to close your jupyter server
2.4. Using Pandas#
We will use data with a library called pandas. By convention, we import it like:
import pandas as pd
the
import
keyword is used for loading packagespandas
is the name of the package that is installedas
keyword allows us to assign an alias (nickname)pd
is the typical alias for pandas
2.5. Everything is Data#
Data we will see:
tabular data
websites as data
activity logs on websites
images
text
2.6. Why inspection in code?#
Some IDEs give you GUI based tools to inspect objects. We are going to do it programmatically inline with our analyses for two reasons.
(minor, logistical) it helps make for good notes
(most importantly) it helps build habits of data science
In data science, our code will be aiming to tell a story.
If you’re curious about something, try it out, see what happens. We’re going to use a lot of code inspection tools during class. These are helpful both for understanding what’s going on, but the advantage to knowing how to get this information programmatically even though a different IDE would give you inspection tools is that it helps you treat your code as data.
2.7. everything is an object#
let’s examine the type
of some variables:
a=4
b = 'monday'
c =5.3
d = print
type(a)
int
` ints are a base python type, like they appear in other languages
strings are iterable type, meaning that theycan be indexed into, or their elements iterated over. For a more technical definition, see the official python glossary entry
type(b)
str
we can select one element
b[0]
'm'
or multiple, this is called slicing.
b[0:3]
'mon'
negative numbers count from the right.
b[-1]
'y'
type (c)
float
a variable can hold a whole function.
type(d)
builtin_function_or_method
functions are also objects like any other type in python
we can use the variable just like the function itself
d('hello')
hello
print(b)
monday
2.8. Tabular Data#
Structured data is easier to work with than other data.
We’re going to focus on tabular data for now. At the end of the course, we’ll examine images, which are structured, but more complex and text, which is much less structured.
2.9. Getting familiar with the datset#
We’re going to use a dataset about coffee quality today.
How was this dataset collected?
reviews added to DB
then scraped
Where did it come from?
coffee Quality Institute’s trained reviewers.
what format is it provided in?
csv (Comma Separated Values)
what other information is in this repository?
the code to scrape and clean the data
the data before cleaning
It’s important to always know where data came from and how it was collected.
This helps you know what is is useful for and what its limitations are.
Further Reading
An important research article on documenting datasets for machine learning is called Datasheets for Datasets these researchers also did a follow up study to better understand how practitioner use datasheets and decide how to use data.
If topics like this are interesting to you, let me know! my research is related to this and I have a lot of students who complete 310 do research in my lab.
2.10. Loading the Coffee Data#
Get raw url for the dataset click on the raw button on the csv page, then copy the url.
coffee_data_url = 'https://raw.githubusercontent.com/jldbc/coffee-quality-database/master/data/robusta_data_cleaned.csv'
Warning
This did not work in class, so I downloaded the data and dragged it to the same folder as my notebook
pd.read_csv(coffee_data_url)
Unnamed: 0 | Species | Owner | Country.of.Origin | Farm.Name | Lot.Number | Mill | ICO.Number | Company | Altitude | ... | Color | Category.Two.Defects | Expiration | Certification.Body | Certification.Address | Certification.Contact | unit_of_measurement | altitude_low_meters | altitude_high_meters | altitude_mean_meters | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | 1 | Robusta | ankole coffee producers coop | Uganda | kyangundu cooperative society | NaN | ankole coffee producers | 0 | ankole coffee producers coop | 1488 | ... | Green | 2 | June 26th, 2015 | Uganda Coffee Development Authority | e36d0270932c3b657e96b7b0278dfd85dc0fe743 | 03077a1c6bac60e6f514691634a7f6eb5c85aae8 | m | 1488.0 | 1488.0 | 1488.0 |
1 | 2 | Robusta | nishant gurjer | India | sethuraman estate kaapi royale | 25 | sethuraman estate | 14/1148/2017/21 | kaapi royale | 3170 | ... | NaN | 2 | October 31st, 2018 | Specialty Coffee Association | ff7c18ad303d4b603ac3f8cff7e611ffc735e720 | 352d0cf7f3e9be14dad7df644ad65efc27605ae2 | m | 3170.0 | 3170.0 | 3170.0 |
2 | 3 | Robusta | andrew hetzel | India | sethuraman estate | NaN | NaN | 0000 | sethuraman estate | 1000m | ... | Green | 0 | April 29th, 2016 | Specialty Coffee Association | ff7c18ad303d4b603ac3f8cff7e611ffc735e720 | 352d0cf7f3e9be14dad7df644ad65efc27605ae2 | m | 1000.0 | 1000.0 | 1000.0 |
3 | 4 | Robusta | ugacof | Uganda | ugacof project area | NaN | ugacof | 0 | ugacof ltd | 1212 | ... | Green | 7 | July 14th, 2015 | Uganda Coffee Development Authority | e36d0270932c3b657e96b7b0278dfd85dc0fe743 | 03077a1c6bac60e6f514691634a7f6eb5c85aae8 | m | 1212.0 | 1212.0 | 1212.0 |
4 | 5 | Robusta | katuka development trust ltd | Uganda | katikamu capca farmers association | NaN | katuka development trust | 0 | katuka development trust ltd | 1200-1300 | ... | Green | 3 | June 26th, 2015 | Uganda Coffee Development Authority | e36d0270932c3b657e96b7b0278dfd85dc0fe743 | 03077a1c6bac60e6f514691634a7f6eb5c85aae8 | m | 1200.0 | 1300.0 | 1250.0 |
5 | 6 | Robusta | andrew hetzel | India | NaN | NaN | (self) | NaN | cafemakers, llc | 3000' | ... | Green | 0 | February 28th, 2013 | Specialty Coffee Association | ff7c18ad303d4b603ac3f8cff7e611ffc735e720 | 352d0cf7f3e9be14dad7df644ad65efc27605ae2 | m | 3000.0 | 3000.0 | 3000.0 |
6 | 7 | Robusta | andrew hetzel | India | sethuraman estates | NaN | NaN | NaN | cafemakers | 750m | ... | Green | 0 | May 15th, 2015 | Specialty Coffee Association | ff7c18ad303d4b603ac3f8cff7e611ffc735e720 | 352d0cf7f3e9be14dad7df644ad65efc27605ae2 | m | 750.0 | 750.0 | 750.0 |
7 | 8 | Robusta | nishant gurjer | India | sethuraman estate kaapi royale | 7 | sethuraman estate | 14/1148/2017/18 | kaapi royale | 3140 | ... | Bluish-Green | 0 | October 25th, 2018 | Specialty Coffee Association | ff7c18ad303d4b603ac3f8cff7e611ffc735e720 | 352d0cf7f3e9be14dad7df644ad65efc27605ae2 | m | 3140.0 | 3140.0 | 3140.0 |
8 | 9 | Robusta | nishant gurjer | India | sethuraman estate | RKR | sethuraman estate | 14/1148/2016/17 | kaapi royale | 1000 | ... | Green | 0 | August 17th, 2017 | Specialty Coffee Association | ff7c18ad303d4b603ac3f8cff7e611ffc735e720 | 352d0cf7f3e9be14dad7df644ad65efc27605ae2 | m | 1000.0 | 1000.0 | 1000.0 |
9 | 10 | Robusta | ugacof | Uganda | ishaka | NaN | nsubuga umar | 0 | ugacof ltd | 900-1300 | ... | Green | 6 | August 5th, 2015 | Uganda Coffee Development Authority | e36d0270932c3b657e96b7b0278dfd85dc0fe743 | 03077a1c6bac60e6f514691634a7f6eb5c85aae8 | m | 900.0 | 1300.0 | 1100.0 |
10 | 11 | Robusta | ugacof | Uganda | ugacof project area | NaN | ugacof | 0 | ugacof ltd | 1095 | ... | Green | 1 | June 26th, 2015 | Uganda Coffee Development Authority | e36d0270932c3b657e96b7b0278dfd85dc0fe743 | 03077a1c6bac60e6f514691634a7f6eb5c85aae8 | m | 1095.0 | 1095.0 | 1095.0 |
11 | 12 | Robusta | nishant gurjer | India | sethuraman estate kaapi royale | RC AB | sethuraman estate | 14/1148/2016/12 | kaapi royale | 1000 | ... | Green | 0 | August 23rd, 2017 | Specialty Coffee Association | ff7c18ad303d4b603ac3f8cff7e611ffc735e720 | 352d0cf7f3e9be14dad7df644ad65efc27605ae2 | m | 1000.0 | 1000.0 | 1000.0 |
12 | 13 | Robusta | andrew hetzel | India | sethuraman estates | NaN | NaN | NaN | cafemakers | 750m | ... | Green | 1 | May 19th, 2015 | Specialty Coffee Association | ff7c18ad303d4b603ac3f8cff7e611ffc735e720 | 352d0cf7f3e9be14dad7df644ad65efc27605ae2 | m | 750.0 | 750.0 | 750.0 |
13 | 14 | Robusta | kasozi coffee farmers association | Uganda | kasozi coffee farmers | NaN | NaN | 0 | kasozi coffee farmers association | 1367 | ... | Green | 7 | July 14th, 2015 | Uganda Coffee Development Authority | e36d0270932c3b657e96b7b0278dfd85dc0fe743 | 03077a1c6bac60e6f514691634a7f6eb5c85aae8 | m | 1367.0 | 1367.0 | 1367.0 |
14 | 15 | Robusta | ankole coffee producers coop | Uganda | kyangundu coop society | NaN | ankole coffee producers coop union ltd | 0 | ankole coffee producers coop | 1488 | ... | Green | 2 | July 14th, 2015 | Uganda Coffee Development Authority | e36d0270932c3b657e96b7b0278dfd85dc0fe743 | 03077a1c6bac60e6f514691634a7f6eb5c85aae8 | m | 1488.0 | 1488.0 | 1488.0 |
15 | 16 | Robusta | andrew hetzel | India | sethuraman estate | NaN | NaN | 0000 | sethuraman estate | 1000m | ... | Green | 0 | April 29th, 2016 | Specialty Coffee Association | ff7c18ad303d4b603ac3f8cff7e611ffc735e720 | 352d0cf7f3e9be14dad7df644ad65efc27605ae2 | m | 1000.0 | 1000.0 | 1000.0 |
16 | 17 | Robusta | andrew hetzel | India | sethuraman estates | NaN | sethuraman estates | NaN | cafemakers, llc | 750m | ... | Blue-Green | 0 | June 3rd, 2014 | Specialty Coffee Association | ff7c18ad303d4b603ac3f8cff7e611ffc735e720 | 352d0cf7f3e9be14dad7df644ad65efc27605ae2 | m | 750.0 | 750.0 | 750.0 |
17 | 18 | Robusta | kawacom uganda ltd | Uganda | bushenyi | NaN | kawacom | 0 | kawacom uganda ltd | 1600 | ... | Green | 1 | June 27th, 2015 | Uganda Coffee Development Authority | e36d0270932c3b657e96b7b0278dfd85dc0fe743 | 03077a1c6bac60e6f514691634a7f6eb5c85aae8 | m | 1600.0 | 1600.0 | 1600.0 |
18 | 19 | Robusta | nitubaasa ltd | Uganda | kigezi coffee farmers association | NaN | nitubaasa | 0 | nitubaasa ltd | 1745 | ... | Green | 2 | June 27th, 2015 | Uganda Coffee Development Authority | e36d0270932c3b657e96b7b0278dfd85dc0fe743 | 03077a1c6bac60e6f514691634a7f6eb5c85aae8 | m | 1745.0 | 1745.0 | 1745.0 |
19 | 20 | Robusta | mannya coffee project | Uganda | mannya coffee project | NaN | mannya coffee project | 0 | mannya coffee project | 1200 | ... | Green | 1 | June 27th, 2015 | Uganda Coffee Development Authority | e36d0270932c3b657e96b7b0278dfd85dc0fe743 | 03077a1c6bac60e6f514691634a7f6eb5c85aae8 | m | 1200.0 | 1200.0 | 1200.0 |
20 | 21 | Robusta | andrew hetzel | India | sethuraman estates | NaN | NaN | NaN | cafemakers | 750m | ... | Bluish-Green | 1 | May 19th, 2015 | Specialty Coffee Association | ff7c18ad303d4b603ac3f8cff7e611ffc735e720 | 352d0cf7f3e9be14dad7df644ad65efc27605ae2 | m | 750.0 | 750.0 | 750.0 |
21 | 22 | Robusta | andrew hetzel | India | sethuraman estates | NaN | sethuraman estates | NaN | cafemakers, llc | 750m | ... | Green | 0 | June 20th, 2014 | Specialty Coffee Association | ff7c18ad303d4b603ac3f8cff7e611ffc735e720 | 352d0cf7f3e9be14dad7df644ad65efc27605ae2 | m | 750.0 | 750.0 | 750.0 |
22 | 23 | Robusta | andrew hetzel | United States | sethuraman estates | NaN | sethuraman estates | NaN | cafemakers, llc | 3000' | ... | Green | 0 | February 28th, 2013 | Specialty Coffee Association | ff7c18ad303d4b603ac3f8cff7e611ffc735e720 | 352d0cf7f3e9be14dad7df644ad65efc27605ae2 | m | 3000.0 | 3000.0 | 3000.0 |
23 | 24 | Robusta | luis robles | Ecuador | robustasa | Lavado 1 | our own lab | NaN | robustasa | NaN | ... | Blue-Green | 1 | January 18th, 2017 | Specialty Coffee Association | ff7c18ad303d4b603ac3f8cff7e611ffc735e720 | 352d0cf7f3e9be14dad7df644ad65efc27605ae2 | m | NaN | NaN | NaN |
24 | 25 | Robusta | luis robles | Ecuador | robustasa | Lavado 3 | own laboratory | NaN | robustasa | 40 | ... | Blue-Green | 0 | January 18th, 2017 | Specialty Coffee Association | ff7c18ad303d4b603ac3f8cff7e611ffc735e720 | 352d0cf7f3e9be14dad7df644ad65efc27605ae2 | m | 40.0 | 40.0 | 40.0 |
25 | 26 | Robusta | james moore | United States | fazenda cazengo | NaN | cafe cazengo | NaN | global opportunity fund | 795 meters | ... | NaN | 6 | December 23rd, 2015 | Specialty Coffee Association | ff7c18ad303d4b603ac3f8cff7e611ffc735e720 | 352d0cf7f3e9be14dad7df644ad65efc27605ae2 | m | 795.0 | 795.0 | 795.0 |
26 | 27 | Robusta | cafe politico | India | NaN | NaN | NaN | 14-1118-2014-0087 | cafe politico | NaN | ... | Green | 1 | August 25th, 2015 | Specialty Coffee Association | ff7c18ad303d4b603ac3f8cff7e611ffc735e720 | 352d0cf7f3e9be14dad7df644ad65efc27605ae2 | m | NaN | NaN | NaN |
27 | 28 | Robusta | cafe politico | Vietnam | NaN | NaN | NaN | NaN | cafe politico | NaN | ... | NaN | 9 | August 25th, 2015 | Specialty Coffee Association | ff7c18ad303d4b603ac3f8cff7e611ffc735e720 | 352d0cf7f3e9be14dad7df644ad65efc27605ae2 | m | NaN | NaN | NaN |
28 rows × 44 columns
If the file is local, you can load it this way. The parameter of the function is the path to the dataset, that can be relative, like below, absolute (a full address on your computer) or a URL like above.
pd.read_csv('robusta_data_cleaned.csv')
This read in the data and printed it out because it is the last line on the cell. If we do something else after, it will read it in, but not print it out.
In order to use it, we save the output to a variable.
coffee_df = pd.read_csv(coffee_data_url)
we choose this name so that related variables will all use coffee
and then have other parts after _
to describe them in terms of type and content. In Python, for variables, the typical convention is to use _
to join words, not CamelCase, which is used for classes, like DataFrame
we can look at it again using the jupyter display
coffee_df
Unnamed: 0 | Species | Owner | Country.of.Origin | Farm.Name | Lot.Number | Mill | ICO.Number | Company | Altitude | ... | Color | Category.Two.Defects | Expiration | Certification.Body | Certification.Address | Certification.Contact | unit_of_measurement | altitude_low_meters | altitude_high_meters | altitude_mean_meters | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | 1 | Robusta | ankole coffee producers coop | Uganda | kyangundu cooperative society | NaN | ankole coffee producers | 0 | ankole coffee producers coop | 1488 | ... | Green | 2 | June 26th, 2015 | Uganda Coffee Development Authority | e36d0270932c3b657e96b7b0278dfd85dc0fe743 | 03077a1c6bac60e6f514691634a7f6eb5c85aae8 | m | 1488.0 | 1488.0 | 1488.0 |
1 | 2 | Robusta | nishant gurjer | India | sethuraman estate kaapi royale | 25 | sethuraman estate | 14/1148/2017/21 | kaapi royale | 3170 | ... | NaN | 2 | October 31st, 2018 | Specialty Coffee Association | ff7c18ad303d4b603ac3f8cff7e611ffc735e720 | 352d0cf7f3e9be14dad7df644ad65efc27605ae2 | m | 3170.0 | 3170.0 | 3170.0 |
2 | 3 | Robusta | andrew hetzel | India | sethuraman estate | NaN | NaN | 0000 | sethuraman estate | 1000m | ... | Green | 0 | April 29th, 2016 | Specialty Coffee Association | ff7c18ad303d4b603ac3f8cff7e611ffc735e720 | 352d0cf7f3e9be14dad7df644ad65efc27605ae2 | m | 1000.0 | 1000.0 | 1000.0 |
3 | 4 | Robusta | ugacof | Uganda | ugacof project area | NaN | ugacof | 0 | ugacof ltd | 1212 | ... | Green | 7 | July 14th, 2015 | Uganda Coffee Development Authority | e36d0270932c3b657e96b7b0278dfd85dc0fe743 | 03077a1c6bac60e6f514691634a7f6eb5c85aae8 | m | 1212.0 | 1212.0 | 1212.0 |
4 | 5 | Robusta | katuka development trust ltd | Uganda | katikamu capca farmers association | NaN | katuka development trust | 0 | katuka development trust ltd | 1200-1300 | ... | Green | 3 | June 26th, 2015 | Uganda Coffee Development Authority | e36d0270932c3b657e96b7b0278dfd85dc0fe743 | 03077a1c6bac60e6f514691634a7f6eb5c85aae8 | m | 1200.0 | 1300.0 | 1250.0 |
5 | 6 | Robusta | andrew hetzel | India | NaN | NaN | (self) | NaN | cafemakers, llc | 3000' | ... | Green | 0 | February 28th, 2013 | Specialty Coffee Association | ff7c18ad303d4b603ac3f8cff7e611ffc735e720 | 352d0cf7f3e9be14dad7df644ad65efc27605ae2 | m | 3000.0 | 3000.0 | 3000.0 |
6 | 7 | Robusta | andrew hetzel | India | sethuraman estates | NaN | NaN | NaN | cafemakers | 750m | ... | Green | 0 | May 15th, 2015 | Specialty Coffee Association | ff7c18ad303d4b603ac3f8cff7e611ffc735e720 | 352d0cf7f3e9be14dad7df644ad65efc27605ae2 | m | 750.0 | 750.0 | 750.0 |
7 | 8 | Robusta | nishant gurjer | India | sethuraman estate kaapi royale | 7 | sethuraman estate | 14/1148/2017/18 | kaapi royale | 3140 | ... | Bluish-Green | 0 | October 25th, 2018 | Specialty Coffee Association | ff7c18ad303d4b603ac3f8cff7e611ffc735e720 | 352d0cf7f3e9be14dad7df644ad65efc27605ae2 | m | 3140.0 | 3140.0 | 3140.0 |
8 | 9 | Robusta | nishant gurjer | India | sethuraman estate | RKR | sethuraman estate | 14/1148/2016/17 | kaapi royale | 1000 | ... | Green | 0 | August 17th, 2017 | Specialty Coffee Association | ff7c18ad303d4b603ac3f8cff7e611ffc735e720 | 352d0cf7f3e9be14dad7df644ad65efc27605ae2 | m | 1000.0 | 1000.0 | 1000.0 |
9 | 10 | Robusta | ugacof | Uganda | ishaka | NaN | nsubuga umar | 0 | ugacof ltd | 900-1300 | ... | Green | 6 | August 5th, 2015 | Uganda Coffee Development Authority | e36d0270932c3b657e96b7b0278dfd85dc0fe743 | 03077a1c6bac60e6f514691634a7f6eb5c85aae8 | m | 900.0 | 1300.0 | 1100.0 |
10 | 11 | Robusta | ugacof | Uganda | ugacof project area | NaN | ugacof | 0 | ugacof ltd | 1095 | ... | Green | 1 | June 26th, 2015 | Uganda Coffee Development Authority | e36d0270932c3b657e96b7b0278dfd85dc0fe743 | 03077a1c6bac60e6f514691634a7f6eb5c85aae8 | m | 1095.0 | 1095.0 | 1095.0 |
11 | 12 | Robusta | nishant gurjer | India | sethuraman estate kaapi royale | RC AB | sethuraman estate | 14/1148/2016/12 | kaapi royale | 1000 | ... | Green | 0 | August 23rd, 2017 | Specialty Coffee Association | ff7c18ad303d4b603ac3f8cff7e611ffc735e720 | 352d0cf7f3e9be14dad7df644ad65efc27605ae2 | m | 1000.0 | 1000.0 | 1000.0 |
12 | 13 | Robusta | andrew hetzel | India | sethuraman estates | NaN | NaN | NaN | cafemakers | 750m | ... | Green | 1 | May 19th, 2015 | Specialty Coffee Association | ff7c18ad303d4b603ac3f8cff7e611ffc735e720 | 352d0cf7f3e9be14dad7df644ad65efc27605ae2 | m | 750.0 | 750.0 | 750.0 |
13 | 14 | Robusta | kasozi coffee farmers association | Uganda | kasozi coffee farmers | NaN | NaN | 0 | kasozi coffee farmers association | 1367 | ... | Green | 7 | July 14th, 2015 | Uganda Coffee Development Authority | e36d0270932c3b657e96b7b0278dfd85dc0fe743 | 03077a1c6bac60e6f514691634a7f6eb5c85aae8 | m | 1367.0 | 1367.0 | 1367.0 |
14 | 15 | Robusta | ankole coffee producers coop | Uganda | kyangundu coop society | NaN | ankole coffee producers coop union ltd | 0 | ankole coffee producers coop | 1488 | ... | Green | 2 | July 14th, 2015 | Uganda Coffee Development Authority | e36d0270932c3b657e96b7b0278dfd85dc0fe743 | 03077a1c6bac60e6f514691634a7f6eb5c85aae8 | m | 1488.0 | 1488.0 | 1488.0 |
15 | 16 | Robusta | andrew hetzel | India | sethuraman estate | NaN | NaN | 0000 | sethuraman estate | 1000m | ... | Green | 0 | April 29th, 2016 | Specialty Coffee Association | ff7c18ad303d4b603ac3f8cff7e611ffc735e720 | 352d0cf7f3e9be14dad7df644ad65efc27605ae2 | m | 1000.0 | 1000.0 | 1000.0 |
16 | 17 | Robusta | andrew hetzel | India | sethuraman estates | NaN | sethuraman estates | NaN | cafemakers, llc | 750m | ... | Blue-Green | 0 | June 3rd, 2014 | Specialty Coffee Association | ff7c18ad303d4b603ac3f8cff7e611ffc735e720 | 352d0cf7f3e9be14dad7df644ad65efc27605ae2 | m | 750.0 | 750.0 | 750.0 |
17 | 18 | Robusta | kawacom uganda ltd | Uganda | bushenyi | NaN | kawacom | 0 | kawacom uganda ltd | 1600 | ... | Green | 1 | June 27th, 2015 | Uganda Coffee Development Authority | e36d0270932c3b657e96b7b0278dfd85dc0fe743 | 03077a1c6bac60e6f514691634a7f6eb5c85aae8 | m | 1600.0 | 1600.0 | 1600.0 |
18 | 19 | Robusta | nitubaasa ltd | Uganda | kigezi coffee farmers association | NaN | nitubaasa | 0 | nitubaasa ltd | 1745 | ... | Green | 2 | June 27th, 2015 | Uganda Coffee Development Authority | e36d0270932c3b657e96b7b0278dfd85dc0fe743 | 03077a1c6bac60e6f514691634a7f6eb5c85aae8 | m | 1745.0 | 1745.0 | 1745.0 |
19 | 20 | Robusta | mannya coffee project | Uganda | mannya coffee project | NaN | mannya coffee project | 0 | mannya coffee project | 1200 | ... | Green | 1 | June 27th, 2015 | Uganda Coffee Development Authority | e36d0270932c3b657e96b7b0278dfd85dc0fe743 | 03077a1c6bac60e6f514691634a7f6eb5c85aae8 | m | 1200.0 | 1200.0 | 1200.0 |
20 | 21 | Robusta | andrew hetzel | India | sethuraman estates | NaN | NaN | NaN | cafemakers | 750m | ... | Bluish-Green | 1 | May 19th, 2015 | Specialty Coffee Association | ff7c18ad303d4b603ac3f8cff7e611ffc735e720 | 352d0cf7f3e9be14dad7df644ad65efc27605ae2 | m | 750.0 | 750.0 | 750.0 |
21 | 22 | Robusta | andrew hetzel | India | sethuraman estates | NaN | sethuraman estates | NaN | cafemakers, llc | 750m | ... | Green | 0 | June 20th, 2014 | Specialty Coffee Association | ff7c18ad303d4b603ac3f8cff7e611ffc735e720 | 352d0cf7f3e9be14dad7df644ad65efc27605ae2 | m | 750.0 | 750.0 | 750.0 |
22 | 23 | Robusta | andrew hetzel | United States | sethuraman estates | NaN | sethuraman estates | NaN | cafemakers, llc | 3000' | ... | Green | 0 | February 28th, 2013 | Specialty Coffee Association | ff7c18ad303d4b603ac3f8cff7e611ffc735e720 | 352d0cf7f3e9be14dad7df644ad65efc27605ae2 | m | 3000.0 | 3000.0 | 3000.0 |
23 | 24 | Robusta | luis robles | Ecuador | robustasa | Lavado 1 | our own lab | NaN | robustasa | NaN | ... | Blue-Green | 1 | January 18th, 2017 | Specialty Coffee Association | ff7c18ad303d4b603ac3f8cff7e611ffc735e720 | 352d0cf7f3e9be14dad7df644ad65efc27605ae2 | m | NaN | NaN | NaN |
24 | 25 | Robusta | luis robles | Ecuador | robustasa | Lavado 3 | own laboratory | NaN | robustasa | 40 | ... | Blue-Green | 0 | January 18th, 2017 | Specialty Coffee Association | ff7c18ad303d4b603ac3f8cff7e611ffc735e720 | 352d0cf7f3e9be14dad7df644ad65efc27605ae2 | m | 40.0 | 40.0 | 40.0 |
25 | 26 | Robusta | james moore | United States | fazenda cazengo | NaN | cafe cazengo | NaN | global opportunity fund | 795 meters | ... | NaN | 6 | December 23rd, 2015 | Specialty Coffee Association | ff7c18ad303d4b603ac3f8cff7e611ffc735e720 | 352d0cf7f3e9be14dad7df644ad65efc27605ae2 | m | 795.0 | 795.0 | 795.0 |
26 | 27 | Robusta | cafe politico | India | NaN | NaN | NaN | 14-1118-2014-0087 | cafe politico | NaN | ... | Green | 1 | August 25th, 2015 | Specialty Coffee Association | ff7c18ad303d4b603ac3f8cff7e611ffc735e720 | 352d0cf7f3e9be14dad7df644ad65efc27605ae2 | m | NaN | NaN | NaN |
27 | 28 | Robusta | cafe politico | Vietnam | NaN | NaN | NaN | NaN | cafe politico | NaN | ... | NaN | 9 | August 25th, 2015 | Specialty Coffee Association | ff7c18ad303d4b603ac3f8cff7e611ffc735e720 | 352d0cf7f3e9be14dad7df644ad65efc27605ae2 | m | NaN | NaN | NaN |
28 rows × 44 columns
Next we examine the type
type(coffee_df)
pandas.core.frame.DataFrame
This is a new type provided by the pandas
library, called a dataframe
We can also exmaine its parts. It consists of several; first the column headings
coffee_df.columns
Index(['Unnamed: 0', 'Species', 'Owner', 'Country.of.Origin', 'Farm.Name',
'Lot.Number', 'Mill', 'ICO.Number', 'Company', 'Altitude', 'Region',
'Producer', 'Number.of.Bags', 'Bag.Weight', 'In.Country.Partner',
'Harvest.Year', 'Grading.Date', 'Owner.1', 'Variety',
'Processing.Method', 'Fragrance...Aroma', 'Flavor', 'Aftertaste',
'Salt...Acid', 'Bitter...Sweet', 'Mouthfeel', 'Uniform.Cup',
'Clean.Cup', 'Balance', 'Cupper.Points', 'Total.Cup.Points', 'Moisture',
'Category.One.Defects', 'Quakers', 'Color', 'Category.Two.Defects',
'Expiration', 'Certification.Body', 'Certification.Address',
'Certification.Contact', 'unit_of_measurement', 'altitude_low_meters',
'altitude_high_meters', 'altitude_mean_meters'],
dtype='object')
These are a special type called Index that is also provided by pandas.
It also tells us that the actual headings are of dtype
object
. object
is used for strings or columns with mixed types
the dtype
is slightly different from base Python types and is how pandas classifies but roughly is the same idea as a type.
type(coffee_df.columns)
pandas.core.indexes.base.Index
We can look at the first 5 rows with head
coffee_df.head()
Unnamed: 0 | Species | Owner | Country.of.Origin | Farm.Name | Lot.Number | Mill | ICO.Number | Company | Altitude | ... | Color | Category.Two.Defects | Expiration | Certification.Body | Certification.Address | Certification.Contact | unit_of_measurement | altitude_low_meters | altitude_high_meters | altitude_mean_meters | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | 1 | Robusta | ankole coffee producers coop | Uganda | kyangundu cooperative society | NaN | ankole coffee producers | 0 | ankole coffee producers coop | 1488 | ... | Green | 2 | June 26th, 2015 | Uganda Coffee Development Authority | e36d0270932c3b657e96b7b0278dfd85dc0fe743 | 03077a1c6bac60e6f514691634a7f6eb5c85aae8 | m | 1488.0 | 1488.0 | 1488.0 |
1 | 2 | Robusta | nishant gurjer | India | sethuraman estate kaapi royale | 25 | sethuraman estate | 14/1148/2017/21 | kaapi royale | 3170 | ... | NaN | 2 | October 31st, 2018 | Specialty Coffee Association | ff7c18ad303d4b603ac3f8cff7e611ffc735e720 | 352d0cf7f3e9be14dad7df644ad65efc27605ae2 | m | 3170.0 | 3170.0 | 3170.0 |
2 | 3 | Robusta | andrew hetzel | India | sethuraman estate | NaN | NaN | 0000 | sethuraman estate | 1000m | ... | Green | 0 | April 29th, 2016 | Specialty Coffee Association | ff7c18ad303d4b603ac3f8cff7e611ffc735e720 | 352d0cf7f3e9be14dad7df644ad65efc27605ae2 | m | 1000.0 | 1000.0 | 1000.0 |
3 | 4 | Robusta | ugacof | Uganda | ugacof project area | NaN | ugacof | 0 | ugacof ltd | 1212 | ... | Green | 7 | July 14th, 2015 | Uganda Coffee Development Authority | e36d0270932c3b657e96b7b0278dfd85dc0fe743 | 03077a1c6bac60e6f514691634a7f6eb5c85aae8 | m | 1212.0 | 1212.0 | 1212.0 |
4 | 5 | Robusta | katuka development trust ltd | Uganda | katikamu capca farmers association | NaN | katuka development trust | 0 | katuka development trust ltd | 1200-1300 | ... | Green | 3 | June 26th, 2015 | Uganda Coffee Development Authority | e36d0270932c3b657e96b7b0278dfd85dc0fe743 | 03077a1c6bac60e6f514691634a7f6eb5c85aae8 | m | 1200.0 | 1300.0 | 1250.0 |
5 rows × 44 columns
Some notes:
columns
is an attribute, something that theDataFrame
object stores directly so we access it as ishead
is a method, it does something to the content and can rely on parameters (heren=5
can be changed to show different numbers of rows)
If we forget the ()
on a method, it looks weird in the output
coffee_df.head
<bound method NDFrame.head of Unnamed: 0 Species Owner Country.of.Origin \
0 1 Robusta ankole coffee producers coop Uganda
1 2 Robusta nishant gurjer India
2 3 Robusta andrew hetzel India
3 4 Robusta ugacof Uganda
4 5 Robusta katuka development trust ltd Uganda
5 6 Robusta andrew hetzel India
6 7 Robusta andrew hetzel India
7 8 Robusta nishant gurjer India
8 9 Robusta nishant gurjer India
9 10 Robusta ugacof Uganda
10 11 Robusta ugacof Uganda
11 12 Robusta nishant gurjer India
12 13 Robusta andrew hetzel India
13 14 Robusta kasozi coffee farmers association Uganda
14 15 Robusta ankole coffee producers coop Uganda
15 16 Robusta andrew hetzel India
16 17 Robusta andrew hetzel India
17 18 Robusta kawacom uganda ltd Uganda
18 19 Robusta nitubaasa ltd Uganda
19 20 Robusta mannya coffee project Uganda
20 21 Robusta andrew hetzel India
21 22 Robusta andrew hetzel India
22 23 Robusta andrew hetzel United States
23 24 Robusta luis robles Ecuador
24 25 Robusta luis robles Ecuador
25 26 Robusta james moore United States
26 27 Robusta cafe politico India
27 28 Robusta cafe politico Vietnam
Farm.Name Lot.Number \
0 kyangundu cooperative society NaN
1 sethuraman estate kaapi royale 25
2 sethuraman estate NaN
3 ugacof project area NaN
4 katikamu capca farmers association NaN
5 NaN NaN
6 sethuraman estates NaN
7 sethuraman estate kaapi royale 7
8 sethuraman estate RKR
9 ishaka NaN
10 ugacof project area NaN
11 sethuraman estate kaapi royale RC AB
12 sethuraman estates NaN
13 kasozi coffee farmers NaN
14 kyangundu coop society NaN
15 sethuraman estate NaN
16 sethuraman estates NaN
17 bushenyi NaN
18 kigezi coffee farmers association NaN
19 mannya coffee project NaN
20 sethuraman estates NaN
21 sethuraman estates NaN
22 sethuraman estates NaN
23 robustasa Lavado 1
24 robustasa Lavado 3
25 fazenda cazengo NaN
26 NaN NaN
27 NaN NaN
Mill ICO.Number \
0 ankole coffee producers 0
1 sethuraman estate 14/1148/2017/21
2 NaN 0000
3 ugacof 0
4 katuka development trust 0
5 (self) NaN
6 NaN NaN
7 sethuraman estate 14/1148/2017/18
8 sethuraman estate 14/1148/2016/17
9 nsubuga umar 0
10 ugacof 0
11 sethuraman estate 14/1148/2016/12
12 NaN NaN
13 NaN 0
14 ankole coffee producers coop union ltd 0
15 NaN 0000
16 sethuraman estates NaN
17 kawacom 0
18 nitubaasa 0
19 mannya coffee project 0
20 NaN NaN
21 sethuraman estates NaN
22 sethuraman estates NaN
23 our own lab NaN
24 own laboratory NaN
25 cafe cazengo NaN
26 NaN 14-1118-2014-0087
27 NaN NaN
Company Altitude ... Color \
0 ankole coffee producers coop 1488 ... Green
1 kaapi royale 3170 ... NaN
2 sethuraman estate 1000m ... Green
3 ugacof ltd 1212 ... Green
4 katuka development trust ltd 1200-1300 ... Green
5 cafemakers, llc 3000' ... Green
6 cafemakers 750m ... Green
7 kaapi royale 3140 ... Bluish-Green
8 kaapi royale 1000 ... Green
9 ugacof ltd 900-1300 ... Green
10 ugacof ltd 1095 ... Green
11 kaapi royale 1000 ... Green
12 cafemakers 750m ... Green
13 kasozi coffee farmers association 1367 ... Green
14 ankole coffee producers coop 1488 ... Green
15 sethuraman estate 1000m ... Green
16 cafemakers, llc 750m ... Blue-Green
17 kawacom uganda ltd 1600 ... Green
18 nitubaasa ltd 1745 ... Green
19 mannya coffee project 1200 ... Green
20 cafemakers 750m ... Bluish-Green
21 cafemakers, llc 750m ... Green
22 cafemakers, llc 3000' ... Green
23 robustasa NaN ... Blue-Green
24 robustasa 40 ... Blue-Green
25 global opportunity fund 795 meters ... NaN
26 cafe politico NaN ... Green
27 cafe politico NaN ... NaN
Category.Two.Defects Expiration \
0 2 June 26th, 2015
1 2 October 31st, 2018
2 0 April 29th, 2016
3 7 July 14th, 2015
4 3 June 26th, 2015
5 0 February 28th, 2013
6 0 May 15th, 2015
7 0 October 25th, 2018
8 0 August 17th, 2017
9 6 August 5th, 2015
10 1 June 26th, 2015
11 0 August 23rd, 2017
12 1 May 19th, 2015
13 7 July 14th, 2015
14 2 July 14th, 2015
15 0 April 29th, 2016
16 0 June 3rd, 2014
17 1 June 27th, 2015
18 2 June 27th, 2015
19 1 June 27th, 2015
20 1 May 19th, 2015
21 0 June 20th, 2014
22 0 February 28th, 2013
23 1 January 18th, 2017
24 0 January 18th, 2017
25 6 December 23rd, 2015
26 1 August 25th, 2015
27 9 August 25th, 2015
Certification.Body \
0 Uganda Coffee Development Authority
1 Specialty Coffee Association
2 Specialty Coffee Association
3 Uganda Coffee Development Authority
4 Uganda Coffee Development Authority
5 Specialty Coffee Association
6 Specialty Coffee Association
7 Specialty Coffee Association
8 Specialty Coffee Association
9 Uganda Coffee Development Authority
10 Uganda Coffee Development Authority
11 Specialty Coffee Association
12 Specialty Coffee Association
13 Uganda Coffee Development Authority
14 Uganda Coffee Development Authority
15 Specialty Coffee Association
16 Specialty Coffee Association
17 Uganda Coffee Development Authority
18 Uganda Coffee Development Authority
19 Uganda Coffee Development Authority
20 Specialty Coffee Association
21 Specialty Coffee Association
22 Specialty Coffee Association
23 Specialty Coffee Association
24 Specialty Coffee Association
25 Specialty Coffee Association
26 Specialty Coffee Association
27 Specialty Coffee Association
Certification.Address \
0 e36d0270932c3b657e96b7b0278dfd85dc0fe743
1 ff7c18ad303d4b603ac3f8cff7e611ffc735e720
2 ff7c18ad303d4b603ac3f8cff7e611ffc735e720
3 e36d0270932c3b657e96b7b0278dfd85dc0fe743
4 e36d0270932c3b657e96b7b0278dfd85dc0fe743
5 ff7c18ad303d4b603ac3f8cff7e611ffc735e720
6 ff7c18ad303d4b603ac3f8cff7e611ffc735e720
7 ff7c18ad303d4b603ac3f8cff7e611ffc735e720
8 ff7c18ad303d4b603ac3f8cff7e611ffc735e720
9 e36d0270932c3b657e96b7b0278dfd85dc0fe743
10 e36d0270932c3b657e96b7b0278dfd85dc0fe743
11 ff7c18ad303d4b603ac3f8cff7e611ffc735e720
12 ff7c18ad303d4b603ac3f8cff7e611ffc735e720
13 e36d0270932c3b657e96b7b0278dfd85dc0fe743
14 e36d0270932c3b657e96b7b0278dfd85dc0fe743
15 ff7c18ad303d4b603ac3f8cff7e611ffc735e720
16 ff7c18ad303d4b603ac3f8cff7e611ffc735e720
17 e36d0270932c3b657e96b7b0278dfd85dc0fe743
18 e36d0270932c3b657e96b7b0278dfd85dc0fe743
19 e36d0270932c3b657e96b7b0278dfd85dc0fe743
20 ff7c18ad303d4b603ac3f8cff7e611ffc735e720
21 ff7c18ad303d4b603ac3f8cff7e611ffc735e720
22 ff7c18ad303d4b603ac3f8cff7e611ffc735e720
23 ff7c18ad303d4b603ac3f8cff7e611ffc735e720
24 ff7c18ad303d4b603ac3f8cff7e611ffc735e720
25 ff7c18ad303d4b603ac3f8cff7e611ffc735e720
26 ff7c18ad303d4b603ac3f8cff7e611ffc735e720
27 ff7c18ad303d4b603ac3f8cff7e611ffc735e720
Certification.Contact unit_of_measurement \
0 03077a1c6bac60e6f514691634a7f6eb5c85aae8 m
1 352d0cf7f3e9be14dad7df644ad65efc27605ae2 m
2 352d0cf7f3e9be14dad7df644ad65efc27605ae2 m
3 03077a1c6bac60e6f514691634a7f6eb5c85aae8 m
4 03077a1c6bac60e6f514691634a7f6eb5c85aae8 m
5 352d0cf7f3e9be14dad7df644ad65efc27605ae2 m
6 352d0cf7f3e9be14dad7df644ad65efc27605ae2 m
7 352d0cf7f3e9be14dad7df644ad65efc27605ae2 m
8 352d0cf7f3e9be14dad7df644ad65efc27605ae2 m
9 03077a1c6bac60e6f514691634a7f6eb5c85aae8 m
10 03077a1c6bac60e6f514691634a7f6eb5c85aae8 m
11 352d0cf7f3e9be14dad7df644ad65efc27605ae2 m
12 352d0cf7f3e9be14dad7df644ad65efc27605ae2 m
13 03077a1c6bac60e6f514691634a7f6eb5c85aae8 m
14 03077a1c6bac60e6f514691634a7f6eb5c85aae8 m
15 352d0cf7f3e9be14dad7df644ad65efc27605ae2 m
16 352d0cf7f3e9be14dad7df644ad65efc27605ae2 m
17 03077a1c6bac60e6f514691634a7f6eb5c85aae8 m
18 03077a1c6bac60e6f514691634a7f6eb5c85aae8 m
19 03077a1c6bac60e6f514691634a7f6eb5c85aae8 m
20 352d0cf7f3e9be14dad7df644ad65efc27605ae2 m
21 352d0cf7f3e9be14dad7df644ad65efc27605ae2 m
22 352d0cf7f3e9be14dad7df644ad65efc27605ae2 m
23 352d0cf7f3e9be14dad7df644ad65efc27605ae2 m
24 352d0cf7f3e9be14dad7df644ad65efc27605ae2 m
25 352d0cf7f3e9be14dad7df644ad65efc27605ae2 m
26 352d0cf7f3e9be14dad7df644ad65efc27605ae2 m
27 352d0cf7f3e9be14dad7df644ad65efc27605ae2 m
altitude_low_meters altitude_high_meters altitude_mean_meters
0 1488.0 1488.0 1488.0
1 3170.0 3170.0 3170.0
2 1000.0 1000.0 1000.0
3 1212.0 1212.0 1212.0
4 1200.0 1300.0 1250.0
5 3000.0 3000.0 3000.0
6 750.0 750.0 750.0
7 3140.0 3140.0 3140.0
8 1000.0 1000.0 1000.0
9 900.0 1300.0 1100.0
10 1095.0 1095.0 1095.0
11 1000.0 1000.0 1000.0
12 750.0 750.0 750.0
13 1367.0 1367.0 1367.0
14 1488.0 1488.0 1488.0
15 1000.0 1000.0 1000.0
16 750.0 750.0 750.0
17 1600.0 1600.0 1600.0
18 1745.0 1745.0 1745.0
19 1200.0 1200.0 1200.0
20 750.0 750.0 750.0
21 750.0 750.0 750.0
22 3000.0 3000.0 3000.0
23 NaN NaN NaN
24 40.0 40.0 40.0
25 795.0 795.0 795.0
26 NaN NaN NaN
27 NaN NaN NaN
[28 rows x 44 columns]>
We can see more about why this happens with type
.
type(coffee_df.head)
method
Without the parenthesis, it is the literal function object.
type(coffee_df.head())
pandas.core.frame.DataFrame
With the parenthesis, it runs the function and type
examines what it returns, the DataFrame
object.
2.11. Assignment 1 tips#
2.11.1. Pythonic Loops#
In Python, we call good style ‘pythonic’, for loops that means making a sensible loop variable. Let’s firs tmake a list object we can iterate over
coffee_cols_list = list(coffee_df.columns)
coffee_cols_list
['Unnamed: 0',
'Species',
'Owner',
'Country.of.Origin',
'Farm.Name',
'Lot.Number',
'Mill',
'ICO.Number',
'Company',
'Altitude',
'Region',
'Producer',
'Number.of.Bags',
'Bag.Weight',
'In.Country.Partner',
'Harvest.Year',
'Grading.Date',
'Owner.1',
'Variety',
'Processing.Method',
'Fragrance...Aroma',
'Flavor',
'Aftertaste',
'Salt...Acid',
'Bitter...Sweet',
'Mouthfeel',
'Uniform.Cup',
'Clean.Cup',
'Balance',
'Cupper.Points',
'Total.Cup.Points',
'Moisture',
'Category.One.Defects',
'Quakers',
'Color',
'Category.Two.Defects',
'Expiration',
'Certification.Body',
'Certification.Address',
'Certification.Contact',
'unit_of_measurement',
'altitude_low_meters',
'altitude_high_meters',
'altitude_mean_meters']
Now we will write a loop to clean up the .
clean_cols = []
for col in coffee_cols_list:
clean_cols.append(col.replace('.','_'))
clean_cols
['Unnamed: 0',
'Species',
'Owner',
'Country_of_Origin',
'Farm_Name',
'Lot_Number',
'Mill',
'ICO_Number',
'Company',
'Altitude',
'Region',
'Producer',
'Number_of_Bags',
'Bag_Weight',
'In_Country_Partner',
'Harvest_Year',
'Grading_Date',
'Owner_1',
'Variety',
'Processing_Method',
'Fragrance___Aroma',
'Flavor',
'Aftertaste',
'Salt___Acid',
'Bitter___Sweet',
'Mouthfeel',
'Uniform_Cup',
'Clean_Cup',
'Balance',
'Cupper_Points',
'Total_Cup_Points',
'Moisture',
'Category_One_Defects',
'Quakers',
'Color',
'Category_Two_Defects',
'Expiration',
'Certification_Body',
'Certification_Address',
'Certification_Contact',
'unit_of_measurement',
'altitude_low_meters',
'altitude_high_meters',
'altitude_mean_meters']
This is equivalent, but easier to read than:
clean_cols = []
for i in range(len(coffee_cols_list)):
clean_cols.append(coffee_cols_list[i].replace('.','_'))
clean_cols
['Unnamed: 0',
'Species',
'Owner',
'Country_of_Origin',
'Farm_Name',
'Lot_Number',
'Mill',
'ICO_Number',
'Company',
'Altitude',
'Region',
'Producer',
'Number_of_Bags',
'Bag_Weight',
'In_Country_Partner',
'Harvest_Year',
'Grading_Date',
'Owner_1',
'Variety',
'Processing_Method',
'Fragrance___Aroma',
'Flavor',
'Aftertaste',
'Salt___Acid',
'Bitter___Sweet',
'Mouthfeel',
'Uniform_Cup',
'Clean_Cup',
'Balance',
'Cupper_Points',
'Total_Cup_Points',
'Moisture',
'Category_One_Defects',
'Quakers',
'Color',
'Category_Two_Defects',
'Expiration',
'Certification_Body',
'Certification_Address',
'Certification_Contact',
'unit_of_measurement',
'altitude_low_meters',
'altitude_high_meters',
'altitude_mean_meters']
In this version the loop variable i
is a number we have to use to access what we want, where in the first one the col
loop variable is the thing we want. Simpler and easier to read, which is better by definition in Python.
To make it better than in class, without a lot of extra logic we can do the ...
first then the single ones:
clean_cols = []
for col in coffee_cols_list:
clean_cols.append(col.replace('...','_').replace('.','_'))
clean_cols
['Unnamed: 0',
'Species',
'Owner',
'Country_of_Origin',
'Farm_Name',
'Lot_Number',
'Mill',
'ICO_Number',
'Company',
'Altitude',
'Region',
'Producer',
'Number_of_Bags',
'Bag_Weight',
'In_Country_Partner',
'Harvest_Year',
'Grading_Date',
'Owner_1',
'Variety',
'Processing_Method',
'Fragrance_Aroma',
'Flavor',
'Aftertaste',
'Salt_Acid',
'Bitter_Sweet',
'Mouthfeel',
'Uniform_Cup',
'Clean_Cup',
'Balance',
'Cupper_Points',
'Total_Cup_Points',
'Moisture',
'Category_One_Defects',
'Quakers',
'Color',
'Category_Two_Defects',
'Expiration',
'Certification_Body',
'Certification_Address',
'Certification_Contact',
'unit_of_measurement',
'altitude_low_meters',
'altitude_high_meters',
'altitude_mean_meters']
This shows that we can chain string operations (this will coem in handy at other times).
The above is a good form for all for
loops in Python, but since it was specifically making a list with append, we could make it more concise with a list comprehension.
clean_cols_alt = [clean_cols.append(col.replace('...','_').replace('.','_')) for col in coffee_cols_list]
these two ways are the same
clean_cols_alt == clean_cols
False
2.12. Conditionals Evaluate in order#
recall we set this variable
a
4
If we write conditions where they can be both true, but we want the larger one to act, if we put them in this order it never sees the second, because the first is true.
if a >1:
print('greater 1')
elif a >2:
print('greater 2')
greater 1
This one works.
if a >2:
print('greater 2')
elif a >1:
print('greater 1')
greater 2
2.13. Questions After Class#
2.13.1. What is the name of the inline iteration/loop again in Python?#
list comprehension
2.13.2. I just want to know more about github In general as to me, although it’s new so it will take some time to get used to, it’s still pretty confusing to me#
I will hold an optional session for a bit more GitHub. You can also take CSC311 for a lot more detail
2.13.3. when will we learn about the portfolio?#
After A2 feedback, which will be the first time it makes sense for you to work on it.
2.13.4. How would you attain a level 3 on any given skill?#
There are example ideas in the Portfolio section of the website, but it will make more sense after Assignment 2 and then I’ll spend more time on it in class again.
2.13.5. Im confused on what this “pandas.core.frame.DataFrame” is#
It is the main data type provided by pandas, that represents a table of data. We will keep working with and inspecting them. For a technical description see the api docs, for a high level description, see the getting started tutorial
2.13.6. would like to get feedback on my homework so I can fix any errors I have#
You will get feedback and have a chance to revise later.
2.13.7. Can you use any dataset from github using the raw URL and importing it? Can you use any dataset URL or only github?#
You can use any URL that has a compatible type of data.
2.13.8. if level 1 is determined by attendence and participation, how can I assure I am getting my lesson 1s fulfilled every class#
I am removing the prismia grading this semester, but it seems I missed one reference of that in the syllabus.
2.13.9. If we get an achievement are we gonna see those on github or do we have to keep track of all the achievements we have to see our grade?#
in your feedback you will get a table with your current standing each time work is assessed
2.13.10. Can you slow down a little, some times it gets hard to follow along#
I will try a little, but also please either message on prismia or raise your hand if you ever fall behind.