Skip to article frontmatterSkip to article content

Python Review

Closing Jupyter server.

In the terminal use Ctrl +C (actually control, not command, on mac).

It will ask you a question and give options, read and follow

or

do Ctrl +C a second time.

A jupyter server typically runs at localhost:8888, but if you have multiple servers running the count increases.

Once I saw a student in office hours working on localhost:8894 asking why their code kept crashing.

Functions

To define a function in Python we use the def keyword.

def greeting_initial(name):
    '''
    greet the person named
    '''
    print('hello', name)

the docstring is a string defined with ''' as a multline comment that is used for help

help(greeting_initial)
Help on function greeting_initial in module __main__:

greeting_initial(name)
    greet the person named

greeting_initial('sarah')
hello sarah
Solution to Exercise 1

No, that function had only a side effect and it did not return a value.

This is a better function, because it returns a value

def greeting(name):
    '''
    greet the person named
    '''
    return 'hello ' + name

Calling the two functions does not look that differnt

greeting('sarah')
'hello sarah'
greeting_initial('sarah')
hello sarah

but if we check what they return:

type(greeting('sarah')), type(greeting_initial('sarah'))
hello sarah
(str, NoneType)

Remember, everything is an object, so we can evne check properties fo the function:

type(greeting)
function

We could look at the docstring manually

greeting.__doc__
'\n greet the person named\n '

Or even see what function we have:

greeting.__name__
'greeting'

Modules

We can also write code in a separate file and then import it

example.py
name = 'sarah'
major = 'ee'

def greeting(name, greet_word= 'hello'):
    '''
    greet the person named
    '''
    return greet_word +' '+ name
import example
example.greeting('sarah')
'hello sarah'
help(example.greeting)
Help on function greeting in module example:

greeting(name, greet_word='hello')
    greet the person named

We can also import only a part of it

from example import name
name
'sarah'

Conditionals

Conditionals allow us to create flow here

The boolean expression, below (len(name)< 5) can then be used in all sorts of flows.

if len(name) < 5:
    print('short')
else:
    print('long')
long

Iterables

strings are iterable type, meaning that theycan be indexed into, or their elements iterated over. For a more technical definition, see the official python glossary entry

type(name)
str

we can select one element

name[0]
's'

negative numbers count from the right.

name[-1]
'h'

or multiple, this is called a slice or slicing.

name[1:3]
'ar'

notice that the string, sarah, the characters in positions 1,2,3 would be a

name[3]
'a'

lists

Lists are defined with []

ages = [25,34,64,24,56,23,45,48]

we can index them as well

a= ages[2]
a
64

We will transform this to ranges.

First lets figure out how to calculate the lower nubmer of the range.

str(int((a-5)/10)*10+5)
'55'

now the upper

str(int((a+5)/10)*10+4)
'64'

We can make a small function to make this more concise. This is called a lambda or anonymous function.

It is a compact function

lb = lambda a: str(int((a-5)/10)*10+5)

It is equivalant to the following:

def lb(a):
    return str(int((a-5)/10)*10+5)

we can use it like any other function

lb(73)
'65'

we will make one for the upper bound as well

ub = lambda a: str(int((a+5)/10)*10+4)

Then we can apply it as a loop with a list comprehension. The comprehension form is more compact than, but equivalent to, a full for loop

age_bins = ['-'.join([lb(a),ub(a)]) for a in ages]
age_bins
['25-34', '25-34', '55-64', '15-24', '55-64', '15-24', '45-54', '45-54']

The above is equivalent to a longer loop.

age_bins = []

for a in ages: 
    age_bins.append('-'.join([lb(a),ub(a)]))

The comprehension has the advantage of Python knowing the length in advance and behinng visually more compact

What if we had a longer age list and we wanted to put the labels above only for ages 25-65 and otherwise put <25 or >65?

ages = [25,34,64,24,56,23,45,48,99,76,21, 23,56,37,40]

We can use conditionals to get the true or false value

a = 24
a<25, int(a<25)
(True, 1)

we can use the int function to cast the boolean to an integer

type(a<25), type(int(a<25))
(bool, int)

If we try different values, we can test that it works as expected

a = 38
a<25, int(a<25)
(False, 0)

and in another case

a>65
False

If we combine the two, it gives us an expression that gives -1,0, or 1 depending on where the value falls.

(a<25)*-1 + int(a>65)
0
age_bin_dict = {-1:lambda a:'<25',0:lambda a: '-'.join([lb(a),ub(a)]),
                1:lambda a:'65+'}
age_bin_dict
{-1: <function __main__.<lambda>(a)>, 0: <function __main__.<lambda>(a)>, 1: <function __main__.<lambda>(a)>}
[age_bin_dict[(a<25)*-1 + int(a>65)](a) for a in ages]
['25-34', '25-34', '55-64', '<25', '55-64', '<25', '45-54', '45-54', '65+', '65+', '<25', '<25', '55-64', '35-44', '35-44']

Questions

How do I add words into the table of contents?

Create a heading in a markdown cell.

How else can lambdas be used?

They can be used for anything, including with any data type, as long as it can be defined simply.

Best practice is to only use them for small things.

For example if we had a list of short phrases and wanted to turn them all to camel case

phrases = ['hello friend', 'data frame', 'prismia chat']
camel_case = lambda s: ''.join([si.title() for si in s.split()])

[camel_case(p) for p in phrases]
['HelloFriend', 'DataFrame', 'PrismiaChat']

in python isnt it possible to separate code onto multiple lines for readability sakes?

Yes, for example this cell

age_bin_dict = {-1:lambda a:'<25',
                 0:lambda a: '-'.join([lb(a),ub(a)]),
                 1:lambda a:'65+'}

and this one

age_bin_dict = {-1:lambda a:'<25', 0:lambda a: '-'.join([lb(a),ub(a)]), 1:lambda a:'65+'}

Do the same thing, but the first is a lot easier to read

In what cases should switch be used instead of dictionary?

Sometimes you need to pass a thing like the above into another function, then it needs to be an object that can be passed, that is what a dictionary is for.

For example, we will use dictionaries to tell pandas to compute different statistics on different columns of a dataset.

If it is in a regular flow program, a switch type structure could be better. The dictionary based case handling was the original Pythonic way to do this, because Python does not have a regular switch. It has match which was introduced in 3.9 (I misremembered in class, it’s older than I thought). This is under the hood different, but the details of the differences are beyond this course.

How can I practice or know more about writing functions like what we did today

For example with this piece:

['-'.join([lb(a), ub(a)]) for a in ages]
['25-34', '25-34', '55-64', '15-24', '55-64', '15-24', '45-54', '45-54', '95-104', '75-84', '15-24', '15-24', '55-64', '35-44', '35-44']

One way to understand it better is to run the separatie parts of it one at a time.

So, for example by removing the '-'.join() we get a list of lists instead

[[lb(a), ub(a)] for a in ages]
[['25', '34'], ['25', '34'], ['55', '64'], ['15', '24'], ['55', '64'], ['15', '24'], ['45', '54'], ['45', '54'], ['95', '104'], ['75', '84'], ['15', '24'], ['15', '24'], ['55', '64'], ['35', '44'], ['35', '44']]

We can make a plain loop too:

[a for a in ages]
[25, 34, 64, 24, 56, 23, 45, 48, 99, 76, 21, 23, 56, 37, 40]
[a for a in ages] == ages
True

what would be best way to learn to get more comfortable writing in juypiter?

Practice!

You can follow along in class and you can go back after class and add more detail to your own notes.

Solutions