Building a New Dataset

I would like to study if x and y are related, but I couldn’t find a dataset about both. I found a dataset about x that included w and a dataset about y that included z. I can compute v from w and v from z.

Getting the data together

First let’s get all fo the data loaded into python

Loading and cleaning X

First let’s exmaine the data about x

# load data x

It has some problems:

  • problem 1

  • problem 2

  • problem 3

First I’ll fix problem 2 because that will make fixin 1 & 3 easier

The plan to fix is this high level idea

# comment on step 1

# second step comment

next

# more code

loading and cleaning y

what i’m going to do

# more code

observation.

next plan

# more code

observation

Making them compatible

computing v from w

computing v from z

Merging

I’ll use this type of merge because …