What is this?

I’m livestreaming myself doing data science. You can find out more about why I am livestreaming here, and about the project I am working on here.

What project are you working on?

I’m attempting to import, clean, combine, and analyze dozens of years of FBI Uniform Crime Reports and other data associated with the criminal justice system in the United States. Read more

I’m not doing any “deep learning” or analyzing what you might think of as big data. I’m analyzing moderately sized public datasets using R. I’m not doing any machine learning as of yet, because that’s really about 10% of being a data scientist. Right now I’m using the packages in the tidyverse to munge, reshape, clean, and diagnose these large administrative record sets.

Who am I?

I’m a former data scientist for the state of Wisconsin who is now an independent statistical programming and research consultant. You can find out more about me on my homepage.

What will I see on the livestream?

Just me, my RStudio window (set to use notebooks) and my intermittent narration of my thought process as I work through a problem.

Screencapture of me streaming

What data science are you doing?

This week I’m mostly working on smoothing out inconsistencies in reporting by agency and year for common and known reporting irregularity. This involves writing a lot of logical statements, fitting some simple models, and smoothing the data out by applying those models.