Quantcast
Channel: Kevin Davenport
Browsing all 29 articles
Browse latest View live

Significance Magazine Contribution

I am excited to have the opportunity to be a regular contributor to Significance Magazine. Below is the first of what I hope to be many contributions. I hope to see contributions from my talented (and...

View Article



Quick Look: Facebook’s Kaggle Competition

Following Friday’s news of yhat’s ggplot port (which I hope they promptly rename to avoid search engine conflation with other variants), I thought it’d be fun to explore the large Stack Overflow...

View Article

Mahalanobis Distance and Outliers

I wrote a short article on  Absolute Deviation Around the Median a few months ago after having a conversation with Ryon regarding robust parameter estimators. I am excited to see a wet lab scientist...

View Article

The Cost Function of K-Means

When exploring a novel dataset, I believe most analysts will run through the familiar steps of generating summary statistics and/or plotting distributions and feature interactions. Clustering and PCA...

View Article

A Real World Introduction to Information Entropy

I’ve been using IPython notebook so much that it might finally be time to stand up a Pelican based site on this server in order to utilize Jake Vanderplas’ IPython integration method. This post might...

View Article


Dynamic Time-Series Modeling

Today’s article will showcase a subset of Pandas’ time-series modeling capabilities. I’ll be using financial data to demonstrate the capabilities, however, the functions can be applied to any...

View Article

Regularized Logistic Regression Intuition

In this notebook we’ll manually implement regularized logistic regression in order to facilitate intuition about the algorithm’s underlying math and to demonstrate how regularization can address...

View Article

The 35-hour Workweek with Python

I was prompted to write this post after reading the NYT’s In France, New Review of 35-Hour Workweek. For those not familiar with the 35-hour workweek, France adopted it in February 2000 with the...

View Article


PyCon Montreal 2015 and Motivation

I just got back from a fun week in Montreal for PyCon 2015. Due to my work commitments since relocating to Seattle and leaving the San Diego Data Science Meetup I organized behind, I’ve been concerned...

View Article


Pure Python Decision Trees

By now we all know what Random Forests is. We know about the great off-the-self performance, ease of tuning and parallelization, as well as it’s importance measures. It’s easy for engineers...

View Article

Lending Club Data Analysis Revisited with Python

2.5 years ago I analyzed Lending Club’s issued loans data (yikes! I was using R back then!) . It was the most visited blog post on my site in 2013 through 2014. Today it’s still number 5. Reddit picked...

View Article

Examining Your Presence on Twitter with Python

My Evil The Following with absoluteBLACK’s direct mount oval ring. The purpose of this post is to show how a sponsorship/marketing manager might track their athletes or brand ambassadors. The code...

View Article

A wild dataset has appeared! Now what?

Where do we start when we stumble across a dataset we don’t know much about? Lets say one where we don’t necessarily understand the underlying generative process for some or all of the variables. Lets...

View Article


Topic Modeling Amazon Reviews

Adapted from Biel 2011 I found Professor Julian McAuley’s work at UCSD when I was searching for academic work identifying the ontology and utility of products on Amazon. Professor McAuley and his...

View Article

A Real World Introduction to Information Entropy

I’ve been using IPython notebook so much that it might finally be time to stand up a Pelican based site on this server in order to utilize Jake Vanderplas’ IPython integration method. This post might...

View Article


Dynamic Time-Series Modeling

Today’s article will showcase a subset of Pandas’ time-series modeling capabilities. I’ll be using financial data to demonstrate the capabilities, however, the functions can be applied to any...

View Article

Regularized Logistic Regression Intuition

In this notebook we’ll manually implement regularized logistic regression in order to facilitate intuition about the algorithm’s underlying math and to demonstrate how regularization can address...

View Article


The 35-hour Workweek with Python

I was prompted to write this post after reading the NYT’s In France, New Review of 35-Hour Workweek. For those not familiar with the 35-hour workweek, France adopted it in February 2000 with the...

View Article

PyCon Montreal 2015 and Motivation

I just got back from a fun week in Montreal for PyCon 2015. Due to my work commitments since relocating to Seattle and leaving the San Diego Data Science Meetup I organized behind, I’ve been concerned...

View Article

Pure Python Decision Trees

By now we all know what Random Forests is. We know about the great off-the-self performance, ease of tuning and parallelization, as well as it’s importance measures. It’s easy for engineers...

View Article
Browsing all 29 articles
Browse latest View live




Latest Images