ISLR Fridays: Introduction

February 7th, 2014

UPDATE 2014-03-24: I pushed everything back because lots of things have been busy.  

UPDATE 2014-02-25: I pushed everything back 2 weeks because lots of things have been busy.  

Last week, I posted a link to a set of free books to this blog.  Not long after, I got a twitter message from a friend:

You and I should setup to study the R book jointly. Somebody pushing along is tremendously helpful to me. Interested?

The R book is An Introduction to Statistical Learning with Applications in R by Gareth James, Daniela Witten, Trevor Hastie, and Robert Tibshirani.

So I decided I'm going to post biweekly to this blog for the next 18 weeks and talk about what I've learned.  Responses are welcome in the comments or via email at andrew .- -.-. siliconcreek .-.-.- net (related comments may be posted to this blog).

The schedule is something like this, based on the chapters of the books:

  1. Statistical Learning - April 18
  2. Linear Regression -  May 2
  3. Classification - May 16
  4. Resampling Methods - May 30
  5. Linear Model Selection and Regularization - June 13 (Friday the 13th???)
  6. Moving Beyond Linearity - July 4 (well, this is when it will post to the blog)
  7. Tree-Based Methods - July 18
  8. Support Vector Machines - August 1
  9. Unsupervised Learning - August 15

So this will be not-too-intense, and with my current workload being spent a lot on waiting for models to run (I'm waiting on one right now, which is partly why I read the introduction), I should be able to spend some time on it.

In addition to the exercises in the book, I intend to post a link to a sanitized version of the Greater Cincinnati Household Travel Survey.  This sanitized version will have a number of changes made to it to protect the privacy of the survey participants (for example, we will not include the names, phone numbers, addresses, or GPS coordinates).