Spectrm Text Classification Challenge

Spectrm posted a fun hiring challenge involving text classification. I gave it a shot and managed an 8% recall rate: not so bad considering there are as many classes as there are examples. Apparently what I came up with is a crude implementation of an inverse document frequency model.

Predictions over the first 200 samples. Note the encouraging band along the diagonal!

You can see my solution on Github and run it yourself if you like. I also included a Jupyter notebook which explains how it all works.