Just spent a sunny afternoon (that’s dedication for you) data munging and setting up the ML approach to pace comment analysis I outlined in the previous blog post. I was curious to see if an ML approach would categorize horses positions better than my pace program that I already use. To do this I used data from TPD which gives an accurate data source for horse positions within a race. Once I had mapped early position I could designate horses as Leaders. trackers, mid division or hold ups. The next task was to feed this data along with the corresponding race comments, 15844 total lines of data, to an ML analyser after first splitting into train and test splits. The algorithm was asked to learn what category a horse was based on its race comment from the training data. here is an example line or two.
led until over 1f out weakened final furlong,1
tracked leader led over 1f out headed inside final furlong kept on same pace,2
I then asked it to predict the positional number ie 1 to 4, on the comments held out in the remaining test data. It turned out that it did reasonably well with an accuracy of 69.5%. Well I say reasonably well, the truth is I have never put my own pace figures to the test against actual TPD data. This was the next step, checking my pace programs performance against the TPD data.
Across the whole test file my program came in at 68.2% accuracy. Not a huge difference between the ML algorithm but what was very useful about this exercise is that it allowed me to check how my pace program, and of course the figures on the SmarterSig site, do against the various categories of 1 to 4.
Predicting leaders it did OK at 69.6% accuracy
Predicting Trackers it came in at 81.5% accuracy
Predicting Mid div’s it achieved 44.7%
Predicting hold ups it managed 73.9%
This is very useful information as it allows me to reexamine the pace program and potentially fine tune it or possibly jump ship to the ML side of processing. The mid div’s seem to need a bit of TLC although we will always be a hostage to how race comment compilers write up.
Now time for beer before the sun go’s down