Superforecasters

What makes a good forecaster, be it horse racing, political predictions, social movements or perhaps currency fluctuations ?. We probably all have an opinion on this one. Some would say intelligence, maybe IQ, they would be wrong. Being smart is no disadvantage but it is not the main driver behind the super forecasters out there. Perhaps it’s the men and women on TV ?. Almost certainly not, they are selected on the basis of how much air time they can accurately consume. The book I have almost finished attempts to shine a more objective light on what makes people good forecasters and seeks out those in the general public that fall into the category. Superforecasting: The Art and Science of Prediction.

The people in charge of this project simply advertised for volunteers (actually they got some gift vouchers at the end of the year) to become subjects in an experiment designed to find out who could become accurate forecasters and most important why they had such traits. Once the individuals had been tested before selection they were assigned periodically questions such as will the left or right party win the next Honduras election ?. What are the chances of Italy leaving the EU or defaulting on their debt. Members had to assign confidence levels to their answers and were allowed as time progressed to update their answers. Interestingly people who were diligent at updating tended to be the best forecasters when their objectively based scores were compiled.

It is a must read for anyone involved in forecasting and one of the most interesting points for me was when after a year and before they knew the ranking of forecasters, the people running the experiment decided to run groups. They randomly compiled groups to operate even thought they knew the dangers of group think and group fallout. Despite this fear the groups performed better than individuals and later when they compiled groups of super forecasters they too performed better than individual super forecasters, something I have found myself.

So what qualities make up a super forecaster and do they apply to horse betting ?.

1. Cautious – Nothing is certain, they are able to think in terms of percentages
2. Humble – Reality is infinitely complex
3. Nondeterministic – What happens is not meant to be and does not have to happen
4. Actively open minded – Beliefs are hypothesis to be tested, not treasures to be protected
5. Intelligent and Knowledgable With a Need For Cognition – Intellectually curious, enjoy puzzles.
6. Reflective – Introspective and self critical
7. Numerate – Comfortable with number

Within their forecasting they tend to be

8. Pragmatic – Not wedded to any ideas of agenda
9. Analytical – Capable fo stepping back and considering other views
10. Dragon Fly Eyed – Value a wide range of views which they then synthesize
11.Thoughtful updaters – When facts change they change their minds
12. Good Intuitive Psychologists – Aware of checking for personal biases

In their work they tend to be

13. Growth Mindset – Believe its possible to get better
14. Grit – Determined to keep at it however long it takes

I am sure you will tick a few of those as supremely relevant to horse betting. At the moment I am running a similar forecasting group in horse betting. Each member is assigned a specialist distance eg 5f and is asked to make selections to the group based on morning value prices. I hope to report back on this later in the year. By the way we have one vacancy in the group to cover 6f races.

Advertisements

Deep Learning and Horse Racing

Tags

Came back inspired and fascinated from the cinema the other day having sat with one other lone cinema goer watching AlphaGo.
Alphago is a deep learning program create by the company DeepMind to challenge the world champion Go player. Since the defeat of Kasparov, a world chess champion by a similar deep learning program the next mountain to climb was always Go. So far its proved elusive as the number of game permutations in Go make Chess look like noughts and crosses and it was thought that Go might be just too difficult for an AI program. If you get a chance you must see the documentary as it tracks the development, first beating the European champion and then the world champion. Even more interesting is the reaction of the huge crowd watching the event.

If you have read any of my other posts you will know that I have been impressed by the gains that seem to be real surrounding the machine learning algorithm Gradient Descent Boosting. This algorithm seems to be the defacto Kaggle competition winner at the moment. Kaggle, if you are not familiar, is a web site where data scientist hobbyists and pro’s take on submitted data sets and see who can produce the best Machine Learning solution. Inspired by Go I finally got around to checking out Deep Learning and was not surprised to find further gains. I tested three approaches on a simple data set consisting of just two features, namely horse age and days since last run. In all three cases I trained the models on two years of flat handicap data and tested them on one year of handicap data. Deep learning came out ahead of GDB which in turn beat Random Forests in terms of profit and loss of top rated.

If this topic would be of interest as perhaps a hands on tutorial then please leave a comment below. In the meantime probably the first thing you need to do if you want to get involved is to install Tensorflow and Keras. Keras is a front end built on top of Tensorflow and provides a simplified access to deep learning. You will need to have Anaconda Python installed which if you followed my earlier blog on Machine Learning you should have installed see here

https://markatsmartersig.wordpress.com/2016/01/13/profitable-punting-with-python-1/

Installing Tensorflow and Keras

First you need to create a new environment for your Keras based programs. Pull up a command box (type command in windows search box)

Assuming you have Anaconda installed enter the following command (not very clear in WordPress but that is a double dash before name shown below)

conda create –name deeplearning python

You can change deeplearning to whatever you’d like to call the environment. You’ll be prompted to install various dependencies throughout this process—just agree each time.

Let’s now enter this newly created virtual environment. Enter the following command

activate deeplearning

The command prompt should now be flanked by the name of the environment in parentheses—this indicates you’re inside the new environment.

We know need to install in this new environment any libraries we may need as they wont be accessible from the original root folder created when Anaconda was installed.

IPython and Jupyter are a must for those who rely on Jupyter notebooks for data science. Enter the following commands

conda install ipython
conda install jupyter

Pandas includes the de facto library for exploratory analysis and data wrangling in Python. Enter the following command

conda install pandas

SciPy is an exhaustive package for scientific computing, but the namesake library itself is a dependency for Keras. Enter the following

conda install scipy

Seaborn is a high-level visualization library. Enter the following

conda install seaborn

Scikit-learn contains the go-to library for machine learning tasks in Python outside of neural networks.

conda install scikit-learn

We’re finally equipped to install the deep learning libraries, TensorFlow and Keras. Neither library is officially available via a conda package (yet) so we’ll need to install them with pip. One more thing: this step installs TensorFlow with CPU support only and not GPU support. Enter the following

pip install –upgrade tensorflow
pip install –upgrade keras

Check all is OK

Get Jupyter Notebook up and running by entering

jupyter notebook

Once you are in notebook create a new notebook file and simply enter

from keras.models import Sequential
from keras.layers import Dense

Now run the above cell and hopefully all will be OK

Should you at any point wish to remove the new environment simply use the following command

conda remove –name deeplearning –all

That’s enough for now, if there is interest then we could perhaps explore the code sessions.

Just Think Then Did It (actually)

Just spent a sunny afternoon (that’s dedication for you) data munging and setting up the ML approach to pace comment analysis I outlined in the previous blog post. I was curious to see if an ML approach would categorize horses positions better than my pace program that I already use. To do this I used data from TPD which gives an accurate data source for horse positions within a race. Once I had mapped early position I could designate horses as Leaders. trackers, mid division or hold ups. The next task was to feed this data along with the corresponding race comments, 15844 total lines of data, to an ML analyser after first splitting into train and test splits. The algorithm was asked to learn what category a horse was based on its race comment from the training data. here is an example line or two.

led until over 1f out weakened final furlong,1
tracked leader led over 1f out headed inside final furlong kept on same pace,2

I then asked it to predict the positional number ie 1 to 4, on the comments held out in the remaining test data. It turned out that it did reasonably well with an accuracy of 69.5%. Well I say reasonably well, the truth is I have never put my own pace figures to the test against actual TPD data. This was the next step, checking my pace programs performance against the TPD data.

Across the whole test file my program came in at 68.2% accuracy. Not a huge difference between the ML algorithm but what was very useful about this exercise is that it allowed me to check how my pace program, and of course the figures on the SmarterSig site, do against the various categories of 1 to 4.

Predicting leaders it did OK at 69.6% accuracy
Predicting Trackers it came in at 81.5% accuracy
Predicting Mid div’s it achieved 44.7%
Predicting hold ups it managed 73.9%

This is very useful information as it allows me to reexamine the pace program and potentially fine tune it or possibly jump ship to the ML side of processing. The mid div’s seem to need a bit of TLC although we will always be a hostage to how race comment compilers write up.

Now time for beer before the sun go’s down

Just Think Then Do It (maybe)

It is a glorious Sunday morning and I am sat in the garden at 8am barefoot and earthing.

https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3265077/

Some times it best to just stop and think and then when you have done that think a bit more. Of course Nike would like you to just do it but then again increasing their sales is dependant on your compulsion to buy.

While I was thinking I hit upon a rather simple and obvious idea that I had been pondering over the last few days. I produce Pace Figures daily on the SmarterSig web site and even if I say so myself, they are rather good. They should be I have tended them carefully over the last 10 years, tweeking them when ever I spot a miss-reading of a race comment. They form an integral part of my daily betting. I have however been pondering another recent interest of mine which is Machine Learning and sentiment analysis. Sentiment analysis is the adoption of machine learning techniques to analyse text and derive meaning. If you have read my earlier blog on this you will get the picture. What I am now pondering is could a machine learning approach produce better predictions of pace than my hard wired program approach. Furthermore it would also be capable of self updating or learning. For those of you not familiar with ML it will involve feeding the program as many examples of race comments as you can muster along with a an outcome for each comment. this outcome for example could be 1 for held up 2 for tracked and 3 for led or perhaps some other variation. Now the off putting aspect to this idea is that you would need to sit and watch a lot of races in order to really accurately tag each comment with the correct pace position. Now even bare foot in the garden such a task would guarantee a reversal of that Nike philosophy. However the rather obvious occurred to me while I wasn’t ‘just doing it’, the TPD tracking data gives me as accurate a track on race position as you can get. Marry this with race comments plug into the sentiment analysis and potentially thousands of race comments can be accurately machine trained. Once trained the system can then be used to predict future race pace. If TPD extend to all tracks then there would be no need for this and of course there is no guarantee that this will outperform my carefully hand crafted pace figure program but it sure as hell beats ‘just dont do it’.

Sectionals and GDB

Spent a few hours courtesy of TotalPerformanceData looking at sectional times for the various courses they cover. I was working on creating average split times for the various split or gate points as they are called for each track by the winners of the races. So in other words what is the average split times for achieved in 2017 for Wolves 7f winners

Final furlong 12.07368421 secs
2f to 1f 11.87368421
3f to 2f 12.33026316
4f to 3f 12.25657895
5f to 4f 11.46710526
6f to 5f 11.83289474
Start to 6f 16.58157895

The beauty of AW racing is that we do not have as much going variation to contend with although this can be eventually factored in with averages for each going type. Class may also have to be factored to get a very fine level of detail on how fast or slow a race was run at different points.

My initial thought was to go down the same line as others and start trying to determine fast/slow run races and then another idea occurred to me. Why not allow Machine learning to take the strain. We are after all interested in which horses to back next time. With a more manual approach we have to determine what type of race a horse has just run in and then figure out which types of performance are best backed next time.

To do this we could feed a Gradient Descent Boosting ML algorithm or some other ML of choice with a file of the following data.

All 7f races at Wolves

Deviation from split 1 average,dev from split 2 average,…….,
dev from final split average,FinPos next time out

It could be possible that with such data machine learning might do a reasonable job of predicting the best split time runs for next time out results. It would not just be a case of best overall time but also the manner in which it was run.

I must confess to not having thought this through fully just yet, it is very much a work/thought in progress and I welcome views.

Guineas Trials Are They Relevant

Its Guineas week and many a pundit is pouring over the trials run over the last few weeks but are they very helpful from a betting perspective. I decided to take a look at the pound in your pocket effect as I was particularly taken by Saxon Warrior ahead of Saturdays race. He held off a good hold up horse when running prominently in a fast run Racing Post trophy at the back end of last season, but if he runs as promised then he will be another OBrien runner without a prep race.
An AE value is a measure of how under or over bet a factor is. If we took 10 white faced horse where 5 went off at 2.0 and 5 went off at 3.0 BFSP we would expect according to the market to see 4.16666 winners, thats

5 X (1 / 2.0) + 5 X (1 / 3.0)

If say 6 of them won then they are doing better than the market suggests, whereas if say only 2 won then they are under performing in relation to the market.

AE Values above 1.0 (Actual win divided by expected wins) suggest that the market is under-betting them and we want to be with them in the future.

Taking horses that have run in the Guineas since 2009 and dividing them into prep runs for those having run within 55 days and anything above being a no trial runner we have the following AE values

Prep runs AE = 0.75
No prep AE = 1.26

Horse without a prep run have been significantly under bet in previous Guineas

Interestingly pretty much the exact opposite is the case for fillies in the 1,000 guineas. Perhaps trainers are reluctant to push fillies at home so early.

I am not a big fan of race trending but could not resist a quick look at the draw across bot Guineas and the AE’s for high, low and middle came out at

High 1.04
Mid 0.68
Low 1.37

Horse that are of interest this year are

Saxon Warrior Won
Elarqam
Ghaiyyath
Headway
Barraquero
Tip Tie Win
Murello

Short list for the 1k guineas tweeted out before the race

Altyn Orda
Anna Nerium
Billesdon Brook Won BFSP 168.1
Dans Dream
I can Fly
Sarrochi
Sizzling
Soliloquy

Gradient Descent Boosting

The new kid on the block in the world of machine learning is deep learning but the among the shallow learners the method kicking ass in the world of Kaggle is Gradient Descent Boosting, and I have to admit I am a convert.

Up until now I have not quite seen the gains from ML techniques over non ML methods of forming ratings such as AE values. The old adage that data is king and the method will never polish a turd to such an extent that it will shine has never been in doubt to me. Gains form ML techniques have seemed minimal but that may be about to change in my world.

I fed a set of data into a Sklearn Random Forest machine learning algorithm. If you are not familiar with Random Forests they are basically a tree based algorithm but their strength comes from the fact that they form multiple trees on subsections of the input data. Branches are also split on random sub sets of the available fields within the data. This means that a single more dominant feature or field within the data is less likely to swamp the decision making of the tree’s and hence bias the overall model towards one field within the data fields.

Working with BFSP as a model evaluator I checked first the performance of top rated horses using the model on fresh unseen data. This produced after commision

8031 bets PL -33.6 Points ROI -0.41% VarPL -3.96 VarRoi -0.31%

Now creating a Gradient Descent model and applying it to the same data produced

9250 bets (more joint tops) PL +713.7pts ROI +7.7% VarPL +100.5 VarRoi +5.54%

Splitting at roughly the mid point of rating values to split the runners roughly in half produced for the GDB method

33536 bets PL +861 ROI +2.56 %

By contrast for the Random Forest it produced

33155 bets PL -784 pts ROI -2.36%

How does Boosting differ from plain old Random Forests. Random Forests rely on a technique called bagging. A selection of the input data is placed in a ‘bag’ and the random forest algorithm works on this bag or subset of data. Another bag is then selected and a Random Forest algorithm works on this data and so on. The results are then averaged across all the trees to produce a final prediction. With Boosting however an extra step takes place. When the first bagged set of data is analysed, weights are handed to the data ready for selection for the second bag of data. Those data items that were predicted poorly in the first bag are prioritized for inclusion in the second bag in the hope that within this new mix they will be predicted more efficiently. Kind of like you doing a random set of revision questions from a set of questions and then when I select a second random set for you to try I increase the chance of selecting the ones you got wrong in the first revision test.

The following short video does a pretty good job of describing the process. Needless to say the above results have sparked my interest

Cadence and the Guineas Favourite

Simon Rowlands is always a thought provoking read and he recently extended his sectional coverage to embrace stride length and frequency a new addition to horse racing data which I wrote about a month or so ago. His piece on this can be found at

http://www.attheraces.com/blogs/sectional-spotlight/12-April-2018

I do not wish to steal Simon’s thunder as I guess he will have more articles in the pipeline but in the above piece he did not mention a number of variables that may need to be accounted for when accessing this new data. Going and class are two obvious ones and I want to touch on this with regard to the Guineas favorite that Simon mentions in his article, Saxon Warrior.

Saxon Warrior recorded a strides per second of 2.36 when winning the Racing Post Trophy and the RPT always conjurs the question of whether the winner is a Derby horse or a Guineas horse. Simon does not actually come down on either side as rightly so one cannot say that because Saxon Warrior recorded an SPS of 2.36 over a mile he would not record an SPS quite happily that would win him a Derby but the real question at this stage is does his SPS give him a chance of winning the Guineas ?.

So far the average SPS of horses winning over 12f at class 2 is 2.32 whereas winning at 12f at class 6 is 2.26. Clearly lower class horses cannot muster the same stride speed. By contrast 8f horse at class 2 have an average SPS of 2.35 with class 6 at 2.33.

The striking feature of these averages is that over 12f a class 2 performance average is not strikingly different from an 8f class 2. Compare this with a 5f class 2 at 2.49. Whether the differences are not linear or perhaps the going really needs to be factored in is a topic I can write about later.

From here we can see that Saxon would appear to be certainly a genuine class 2 or higher over 8f. Now you may say that’s obvious as he won a class 1 at a mile but this is not the case. If last year just happened to be a year of staying horses running the Post then he may have won it and yet only clocked an SPS of a class 3/2 animal.

I would say to any Saxon Warrior fans, do not be put off his Guineas chance simply because he is a Post winner.

Does Hugh Taylor Eat Breakfast ?

I have to confess that in my interview with Hugh for my book, The Newmarket Wizards, it did not occur to me that I should ask such a question. What would be useful to know is does Hugh ever skip breakfast, as knowing which days he skips breakfast might be more valuable than any other variable we could choose to look at. Eating or not eating breakfast may also be something you might wish to consider from your own betting and indeed health perspective.

I have recently investigated the blood glucose (bare with me I will get on to betting) effect of various breakfast meals on myself.

https://heartattackandthenhs.wordpress.com/2018/01/23/personalized-diet/

The results astonished me and made me realize that breakfast is the one meal we all tend to adopt habits around. In other words it is the easiest meal of the day to control and yet it could be the most damaging. Research shows we are all most insulin resistant in the morning and yet it is breakfast when we carb’ load. Breakfast is also a meal which naturally breaks a perfect opportunity to fast for around 18 hours and the research around fasting and longevity is pretty well bullet proof.

But what if all this health talk does not motivate you to skip those cereal killers or at least replace them with low glycemic alternatives like eggs. Maybe appealing to your punting instincts will do the trick.

Always gamble on an empty stomach. Sounds ridiculous and certainly the exact opposite to what most people do and yet a research team from Utrecht in 2014 looked at this very phenomena. They took two groups of University students and fed half a decent breakfast whilst the other half had no breakfast. They were set a standard well know risk assessment test known as the Iowa Gambling Task.

https://en.wikipedia.org/wiki/Iowa_gambling_task

The task involves picking cards from four piles of cards with each pile having varying reward and risk. The task was to see if the myth that suggests that when we are hungry we take bigger risks and have poor grasp of risk/reward was actually true. This idea is derived from the fact that when we are hungry we seek food more irrationaly and when we are sexually aroused we take bigger risks seeking sex.

The results were inf act the exact opposite. Fasting members were significantly better at the task in terms of evaluating risk/reward than those who ate breakfast. Their decision making skills were more finally honed even when allowing for other factors. We bet in a world where Betfair offers us less than 2% over-round and therefore slight gains and edges can push us into profit. Skipping breakfast before making those betting selections just might be one of them. Does Hugh Taylor eat breakfast ?, I don’t know but he would do well to stay away from the fridge after issuing a ‘possibly one more bet’ message.

http://journals.plos.org/plosone/article?id=10.1371/journal.pone.0111081

Sentiment Analysis and Hugh Taylor 3

At this point it might be worth mentioning a bit about what I am tracking here and how it is being tracked. I am obviously looking at whether its possible to detect more or less positive selections from Hugh Taylor, but what measures are being used.

I am tracking at the moment on the following criteria which I will then try and explain.

Profit on Positive Probability
Profit on Polarity
Profit on Subjectivity
Profit on tip text length
Profit on good old back them all to BFSP

All the above are to BFSP before commission.

OK so what do they mean ?. The last two are self explanatory, the next to last being simply is he more confident when he has more to say ?.

To explain the first three I need to now mention the Python library being used to carry out the analysis. The library is called TextBlob. There are a plethora of introductions out there should you be interested in the detail and it is by no means the only option for this kind of work.

To analyze and gain sentiment analysis on any text the library has to refer to a Lexicon of words, phrases, sentences if it is to determine what is positive and what is negative. So the word ‘great’ in some text may push the overall positive probability up whilst the word ‘poor’ or even ‘not great’ will pull it down.

TextBlob comes with two ready to use corpuse’s to refer to. One is a library of movie reviews and it is this option that gives us the positive probability scores (lets hope Hugh does not tip anything called The Shawshank Redemption). The other is based on lexicon of words and similar positivity scores and gives us the Polarity score, a measure of positivity, and the Subjectivity score, a measure of how subjective or objective the text is. At this stage I am not sure that the Subjectivity score will be very useful but lets track it and see.

Another option, once a back catalogue of Hughs text builds up, is to train an algorithm on Hugh’s language itself so that future selections are assessed based on his past language and their success rate. This is why I am logging not just winners but placed horses too as his win rate is low and this may prove a problem when training such an algorithm. That is for the back burner for now, let’s continue to see if how people feel about their movies is how Hugh feels about his horses (Godfather excluded).

I will update at the end of week 1 with the figures on this blog entry.

End of week 1

If you had backed all Hughs tips to BFSP you would be -4.02 pts down
If you had backed or layed them as indicated by positivity probability you would be -0.02 pts down
If you had back or layed depending on text length you would have been -7.98 pts down
If you had backed or layed as indicated by sentiment polarity you would be -1.98 pts down
If you had layed where Hugh appears more subjective and backed where he appears less subjective you would be +5.98 pts up

Back all Plays/Back all PL
10/-4.02
Avg’ Prob’ Plays/Avg Prob PL
10/-0.02
Text Length Plays/Text Length PL
10/-7.98
Avg Polarity Plays/Avg Polarity PL
10/-1.98
Avg Subjectivity Plays/Avg Subjectivity PL
10/5.98

End of week 2 18/3/18

Back all Plays/Back all PL
20/-4.27
Avg’ Prob’ Plays/Avg Prob PL
20/+1.73
Text Length Plays/Text Length PL
20/+3.77
Avg Polarity Plays/Avg Polarity PL
20/-3.77
Avg Subjectivity Plays/Avg Subjectivity PL
20/-5.77

End of week 3 25/3/2018

Back all Plays/Back all PL
31/-7.92
Avg’ Prob’ Plays/Avg Prob PL
31/+6.08
Text Length Plays/Text Length PL
31/-0.58
Avg Polarity Plays/Avg Polarity PL
31/-10.12
Avg Subjectivity Plays/Avg Subjectivity PL
31/+12.12

End of week 4 1/4/2018

Back all Plays/Back all PL
39/-3.78
Avg’ Prob’ Plays/Avg Prob PL
39/-18.06
Text Length Plays/Text Length PL
39/-12.72
Avg Polarity Plays/Avg Polarity PL
39/2.02
Avg Subjectivity Plays/Avg Subjectivity PL
39/+22.26

End of week 5 8/4/2018

Back all Plays/Back all PL
50/+1.74
Avg’ Prob’ Plays/Avg Prob PL
50/-28.14
Text Length Plays/Text Length PL
50/-16.8
Avg Polarity Plays/Avg Polarity PL
50/-6.7
Avg Subjectivity Plays/Avg Subjectivity PL
50/+15.38

End of Week 6 15/4/2018
Back all Plays/Back all PL
62/+10.69
Avg’ Prob’ Plays/Avg Prob PL
62/-17.19
Text Length Plays/Text Length PL
62/-14.85
Avg Polarity Plays/Avg Polarity PL
62/-25.65
Avg Subjectivity Plays/Avg Subjectivity PL
62/+4.93

End of Week 7 22/4/2018
Back all Plays/Back all PL
74/+30.61
Avg’ Prob’ Plays/Avg Prob PL
74/+4.73
Text Length Plays/Text Length PL
74/-0.93
Avg Polarity Plays/Avg Polarity PL
74/-46.37
Avg Subjectivity Plays/Avg Subjectivity PL
74/-14.19

End of week 8 29/4/2018
Back all Plays/Back all PL
87/+43.15
Avg’ Prob’ Plays/Avg Prob PL
87/-8.49
Text Length Plays/Text Length PL
87/+7.15
Avg Polarity Plays/Avg Polarity PL
87/-21.83
Avg Subjectivity Plays/Avg Subjectivity PL
87/-38.73

| report block user