Betfair API NG Session 12

In this session we will look at how to access Irish racing in addition to UK racing.

This is a fairly simple process and involved editing one line in our MyAPILib library that we created. This will hardwire so to speak our library to pick up Irish and UK races in the same way that the library at the moment is hard wired to pick up only UK races. It might be better to write the subroutine so that we pass it a parameter indicating what markets to pick up.

First let us take the simple approach and simply change the routine to gather Irish and UK races. In our library we need to edit only one line within the getEvents subroutine. This line will now read as

market_catalogue_req = ‘{“jsonrpc”: “2.0”, “method”: “SportsAPING/v1.0/listMarketCatalogue”, “params”: {“filter”:{“eventTypeIds”:[‘+eventId+’],”marketCountries”:[“GB”, “IE”],”marketTypeCodes”:[“WIN”], “marketStartTime”:{“from”:”‘ + now + ‘”,”to”:”‘ + to + ‘”}},”sort”:”FIRST_TO_START”,”maxResults”:”100″,”marketProjection”:[“MARKET_START_TIME”,”RUNNER_DESCRIPTION”, “RUNNER_METADATA”, “EVENT”]}, “id”: 1}’

Notice how the marketCountries parameter has IE added to it. API NG takes two letter country codes as specified by ISO 3166 at

To make the getEvents sub routine more generic we need to modify it to receieve an extra parameter stating the required markets

def getEvents(appKey, sessionToken, eventId, reqMarkets):

Now we can call this routine with the following modified call

myMarkets = ‘”GB”,”IE”‘
races = myAPILib2.getEvents(appKey, sessionToken, horseRacingEventTypeID, myMarkets)

The market_catalogue_req initialisation line will now need changing to

market_catalogue_req = ‘{“jsonrpc”: “2.0”, “method”: “SportsAPING/v1.0/listMarketCatalogue”, “params”: {“filter”:{“eventTypeIds”:[‘+eventId+’],”marketCountries”:[‘+reqMarkets+’],”marketTypeCodes”:[“WIN”], “marketStartTime”:{“from”:”‘ + now + ‘”,”to”:”‘ + to + ‘”}},”sort”:”FIRST_TO_START”,”maxResults”:”100″,”marketProjection”:[“MARKET_START_TIME”,”RUNNER_DESCRIPTION”, “RUNNER_METADATA”, “EVENT”]}, “id”: 1}’

Comment Commentary

When I think back to my earliest memories of betting on horses I recall the an Autumn day when a colt called The Minstrel powered up the Newmarket heath to beat Saros in the Dewhurst Stakes. Somewhat fitting given that we watched Churchill win yesterday’s race of the same name. The other nostalgic memory I have is my fascination with the new language I had to master to understand this conundrum of a sport. What did ‘ro’ mean or ‘nearest finish’ and how did the latter differ from ‘finished well’ or ‘finished fast’.

I would imagine that there are plenty of seasoned observers who even today do not fully understand the meaning of some of them. Given the way I now think about betting the meaning of what each of these peculiar abbreviations and phrases mean is of less importance compared to how the public use them and what effect they have on the odds of horses. In other words their worth is more important than their meaning although perhaps in this video age their worth is perhaps a little less of a currency than it used to be.

One member of the Smartersig forum asked about this topic today in relation to a past article to which I replied that I had done some analysis using both AE values and also a Kth nearest neighbour machine learning approach. The latter had not produced any revelations but the AE values did highlight some interesting observations. I also mentioned that Peter May had tries to formalise Raceform comments when he worked there, something I think should be implemented so as to avoid different race readers using varied comments to mean the same thing. He also responded to a particular example and gave a definition of what the terms actually meant. The extract is copies below.

“Nearest finish” means the horse ran through beaten runners to secure its
best position at the end of the race; “finished well” implies the horse
improved at the end of the race to get the better of horses that were not
necessarily beaten at the time. I’d rather be on a “finished well” next
time out than a “nearest finish” unless the latter was a non-trier and
merely running for a handicap mark.

Now, who wants to do: “stayed on well” and “ran on well”?

My reply to the above was

You would be right to value Finished well higher than nearest finish, the figures below show that it has a far better AE value although both are > 1
Comments from the previous race applied to next race

Comment                                                                           actual wins Expected wins AE
NEAREST FINISH 460 431.4271 1.066229
FINISHED WELL 38 29.68467 1.280122

Sadly combining all AE values for sub comments within a race comment and then using these next time out for a horse does not produce an automatic ticket to wealth. Betting only those below 8/1 for example will lose you around 2% less in the pound for those with an overall AE > 1 compared to those with an AE less than 1. It may be possible however to tease out profitable combinations from the figures and some of the values do highlight the old adage that more likely winners does not always mean better overall results to the pound. There is often a reverse effect to the pocket than the one we expect. For example which would you rather back, ‘stayed on same pace inside final furlong’ or ‘stayed on well’. Below is a list of the top 100 based on expected win numbers.

Comment Actual Wins Expected Wins AE
CHASED LEADERS 2671 2587.35969 1.03232651
HELD UP 2231 2265.561685 0.984744761
TRACKED LEADERS 1873 1871.609995 1.000742679
RIDDEN OVER 2F OUT 1785 1820.174781 0.980675053
TOOK KEEN HOLD 1481 1361.466088 1.087797936
RIDDEN OVER 1F OUT 1356 1354.724618 1.000941433
LED 1182 1243.74324 0.950356924
WEAKENED INSIDE FINAL FURLONG 1050 1026.676437 1.02271754
IN TOUCH 979 1011.940749 0.967447947
SLOWLY INTO STRIDE 933 974.9412843 0.956980708
PROMINENT 1008 945.7114615 1.06586421
HELD UP IN REAR 856 920.3450153 0.930085985
MIDFIELD 849 891.8348109 0.951970017
HEADWAY OVER 2F OUT 824 837.1442708 0.984298679
DWELT 782 772.8943664 1.011781214
KEPT ON 702 726.5056188 0.966269196
WEAKENED OVER 1F OUT 683 700.8964292 0.974466371
HELD UP IN TOUCH 725 700.8810366 1.03441235
MADE ALL 639 650.0103887 0.983061211
RIDDEN 2F OUT 631 637.7561372 0.989406394
RAN ON 624 637.5882859 0.978687993
IN REAR 633 632.1035041 1.001418274
SOON RIDDEN 607 632.0653357 0.960343758
TRACKED LEADER 643 626.1577366 1.026897797
SOON WEAKENED 576 598.9307325 0.961713882
HELD UP TOWARDS REAR 576 567.435946 1.015092548
HEADWAY OVER 1F OUT 532 564.6852995 0.942117672
CLOSE UP 574 560.7940987 1.023548574
STAYED ON 569 548.0675889 1.03819312
STEADIED START 467 527.8751126 0.884678949
HELD UP IN MIDFIELD 460 490.6925151 0.937450615
CHASED LEADER 518 470.069006 1.101965868
NEAREST FINISH 460 431.4270504 1.066228925
LED OVER 1F OUT 431 420.5246004 1.024910313
KEPT ON SAME PACE 403 409.498403 0.984130822
HEADWAY 2F OUT 368 389.0303672 0.945941579
KEPT ON FINAL FURLONG 369 382.5430579 0.964597298
ALWAYS TOWARDS REAR 346 377.4591406 0.916655507
ALWAYS PROMINENT 382 373.7864294 1.021973967
HEADWAY 3F OUT 371 372.4799733 0.996026704
RIDDEN WELL OVER 1F OUT 374 371.5040007 1.006718634
RIDDEN OVER 3F OUT 350 369.287482 0.947771092
TOWARDS REAR 373 368.4816783 1.012261998
IN TOUCH IN MIDFIELD 374 362.3074659 1.032272407
RIDDEN 3F OUT 379 356.4176777 1.063359153
HEADWAY OVER 3F OUT 343 333.6436157 1.028043049
EFFORT OVER 2F OUT 326 329.1876265 0.990316688
TRACKED LEADING PAIR 353 313.414136 1.126305292
LED OVER 2F OUT 326 310.5600987 1.049716307
RIDDEN OUT 333 301.1864019 1.105627605
KEPT ON INSIDE FINAL FURLONG 284 290.6621099 0.977079538
ONE PACE 284 284.4649458 0.998365543
RAN ON WELL 300 282.0159815 1.063769501
RIDDEN TO LEAD OVER 1F OUT 293 268.7380356 1.090281096
STAYED ON WELL 260 268.7236763 0.967536629
STAYED ON SAME PACE INSIDE FINAL FURLONG 282 268.1975814 1.051463621
HELD UP IN LAST PAIR 250 261.3892469 0.95642802
LED INSIDE FINAL FURLONG 272 260.428051 1.044434342
NOT REACH LEADERS 289 260.0312104 1.111405049
NO IMPRESSION 270 256.222184 1.053772924
SOON LED 287 253.9916809 1.129958269
READILY 260 242.4380556 1.072438893
DRIVEN OUT 212 237.5200836 0.892556102
HEADED INSIDE FINAL FURLONG 230 236.4138702 0.972870161
HEADED OVER 1F OUT 232 233.9553457 0.991642227
RACED KEENLY 234 229.4599047 1.019786007
RIDDEN ALONG OVER 2F OUT 208 228.9571069 0.908467105
COMFORTABLY 234 224.0209046 1.044545376
STAYED ON SAME PACE 226 221.8574082 1.018672317
BEHIND 225 221.6497903 1.015114879
DRIVEN OVER 1F OUT 231 220.145094 1.049307962
NEVER TROUBLED LEADERS 210 215.2400129 0.975655024
NO EXTRA INSIDE FINAL FURLONG 239 213.8537856 1.117586015
SOON BEATEN 213 212.5922548 1.001917968
KEPT ON SAME PACE FINAL FURLONG 210 211.0349585 0.995095796
RIDDEN ALONG 2F OUT 198 203.6188192 0.972405207
RIDDEN AND HEADED OVER 1F OUT 190 200.915252 0.945672358
HELD UP IN LAST TRIO 188 198.4807439 0.94719516
PUSHED ALONG OVER 2F OUT 213 197.7588924 1.077069139
LED 2F OUT 208 196.9414763 1.056151319
DRIVEN OVER 2F OUT 192 195.4939592 0.982127534
KEPT ON WELL 190 190.0810739 0.999573477
STAYED ON FINAL FURLONG 193 182.1342079 1.059658162
EFFORT 2F OUT 171 180.8516812 0.945526184
STAYED ON INSIDE FINAL FURLONG 161 179.4288736 0.897291483
NEVER DANGEROUS 178 178.4931228 0.997237301
TAILED OFF 164 171.4225248 0.956700412
NEVER ON TERMS 154 170.1821368 0.904912836
NEVER ABLE TO CHALLENGE 173 168.309055 1.02787102
STAYED ON SAME PACE FINAL FURLONG 149 165.9114395 0.898069479
SOON CLEAR 164 163.855825 1.00087989
NO CHANCE WITH WINNER 149 160.0478459 0.930971605
RIDDEN ALONG 3F OUT 171 158.544772 1.078559689
TOOK KEEN HOLD EARLY 148 157.3422748 0.940624509
WEAKENED OVER 2F OUT 140 156.5396692 0.894341995
HELD UP IN LAST PLACE 134 156.2194758 0.857767569
PULLED HARD 168 155.8111567 1.078228309
RIDDEN HALFWAY 148 153.9541078 0.961325437
PUSHED ALONG OVER 3F OUT 165 151.0215281 1.092559466

The NR Conspiracy

In the 8.10 at Chelmsford today there are bang on 16 runners in a handicap. Bookmakers do not like 16 runners in handicaps, perhaps even more so than 8 runners in a handicap. It always seems however that they have little to worry about. Doesn’t one always come out and drop the runners to 15 or less. The punter friendly each way terms of 16 runner handicaps seemed to be inevitably snatched away from them by a non runner or two.

Is something going on here or are we imagining it. Is this the equivalent of a football manager bung, with trainers getting a brown envelope for saving the bookmakers thousands of pounds ?.

Lets look at some figures covering 2009 to 2011 inclusive for handicaps across both Flat and NH.
It shows the number of decs, number of races for that many decs, number of NRs and the NR percent expressed in relation to total decs for that dec size race.
This is for hcps only.
We can see that for 8 dec races there was 1646 such races and they were subject to 1058 NR’s which is 8.03% ie 1058 / (1646 * 8).
This is below the total NR percentage across all races of 8.62%
16 decs by comparison stand at 10.39%
Is there a confounding reason why 16 runner races would get more NR’s as a percentage ?
There does seem to be a spike around the 15,16 and 17 dec’ mark although quite why 15 decs should be spiking is a puzzle

Decs NumOfRaces NRs NRpercent
1 1 0 0
2 3 0 0
3 22 6 9.090909
4 148 30 5.067568
5 468 143 6.111111
6 863 375 7.242178
7 1199 580 6.910521
8 1646 1058 8.034629
9 1554 1104 7.893608
10 1630 1374 8.429448
11 1550 1416 8.304985
12 1740 1769 8.472222
13 1312 1535 8.999765
14 1369 1756 9.162058
15 529 853 10.74984
16 613 1020 10.39967
17 420 738 10.33613
18 247 397 8.929375
19 61 95 8.196721
20 170 260 7.647059
21 22 53 11.47186
22 32 61 8.664773
23 7 2 1.242236
24 28 25 3.720238
25 3 4 5.333333
26 3 11 14.10256
27 10 18 6.666667
28 10 17 6.071429
29 8 9 3.87931
30 7 6 2.857143
31 2 9 14.51613
32 2 7 10.9375
34 2 3 4.411765
35 2 3 4.285714
36 1 4 11.11111

Newcastle AW Pace

It is perhaps a little early to be evaluating the pace angle at the new Newcastle AW track but I thought an update on how things are measuring up might be in order.

So far working to BFSP before commission and using SmarterSig pre race pace figures and I emphaseise pre race here, we have the following figures

Hold up ie Less than 1.4 pace figures for prior race runs 372  wins 31 PL -86pts

Prominent ie pacefig greater than 1.4 and less than 3.2  runs 308 wins 35 wins PL +47.7

Led ie pacefig greater than  3.2 runs 114 wins 9 PL -59.55

Horse with a prominent run in their previous race as opposed to held up or led have done very well next time out at Newcastle so far although the duration and sample sizes are very small.

ITV More Of The Same

We are soon to get a new TV presentation team for Racing with one or two familiar faces still in place. There is much debate at the moment with sighs of relief as some are left out whilst groans as others are not included. The truth is that the ITV team is much like the next USA government or indeed our own UK government. The faces change but the underlying driving force and message remains the same. The illusion is that you are getting something new whilst the truth is that you are in for pretty much the same.

So what is the ‘same’ and why do I no longer listen to TV pundits who in reality should come with a wealth warning.

First of all we all know that the punditry will be geared towards protecting the main sponsors namely the bookmakers, this is even more so on a commercial run TV channel. This is not all bad if you learn to recognise how this anti punter stance can work to your advantage. TV pundits provide one huge disservice to losing punters and paradoxically a positive service to winning or potentially winning punters. They pretty much all to a man or woman approach a race from the point of view of finding the winner. Sure every now and then they mention that so and so is too short for a bet but but this is hardly value betting.  Anyone who has managed to move from being a losing punter to a winning one over the years will without doubt have first suffered from the whats going to win mentality inflicted on them by the media. Once you have twigged that those on the screen either do not know how successful punting works or prefer the short term safety of finding a few winners over trusting that the watching public might stick with them with more riskier long term selection methods that actually yield a long term profit. But don’t lets moan about this, the fact that your presenters are incompetent is a bonus. It keeps the majority of punters in the dark which is where they need to be if you are to keep winning.

That last sentence might sound a little harsh but the reality is that most punters have to lose in order for some punters to win. The truth is that they dont have to lose as much or as rapidly as they do. If every losing punter switched to the exchanges they would immediately as a group lose less without impacting on winning punters in a negative fashion, in fact quite the opposite.

So in conclusion do not complain about the dumbing down or the lack of smart punting advice without it your life would be a lot harder unless of course you really are listening to them.

Six Degrees of Separation

What do Sean Bean the actor and I have in common. At first I thought it might be that we are both from Sheffield, or maybe that he studied drama at Rotherham College of Art and Tech where I taught for 4 years back in the 80’s. Even closer to home is that my 81 year old cycling buddy in Portugal is a guy who has a regular ‘last of the summer wine’ Friday meet up with his mates in Sheffield of which one used to be Sean’s dad until he passed away. It could also be that we both support Sheffield United and Idolise Tony Currie, the best English midfield player of the 70’s.

But closer to home for me is that a playwright called Steve Wakelam, a yorkshire lad, wrote a play back in the late 70’s about two young lads who try their arm at professional punting. I know Steve although my friend,and the guy who introduced me to Racing, knows him better having been taught by him when Steve was a school teacher. I met Steve on several occasions at our annual York races soiree in August and I have always been aware that the play was based on my friend John and myself. What I did not know until recently is that it was filmed as a BBC1 play with Sean playing what appears to be the third lead role (alas not John or I). I have not seen it but at least it would be one role in which Sean would not have a problem with the accent.

I do not think the two parts are quite distinct in terms of me and my mate John, rather they appear to be an amalgamation of both of us. My friend did work as a groundsman at a monument and does have a more romantic view of Racing whilst I hold the more hard nosed Mathematical viewpoint. The second character who appears to be the proverbial loser, along for the ride,  is hopefully purely a fictional character.

AE Ratings V Random Forests

I have spent the last few days working on a Random Forest version of my own flat handicap ratings. The original ratings are based on AE values or Actual divided by Expected values to give them their full name. Let me remind you of what AE values are. If we are say calculating the AE values of last time out winners, we can look at all lto winners and for each horse calculate its market chance by taking its SP or BFSP, stripping out the over round and then take the odds as its chance of winning. So an even money shot should win 0.5 times if the odds are true. We sum up all these win chances along with the actual win count for these horses. this gives an E (expected) value and an A (actual) value. If we divide the A value by the E value and it is greater than 1 then, in our example, last time out winners are winning more times than the market estimates. If the value is less than 1 then the market is over betting them.

I trained a Random Forest model on my data for 2009 to 2013 and then tested on the years 2014 and 2015. The original AE model produced the following results for top rated horses.

Bets 7918 Wins 1276 PL +305 to BFSP after comm’ ROI +3.2%

The Random Forest model produced the following results for top rated horses

Bets 7699 Wins 1164 PL +323 to BFSP after comm’ ROI +4.1%

The software used was Python with the Skicit Learn Random Forests library. See my intro blog entry on this software.

The initial interest in this area stems from an excellent article published by Stefan Lessman which is linked below

The next step for me is to extend the model by taking the Lessman and co’s example of moving to a second step of using the resulting RF ratings and combining with the market price of each horse using regression to eventually produce an oddsline. Of course BFSP is not known until after the off but final prices can be a good estimation. Lessman and Bentner argue this two step separation of fundamental race parameters and odds to stop the odds swamping the model parameters when used together at the same time.

I should also perhaps look at some more metrics on this model first as it may have not escaped your notice that the win rate on the AE model is greater than that of the RF model. Lessmann puts up some strong arguments for Random Forests in his article so if you are interested in race modelling it might be worth taking a look.

Watching Frankel

Today saw the first son of Frankel make his debut in the UK and this also coincided with my finishing the sequel to that excellent book Watching Racehorses by Geoffrey Hutson, the obviously named Watching more Racehorses.

I loved the first copy which attempted to numerically represent those soft subjective observations we get thrown at us every weekend by so called paddock watchers. The new book is not as good simply because it is padded out somewhat with observations on areas outside the paddock. Nevertheless it still adds more data to some of those familiar and unfamiliar areas of paddock watching. For example in the first issue sweating is not cited as a negative but in the second issue he puts more meat on this observation by stating that when the temperature is above 21c sweating is not a negative. Another interesting observation is that coltishness is also not a negative.  So what is a negative, well if you want a negative you can get your teeth into sample size wise then consider cross nose bands.

How does this all relate to Cunco the son of Frankel who has just bolted in. Well he drifted like a barge after becoming coltish in the parade ring. Only he and Mr Hutson seemed to know.

Betfair SP’s Part 2

In my previous blog post I mentioned the care needed when doing research to Betfair SP. This was courtesy of an alert by an observant member of the SmarterSig email forum.

Today I will demonstrate just how much difference this anomaly can make. At the moment I am tracking a betting method based on a combination of racing selection strategy and financial trading methods. At first glance the option of betting to BFSP seemed more attractive than taking a price provided you can find some sort of stake threshold by which you do not cannabalise your own BFSP with the size of your stake.

Using the odds displayed when you download your betting summary to calculate a level stake PL to BFSP I get the following results when comparing backing the selections to available price compared to BFSP.

Available price,  Bets = 1717 PL = +27.3 points after comm

SP Price, Bets 1717 PL = +35.2 points after comm

A small increase using BFSP

Of course things are never that simple and the  prices handed to me via the Betfair download do not account for R4’s. Now taking the the profit by calculating the winnings divided by the bet stake we get the following profit for the two categories

Available price Bets 1717 PL -3.04 points after comm

Bets 1717 PL = +22.1 points after comm

The profit from then live prices simply has not survived the R4’s occurred during the time from taking the bets in the last few minutes to off time. The BFSP’s however will have fewer R4’s, perhaps only being affected by markets that have not reformed perhaps due to a stall non entry. There was a 0.7% drop in ROI when the R4’s on BFSP were accounted for.

Conclusion – You need to make sure when calculating points profit on bet summaries that you use the profit divided by stake and not the price to calculate. Also when assessing new strategies retrospectively to BFSP you need to account for late R4’s. A reduction of 1% on ROI would seem prudent.

Betfair SP’s

I have mentioned before about the fact that Betfair SP’s seem to produce a race overround or should I say underround, below 100% on a good number of races. A possible explanation for this was put forward by a member of the Smartersig email forum

The member stated the following which I have to admit I had overlooked.

I assume that I’m not alone in using Betfair SPs as the benchmark to assess the profitability of a potential new system.  Of course there’s an argument that this isn’t entirely accurate as your own theoretical bets might have altered the BFSP but nothing is perfect.

However, I recently noticed that Betfair SPs are NOT recalculated to allow for Rule 4 deductions after a late withdrawal.  I know that Betfair apply their own deduction to any bets (whether a price was taken or BFSP) but had assumed that SPs would be re-normalised after the race to account for this.  Unfortunately it seems they are not, and the historical BFSPs that are released by Betfair in CSV format (or on the Timeform website) do not account for withdrawals.  As far as I know there’s no easy way to get this information, so it means that the profit/loss of any system researched using Betfair SPs is flawed because of this.

The other thing to watch is dead-heats as this will also affect the bottom line.  It’s relatively straightforward to calculate in Win markets but it becomes more complex in place markets.  For instance if there’s a 6-runner race with 2-places paid out in the place market and your horse dead-heats for first place then the dead heat is irrelevant.  It will be treated as a full-stake bet.  However if another horse wins and your horse dead-heats for second in the place market then your return will be calculated to a half stake.  In other words there’s no ‘one size fits all’ solution for dealing with dead-heats.

The main point is about the Rule 4’s though.  Just wondering if anyone else has dealt with this issue before?  it’s hard to assess how much it might actually affect the ‘true’ bottom line of a researched sequence of 1000s of bets.

Clearly a pinch of salt is needed when assessing any betting approach to Betfair SP which these days is the common method used. There is probably less of a problem with NH racing as the bulk of late withdrawels will be stll problems on the flat.
One possible simple solution would be to readjust all prices so that a book which comes out at sub 100 is normalised to around 101 or 102% which is more likely to reflect the true BFSP after R4’s. If there is any interest in this topic I could look into it further. If the writer of the above is correct then there should be some whopping underrounds when an even money shot gets withdrawn at the stalls.