Following on from the previous post where I looked at a simple model using expected goals as an underpinning value for a simple rating system I will continue the exploration by

  1. Extending the premiership analysis to 2018/19 through to 2021/22 data ( 4 seasons of data)
  2. Look at an alternative baseline model of simply using goals scored difference

First the results for the XG model using a walkforward train and test. Let me explain, the software will train the model on 18/19 and test the model on 19/20. It will then train the model on 18/19 and 19/20 and then test the model on 20/21. Finally is will train the model on 18/19, 19/20 and 20/21 and then test on 21/22. While it is doing this it accumulates the results from each test period before reporting the results.

Using a train test split only rather than walkforward so it will train on the first 80% and test on the last 20% we have

Both set of results show a good return on value bets and the calibration plot looks reasonable down in the bulk of where the ratings will reside

Next I trained a model but this time using the goals scored difference for a teams last 3 matches as a measure of their worth. So team A with results of 1-0, 2-0 and 1-5 if there score comes first would have gross difference of -1 with an average of -0.33

In the above ignore the fact that the input feature is named as xgdiff, I did not change the feature name but did populate it with actual goals scored difference. The results using a train test split were as follows

The value bet profit has evaporated here adding extra weight to the idea that XGoals is a superior input to goal difference when it comes to profit generation. There is enough evidence here to prompt further investigation using more data and exploring other leagues.

Many thanks to and for data supply