Apologies to anyone who read the initial post under this title. Half an hour after posting it I realised that due to a schoolboy error in my code the outputs so far are not quite as uniform as I first thought. So far this week up today Thursday Hugh’s positivity prediction scores according to sentiment analysis are as follows.

Pos Prob FinPos
0.9999987 0
0.99999948 0
0.99999474 1
0.99929637 2
0.99951526 ?

The average is running at 0.99976091 which means only three tips have been above average including the winner. Early days I agree but I mention it now simply as one way of looking at the numbers.

It may turn out that a generally positive Hugh Taylor or indeed any tipster for that matter may require a specific machine learning algorithm trained on that tipsters specific vocabulary and of course tipping style in order to tease out any nuances. For example ‘tends to break slowly’ might be far more negative within a trained specific algorithm than within the general corpus used here. This approach may be fine for sifting positive and negative movie reviews but may be average on figuring out what side of the bed Hugh Taylor got out of this morning. Before I can do that however the sample size will have to build up unless someone knows of a back source of Hugh Taylor write up’s.

One reader also commented on using this approach to analyse race reader comments. I have done some work on this before both statistically and using ML (see smartersig mags) but I will try to return to it and blog soon.

I will keep you updated on the current approach via this blog as data builds up.