I’m currently working my way through a fantastic book called “Mathletics” by Wayne L Winston, which provides a great introduction into the basics of statistical analysis in sports for novices like me. One part that I’m finding particularly fascinating is the breadth of analysis and metrics that US sports like Baseball leverage in terms of assessing the skill level of Batters, Pitchers and Fielders. You’ve got metrics there like WAR, WHIP, wOBA, BABIP, WPA and countless other abbreviations and statistics that baseball analysts can use.
It makes our analysis in racing when assessing jockeys and trainers against metrics like win % and place % look pretty archaic to be honest.
To that end it got me thinking about different metrics to assess the skill level of jockeys and so far I’ve thought about 3 metrics that might be interesting to explore. We’ll look at each of them in turn to see if they are both “interesting” and “useful”. The answer to the two questions will not be the same.
1) Settling — jockeys’ ability to settle a horse in a race (Part #1)
2) Breaking — jockeys’ ability to help a horse to break well from the stalls (Part #2)
3) Passing — jockeys’ ability to pass rivals during a race (Part #3)
I am no statistics expert, so all constructive feedback is welcomed.
· Metric: % runners that the jockey was able to settle well in their races
· Definition: using Run Comments for a race I codified a figure against the horses that were said to race “keen”, “pulled hard”, “did not settle”, ran “green”.
· Health Warning: the quality of the data is limited to the quality of the Race Comments provided against the race result. That’s definitely imperfect but it’s the best data set that I have at the moment.
· Data: January 2017 — July 2020
Whilst cleaning the data I noticed that the Irish Run Comments did not seem to be recorded in the same way as the UK Run Comments. Almost all Irish jockeys were scoring at the top of my settling metric, as it looks like very few runners are recorded as running “keen”. For that purpose I have removed all Irish races from my dataset.
Top 20 — the best “settlers”
Let’s start with those that perform best in terms of settling their horses in races.
Let me explain what that means…
In the data set that I looked at, for all the rides that Grace McEntee had only 6/284 were recorded with a Run Comment of “Keen”, “Pulled Hard”, “did not settle”, ran “green”. 98% of her rides appeared to settle.
Stl% is displaying a % of rides that a jockey was therefore t able to “settle”. xSTLA is displaying an expected number of settled rides, based on the average across all jockeys, which was 92% (0.92). SOA is displaying “Settles Over Average” which is their number of actual settled rides — the expected number based on total number of rides * jockey average.
The top 20 on Settles Over Average are as follows:
Bottom 20 — The “worst” settlers
Now let’s take a look at those jockeys who seem to perform poorly based on the notes captured in the Run Comments. Those who have a higher % of rides that seem to run keen, run green or do not settle.
Immediately the one that jumps out is Frankie Dettori as the 2nd worst jockey in terms of horses running keen! We need to investigate that one further.
Settling: is it useful?
Whilst it might be interesting to see the % of rides that jockeys appear to be able to settle well, is it actually useful and in any way predictive of wins or win%.
I started by running a Pearson Correlation on the Settle % and Win % of Jockeys. The Pearson Correlation showed little correlation between the ability of a jockey to help a horse settle and their overall Win %.
Pearson tells us that the Settling Metric is not correlated with Jockey Win %.
I also had a go at a regression to see if Settle % and SOA could be in any way predictive of Win %. They were not.
The Frankie Conundrum
The position of Frankie at the bottom of the table got me thinking about whether true skill of a jockey can be defined by their ability to help horses settle in their races or whether the skill is getting horses that run keenly to still win.
That took me to think about the difference between Jockey Win % and Jockey Settle % — were jockeys still able to win with horses that had not settled.
The table shows the difference between the jockey win % and the jockey settled rides %. Daniel Tudhope had a win % of 0.17 and a % of unsettled rides of 0.04, so therefore a difference of 0.13. Frankie had a win % of 0.25 and a % of unsettled rides of 0.16, therefore a difference of 0.09. What’s interesting about this chart is that we’re seeing the generally perceived “good” jockeys rise to the top.
Summary: “Settling” as a concept is perhaps interesting to look at, but not useful (correlated or predictive) in terms of determining Jockey Win %.