Sprinters Statistics: A First Crack
Last Friday I casually threw out the idea of trying to devise some useful statistics for cycling. Today, I'm rolling out a few such ideas, and looking forward to your feedback. Below, I have charted a few recent seasons for a specific category of riders: pack sprinters. I chose the last two seasons of Mark Cavendish, Daniele Bennati, and four seasons of Tom Boonen. With this data, I looked at how often the rider made the finale in a sprint finish (as opposed to finishing with a time gap), how often from that finale he won, or placed in the top 5 or top 10. Results follow:
| Rider | %Finale | Wins | Win% | AvPlace | Top 10% | Top 5% | CQ Pts |
| Mark Cavendish '08 | 82 | 17 | 53 | 4.06 | 87 | 75 | 1080 |
| Mark Cavendish '07 | 71 | 11 | 40 | 3.85 | 92 | 74 | 774 |
| Daniele Bennati '08 | 86 | 5 | 26 | 10.30 | 73 | 63 | 987 |
| Daniele Bennati '07 | 72 | 9 | 31 | 3.75 | 93 | 82 | 1233 |
| Tom Boonen '08 | 65 | 14 | 50 | 3.33 | 92 | 78 | 1328 |
| Tom Boonen '07 | 70 | 10 | 35 | 8.12 | 75 | 60 | 1479 |
| Tom Boonen '06 | 86 | 19 | 57 | 2.24 | 97 | 91 | 2559 |
| Tom Boonen '05 | 79 | 10 | 37 | 3.41 | 92 | 62 | 2073 |
Discussion on the flip...
First, some rationale:
- I am starting with sprinters because they are relatively easy to chart. They show up just about whenever there's a bunch finish, and it's easy to measure relative performance from simple placings (as opposed to climbers or ITTs, where you might want to use time gaps). I picked a few guys who are vaguely comparable, though it might say more about the statistics if we look a wider quality range.
- I am including a few basic assumptions. First, a sprinter's job is to make the finale. If you can't make it to the front, you really aren't ready. Secondly, your job is to win, but it's just as useful to see how close you come to winning as to just count wins.
- I ran some numbers specific to classics, but they were mostly single and low double digits, which don't seem very valuable, so I left them off for now. It might be worth focusing simply on sprints in a stage race, or in a grand tour, but I left in all races for now.
Now, some caveats. I tabulated these by hand, which is a bit labor intensive and not overly accurate. I had to guess on a couple occasions, because Cycling Quotient really only has full results going back a couple years, after which you get some abbreviated ones. So, to turn this into a more comprehensive project will not be easy.
I also struggled with whether to count certain placings, like, is it a sprint finish if someone wins by a second or two and the pack comes charging behind? I say yes; but if someone wins by several seconds, or multiple riders win off the front, I excluded it. Also, does it screw up the average placings if you put in every s.t. finish? Boonen had a couple triple digit finishes in 2007, where he coasted home. I probably should just exclude them from the finale %??
Finally, a few conclusions:
- Boonen's 2006 was magical, no two ways about it.
- Cavendish and Bennati both increased their finale % by a dozen points or so from 2007 to 2008, showing that a little more experience helps a sprinter make sure he doesn't get left behind.
- Cavendish is off to an impressive start in win percentage: 53% wins this year (and 40% last year) are better than anything Benna has done, and comparable to Boonen's best.
- Bennati, however, logged a top-five percentage of 82 last year (down this year, a hard year for him all around). By comparison, you could say that Cavendish is all-or-not-so-much, while Bennati has a tendency to be right there at the line more consistently.
- The Top-10% and Top-5% numbers mostly fall within a limited range, which means we don't learn all that much from them. I suspect once you broaden out the spreadsheet to include, oh, the ten best sprinters from all the grand tours, then these percentages will say more -- e.g., if you start seeing elite sprinters who only make the top ten 60% of the time, then you'll know what makes Boonen, Benna and Cav special. For this elite group, they simply don't miss the top ten much, or the top five for that matter.
- Average placing is amusing. This consists simply of adding up the finish placings where a rider made the finale and dividing by the number of races (where the rider made the finale). When a rider is on, and his tactics put him in the finish, where does he typically end up? This is over a pretty large number of races (range is 19-33), which I think is fairly illustrative. For 2006 Tom Boonen to average 2.24 in 33 finales is unreal. Except for Benna this year and Boonen in 2007, they all average 4.06 (fourth place) or better.
And finally, a few requests. All commentary on the value or methodology is welcomed. Also, if you're aware of any source of more extensive data, particularly going back in time, please let me know. I'd love to compare some of these seasons to other current sprinters... and to some of the great sprint seasons in recent times. Finally, if you want to tinker with the numbers yourself or add to the list of seasons, feel free. Email me or use the comments if you have questions about how I did these initial data gatherings. And lastly, here's the full spreadsheet.
0 recs |
8 comments
|
Comments
An excellent start
As a huge baseball fan, and even bigger fan of sabermetrics, I like the concept of using statistical analysis to prove if common assumptions are correct or false.
Applying some common sabermetrics measures, I think we have to assign some sort of weight to what type of finish it is. In baseball terms, the closest comp I can think of is park value, where each ballpark is given a plus or minus weight when measuring offensive statistics. In simplified terms, hitting a home run in Cincinnati is a lot easier than hitting one in San Diego, and therefore, home runs in San Diego carry more weight,
Relating this to cycling, winning a bunch sprint on a wide boulevard with no major turns in the last kilometer is significantly different than a dangerous finish on a small road with a turn right before the line. So I think we have to weight this type of finish to get a better measure or predictor about who is better in certain types of sprint finishes.
A second factor I’d like to see weighted is team strength. We all know the importance of a strong team in sprinting, and we know how some riders need one to win, and others don’t. I think it would be important to measure the strength of a team in the final run up as a variable of the strength of the sprinter.
Both of these variable would be ridiculously time consuming to determine by going back and assigning a value for the past four years. But I would be more than willing to chip in for the 2009 season with trying to determine the statistical variables that go into sprinting.
As a starting point, I throw out these two measure to consider in your statistical review of sprinters:
1. Finale type: Rate the last 2 kilometers of a race in terms of difficulty. You could possibly use this as a multiple to the percent finale listed above.
2. Team Strength: Rate the team numbers and positioning of each sprinters lead out train in the final 3k
If we can build a database on these sprinting variables in 2009, plus your information listed above, I think we can generate some pretty solid predictions for certain races. Count me in for this project in 2009!
by PopUp Rolen on Nov 4, 2008 10:41 PM EST reply actions 0 recs
hm
well the easy weighting system would be to use the UCI ratings. I guess that doesn’t work well with stage races though.
CQRanking.com, you complete me.
by Chris... on Nov 5, 2008 1:05 AM EST up reply actions 0 recs
I don't think UCI rating would be accurate
because doesn’t it put a value on the entire race? I think the only way to get an accurate weight is if you have a knowledgeable viewer watch the last 2ks and determine the difficulty on say a 1-5 scale.
by PopUp Rolen on Nov 5, 2008 9:47 AM EST up reply actions 0 recs
yeah
Well, using baseball as the analogy, maybe it’s not worth weighting. In baseball, they don’t weight batting stats based on what pitcher you faced, because over time it will pretty much even out. Likewise, sprinters don’t pick and choose certain sprints, they chase all of them. So maybe there’s nothing to gain from weighting?
The answer, of course, is to try weighting and see if we learn anything from it.
CQRanking.com, you complete me.
by Chris... on Nov 5, 2008 12:36 PM EST up reply actions 0 recs
adding
park effects are weighted because they don’t even out. But if all players played, say, 5% of their games in each of 20 different parks, we probably wouldn’t bother measuring park effects.
CQRanking.com, you complete me.
by Chris... on Nov 5, 2008 12:38 PM EST up reply actions 0 recs
Good points
I guess I’m looking at the issue as more of a predictor. I would like to be able to hone down on who we can expect to win a sprint based on they type of finish it is. The general consensus is that Cav is the fastest sprinter. But maybe with data on finish types, we can get to a point where we can say Cav has a 13% chance of winning this sprint because the road is only 20 feet wide and there is a turn greater than 30 degrees less than 2k from the finish.
by PopUp Rolen on Nov 5, 2008 1:05 PM EST up reply actions 0 recs
"The answer, of course, is to try weighting and see if we learn anything from it."
Do you ever sleep?
Carlos Sastre - Tour de France winner - Born From Jets
by Jens on Nov 5, 2008 12:43 PM EST up reply actions 0 recs

by 













