Navigation: Jump to content areas:


Pro Quality. Fan Perspective.
Login-facebook
Around SBN: What If This Is It For The Celtics? End Of An Era Looming

VDS analysis: finding patterns.

 

This is a long one, so grab your espresso now...

 

I was wondering recently if we could look at the seasons VDS scores and determine if there was any predictions that we could make by group (by price).  A note about these groupings: I have grouped some of the higher priced riders together so that I could actually see patterns, so the groups are 20-28, 16-18, 12-14, 8-10, 6, 4, 2 and 1.  And yes I have and will continue to ignore both Contador and Valverde.  

Below the flip, I compare these two groups using two separate approaches.  

 

 

Star-divide

 

 

First of all, a theory of pricing (at least in my head) would be that higher priced riders should bring in more points.  So to look at this we can calculate the average Points per price for each group (PPP = sum(points)/sum(price)).  

Here is a graph of points plotted against PPP.  I theory the line should be generally from bottom left to upper right if increased price predicts increased income (hmmm... apparently I don't know how to show graphs made in google docs...help anyone?).  So here's the table for said graph:

 

1 37.64

2 38.34

4 51.34

6 41.95

8-10 49.12

12-14 56.24

16-18 38.18

20-28 56.99

What you might notice (which is screamingly obvious in the graph) is that for the most part, the more you spend on a rider, the more you get.  However, there are two exceptions, the 4 pt'ers are higher than expected, and the 16-18 pt'ers are lower.  The fact that the average points per price is slightly lower for the 16-18's as the 2's is kind of embarrassing.  

 

But averages can only tell you us much,  so I looked at each group's distributions.  To do this I took the full range of scores in a group, and split that into 4.  Then I assigned each of the riders in the group to one of the 4 quartiles based on their score and looked at the distribution amongst the four quartiles.  If the score in a group are normally distributed., then you would expect that the quartiles to look something like this:


 

4th   15%

3rd   33%

2nd   33%

1st   15%

 

with more of the riders clustered around the average score.  i will go through each group in turn.    Percentages may not add to 100 because I am a lazy rounder.  Numbers in brackets are the score range for that quartile.  

In my initial analysis I used all riders in the group unless there was a good reason for their exclusion.  But in the last big VDS analysis the issue of non-scorers was raised, so I will include a second breakdown of groups where exclusion of non-scorers changes things dramatically.  

 

20-28pt (13 riders)

4th (1746-2140): 15%

3rd (1347-1746): 38%

2nd (948-1346): 23%

1st (550-947): 23%

 

This distribution is normalish.  Most of the riders are in the middle quartiles.  This isn't the biggest group (only 13 riders) so 1st and 4th quartile only differ by 1 rider.  You are just as likely to boom as bust with this group.  But its a pretty safe bet.  (no non-scorers)

16-18pt (13 riders)

 

4th (766-994): 46%

3rd (537-765): 7%

2nd (308-536): 7%

1st (80-307): 38%

 

First of all, I have excluded both Nibali and Pellizotti have been excluded. Nibs is an animal scoring over twice the second rider in the group (Basso), and Franco's big fat donut has non sporting reasons (and he was the only non-scorer).  We knew from the PPP graph that this group was weird, and now we know why.  The distribution is split with almost half of the riders in the top quartile and same with the bottom.  Not only is this an underperfoming group on average, but the distribution is also split with almost half of them flopping massively.  **Risky group**

12-14pt (15 riders)

 

4th (1127-1482): 20%

3rd (773-1127): 13%

2nd (419-773): 33%

1st (65-419): 33%

 

Like in the last group I excluded the obvious outliers, in this case J-Rod and Kim Kirchen.  Here we see  almost the identical average PPP at the high rollers in the 20-28 group.  But as you can see from the distribution, this is due to the hared work of a few. (again, no non-scorers besides Kirchen)

8-10pt (26 riders, 3 donuts)

4th (970-1293): 11%

3rd (646-970): 15%

2nd (323-646): 23%

1st (0-323): 50%

First of all, I looked at how this changes with the exclusion of the 3 non-scorers.  Not a lot, the first non-non-scorer is Lars Bak with 16pts, so the range is not effected much.  But the excluded riders are all from the bottom quartile, so it does alter the percentages a bit (13, 17, 21 & 47%).  This groups distribution look much like the group ahead of them, but a little riskier.  About a third clustered around the middle, but a larger proportion flopped than excelled.  

6pt (28 riders, 5 donuts)

 

4th (513-685): 14%

3rd (342-513): 17%

2nd (171-342): 28%

1st (0-171): 39%

 

The tendency for these distributions to skew towards the bottom reminds me of what majope said in the last VDS analysis thread "only 20 teams actually scored below [the score for a team of average scoring riders]".  In theory one could compose a team of 25 6pt riders.  But even if you picked the top 25 from this category, you would only score 7048, placing you over a hundred points behind the Drewd.  

Oh, and if we exclude the 5 non-scorers, the distribution flattens out a bit again (13, 26, 30 & 30%).  Note that since the range shrinks, Romain Feillu gets the boot from the upper quartile.  

 

4pt (75 riders, 11 donuts)

 

4th (570-760): 5%

3rd (380-570): 8%

2nd (190-380): 30%

1st (0-190): 56%

First of all, Horner has been excluded because he's just too awesome.  As you may recall from hours ago when you were reading the top of this post, this group was one of the groups that was flagged by the group PPP graph.  As we can see, this is due to the fact that the quartiles of the 4pters and the 6pters line up almost exactly.   This suggests that  4pters are better value than 6pters.  But is this true? The boundaries of the quartiles are pretty much the same, so we can compare directly,.  The number of riders in both groups that scored in the upper two quartiles is about the same (9, and 11 in the 6 and 4pt respectively).  However, since the 4pt group is that much larger, those riders make up 31% of 6pters and only 14% of 4pters (if you remove the non-scorers this shifts to 39% and 17%).  So, even though the range of points scored are about the same, and the PPP favours the 4pters, the distribution shows that the 6pters are a safer bet.  With non-scorers excluded (6, 11, 33, & 49%).


2pt (200 riders, 78 donuts)

4th (511-680): 1%

3rd (340-511): 2%

2nd (170-340): 10%

1st (0-170): 87%

 

Initially, I just excluded Anton from this analysis.  As you can see, it doesn't really get to what this group really looks like.  So here is the same but I've excluded the top three riders (Anton, Marcato, Pozzovivo).  If you look at their scores in the context of the rest of the top 10, they are head and shoulders above the rest.  So, chapeau to you...get out of my analysis.  And because of the large number of non-scorers, I've taken them out too.  With this smaller group of scorers (120) it looks like this.  

 

4th (313-417): 5%

3rd (210-313): 9%

2nd (107-210): 23%

1st (4-107): 62%

 

By no stretch of the imagination does this even out the distribution, but it tell a little more about how the 2pters are doing.  And they look a lot like the 4pters.  14% in the top two quartiles...ring any bells?  Regardless, a lot of these guys aren't doing very much.  

1pt (643 riders, 399 donuts!!!)

I'm not even going to put up the first breakdown...here is the 1pters with non-scorers excluded (and Porte and Sagan too, duh):

 

4th (293-390): 3%

3rd (197-293): 9%

2nd (100-197): 12%

1st (4-100): 75%

 

Wow...these guys are a crap shoot.  even with the non-scorers removed, 75% of riders are in the bottom quartile.  

Besides those crazy 16-18pters, only the top riders had more score in the top two quartiles (52%) than the bottom two, reinforcing the idea that at the very top end of the scale, you are paying for reliability.  As long as you don't pick Cunego...

 

If you've made it to the end, congratulations, if you got anything of value...you deserve a cookie.  If you just  skipped to the end cause I was blabbering too much, feel free to check my work directly and make your own analyses.  

Even if this doesn't help you design your teams next year, I think that something like this could be used to objectively determine riders scores for next year.  I know that there are subjective ways of determining rider values...but are they working? (*cough*EBH*cough*)

Discuss...

Comment 14 comments  |  1 recs  | 

Do you like this story?

Comments

Display:

This was a fun fun read, the stats guy in me was geeking out

I’d really love to see the groups broken down into points/cost for quartile analysis too – would really help show what’s going on with respect to abilities even more. If I am productive this afternoon, may be something for me to return to…

All in all, thanks! Maths is fun.

My fruit bowl is full of sex wax--gavia

by Douglas Ansel on Oct 30, 2010 3:07 PM EDT reply actions  

Dude

This is fucking awesome. I’m probably going to have to read it 5 times before I understand it.

"It was getting colder and colder as we went up. About halfway up, I started to go a little backwards and as I passed Thor he looked at me and said, "If you lose my wheel I will smash you." I took his wheel and found an extra gear." João Correia

by jsallee00 on Oct 30, 2010 3:18 PM EDT reply actions  

Nice analysis...

I will have to look it over more closely when I have the time. Glad to see that the high-pointers are more reliable, the low-pointers less so. With the 1-pointers at least, I would guess that many of the non-scorers were selected on few if any teams; wonder if that’s true for the 2-pointers and 4-pointers as well.

I tend to think the poor performance of the 16-18 range is just noise due to the smallish sample (though I remember Majope pointed out the 18ers sucked last year too.) But if, for example, you had chosen your ranges as 18-32 and 12-16 instead, I bet the graph would have looked much smoother.

It’s still an open question for me as to whether the ideal team structure should be top-heavy or middle-heavy (pretty sure bottom-heavy is a bad idea.) And of course it may vary from year to year—you have to have guys who outperform, and those outperformers may come from different point ranges in different years.

What else can I say? I'm really happy. --Vincenzo Nibali

by tgartner on Oct 30, 2010 3:34 PM EDT reply actions  

The sample size to the 16-18pters is pretty much the same as the high rollers and the 12-14pters.

But if you were to put the 16-18pters into the high-rollers quartiles, exept for Nibali and Basso, all the rest would fall into the bottom quartile. So there is something about being labelled an almost high roller that sets you of for a fall.

by Hons on Oct 30, 2010 4:31 PM EDT up reply actions  

Intriguing

Looking at this group a little deeper, there are some interesting numbers that emerge.
Throwing out Jen Grey, there are 14 riders left in this group- only one (Nibali) showed improvement over his 2009 VDS Score. Three stayed roughly the same (Basso, Martin, Breschel), and 10 guys had significant decreases. Looking further back (using the CQ ratings score), only 4 of the 10 showed a significant (> 20%) decrease over their 2008 scores. Kolobnev, Barbie, Kreuziger, Greipel (Armstrong and Basso don’t have 2008 scores). Oddly enough Greipel had more CQ points in 2010 than he did in 2009, while dropping 258 VDS points (SSR anyone?).

Only two guys in this group (Levi, Ballan) didn’t show strong improvement from 2008-2009, so perhaps a lot of these guys “disappointing” seasons were predictable? Seven of the 12 riders with four years of CQ scores showed 2009 as the best year of the four, Greipel would make eight. Point being, it’s important to look at trends over multiple years instead of just the most recent.

This strikes me as a really speculative group, lots of young riders potentially on the way up, and some old guys just trying to keep it going. A lot of these guys will leave us wondering which was the anomoly – this year or last?

I leave you with Nibali’s last four seasons from CQ: 2203, 991, 657, 793. If he put’s up 1000 pts next season, lots of people will consider it a bad year, but it’s much more consistant with his history.

(now what else can I do to avoid sitting on the trainer…)

"It was getting colder and colder as we went up. About halfway up, I started to go a little backwards and as I passed Thor he looked at me and said, "If you lose my wheel I will smash you." I took his wheel and found an extra gear." João Correia

by jsallee00 on Oct 30, 2010 5:24 PM EDT up reply actions  

!

I screwed that up. The four that showed significat decreases from 2008 to 2010 were Pozzato, Levi, Ballan and Gerrans. AK, Kreuz, Greipel and Barbie all had 2010 scores similiar to their 2008 scores…

"It was getting colder and colder as we went up. About halfway up, I started to go a little backwards and as I passed Thor he looked at me and said, "If you lose my wheel I will smash you." I took his wheel and found an extra gear." João Correia

by jsallee00 on Oct 30, 2010 5:28 PM EDT up reply actions  

I didn't mean that the 16-18 group was an unusually small sample...

Just that when you’re comparing ranges as small as 15 riders (which all the upper ranges are), you are likely to see variations that are random. Nothing to be done about it—that’s how big the ranges are!—but it just makes me wonder if the underperformance in that one group is meaningful.

It does seem possible that the price range becomes sort of a dumping ground for riders who are prone to decline but too good to lowball—Armstrong, Levi, and Ballan certainly fit that profile—but I don’t know. I suspect that if we looked at several years’ worth of data we’d see a smoother curve.

What else can I say? I'm really happy. --Vincenzo Nibali

by tgartner on Oct 30, 2010 6:28 PM EDT up reply actions  

And here is one more reason why I sucked

5 six pointers.

Feillu
Boom
Langeveld
Maaskant
Seeldrayers

one in the Third Quartile and 2 in the Second and 2 in the First with one of the doughnuts.

3 Four pointers

Roche
Hooter
Kessaikopf

one in the 4th one in the 2nd and one in the 1st scoring 0

3 8-12

Hesjedal
Velits
Bozic

again one 4th one 2nd and one 1st

16-18 the one eliminated lowball outlier

20-28

Schleck
Gesink

one in the 3rd and one in the second.

So aside from Anton and Roche and Hesjedal I showed a clear ability to steer away from the top picks in all categories and definite tendency towards the bottom of the pack. So showing in the upper half of all teams seems to be a blessing.

'When playing a game, the goal is to win, but it is the goal that is important, not the winning' - Dr. Reiner Knizia

by bought with blood on Oct 30, 2010 3:46 PM EDT reply actions  

I shudder to think what my team would look like through those lenses

by Hons on Oct 30, 2010 4:32 PM EDT up reply actions  

yeah it seems you really need

to get big chunks of points with your top guys to get a high score. In theory the best possible team would consist of a lot of outperforming lower scores and the only 18+ guy (of which we were restricted to 3) would be Farrar but you would have to be Nostradamus to pick the correct 15 or so 1-4 pointers to get that team. It seems safer to pick as many top tier guys as you can assuming you can get the at least 50-55 PPP return that is the average for them and take the safe points.

by Nomer on Oct 30, 2010 7:59 PM EDT up reply actions  

Those quartile figures are really scary when you get down to the 4s, 2s, and 1s...

Definitely not the place to build your team.

Fascinating stuff.

What else can I say? I'm really happy. --Vincenzo Nibali

by tgartner on Oct 31, 2010 2:31 AM EDT up reply actions  

Having the year's magical riders is a huge key to winning.

Nice article and like to take a moment to reflect on picking one of the three 8-pointers who did not score – the honorable and usually absent Juan Jose Cobo.

I did great with the cheap riders, but I had a lot of them, and took some donuts, too.

More than any strategy, I just feel having Sagan, Horner, Ryder, J-Rod, Anton, Porte, guys that went way above expectations all across the point scale, those guys really determine the game and hard to foresee any kind of Sagan/Porte performance coming, let alone Chris Horner, a veteran with a trackable history.

by rubesANdbabes on Nov 4, 2010 11:14 AM EDT reply actions  

Comments For This Post Are Closed


User Tools

Every sprint, every cobble, every mountain pass from the world of Pro Cycling

FanPosts

Community blog posts and discussion.

Recommended FanPosts

Sorlin_small
Passo dello Stelvio - A Brief History
Unicorn_160_x_160_small
Marmottes Without Contract!

Recent FanPosts

Kelly_legs_small
Giro Stage Predictor: Stage 19
Small
Can Ryder win the Giro?
Cutenessoverload_small
Why haven't there been single-day races that resemble particularly difficult Grand Tour stages?
Small
Visiting Copenhagen, any tips on renting a bike or where to ride?
Kelly_legs_small
Giro Stage Predictor: Stage 18
Schermafbeelding_2012-05-09_om_14
Belgium and Bayern Thursday Thread
Kelly_legs_small
GIro Stage Predictor: Stage 17
Small
The Pain I saw on Mt. Baldy (ToC)
Kelly_legs_small
Giro Stage Predictor: Stage 16

+ New FanPost All FanPosts >

Giro d'Italia Podium Cafe

Celebrate the Giro d'Italia at Podium Cafe!

Check our Giro Section for race updates, on-the-scene reports, and other hijinx.

FanShots

Quick hits of video, photos, quotes, chats, links and lists that you find around the web.

Recent FanShots

not quite in Dario Frigo's league . . .
Talking about women's cycling
pdc national champs ride sunday in greenville sc
Trivia time: 
1 Where's the picture shot?
2 Who's the dude riding the race bike?
3 Who's the girl riding the omafiets?

Waaay too easy for this crowd, I know.
Picture by Nieke 0562
Should I, shouldn't I? Or am I being an idiot?
Lee Rodgers Diary: A Memorable Day in Kuala Lumpur
cycle faster. do yoga. - An Evelyn Stevens video
life in an atoc team car
airbag bike helmet
Oldest Race in the US.

+ New FanShot All FanShots >


Editors

Farrar_and_cafe_small Chris Fontecchio

Espresso_cup_small Jen See