@spinny I see what you're driving at. I disagree with the methodology of removing "peaky" athletes and recalculating scores though.
It also all depends on what the scoring system is trying to accomplish. Is H a better athlete than J? J is certainly more balanced and worst weakness is not nearly as bad as H's. So which should be ranked higher? Except for cases like A and B, it's not going to be clear who should be ranked above someone else. Everything is a tradeoff and the ranking system is going to need to decide how to weight different results.
@spinny I thought this would reward people that have more consistent finishes and punish specialist.
but looking at the +/- of the athletes around me; the athletes that moved up (-places) did 1 or 2 workouts really well while all the (+places) like me did much better in all the other workouts. This ranking rewards specialist.
looks like every single person that cleaned a lot moved up a lot even though they did substantially worse in all the other workouts.
@adenna It doesn't actually care if your scores are consistent. It only cares if your overall combination of speed and strength are better than the next athlete.
If you compare someone with 10-10-10-10-10 placement vs 06-11-11-11-11, you could say they are equally fit overall, even though one looks like a specialist.
But then you compare 10-10-10-10-10 to 07-11-11-11-11 and the generalist wins. Or to 05-11-11-11-11 and the specialist wins.
With real world numbers, it's harder to subjectively decide who is better overall, so I make the computer do the math.
"overall combination of speed and strength are better than the next athlete." this is an inaccurate way of stating what this ranking algorithm shows. i take that statement as meaning the generalist should win. considering crossfit is a measurement of work capacity across broad time and modal domains. I can see how the math works the other way just doesnt make sense to me when applying to crossfit.
IMO 1 rep maxes as the only measure in a workout really skews the data/ranking; when applying to crossfit terms of finding the fittest person. Would have rather seen 18.2 done as part A done over say 10 minutes with total reps then part B have say 5 minutes to find 1 rep max; combine reps+lb to get 1 score for 18.2
@adenna I don't think the workouts are flawed. What if one workout was just too run 5k for time? It's the same argument that a specialist could win. It's a useful measure by itself, but needs to be combined thoughtfully with other workouts to determine an overall score.
@spinny This was an amazing read, though a good TLDR would be nice. I would like to get a better idea of what type of athlete would rank worse with a normalized system?
@spinny Nice work man. I don't suppose you used any python in your analysis?
I'm creating an open source python package to do various shit with crossfit data.
May be of interest https://github.com/raybellwaves/cfanalytics
@misseldiva Of course. I would love to post the women's data too. And even re-analyze the 2017 results. There's a lot of data out there, so I'm trying to figure out what is the most important, how to display it, how to automate my process a little better, and of course pick out any bugs in the algorithms.
One thing I found about google sheets is that if you have an excel workbook with a lot of worksheets it may be too big for google sheets. But if you copy one of the worksheets out to a new workbook, then it might not be too big for google sheets to handle.
@spinny Thanks for the tip! I will try importing a bit differently. I was just trying to upload the CSV into Sheets since I don’t do any spreadsheet editing natively (therefore always relying on Sheets).
@spinny Very nice and Interesting work. Definitely a different perspective into scoring that gives new insights, but i believe CF HQ wants to keep things simple especially if affects thousands of people.
It would be great if you could share your js code for the re-ranking, so we could reproduce it.
Also I would like to ask if are you planing to update with all the scores now that open is over and if it's possible to include a column for the competitors id.
@goldielocks I agree that the current system probably is what it is, because it is simple. And it does find the athletes who are the very best at everything, the top10 stay in the top10.
Of course I plan to re-run it after 18.5 is posted, and I can add a column for competitor id (and affiliate id). I can share the code once I get it to a reasonable state.
@spinny I was actually agreeing with you that I thought it was unfair I was ranked like 400 spots on my region behind somebody that I beat in 4 workouts but that person cleaned 50 pounds more than me.
If I would've cleaned 50 pounds more I would be ranked in the 700s. But for only one WEAK lift I am ranked 1300ish.. I thought this ranking would put me on a fair spot since I have done okay in the 4 workouts but not in ONE lift. But nope- My ranking is actually worse on the spreadsheet. Can you explain why?