Phelps vs The World in Medals? A Statistical View at Medal Share
Lots of "Michael Phelps has more medals than..." articles popping up on my feed. As I suspected, most are using raw medal counts. This is one way to figure out whether there are differences between things, but other factors are still at play. For example, the amount of events Phelps participates in are more than the majority of athletes. In the last two Olympics (2008 and 2012), he participated in 15 events and won gold 12 times, which is 80% gold medal finishes. In comparison, Usain Bolt participated in 6 events over the last two Olympics and won gold in all 6 events, 100% gold medal finishes. 6 gold medals versus 12 gold medals... 100% gold medals versus 80% gold medals... depending on how you cut your raw data, different ways to rank athletes can emerge and give you a different story.
This isn't to show that Phelps is "worse" than he actually is -- in fact, the data suggests otherwise -- but rather a view into how to assess smaller countries, which is actually something Phelps sort of models on his own.
Let's start with some basic stats. Above is a simple comparison of medal share for the US. Medal share is a very simple statistic: it's how many medals a country has divided by the amount of medals awards at an Olympic Games.
In the previous 5 Olympic games, USA has averaged about 12% of the medal share. That's around 100 medals every Olympic Games. Currently, out of all the events that have awarded medals, USA has 14% of the medal share. You can interpret this one of two ways: 1) either the USA is performing better than other years and therefore USA should see more medals this year than other years or 2) the USA is going to regress back to their average of 12% medal share over the coming days. Either interpretation is fine, we just won't know which is correct until the conclusion of the Games.
Using this analytical approach of current medal share versus previous average medal share, we can take a look at all competing countries over the last 5 years.
You'll notice a few trends with this graph. The first is that many countries that have medaled in previous years have yet to medal at Rio 2016. Which is completely fine and logical -- barely 25% of the events have been awarded medals. Lots of time for countries with fewer athletes to medal in the respective sport. Another trend is the fact USA has the greatest medal share over the last 5 years. The US is represented in nearly every Olympic event, with many elite athletes winning in multiple events across swimming, gymnastics, and track & field. One trend that seems to stick out to folks when I show them this graph is how poorly Germany is doing currently when compared to their expected medal share. I try to reassure folks that 75% of the medals are still out there -- plenty of time for Germany to meet their expectations. However, the other argument could possibly be that this is the worst showing for Germany in the last 6 Olympic Games. We won't know until Rio concludes.
Let's take this back to Phelps versus The World. Phelps' golden age was the 2004 and 2008 Olympics, where he participated and medaled in 16 total events. Using some simple averaging, I calculated his 2016 medal share to be 0.37%, which comes out to 3-4 medals out of his 5 events he is participating in. At the moment, he has 3 medals for 3 events. Again, this value could regress in his last two events or perhaps Phelps medals in his final two events and crushes expectations.
If you take the difference of his current medal share and his expected medal share, you end up with how close or how far away Phelps is performing to his expectations. Here, a difference of 0.93% could mean he is performing at his typical range, or you could argue he is performing slightly above average, or you could argue he is due for a regression. Conservative estimates would say he is performing near expectation.
We can compare Phelps' medal share difference (the further right set of bar values) to the medal share difference of all other countries who have medaled so far. With 3 medals to his name and only 25% of events awarding medals, Phelps has more medals than all but 10 countries in these Olympic Games. This number will definite regress down since the inflation in Phelps' numbers are due to ~230 medals being awarded currently and not ~950 medals (the estimated medal count by the end of the Olympics).
The underlying factor to this would be availability of resources to promote elite athleticism. In a different country, someone with Phelps' physical attributes may have never got a chance to swim because of many reasons -- one being the fact that swimming as a sport is non-existent in many countries. A big factor is how sports are prioritized in other countries. China is known for gymnastics, diving, weightlifting, and table tennis. Gymnastics in the US is a medal-sport for women, but US men do not perform as well as US women in gymnastics. The US is not considered a threat in weighlifting, diving, or table tennis. China on the other hand pride themselves in all four of these sports. Table tennis is the most popular sport in China, with professional leagues that pay professional table tennis athletes. That's just not a prioritized sport in America, let alone many other countries.
So how do you compare someone like Phelps to a country altogether? Well it's not really that fair. As said before, Phelps gets to medal in potentially 5 events. For events like weighlifting or any team sport, there is generally only one event you can medal in. Gymnastics and track & field are really the only comparable fields to swimming, since one athlete could contend for a medal in multiple events. But the rulers of Swimming, gymnastics, and track & field are also US athletes.
There is no great way to compare the achievements of one person to the achievements of an entire nation. It's practically impossible to compare the achievements of one athlete to the achievements of another athlete in a different sport. There is one number that sticks out in my mind for Phelps, and that's the 2nd number of gold medals won by someone. That number is 9. Larisa Latynina achieved this over three Olympic Games, across 4 different events. Paavo Nurmi did it over 3 Games across 7 events. Mark Spitz did it over 2 Games, 7 events. Carl Lewis did it across 4 Games, 4 events. And then there's Phelps. 4 Games, 8 events, 21 gold medals in counting. That many events over the last 16 years of Olympics -- incomparable.
For a far more interesting data visualization of countries and their performances in specific Olympic events, I highly recommend this article!