Finally A Post! CrossFit Athlete Popularity?

F

It’s been quite a while since I last posted. I have had lots of things to talk about, but just haven’t found the time to write anything.

In this post, I am going back to analyzing CrossFit-related data, but this time I am taking a completely different angle. Rather than analyzing scores and comparing athletes based on their performances, I am comparing athletes based on their popularity and what the implications of popularity are within the CrossFit community.

The first thing I had to consider with this project was how can athlete popularity be measured and secondly, where can I find lots of data to actually measure this value? I decided to go with a rather simplistic model. To measure popularity, I decided to base it on how often a particular individual is discussed in the CrossFit community.

There’s lots of potential data sources for this kind of information. I went with CrossFit.com. Any time an athlete is featured in an article, or has their results linked, or is discussed in the comments, I collected that information.

The technical details regarding how I did this is pretty simple. I got a list of all 2010 Games athlete’s names, then I wrote a small PHP script to make requests to every article posted on CrossFit.com over the past year. I then tracked the frequency of how many times I encountered a name within the article’s text and comments. I tracked this information for all articles occurring before the CrossFit Games and for all articles after the Games. I also added in a few popular non-2010 competitors, like Pat Barber, Dave Lipson, Josh Everett, Pat Sherwood, and Kelly Starrett.

There’s some assumptions with this approach and as a result some data will be lost. There could be mentions that are not captured due to someone using a nickname or partial name or pronoun rather than the person’s full name. Further, there’s no image analysis, so I don’t take the images from posts into account, which could potentially reference an athlete.

The goal with dividing the data into two categories: pre-games and post-games, was so I can look at how the games influences someone’s popularity. Does doing well in the games make you more popular? Why is that important?

Using the two sets of frequencies, I normalized the values by the number of posts analyzed for each category (196 pre-games, 77 post-games). If you sort these normalized frequencies and plot them, one interesting thing is that they appear to follow a lognormal distribution. That is, very few athletes are mentioned a lot, and many athletes are mentioned occasionally. This is mostly interesting from a statistical perspective as many statistical tests assume the data is normal.

Below is a table displaying the most highly mentioned athletes pre-games and a table displaying the most highly mentioned athletes post-games.

Pre-Games Popularity
Name NF
Rob Orlando 0.459
Pat Sherwood 0.316
Jason Khalipa 0.245
Dave Lipson 0.204
Chris Spealler 0.194
Josh Everrett 0.158
Post-Games Popularity
Name NF
Pat Sherwood 0.714
Graham Holmberg 0.675
Rob Orlando 0.519
Chris Spealler 0.519
Dave Lipson 0.519
Austin Malleolo 0.312

You can see that there’s some fluctuation in the order, plus some athletes that were not in the original set of highly mentioned athletes are now in the list, in particular, Graham Holmberg and Austin Malleolo. This makes sense, as both Graham and Austin did very well this year (1st and 6th respectively), but Graham was not in the top in 2009 and Austin did not compete at all. However, there was a lot of anticipation and expectation with regards to former games competitors like Josh Everrett, Jason Khalipa, and Rob Orlando. Following the games, only Rob remained in the top for highly mentioned athletes. It’s also interesting that Dave Lipson is in the top for both categories, even though he does not have a strong games history.

To determine if there’s a correlation between the measured athlete popularity and their finishing position in the games, I compared the athlete’s post-games popularity scores with their games ranking. There was a statistically significant correlation between these two vectors. Note, I had to remove any non-games athletes from this analysis.

Following this, I compared the difference in popularity score pre and post-games for all athletes by subtracting their normalized frequency for the post-game category from the pre-game category. Athletes with large negative values indicate that their number of mentions increased greatly post-games, while athletes with large positive values mean the opposite.

Athlete Popularity Fluctuations
Rank Name Difference
1. Graham Holmberg -0.614
2. Pat Sherwood -0.398
3. Chris Spealler -0.326
4. Dave Lipson -0.315
5. Austin Malleolo -0.291
46. Blair Morrison 0.026
47. Josh Everett 0.041
48. Tommy Hackenbruck 0.045
49. Spencer Hendel 0.058

We can see from this table that Graham Holmberg’s CrossFit.com presence increased dramatically after the games. His score is more than 4 standard deviations away from the mean. This is hugely significant.

Pat Sherwood was consistently popular across both data sets, but he also had a huge increase in visibility post-games. I think a lot of this had to do with his success as a commentator during the games.

So why should anyone care about any of this?

A big reason, I think, is because as CrossFit becomes more mainstream and continues to grow, there’s more money and business opportunities. There’s starting to be huge financial incentives for doing well at the games or atleast becoming a popular athlete. The financial incentives go beyond the money that you win at the actual Games, but extend to your own business. Many of the games athletes are coaches and affiliate owners. Surely, being a “web-lebrity” is going to help drum up business. Furthermore, some athletes (i.e. Rob Orlando) are getting into selling equipment. Being a name within the community is going to help establish your brand and establish your company.

Please don’t think that I am insinuating that these athletes are competing solely for the money and business opportunities. But I think as the sport grows, visibility/popularity and establishing yourself as a “brand” is going to play a larger role. Similar to Olympic athletes, top-level CrossFitters may need sponsorship to allow themselves to have a full time training schedule. To entice sponsorship, even the most reluctant athletes may need to increase their presence in the community.

Why should you care if you will never be a Games competitor?

Well, I think from a business perspective, it’s important to recognize how important this kind of visibility is. If you’re a games competitor, you have a leg up on just about every other CrossFitter out there in terms of promoting your business. However, there are alternative means to increase your impact. Web Smith blogs about the importance of social mediums like Twitter and Facebook to help promote your business.

In California, there are tons of CrossFit gyms. There are three just in Palo Alto. How do you separate yourself and how do you get potential members to choose you over so many alternatives? Well, you can potentially take your chances and train for the games or increase you popularity/presence through other means: Twitter, Facebook, sponsoring local athletes, competing locally, blogging, having a useful website, and possibly search optimization for your site.

I am going to conclude this rather long post with my standard warning about this kind of data analysis. The intention is not to insult any athletes, but is merely a thought experiment for me. Also, there’s many ways these results could be potentially analyzed. My interpretation and explanation may not be the whole story. Finally, I am making a rather large assumption here that popularity does lead to increased business. I do not know directly whether any of these athletes have had increased business at their boxes as a direct result of their web-presence.

About the author

Sean Falconer

2 Comments

By Sean Falconer

Sean Falconer

Get in touch

I write about programming, developer relations, technology, startup life, occasionally Survivor, and really anything that interests me.