Predicting the Winner of Survivor Heroes vs. Healers vs. Hustlers

P

In my last post, I discussed a large Survivor-related project that I’m involved in where we have volunteers code episodes of Survivor by the observed behaviors that player’s exhibit. You can read all about that project here.

One of the potential applications of this crowdsourced data is using machine learning models to predict winners. All the nitty gritty details are in my original post, but basically we can use the data collected from existing seasons to train a classifier to calculate the probability that a given player after each episode is a likely winner or not.

We currently have 7 volunteers coding the current season of Survivor, Heroes vs. Healers vs. Hustlers, so each week, I am going to update this post with the latest predictions about the winner.

Also, if you’re interested in this work and want to help by watching old seasons and helping us collect data, you can see our signup information here.

Winner Predictions Episode by Episode

For these predictions, I trained a Naive Bayes classifier based on 27 previous seasons of data. Although it might be useful to eventually consider things like idol finds, immunity wins, sex, age, and lots of other factors, for the training, I relied completely on the behavior data that volunteers sent me.

Episode 1:

  1. Ryan Ulrich
  2. Ben Driebergen
  3. Roark Luskin
  4. Cole Medders
  5. Joe Mena

Quick Thoughts:

It’s interesting that Roark came out third this first week. She didn’t have a tremendous amount of actual airtime and maybe only one or two confessionals. However, what we did see of her was very strong and positive.

Episode 2:

  1. Ben Driebergen
  2. Ryan Ulrich
  3. Ali Elliott
  4. Joe Mena
  5. Devon Pinto

Quick Thoughts:

Ryan dropped to number two this week and Ali sneaked in at third. With the Hustlers going to tribal this week, it makes sense that there’d be some movement here. Ben’s alliance building with Chrissy moved him up to first. Although it may seem strange that Chrissy is not in the top 5, the big difference between Ben and Chrissy is that besides being strategic, Ben has exhibited behaviors of being empathetic, charming and funny.

Episode 3:

  1. Ben Driebergen
  2. Ali Elliott
  3. Ryan Ulrich
  4. Devon Pinto
  5. Joe Mena

Quick Thoughts:

For the second week in a row, Ben’s number one. Ali jumped up to number two with Ryan dropping one spot, and Joe and Devon switched places.

Episode 4:

  1. Ryan Ulrich
  2. Ben Driebergen
  3. Joe Mena
  4. Ali Elliott
  5. Devon Pinto

Quick Thoughts:

For the third week in a row, we end up with the same top 5 with the order changing slightly. One really interesting thing to note is that Chrissy has moved up to 6th place. She started at the back of the pack after the first episode but has been consistently gaining ground each week.

There’s a lot of healers we still haven’t seen much from, so it’s really hard to factor them into the ranking. Roark and Dr. Mike have yet to go to tribal or play a major role in the action. Cole and Jessica haven’t gone to tribal but have had some key scenes related to advantages in the game and their little flirtmance.

Deeper Analysis:

For this week, besides calculating which players have the greatest probability of winning based on the current set of episodes, I also calculated which prior winner is most similar to the top five players based on their behavioral similarities. I did this by taking each player from season 35 and calculating the dot product between their behavioral vector and all the winners. The combination with the highest dot product is the winner they are most similar to.

The results are below:

  1. Ryan Ulrich most similar to John Cochran
  2. Ben Driebergen most similar to Denise Stapley
  3. Joe Mena most similar to Tony Vlachos
  4. Ali Elliott most similar to Kim Spradlin
  5. Devon Pinto most similar to J.T. Thomas

These results seem pretty spot on to me.

Devon is similar to J.T. from Tocantins in the sense that he seems innocent, connects with people socially and is pretty athletic. We haven’t seen enough from Ali to say necessarily that she has Kim’s kind of dominance, but everyone wants to work with her and she’s smart and strategic. Joe to Tony is an easy comparison. Ben is relaxed like Denise and everyone wants to work with him. And Ryan to Cochran is an easy comparison as well.

Just to note, like Ryan, Dr. Mike came out most similar to Cochrane and Ashley is currently most similar to Danni Boatwright.

I also wanted to take a deeper look at specific behaviors to see why some people are being ranked highly over others. I calculated the average and standard deviation for each behavior after episode 4, and looked at who is more than 1 standard deviation away from the average (i.e. an outlier).

Below are some of the behaviors recorded and who at this point has the largest representation for a given behavior and who was the lowest.

Analytical

Most: Ryan Ulrich
Least: JP Hilsabeck

Diplomatic

Most: Ben Driebergen
Least: Jessica Johnston

Perceptive

Most: Ben Driebergen
Least: JP Hilsabeck

Charm

Most: Devon Pinto
Least: Joe Mena

Funny

Most: Ryan Ulrich
Least: Lots of people

Leadership

Most: Ben Driebergen
Least: Ashley Nolan

Deceptive

Most: Joe Mena
Least: Tie between Desi, Ashley, JP and Roark.

Naive

Most: Cole Medders
Least: Lots of people

Minion

Most: JP Hilsabeck
Least: Lots of people
 

We end up seeing Ryan and Ben show up at the top of a lot of the behaviors we would normally associate with good Survivor gameplay, while poor JP is dominating bad gameplay. Ryan is currently more than 2 standard deviations away from the mean for analysis (Chrissy is in 2nd place for this behavior) while Ben is more than two for diplomacy.

The other members of the current top 5 are not always the top for a behavioral category, but they are generally above average for most “good” behaviors.

That’s all for this week. Can’t wait for episode 5.

Episode 5:

  1. Ryan Ulrich
  2. Ben Driebergen
  3. Ali Elliott
  4. Devon Pinto
  5. Joe Mena

Quick Thoughts:

Shockingly we still have the same top 5 as the last few weeks with a slight reshuffling. Not shown here, but Lauren slipped into 6th and Chrissy fell to 9th.

So why is Chrissy so low?

Looking at some of the individual codes recorded for last night’s episode, I see some great stuff like analytical and perceptive, but there’s also weak, aggressive and abrasive.

Having negative traits doesn’t mean you can’t be a winner. However, the probabilities are calculated based on 25 prior seasons of Survivor, so I can only compare Chrissy’s representation to how closely she resembles a past winner’s behavior’s in the game. At the present time, it appears that the existing data is not supporting her.

Let’s dive a little deeper.

Deeper Analysis

To try something different this week, I took each of the cast members from this episode and listed the behaviors for that member that are outliers in comparison to the average across all cast members.

Results are below:

Ben Driebergen: Diplomatic, Perceptive, Charming, Leader, Moral, and Emotional

Ashley Nolan: Diplomatic and not a Leader

Ali Elliott: Diplomatic, Perceptive, Empathy, and a Leader

Jessica Johnston: Not Diplomatic but Flirtatious

JP Hilsabeck: Not Analytical or Perceptive, but Athletic

Devon Pinto: Charming and Athletic (just ask Josh Wigler)

Desi Williams: Emotional

Cole Medders: Not Diplomatic or Perceptive, but Flirtatious and Naive

Chrissy Hofbeck: Analytical, Perceptive, Leader, and Weak

Ryan Ulrich: Analytical, Diplomatic, Perceptive, Charming, Funny, Industrious, and Deceptive.

Roark Luskin: Not outliers.

Mike Zahalsky: Not Perceptive or a Leader, but Industrious and Weak.

Lauren Rimmer: Empathic but has a Temper.

Joe Mena: Perceptive, Industrious, Deceptive, Abrasive, Aggressive, Temper, and not Charming.

Going back to our question about Chrissy, why is she so low?

It is because her “weak” coding is holding her back. She currently has the highest rating for weakness, more than 2 standard deviations away from the mean.

As an experiment, I manually adjusted her weakness value down to zero and re-ran my rankings and she pops up to the top 5.

This makes a lot of sense in terms of her probabilities. It’s pre-merge right now and often at this stage a weak challenge performer gets eliminated. I hope that doesn’t happen because I’d love to see a Chrissy win.

Anyway, that’s all for now. Hit me up on Twitter if you have questions.

Episode 6:

  1. Ryan Ulrich
  2. Ben Driebergen
  3. Joe Mena
  4. Devon Pinto
  5. Chrissy Hofbeck

Quick Thoughts:

With Ali out of the mix, Chrissy pops up into the top 5 after this week’s episode. Joe moved up into third. I’d be pretty surprised if either Joe or Ryan ended up winning. They just seem like too big a target to not go out before the final three, but the data is the data.

Deeper Analysis

Since we had some mix up and it’s been a few episodes, I thought it would be interesting to look again at the top 5 and which winners they are most similar to based on their in-game behaviors thus far.

Results are below:

  1. Ryan Ulrich most similar to John Cochran
  2. Ben Driebergen most similar to Ethan Zohn
  3. Joe Mena most similar to Tony Vlachos
  4. Devon Pinto most similar to J.T. Thomas
  5. Chrissy Hofbeck most similar to Rob Mariano

When I originally did this, Ben was closest to Denise Stapley, but has now switched to Ethan. However, if I compare all winners to each other, Denise and Ethan come out as the most similar to each other, so behaviorally, not much of a switch.

Interestingly, Chrissy comes out as most similar to Boston Rob. Looking at their raw behavior vectors, you can see a lot of similarities. They are strong analytically, and are perceptive yet deceptive leaders. However, Boston Rob is more industrious and stronger athletically.

For next season, I’d really like to incorporate more data beyond the behaviors. For example, being on the right or wrong side of a vote, possessing an idol, and challenge wins, could all be interesting features to incorporate to help inform the model’s predictive power. But for now, I want to keep things consistent.

This is all for this week. Next week we’ll dive back into behavior outliers. There should be some good stuff after the merge.

As always, hit me up on Twitter if you have questions.

Episode 7:

  1. Ben Driebergen
  2. Joe Mena
  3. Ryan Ulrich
  4. Devon Pinto
  5. Chrissy Hofbeck
  6. Lauren Rimmer
  7. Desi Williams
  8. Ashley Nolan
  9. Mike Zahalsky
  10. JP Hilsabeck
  11. Cole Medders

Quick Thoughts:

I decided to put in the full list this week. The top 5 is basically the same as the prior week but Ryan moved to third and Joe moved into second. I think Ben, Joe and Ryan each might be too big a target to actually end up making it to the end. Devon seems like he could be in a decent spot to win.

I’m a little surprised to see Dr. Mike down in 9th. Might change things if the model took into account idol possession. Have to do that for season 36.

Deeper Analysis:

This week I wanted to revisit our behavior outliers like I did a few weeks ago. Below are some of the behaviors and who is exhibiting that behavior more than anyone and less than anyone.

Analytical

Most: Ryan Ulrich
Least: JP Hilsabeck

Diplomatic

Most: Ryan Ulrich (was Ben)
Least: JP Hilsabeck

Perceptive

Most: Chrissy Hofbeck (was Ben)
Least: JP Hilsabeck

Charm

Most: Devon Pinto
Least: Joe Mena

Funny

Most: Ryan Ulrich
Least: Cole, Desi, and Ashley

Leadership

Most: Chrissy Hofbeck (was Ben)
Least: Ashley Nolan

Deceptive

Most: Ryan Ulrich (was Joe)
Least: Ashley and JP

Naive

Most: Cole Medders (not even close to anyone else)
Least: Desi

Minion

Most: JP Hilsabeck
Least: Lots of people
 

The last time I did this was episode 4 and Ben has moved out of the top spot for quite a few behaviors. Poor JP is still struggling across the board and Cole is a huge outlier in the naive category.

That’s all for this week. Looking forward to episode 8.

Episode 8:

  1. Ben Driebergen
  2. Ryan Ulrich
  3. Joe Mena
  4. Chrissy Hofbeck
  5. Devon Pinto
  6. Lauren Rimmer
  7. Ashley Nolan
  8. Mike Zahalsky
  9. JP Hilsabeck
  10. Cole Medders

Analysis and Thoughts:

Not too many changes in comparison to last week. Chrissy moved up into 4th ahead of Devon. At this point, the model has pretty much stabilized. We have our top 5 front runners, followed up Lauren, Ashley and Mike each with an outside chance and then JP and Cole with no shot of winning.

In tests prior to this season, on average by episode 8, the winner had stabilized. We now have two weeks in a row with Ben holding onto the top spot. For the sake of the model, I’d love to see a Ben win :-). Subjectively, I feel like Ben and Devon have the best chance.

There’s a pretty steep drop off in terms of the probability the model returns for how likely the top 5 are to be a winner versus even 6th place. The top five all have values above 0.99 while Lauren drops to 0.15. With regards to Lauren, this seems to make sense. I don’t think we’ve ever had a winner like Lauren, so if she does end up winning, she’s definitely not the odds on most likely.

One other interesting thing is that the past winner Ryan is most similar to has shifted from John Cochran to Tyson Apostol. Also, it’s kind of funny, but Cole comes out as most similar to Amber Brkich or Bob Crowley every week :-). (Note though, just because he’s most similar to them out all winners we have data for, doesn’t mean its all that similar.)

For this week, I took each of the cast members left and listed the behaviors for that member that are outliers in comparison to the average across all cast members.

Results are below:

Ben Driebergen: Diplomatic (the most), Perceptive, Charming, Leader, and Moral

Ashley Nolan: Not Industrious

JP Hilsabeck: Not Analytical, Diplomatic, Perceptive, or Industrious but he is a Hard Worker and Athletic

Devon Pinto: Charming and Not Industrious

Cole Medders: Flirtatious, Abrasive and Naive

Chrissy Hofbeck: Analytical, Perceptive, and a Leader, but Weak and Not Athletic

Ryan Ulrich: Analytical (the most), Diplomatic, Funny, and Industrious, but Weak and Not Athletic

Mike Zahalsky: Empathetic and Weak

Lauren Rimmer: Empathic and Moral

Joe Mena: Industrious, Deceptive, Abrasive, Aggressive, and has a Temper

These are pretty interesting. As this is the first season I’ve really done this for live, I don’t have context for how this has looked at this level for prior seasons. It could be that given say Devon’s lack of outlying strategic-related behaviors that means he really has no shot to win or maybe being middle of the road across behaviors is really good. I am looking forward to monitoring these results for future seasons.

Hit me up on Twitter if you have questions.

Episode 9:

  1. Ben Driebergen
  2. Ryan Ulrich
  3. Joe Mena
  4. Devon Pinto
  5. Chrissy Hofbeck
  6. Lauren Rimmer
  7. Ashley Nolan
  8. JP Hilsabeck
  9. Mike Zahalsky

Quick Thoughts:

A little movement this week, but not much. Devon took over 4th spot ahead of Chrissy and poor Dr. Mike dropped into last spot at 9th even behind JP. Cole has been a dead man walking in terms of the rankings for a while, so not too surprised to lose him this week.

Ben’s aggressiveness with his alliance members this week makes me a little nervous for a Ben win. The top 5, in terms of their probabilities has been really close for most of the season. I think you could make an argument for any of them. Devon would be my second subjective pick right now for being the potential winner.

That’s all I got for now. Short update this week.

Episode 10/11:

  1. Ben Driebergen
  2. Devon Pinto
  3. Chrissy Hofbeck
  4. Lauren Rimmer
  5. Ryan Ulrich
  6. Ashley Nolan
  7. Mike Zahalsky

Quick Thoughts:

Lots of interesting movement this week. Joe was the first of our top 5 to be eliminated since Ali. Devon jumped up into second and Lauren leapfrogged Ryan to enter the top 5 for the first time.

Prior to this week, the top 5 were extremely close in terms of their probabilities to win, but with the double boot this week, things are starting to separate. Ben and Devon are in the lead by a significant margin. Chrissy still has a pretty high probability, and then things drop off with Lauren and Ryan. According to our model, Ashley and Dr. Mike have no shot.

Episode 12:

  1. Ben Driebergen
  2. Devon Pinto
  3. Ashley Nolan
  4. Ryan Ulrich
  5. Chrissy Hofbeck
  6. Mike Zahalsky

Quick Thoughts:

Sorry for the late update. Took a while to get the data this week.

Some more interesting movements this week. Ben is still firmly in first overall with Devon riding hard in second, but Ashley has jumped up into third place.

Personally, it’s hard for me to see a path to the end for Ben. He just seems like such a huge target, but perhaps he’ll pull a Mike Holloway or maybe he can convince everyone that Devon and Ashley are a huge target or maybe people think they can beat him because he’s burned some bridges.

For the sake of science, I hope this is true :-). We should find out tonight.

Episode 13:

  1. Ben Driebergen
  2. Devon Pinto
  3. Ryan Ulrich
  4. Chrissy Hofbeck
  5. Mike Zahalsky

Quick Thoughts:

Not really any movement this week other than Ashley being removed from the board. This is to be expected at this point after so many episodes. I would expect the model to be somewhat stable.

From the 12 rankings that I have published this season, Ben has held the top spot 8 times and has never been out of the top two. Devon, who seems like our second most likely winner, has been in the top 5 every episode but the first. He hovered in the 4th and 5th spot until the double episode of 10/11 and has been ranked second since.

I’d be happy with either winning and I think it would be a great first season outcome for the model.

Similarity Comparisons:

Given that it’s the last episode prior to the final, I wanted to revisit the winner similarity comparison. Below I have our top 5 and which winner they are most like based on their coded behaviors.

  1. Ben Driebergen most similar to Rob Mariano
  2. Devon Pinto most similar to J.T. Thomas
  3. Ryan Ulrich most similar to Todd Herzog
  4. Chrissy Hofbeck most similar to Rob Mariano
  5. Mike Zahalsky most similar to Sandra Diaz

These look pretty good, but I think a lot of people would expect see a major similarity between Ben and Mike Holloway. I dug a little deeper looking at the collection of winners Ben is most like. The top 5 are shown below by order of similarity (most similar to least).

  1. Rob Mariano
  2. Tyson Apostol
  3. J.T. Thomas
  4. Earl Cole
  5. Natalie Anderson

Mike Holloway is currently the 10th most similar based on our codings.

There could be a few reasons this goes against our expectation. One is that, maybe they aren’t as similar as we think, at least based on their codings. Alternatively, we only have one coding for the dirty 30 season, so it could be that only having the single point of data for that season biases the sampling. Finally, it could be that at this point in the season, all potential winners start to look pretty similar. The difference between the most similar winner (Boston Rob) and the 10th position, is not that big.

That’s all I got for this week. Can’t wait for the final.

About the author

Sean Falconer

14 Comments

  • Love the past winner comparison. Ben is so good a talking to people and counter balancing all of their arguments ("No, Ashley, we aren't keeping JP just because he's pretty"). That is VERY Denise Stapley.

    Curious to see what players overall some of these players are most like, not just winners.

    And I must be super biased toward Chrissy. I am coding the hell out of her yet she never makes your list, Sean.

  • Sean we love you for doing this. You are industrious, perceptive, a leader, analytical, charming, and awesome

  • Really cool stuff Sean and Co.!
    But while I do find this super fascinating, it feels like a misnomer to say that you're trying to "predict the winner." Obviously the winner is *sometimes* the player who players the "best"/most like other winners, but obviously a lot of the time it's not. It makes sense that the 4 previous winners that you mention predicting early are all "Heisenberg"-style dominant players that most fans identify as having played an unusually strong/visible game. Even looking at your rankings for this season, it appears that the model is essentially tracking the casual audience's opinions about who's playing the best game, which is interesting to see quantified but, as we've seen with Angie's archetypes, probably not much better than the casual audience at the prediction game.
    If you were actually going to predict the winner, it would make more sense to do the same sort of computational analysis but using Edgic-esque story elements rather than micro-behaviors, e.g. (just off the top of my head) —
    – Initiates an alliance
    – Self-idenfities as wanting to vote out the person who gets voted out in that episode
    – Explains his/her overarching strategy for the game
    – Cries in a confessional
    – Exhibits hubris (obviously include some that are anti-winner like you've done already)
    – (yadda yadda yadda, ad nauseum; obv there are more than enough others)
    Y'all have, of course, gone too far down tracking micro-behaviors to make that change now, but anyone who's analyzed the show episode-to-episode would have to expect that approach to have a much better chance of "predicting the winner", if that's the actual goal.
    Anyway, like I said, really cool stuff regardless, and it'll be fun to see how to continues to shake out!

  • Hey Damien,

    Thanks for stopping by and sharing your thoughts.

    I agree that as it stands right now, this initial model might not be the best way to predict winners, but I wanted to show off one potential application of the data that's easy for people to follow along with.

    With the machine learning approach that I am using, we aren't restricted to purely behaviors. I plan to eventually incorporate other information into someone's representation, like age, sex, location, voting the right way, idol finds, etc.

    Predicting is not the only goal with this. I think it will eventually be more interesting to look at things like clustering players based on these representations and to have a more data-driven approach to classifying groups of player game play. Angie's done this subjectively for character types, but I'd like to do this quantitatively for game play style.

    There's lots of other interesting applications as well.

    Cheers,

    Sean

By Sean Falconer

Sean Falconer

Get in touch

I write about programming, developer relations, technology, startup life, occasionally Survivor, and really anything that interests me.