As a side project, I decided I wanted to predict the winner of BlizzCon 2016 in Heroes of the Storm (HotS). Obviously the tournament just ended, but I still wanted to complete my ratings. If you just want to see them and skip everything else, click here. Also, please don't use these ratings for betting or any similar purpose. if I intended to use this project to make money I would have put a lot more effort in, and I would definitely not post it online. I also don't like betting, so that gets in the way too.
The Method: Glicko (Similar to Elo)
While many people have likely heard of the Elo rating system, most famous for its use in chess, the Glicko rating system is less widely known. Feel free to click on the links for more information, but the gist is that Mark Glickman invented it as an improvement to the Elo rating system by introducing RD (ratings deviation). Essentially, players/teams are penalized for not playing a game in a long time. I'll use the R implementation of it from the
This was probably the most difficult aspect of this project. Unlike League of Legends or Dota 2, the HotS competitive scene has many fewer fans and therefore many fewer people to update any Wikis. Despite that, GosuGamers actually has a pretty good list of upcoming, live, and recent matches. The main downside to using Gosugamers is that each game requires you to click through in order to see the match breakdown which makes it harder to use an automatic web scraper. From what I understand, the matchticker API is also not particularly useful for historical games since it only puts out information about ongoing and recent matches. I ended up transcribing the data by hand from the fall global championship and BlizzCon.
PlayerRatings authors (Alec Stephenson and Jeff Sonas) roughly optimized the default parameter values for a specific chess data set, so I will mess with them somewhat for this analysis. The first parameter is
gamma which represents the advantage of player one. Since there is no real 'homefield advantage' in an e-sport, I left it at the default value of 0. Next up is
init which defines the initial rating and the initial deviation of a team. In the future, I'll likely make the values of new teams based on the region.
Thanks to the PlayerRatings package, the actual R code is very simple.
# Required Packages require("PlayerRatings") # What's needed for everything # Loading in the data HotS.df <- read.csv("HotSCompetitiveData.csv", stringsAsFactors=FALSE) # Parsing the data we actually need HotS.temp <- HotS.df[,c(2,3,4,5)] # Creating a glicko object HotS.glicko <- glicko(HotS.temp) HotS.glicko
If you want to predict a specific game, you just need to use
predict(). The predict function requires the R object (
HotS.glicko in the code above) and the
newdata with the two teams who will be playing. Lets say you want to predict a game between MVP Black and Fnatic HotS. Then,
# Match to predict newdata.df <- data.frame(2, 'MVP Black', 'Fnatic HotS') # Finding the chance of a team winning pvals <- predict(HotS.glicko, newdata.df, tng=1)
pvals is the probability of the home team, MVP Black, winning.
|4||Team Dignitas HotS||2389||148||10||5||0||5||0|
|5||Zero Gaming HotS||2257||158||7||3||0||4||0|
|6||Please Buff Arthas||2160||120||14||7||0||7||0|
|8||eStar Gaming Blizzcon||2088||164||6||2||0||4||1|
|10||Astral Authority HotS||1937||171||5||1||0||4||0|
|11||Imperium Pro Team HotS||1907||188||4||0||0||4||1|
Creating this table was pretty easy using the
htmlTable package from Max Gordon. The only code you need is below. You can also find plenty of documentation here. I do recommend looking at it on a computer or turning your phone sideways if it is not showing up correctly on mobile.
require("htmlTable") HTML.output <- htmlTable(txtRound(HotS.glicko$ratings, 0))
My ratings differ somewhat from GosuGamers in that I have Fnatic rated higher than MVP Black. They still have the two Korean teams in first and second place. I also only have teams that actually took part in the Heroes Global Championship, which means I am missing a lot of data they have available to them. A good example of that is team Misfits HotS, which they rank at number four but I do not even have in my chart. The competitive scene should become a little more predictable now with Blizzard's new competitive setup.
In the future, I hope to automate data collection, and also collect information on what heroes are in each game and who is playing them. Mind you, Blizzard has access to all of this information, and the commentators even use it in their games. At the moment, I do not think this would be a big enough database that it would be unreasonable for Blizzard to release the data if they had enough of an incentive to do so. However, documenting data for public consumption is time consuming, and there are simply not that people who actually want to see the raw data for this kind of a competition. If I am completely off base and Blizzard has open-sourced everything, please let me know in the comments or by e-mail.