Monday, December 13, 2010
What happened to Mankriks wife?
So with the overhaul of the entire 1-60 in its entirety, i would highly recommend everyone to create a new character and just go out there and see the granduer and spendour which is a destroyed azeroth. The lvling purgatory we all once felt has become nothing but a memory.Blizzard took the challenge of making the old 1-60 grind into something new players especially could enjoy and progress through at a relatively fast pace while still appreciating the lore and interest of questing. WOTLK style questing takes place in many old areas, such as the renovations of hillsbrad foothills, in particular the quests following a particular death knight and his trip to the gilnean island. And even boring old places such as aszhara have been revamped to become more player friendly.
Orgrimmar is arguably the most changed place in all of azeroth now, the end of thrall has spelt disaster for those who despise the skies; and the city has more than quadroupled in size.
Blizzard's changes to lvling have arguably been one of the best modifications they have made to the game in general, and something that will be highly appreciated in the future.
Say hello to 10 more alts azeroth, they are coming!
Thursday, December 9, 2010
Welcome to the new found world, azeroth style.
"Welcome to the new world of warcraft", a phrase that has been thrown around diligently with the arrival of Blizzards 3rd expansion:cataclysm. This expansion marks the end of an impoverished world of azeroth and introduces one of splendour and grandeur. The game engine undergone some notable changes, but the most significant changes were indeed to the game world itself. Many game places experienced "cataclysmic" events, such as the destruction of the barrens into both an oasis and a crater, or the flooding of mirage flats in thousand needles.
And of course with a brand new expansion, comes this.
And of course with a brand new expansion, comes this.
Monday, December 6, 2010
Sc2 Laddering
Kudos to Excalibur_Z from the Teamliquid forums for making such a great post on the Sc2 laddering system. Sourced from http://www.teamliquid.net/forum/viewmessage.php?topic_id=142211.
Introduction
This post is a followup to the original ladder analysis post, which shall go into further detail regarding the system. Please note that much of the content contained within this post is of a more speculative nature, and if a detail here is wrong it should not reflect poorly on the original analysis. I will be delving deeper into the mathematical underpinnings, though it should not be excessively complex and I will try to make it easy to follow.
Overview
To start with, we assumed that Blizzard used a system quite similar to their WoW Arena matchmaking system, albeit with refinements. The Arena system uses a Bayesian inference model to create its ladder and do its matchmaking. What this means in essence is that the rating used to represent your skill is easily updated after each match. For more details, see: http://en.wikipedia.org/wiki/Bayesian_analysis
In conjunction with this, the MMR is actually one part of the skill probability distribution. Blizzard also uses an “uncertainty” factor. That is, when you first start in Arena there is a lot of uncertainty in your rating. As you play more games, that uncertainty decreases and the system is more “confident” in the rating it has assigned to you. I will be referring to this uncertainty factor as sigma, and it is the inverse of the system's confidence. This forms a bell curve, also known as a Gaussian, or normal, distribution. For more details, see: http://en.wikipedia.org/wiki/Gaussian_distribution . The curve represents a couple related ideas: the range in which your skill may truly fall, as well as the fact that you do not play at exactly the same skill level every game. A more consistent player would have a narrower curve, for example.
This class of ladder and matchmaking is not new. The first system using a method similar to this is the Glicko system, used to rank chess players, and is arguably better than the famous ELO system which encourages some strange behavior (e.g. it is better to draw in ELO than risk a loss in many cases). Another well-known system is Microsoft TrueSkill, used in every Xbox 360 game for matchmaking and ranking, as well as PC games such as Dawn of War 2.
The published data on TrueSkill gives a glimpse at the underpinnings of a modern Bayesian ranking system designed for videogames. Blizzard’s implementations are obviously different from TrueSkill, though we can infer much from what we know about TrueSkill, and what we know about the SC2 ladder.
For a layman’s primer on TrueSkill: http://research.microsoft.com/en-us/projects/trueskill/details.aspx
For an in-depth description of TrueSkill: http://research.microsoft.com/apps/pubs/default.aspx?id=67956
Matchmaking
The short version of what the links above show is that it is possible (and computationally efficient) to take the MMR and uncertainty factor (also known as sigma, or standard deviation) for both players. The MMR and sigma form a bell curve per player. It is possible to combine the bell curves into a 3D probability distribution. This is done by combining the data to form a shape like this:
It may help to think of it as combining the two 2D curves perpendicularly and forming this 3D shape. This shape is centered on a point in the (x,y) plane, where x represents player 1’s skill, and y represents the skill of player 2. Intuitively, the best matches will be between ratings where x=y. Thus, Blizzard attempts to keep it as close as possible. Looking at this same shape top-down (try to visualize it as a topographical map):
Run a line along x=y, and you will split the shape into 2 pieces. If you sum the volume under the shape on each side of this split, and compare their relative size you will get the probability of a player victory. If the curve is contained wholly within one side of the graph then clearly that player is overwhelmingly favored by the system (Note: this is NOT the same thing as the “Favored” display on the loading screen!). Also note that this does not need to be circular when looking at a top-down section. If players have different confidence values it will look like an ellipse.
Note that this figure is taken from a TrueSkill presentation, and is copyright Microsoft. TrueSkill incorporates the possibility of a draw. More intuitively, it can be thought of as the “matchmaking sweet spot”, and something similar is likely used by SC2’s ladder to provide the system some wiggle room in matchmaking.
After a match finishes, the system needs to update the MMR and sigma for both players. Displayed rating will be discussed later in this post. Whenever a match finishes the winner’s MMR increases and the loser’s decreases. More interesting is what happens to the sigmas. If the match finished as expected with the MMR favored player winning (and remember, the loading screen “favored” display is NOT this) then both players' sigmas will decrease. That is, the system gains confidence in the ratings it has assigned to the players. If the match finishes in an upset and both players' sigmas are small, then the sigmas for both players will increase as the system thinks it may have an incorrect rating assigned to both. The change in sigma scales based upon the difference in MMR and the difference in sigmas. That is, losing to someone close to your own rank will not change your sigma too much (though it will over the course of several games).
If a lower-MMR player wins then what happens depends a lot more on their precise equations they are using. If a player's sigma is large in an upset (whether he's the winner or loser) it can decrease. That is because, given the right MMR and sigma values, it's possible in theory for the system to learn about that player's skill and rate him more accurately. If a player's sigma is small, however, it can become larger after an upset if that upset was truly unexpected.
To summarize: combining the MMR and uncertainty factor of a player creates a curve. Take two of these curves and form a 3D shape. This shape shows the probability of victory when split along x=y. Matchmaking tries to have x=y, but will expand the search if no match is found quickly.
Promotion
As initially theorized, promotion requires your MMR to be above a certain league threshold. However, because MMR changes greatly after each match and the opponent variation is so wide, often spanning multiple leagues, the system requires a particular degree of confidence before it allows promotion. Our initial theory assumed that sigma just needed to be small enough to allow promotion, but it's been confirmed that sigma never gets this small. Instead, it does this by a moving average. Here's an example:
MMR is erratic. A moving average seeks to smooth out the rapidly changing data points over time by evaluating your progress over X number of games. As we previously estimated, the system doesn't use your full match history because if it did, you would eventually get stuck in a league. Once your moving average crosses a particular league threshold, that's when you'll get promoted.
Players like CauthonLuck and Ret who had obscene win ratios had their MMR data points skyrocket. However, the moving average lags behind. In the cases of those players, it will take much longer for the moving average to reach that required threshold. This is why players like IdrA who were affected by this problem have decided to intentionally throw games in order to get promoted, because it allows the moving average to catch up more quickly.
Possibly related is players that aren't getting promoted or demoted properly despite a high likelihood that their moving average would have crossed the confidence threshold. Blizzard has said that this is indeed a bug and will be fixed by moving the affected players to new divisions.
Displayed Rating
Ok, how does all of this tie into displayed rating and the whole “favored” deal? If you remember back to WoW, ratings changed based on a direct comparison of your displayed rating to the other team’s MMR. So if your current rating was 500 and you were playing people with MMRs of 2000, your rating would jump significantly after every win because of the wide disparity. Now, we’ve identified that on the loading screen quite often players are seeing the other person as favored and the opponent (who is nominally “favored”) also sees his opponent as favored! How can this be? The theory put forth here is the system is again comparing your displayed rating to your opponent’s hidden MMR.
The reason for this is so that the system brings you toward your MMR more quickly. kzn explains:
On August 08 2010 14:30 kzn wrote:
How it works was like this: Say you've got a MMR of 2500, and you start a new team. It starts at 0 rating, but the matchmaking system will match you with other players of MMR 2500. If you lose a game, your team rating would not change at all. If you won, it would increase by 47 (a hard cap that was in place at least when I played). This was not explained as arising due to an interaction between the team rating and the opponent's MMR, however - it was explained as the system trying to get your team's rating as close as possible to your team's MMR rapidly.
Therefore, a corollary here is that when determining rating increase, the hidden threshold value for your league is added to your displayed rating, then compared to your opponent’s MMR, for purposes of computing the gain/loss to your displayed rating.
Example: ExcaliburZ and I play a game. His MMR: 2600, sigma: 100, displayed rating: 300. My MMR: 2500, sigma: 50, rating: 150. Diamond’s MMR threshold: 2300. Excal wins because he rules. What happens?
- His MMR will increase
- My MMR will decrease
- Both of our sigmas will decrease
- His rating will increase. How? By comparing my MMR (2500) against his rating + diamond’s MMR threshold: 300 + 2300 = 2600, his gain is thus off 2600 vs my MMR of 2500
- My rating will decrease. In the same way: his MMR: 2600. My rating + threshold: 150 + 2300. Thus I lose points proportionally to 2450 vs 2600.
Conclusions
SC2 uses a Bayesian inference system for its skill determination which forms an MMR and a confidence value for each player. These form a Gaussian distribution useful in determining win probability. Promotions/demotions occur when a player exceeds/drops below a threshold with sufficient confidence. Displayed rating changes according to a combination of the rating itself combined with the hidden MMR and league thresholds.
More clarifications from Vanick:
On August 08 2010 11:33 vanick wrote:
To be clear, the player's skill is never pinpointed. The sigma is never 0. All players vary in their performance from game to game and over time as their skill increases (or decreases!).
I left a point out in my writeup that I probably should have included. TrueSkill, and likely SC2's ladder, have a factor based off the time since your last game that increases the player's uncertainty level (sigma) by an amount related to that. Even if you're playing games back to back this factor will have a minimum value that will still increase sigma. This allows the system to adapt to a player whose skill increases over time.
Questions
Some of these have answers. Some are open questions. You can add on; I will answer them as best I can.
Q: So what’s the deal with people stuck in Platinum who can’t get promoted to Diamond despite clearly belonging there?
A: Short answer? It’s a bug. Longer answer: a lot of people have suggested that the system requires you to lose in order to build its confidence factor. This is almost certainly incorrect. The system in theory learns enough about you from your wins to promote you. Intuitively, if your record is 60-5 against diamond players, you ought to be in Diamond. The TrueSkill system can determine this, and I would be dollars to donuts that Blizzard’s system can too, as designed anyways. Implementation may have introduced bugs that certain players hit under certain conditions. We don’t have enough evidence to flat out state that the system requires you to lose. It may be a workaround to the bug, however.
One possible explanation is that the moving average lags so far behind that more games are required in order to cross the promotion threshold. It's also possible that the bug prevents the moving average from changing.
Q: So how do bonus points affect the display rating changes? If the displayed rating change is based upon the comparison of the opponent's MMR with the player's displayed rating + the player's league cutoff, then wouldn't bonus points inflate the displayed rating and cause problems?
A: I'm not sure how they account for this. One possibility is they keep track of bonus points that make up your displayed rating, and ignore them when performing the calculation in the back-end.
Excal: It seems more likely that the bonus pool is only used to increase the displayed rating for division ranking purposes and ignored in back-end calculation because the bonus pool increases at the same rate for all players. This introduces a constant that is easily discarded when assessing actual skill within the system. Furthermore, if bonus points were considered in the process of point calculation, it would present an unfair advantage for players who have not yet used up their bonus pool (because their rating is therefore inflated giving them more to lose).
Q: Would it take longer to get promoted if you've played lots of games? Assuming someone played a large amount of games (say 100 with a 50% win/loss ratio). If he were to start winning 70% of his games, would it be harder for him to get promoted than someone with similar percentages but fewer games played?
A: It would take longer, yes. The moving average trails behind sharp increases in skill.
Introduction
This post is a followup to the original ladder analysis post, which shall go into further detail regarding the system. Please note that much of the content contained within this post is of a more speculative nature, and if a detail here is wrong it should not reflect poorly on the original analysis. I will be delving deeper into the mathematical underpinnings, though it should not be excessively complex and I will try to make it easy to follow.
Overview
To start with, we assumed that Blizzard used a system quite similar to their WoW Arena matchmaking system, albeit with refinements. The Arena system uses a Bayesian inference model to create its ladder and do its matchmaking. What this means in essence is that the rating used to represent your skill is easily updated after each match. For more details, see: http://en.wikipedia.org/wiki/Bayesian_analysis
In conjunction with this, the MMR is actually one part of the skill probability distribution. Blizzard also uses an “uncertainty” factor. That is, when you first start in Arena there is a lot of uncertainty in your rating. As you play more games, that uncertainty decreases and the system is more “confident” in the rating it has assigned to you. I will be referring to this uncertainty factor as sigma, and it is the inverse of the system's confidence. This forms a bell curve, also known as a Gaussian, or normal, distribution. For more details, see: http://en.wikipedia.org/wiki/Gaussian_distribution . The curve represents a couple related ideas: the range in which your skill may truly fall, as well as the fact that you do not play at exactly the same skill level every game. A more consistent player would have a narrower curve, for example.
This class of ladder and matchmaking is not new. The first system using a method similar to this is the Glicko system, used to rank chess players, and is arguably better than the famous ELO system which encourages some strange behavior (e.g. it is better to draw in ELO than risk a loss in many cases). Another well-known system is Microsoft TrueSkill, used in every Xbox 360 game for matchmaking and ranking, as well as PC games such as Dawn of War 2.
The published data on TrueSkill gives a glimpse at the underpinnings of a modern Bayesian ranking system designed for videogames. Blizzard’s implementations are obviously different from TrueSkill, though we can infer much from what we know about TrueSkill, and what we know about the SC2 ladder.
For a layman’s primer on TrueSkill: http://research.microsoft.com/en-us/projects/trueskill/details.aspx
For an in-depth description of TrueSkill: http://research.microsoft.com/apps/pubs/default.aspx?id=67956
Matchmaking
The short version of what the links above show is that it is possible (and computationally efficient) to take the MMR and uncertainty factor (also known as sigma, or standard deviation) for both players. The MMR and sigma form a bell curve per player. It is possible to combine the bell curves into a 3D probability distribution. This is done by combining the data to form a shape like this:
It may help to think of it as combining the two 2D curves perpendicularly and forming this 3D shape. This shape is centered on a point in the (x,y) plane, where x represents player 1’s skill, and y represents the skill of player 2. Intuitively, the best matches will be between ratings where x=y. Thus, Blizzard attempts to keep it as close as possible. Looking at this same shape top-down (try to visualize it as a topographical map):
Run a line along x=y, and you will split the shape into 2 pieces. If you sum the volume under the shape on each side of this split, and compare their relative size you will get the probability of a player victory. If the curve is contained wholly within one side of the graph then clearly that player is overwhelmingly favored by the system (Note: this is NOT the same thing as the “Favored” display on the loading screen!). Also note that this does not need to be circular when looking at a top-down section. If players have different confidence values it will look like an ellipse.
Note that this figure is taken from a TrueSkill presentation, and is copyright Microsoft. TrueSkill incorporates the possibility of a draw. More intuitively, it can be thought of as the “matchmaking sweet spot”, and something similar is likely used by SC2’s ladder to provide the system some wiggle room in matchmaking.
After a match finishes, the system needs to update the MMR and sigma for both players. Displayed rating will be discussed later in this post. Whenever a match finishes the winner’s MMR increases and the loser’s decreases. More interesting is what happens to the sigmas. If the match finished as expected with the MMR favored player winning (and remember, the loading screen “favored” display is NOT this) then both players' sigmas will decrease. That is, the system gains confidence in the ratings it has assigned to the players. If the match finishes in an upset and both players' sigmas are small, then the sigmas for both players will increase as the system thinks it may have an incorrect rating assigned to both. The change in sigma scales based upon the difference in MMR and the difference in sigmas. That is, losing to someone close to your own rank will not change your sigma too much (though it will over the course of several games).
If a lower-MMR player wins then what happens depends a lot more on their precise equations they are using. If a player's sigma is large in an upset (whether he's the winner or loser) it can decrease. That is because, given the right MMR and sigma values, it's possible in theory for the system to learn about that player's skill and rate him more accurately. If a player's sigma is small, however, it can become larger after an upset if that upset was truly unexpected.
To summarize: combining the MMR and uncertainty factor of a player creates a curve. Take two of these curves and form a 3D shape. This shape shows the probability of victory when split along x=y. Matchmaking tries to have x=y, but will expand the search if no match is found quickly.
Promotion
As initially theorized, promotion requires your MMR to be above a certain league threshold. However, because MMR changes greatly after each match and the opponent variation is so wide, often spanning multiple leagues, the system requires a particular degree of confidence before it allows promotion. Our initial theory assumed that sigma just needed to be small enough to allow promotion, but it's been confirmed that sigma never gets this small. Instead, it does this by a moving average. Here's an example:
MMR is erratic. A moving average seeks to smooth out the rapidly changing data points over time by evaluating your progress over X number of games. As we previously estimated, the system doesn't use your full match history because if it did, you would eventually get stuck in a league. Once your moving average crosses a particular league threshold, that's when you'll get promoted.
Players like CauthonLuck and Ret who had obscene win ratios had their MMR data points skyrocket. However, the moving average lags behind. In the cases of those players, it will take much longer for the moving average to reach that required threshold. This is why players like IdrA who were affected by this problem have decided to intentionally throw games in order to get promoted, because it allows the moving average to catch up more quickly.
Possibly related is players that aren't getting promoted or demoted properly despite a high likelihood that their moving average would have crossed the confidence threshold. Blizzard has said that this is indeed a bug and will be fixed by moving the affected players to new divisions.
Displayed Rating
Ok, how does all of this tie into displayed rating and the whole “favored” deal? If you remember back to WoW, ratings changed based on a direct comparison of your displayed rating to the other team’s MMR. So if your current rating was 500 and you were playing people with MMRs of 2000, your rating would jump significantly after every win because of the wide disparity. Now, we’ve identified that on the loading screen quite often players are seeing the other person as favored and the opponent (who is nominally “favored”) also sees his opponent as favored! How can this be? The theory put forth here is the system is again comparing your displayed rating to your opponent’s hidden MMR.
The reason for this is so that the system brings you toward your MMR more quickly. kzn explains:
On August 08 2010 14:30 kzn wrote:
How it works was like this: Say you've got a MMR of 2500, and you start a new team. It starts at 0 rating, but the matchmaking system will match you with other players of MMR 2500. If you lose a game, your team rating would not change at all. If you won, it would increase by 47 (a hard cap that was in place at least when I played). This was not explained as arising due to an interaction between the team rating and the opponent's MMR, however - it was explained as the system trying to get your team's rating as close as possible to your team's MMR rapidly.
Therefore, a corollary here is that when determining rating increase, the hidden threshold value for your league is added to your displayed rating, then compared to your opponent’s MMR, for purposes of computing the gain/loss to your displayed rating.
Example: ExcaliburZ and I play a game. His MMR: 2600, sigma: 100, displayed rating: 300. My MMR: 2500, sigma: 50, rating: 150. Diamond’s MMR threshold: 2300. Excal wins because he rules. What happens?
- His MMR will increase
- My MMR will decrease
- Both of our sigmas will decrease
- His rating will increase. How? By comparing my MMR (2500) against his rating + diamond’s MMR threshold: 300 + 2300 = 2600, his gain is thus off 2600 vs my MMR of 2500
- My rating will decrease. In the same way: his MMR: 2600. My rating + threshold: 150 + 2300. Thus I lose points proportionally to 2450 vs 2600.
Conclusions
SC2 uses a Bayesian inference system for its skill determination which forms an MMR and a confidence value for each player. These form a Gaussian distribution useful in determining win probability. Promotions/demotions occur when a player exceeds/drops below a threshold with sufficient confidence. Displayed rating changes according to a combination of the rating itself combined with the hidden MMR and league thresholds.
More clarifications from Vanick:
On August 08 2010 11:33 vanick wrote:
To be clear, the player's skill is never pinpointed. The sigma is never 0. All players vary in their performance from game to game and over time as their skill increases (or decreases!).
I left a point out in my writeup that I probably should have included. TrueSkill, and likely SC2's ladder, have a factor based off the time since your last game that increases the player's uncertainty level (sigma) by an amount related to that. Even if you're playing games back to back this factor will have a minimum value that will still increase sigma. This allows the system to adapt to a player whose skill increases over time.
Questions
Some of these have answers. Some are open questions. You can add on; I will answer them as best I can.
Q: So what’s the deal with people stuck in Platinum who can’t get promoted to Diamond despite clearly belonging there?
A: Short answer? It’s a bug. Longer answer: a lot of people have suggested that the system requires you to lose in order to build its confidence factor. This is almost certainly incorrect. The system in theory learns enough about you from your wins to promote you. Intuitively, if your record is 60-5 against diamond players, you ought to be in Diamond. The TrueSkill system can determine this, and I would be dollars to donuts that Blizzard’s system can too, as designed anyways. Implementation may have introduced bugs that certain players hit under certain conditions. We don’t have enough evidence to flat out state that the system requires you to lose. It may be a workaround to the bug, however.
One possible explanation is that the moving average lags so far behind that more games are required in order to cross the promotion threshold. It's also possible that the bug prevents the moving average from changing.
Q: So how do bonus points affect the display rating changes? If the displayed rating change is based upon the comparison of the opponent's MMR with the player's displayed rating + the player's league cutoff, then wouldn't bonus points inflate the displayed rating and cause problems?
A: I'm not sure how they account for this. One possibility is they keep track of bonus points that make up your displayed rating, and ignore them when performing the calculation in the back-end.
Excal: It seems more likely that the bonus pool is only used to increase the displayed rating for division ranking purposes and ignored in back-end calculation because the bonus pool increases at the same rate for all players. This introduces a constant that is easily discarded when assessing actual skill within the system. Furthermore, if bonus points were considered in the process of point calculation, it would present an unfair advantage for players who have not yet used up their bonus pool (because their rating is therefore inflated giving them more to lose).
Q: Would it take longer to get promoted if you've played lots of games? Assuming someone played a large amount of games (say 100 with a 50% win/loss ratio). If he were to start winning 70% of his games, would it be harder for him to get promoted than someone with similar percentages but fewer games played?
A: It would take longer, yes. The moving average trails behind sharp increases in skill.
Friday, December 3, 2010
2010 Update GSL and whatnot: What will we do without our Boxer?
Well the last few rounds of gsl have been quite interesting, a lot of big names have been eliminated from the competition and after seeing the death of some of the big FOus and slayersboxer, it i s hard to tell who exactly is going to take the season 3 GSL cup.
One big surprise was seeing a westerner make the top 8, with the advent of LiquidJinro to the Korean scene. Will he make the metagame more interesting? Only time will tell. We have seen what SlayersBoxer and Fruitdealer have done to it, will Jinro's playstyle be just as influential?
Of course Fruitdealer is known for winning GSL 1,taking out HopeTorture(Intotherainbow).
He qualified for GSL 2 and defeated his first opponent in the round of 64. FruitDealer was invited to do a showmatch agains, SlayersBoxer where he defeated The Emperor. After returning to Korea to play in the round of 32 in the GSL his opponent, also with the ID Boxer but more commonly known as "Fake Boxer", defeated the reigning champion 2-0.
Hmmm maybe its time to create a blog where each day i reveal to all my avid readers(All three of you) a pro from the korean scene which you may or may not be unfamiliar with. Hmmm maybe i'll do that.
Featured Above SlayersBoxer; korean emperor
Featured Below left: Team Liqiud's Jinro
Featured Below right: EG's Grack "IdRa" Fields.
Subscribe to:
Posts (Atom)