Win\lose depended system

xylrik12

Member
Thank you for making another dota 2 trash yard. I so disappointed and it is my fault i believed in these fairy tales that your amount of losses doesnt mean anything. It is right for some point of view, because you cant get lower than your hidden mmr. But this is undiscussable fact, that calibration system bases on win\lose. none cares about your impact - not a real player seats and watching your game. During calibration i finally got Phantom 4, but thanks to leavers/afks and other unskilled players i had dropped to archont 4. I thought okay, i can come it back ))) no. the system lets me win just 2/7 games. Then i get another lose streak and doomed to play with low skill players. Your team mistakes ? - you pay everything. Had low hp enemie's patron ? - your team wont push )) The enemies near your base ? - someone say you in chat "i spilled my coffee" and abandoned match without being kicked... Thank you for this satisfying experience. Hope that all normal players will go to faceit. There is no point to play ranked in alfa-game which developing by small indi team named "valve" and which consists of 300 people who work at the same time with other 3 projects
 
I'm not a Deadlock developer, just an unrelated game dev and a nerd, but I have many things to say. Firstly, this isn't a bug report, and should probably be in "Changelog Feedback". There's no good reason to make this a public thread in my opinion.

Why W/L based matchmaking works:​

  1. Opponents on the opposing team are playing in the exact same matchmaking queue as you. If you're being held back by your teammates, so are they.
  2. If you truly are in a lower bracket than you should be, the system will figure that out over time. You will win a slightly higher than average amount of games and progress upward.

Why metrics would be worse:​

  1. If you used total souls as a metric for success: That would encourage higher rank players to be greedy and prioritize hording as many souls as possible rather than helping out their team.
  2. If you think player kills, assists, deaths, or KDA as a whole should be used: That would encourage players to hunt down enemy players and neglect minions, jungle, mid boss, and other important aspects of the game.
  3. If you think objective damage would be a good metric for success: That would encourage players to move from lane to lane only when an objective is under attack.
  4. Healing obviously has no correlation to total success.

In conclusion:​

The only way to properly measure how skilled a player is at winning the game is to measure how often they win the game. The exact math and statistical evidence for why this system works in team games is well outside the scope of this one forum post but I can run a Elo simulation and analysis with 6 player teams if you still think that the system is flawed.

Please don't take any of this as an insult or personal to you specifically. I've seen this same sentiment across a wide range of players, but I assure you the developers see things that you don't, and they've thought of things that you haven't. So please try to be patient with them.
 
If you're being held back by your teammates, so are they.
This is only true as an average over time, assuming everyone on your team is soloqueuing and everyone on their team is soloqueing.

  1. If you truly are in a lower bracket than you should be, the system will figure that out over time. You will win a slightly higher than average amount of games and progress upward.
This is a vibe (it relies on multiple assumptions, like a functional ranking system and balanced teams per-game) and personally the data does not support that conclusion.
 
This is only true as an average over time, assuming everyone on your team is soloqueuing and everyone on their team is soloqueing.


This is a vibe (it relies on multiple assumptions, like a functional ranking system and balanced teams per-game) and personally the data does not support that conclusion.
For your first response:
  1. No, because the game automatically puts you in a separate matchmaking queue if you're queuing with someone of a different skill range
  2. You can enable soloqueue-only matchmaking
And for your second point: No it's not a "Vibe".
  1. No I'm obviously not assuming balanced teams, the line you quoted directly says that I'm assuming you're in a lower skill bracket than you should be. In other words, there's an imbalance.
  2. My only actual assumption was that they're using the Elo rating system, which is pretty much the most basic MMR system there is, and has been used for a long time. More than likely, they're probably using something more fine tuned for their purposes which would mean my conclusions are under-estimating its effectiveness.
  3. The only reason the "data" doesn't support that conclusion is because of survivorship bias. We're only seeing data from people who are complaining, so it's gonna be the kind of data that supports complaining.
I made no unreasonable assumptions in my post, and in fact I was quite quick to give the benefit of the doubt in a few cases. W/L based matchmaking has been shown to work in team games, this really doesn't even need to be discussed.
 
I'm not a Deadlock developer, just an unrelated game dev and a nerd, but I have many things to say. Firstly, this isn't a bug report, and should probably be in "Changelog Feedback". There's no good reason to make this a public thread in my opinion.

Why W/L based matchmaking works:​

  1. Opponents on the opposing team are playing in the exact same matchmaking queue as you. If you're being held back by your teammates, so are they.
  2. If you truly are in a lower bracket than you should be, the system will figure that out over time. You will win a slightly higher than average amount of games and progress upward.

Why metrics would be worse:​

  1. If you used total souls as a metric for success: That would encourage higher rank players to be greedy and prioritize hording as many souls as possible rather than helping out their team.
  2. If you think player kills, assists, deaths, or KDA as a whole should be used: That would encourage players to hunt down enemy players and neglect minions, jungle, mid boss, and other important aspects of the game.
  3. If you think objective damage would be a good metric for success: That would encourage players to move from lane to lane only when an objective is under attack.
  4. Healing obviously has no correlation to total success.

In conclusion:​

The only way to properly measure how skilled a player is at winning the game is to measure how often they win the game. The exact math and statistical evidence for why this system works in team games is well outside the scope of this one forum post but I can run a Elo simulation and analysis with 6 player teams if you still think that the system is flawed.

Please don't take any of this as an insult or personal to you specifically. I've seen this same sentiment across a wide range of players, but I assure you the developers see things that you don't, and they've thought of things that you haven't. So please try to be patient with them.
What are you talking about ))) The metric system you described above the only best way to define player skill. You work on developing your own skill and rank, not the rank of your team as it works in real sport. You play with random noobs, the mm system is poor i have low skilled allies each second game. I played calibration with phantoms - ascendants and felt good on this rank (teamwork, communication, understanding of game). But FOR NO REASON (THE REASON IS VALVE NEVER DID COMPETITIVE TEAM GAMES. DOTA WAS LIKE THIS BUT TILL 2020s) I AM DOOMED TO PLAY ON EMISSARY POOL ))). I DO MAX IMPACT ON EACH FUCKIN GAME, I MOVE BETTER, PRESS BUTTONS BETTER, I HAVE BETTER UNDERSTANDING OF MACRO AND I DONT FUCKIN KNOW WHAT ELSE SHOULD I DO TO GROW IN THIS FUCKIN GAME. I DOWNLOADED THE GAME JUST 3 DAYS AGO AND HAVE ONLY 4 SUCCESSFUL GAMES/18 (and now i even fell down from archont 4 to emissary 2). I AM ALWAYS MVP OF MY TEAM WHICH CONSISTS WITH TOTAL BOTS. So, stop talking this braindead shit. team game is phenomen of cybersport (imagine you have to cooperate on a pro scene with 5 random mates who besides got muted). The casual players and their rank should be counted with other ways (at least they can hire pro gamers for watching games, so they manually award the rank.
 
What are you talking about ))) The metric system you described above the only best way to define player skill. You work on developing your own skill and rank, not the rank of your team as it works in real sport. You play with random noobs, the mm system is poor i have low skilled allies each second game. I played calibration with phantoms - ascendants and felt good on this rank (teamwork, communication, understanding of game). But FOR NO REASON (THE REASON IS VALVE NEVER DID COMPETITIVE TEAM GAMES. DOTA WAS LIKE THIS BUT TILL 2020s) I AM DOOMED TO PLAY ON EMISSARY POOL ))). I DO MAX IMPACT ON EACH FUCKIN GAME, I MOVE BETTER, PRESS BUTTONS BETTER, I HAVE BETTER UNDERSTANDING OF MACRO AND I DONT FUCKIN KNOW WHAT ELSE SHOULD I DO TO GROW IN THIS FUCKIN GAME. I DOWNLOADED THE GAME JUST 3 DAYS AGO AND HAVE ONLY 4 SUCCESSFUL GAMES/18 (and now i even fell down from archont 4 to emissary 2). I AM ALWAYS MVP OF MY TEAM WHICH CONSISTS WITH TOTAL BOTS. So, stop talking this braindead shit. team game is phenomen of cybersport (imagine you have to cooperate on a pro scene with 5 random mates who besides got muted). The casual players and their rank should be counted with other ways (at least they can hire pro gamers for watching games, so they manually award the rank.
I'm going to be honest, most of this comment is unintelligible. But I'll try to address it to the best of my ability.
  1. The idea of using metrics like that as a form of ranking players is fundamentally a bad idea because it rewards players who attempt to increase their metrics at the detriment of the team or the overall game.
  2. A matchmaking system that doesn't count wins and losses is by definition a matchmaking system that rewards players for losing, and therefor it would be a matchmaking system that doesn't actually track skill. The only way to measure how good a player is at winning is to measure them winning.
  3. Players are sometimes rewarded for their individual accomplishments in real sports, but only because those individual accomplishments necessarily lead to better performance in the games. A basketball player wouldn't be rewarded for the amount of time he had the ball because that would encourage a lack of passing for example. While I'm sure some metrics could be used, the game itself does need to count wins and losses, or something very close to them.
  4. I think it's not only incredibly rude, but also incredibly hypocritical to call me braindead in response to a purely objective and logical post. If you can run the numbers and rigorously prove that your system would be somehow better, than by all means be my guest. But I don't see that happening because it simply isn't true.
 
I'm going to be honest, most of this comment is unintelligible. But I'll try to address it to the best of my ability.
  1. The idea of using metrics like that as a form of ranking players is fundamentally a bad idea because it rewards players who attempt to increase their metrics at the detriment of the team or the overall game.
  2. A matchmaking system that doesn't count wins and losses is by definition a matchmaking system that rewards players for losing, and therefor it would be a matchmaking system that doesn't actually track skill. The only way to measure how good a player is at winning is to measure them winning.
  3. Players are sometimes rewarded for their individual accomplishments in real sports, but only because those individual accomplishments necessarily lead to better performance in the games. A basketball player wouldn't be rewarded for the amount of time he had the ball because that would encourage a lack of passing for example. While I'm sure some metrics could be used, the game itself does need to count wins and losses, or something very close to them.
  4. I think it's not only incredibly rude, but also incredibly hypocritical to call me braindead in response to a purely objective and logical post. If you can run the numbers and rigorously prove that your system would be somehow better, than by all means be my guest. But I don't see that happening because it simply isn't true.
1) you are talking about that matric system as like it would work in particular. i mean, 1 player will farm, another will only fight etc. but it should and will work in a complex: they have to make such a metric system with count and reward players for certain things they did according to their role in a game. for example: support healing and assist rate, dd - obj, player damage, etc. all these actions that measured in the statistic show the real impact each player made for their team. 2) How long you didnt play ? they removed ranked matchmaking in its straight sense, now we have casual game, where if you lose - you waste 1 medal, if you win - you gain it. and the game had never rewarded players for losing, this is total bullshit. in prev calibration system, you couldn't just lose rank less than your hidden one, even if you had lose streak. 3) man you dont even know what are you talking about ) you wear pink glasses or what ? i have all evidence it doesn't work like you said ) and i also played enough dota to realize it. idk what game had you played before deadlock, but you really never heard about hidden pool ? i can give you all demos just to look how i play, what impact i have for my team and with whom i play )) i do everything, despite having each line won, despite my participation on fights and attempts of pushing towers, i always look at the map, more often than i look on a game, i always support other players, always go to gank etc. but man, can you tell me how i can win, when almost 3 guy on my team just passive noobs, who either afk solo walking on a line till enemies' t3 and feeding or just abandon game with no reason when we can def. you dont even have imagination, what are you talking about in such conditions there is no point to measure player's skill according to their win/lose. so you are talking really stupid shit that you brought here from overwatch, valorant, fortnite, rainbow6, battlefield or i dont know what else western people usually play. even cs2 doesnt have this problem, because game provides you with all resources to make game solo. but moba always requires all players to do their maximum. if 1-3 player lack, you never win, because you play against six players which dont have such a problem, which team is full concentrated. this is how hidden pool works: it puts you into really deep trash just for players from a normal pool, who dont have chat restrictions, etc so they will be able to win with a 50% chance. pls stop justify the problem, it really exists, most players and developers know about it, but the reason they dont do anything is a worry to lose casual players - main online. because if they put random or skill-based player selection system on a game, so all games become finally equal, it would be difficult for unskilled players who want just chill and see how they randomly win doing minimum impact
 
I ran a statistical analysis:
1734040018418.png
The results are exactly what I told you they were going to be, but this graph by itself doesn't mean much so let me break down how I did this:

How the simulation works​

Step 1: Starting conditions.​

First, we generate 1000 simulated players. Each player has a "True Skill" and a starting Elo score both picked from a normal distribution with a mean of 1550 and a standard deviation of 200. These numbers were picked based on chess Elo score data since it's readily available.

Step 2: Simulating matches.​

First, we pick a random player, and then remove 95% of the matchmaking queue to simulate players who are already busy in matches. We then pick the closest 11 remaining players in skill to the original random player. These will be the players of our simulated game. We assign 6 of them to the Sapphire Flame and six of them to the Amber Hand, and create an average Elo score for each team based on the players' Elo scores, as well as an average skill score. Using the average skill score, we calculate the real probability of each team winning, and randomize the outcome of the match based on that probability. We then calculate the gained/lost Elo points based on the Elo score expected probability and the actual outcome. Each player gains and loses this amount of Elo points depending on which team they were on. The K value used for the Elo rating system here is equal to 10+5000/(100+X) with X being the amount of total games played by individual players in the match.

Step 3: Measuring the results.​

In each iteration of the simulation, we simulate 1000 total games, since that's the number of players in the simulation. After each iteration we measure the average difference between each player's skill and their Elo score. After 50 iterations (which is the number of games it took to unlock ranked by the old system), the average difference between player skill and Elo score is 85 points, which represents about a 6-4 matchup, which is pretty negligible.

Conclusions​

There are of course issues with the game, but the fact that it uses wins and losses as a metric for skill is not one of those issues. Here's a link to the spreadsheet. If you can find any flaws in my system, let me know and I can run the numbers again. If not, put this to rest.

In response to what you were saying​

  1. That is not what metric-based matchmaking would reward. Players would never deliver the urn because they could be farming instead. Players would never go mid because they could be farming instead. Anything that doesn't improve a stat would be completely ignored.
  2. They did remove "ranked" mode, but you clearly misunderstand how it works now. It's not just "win a game get a medal, lose a game lose a medal". There is a more complicated algorithm happening that you clearly don't understand.
  3. No I don't wear pink glasses, I actually have phenomenal vision. But I do wear striped thigh highs like any good software engineer.
  4. (still responding to your third point) I don't know of any game that doesn't use wins and losses as a matchmaking metric. League of Legends and DotA 2 both use win/loss based matchmaking, so I don't know why you brought up DotA as an example. Though DotA does use a "Hidden MMR", it's clearly not doing what you think it does. The "Hidden MMR" is a hidden matchmaking system used for unranked play. It's the same thing, just not shown to you.
:3
 
I ran a statistical analysis:
View attachment 32778
The results are exactly what I told you they were going to be, but this graph by itself doesn't mean much so let me break down how I did this:

How the simulation works​

Step 1: Starting conditions.​

First, we generate 1000 simulated players. Each player has a "True Skill" and a starting Elo score both picked from a normal distribution with a mean of 1550 and a standard deviation of 200. These numbers were picked based on chess Elo score data since it's readily available.

Step 2: Simulating matches.​

First, we pick a random player, and then remove 95% of the matchmaking queue to simulate players who are already busy in matches. We then pick the closest 11 remaining players in skill to the original random player. These will be the players of our simulated game. We assign 6 of them to the Sapphire Flame and six of them to the Amber Hand, and create an average Elo score for each team based on the players' Elo scores, as well as an average skill score. Using the average skill score, we calculate the real probability of each team winning, and randomize the outcome of the match based on that probability. We then calculate the gained/lost Elo points based on the Elo score expected probability and the actual outcome. Each player gains and loses this amount of Elo points depending on which team they were on. The K value used for the Elo rating system here is equal to 10+5000/(100+X) with X being the amount of total games played by individual players in the match.

Step 3: Measuring the results.​

In each iteration of the simulation, we simulate 1000 total games, since that's the number of players in the simulation. After each iteration we measure the average difference between each player's skill and their Elo score. After 50 iterations (which is the number of games it took to unlock ranked by the old system), the average difference between player skill and Elo score is 85 points, which represents about a 6-4 matchup, which is pretty negligible.

Conclusions​

There are of course issues with the game, but the fact that it uses wins and losses as a metric for skill is not one of those issues. Here's a link to the spreadsheet. If you can find any flaws in my system, let me know and I can run the numbers again. If not, put this to rest.

In response to what you were saying​

  1. That is not what metric-based matchmaking would reward. Players would never deliver the urn because they could be farming instead. Players would never go mid because they could be farming instead. Anything that doesn't improve a stat would be completely ignored.
  2. They did remove "ranked" mode, but you clearly misunderstand how it works now. It's not just "win a game get a medal, lose a game lose a medal". There is a more complicated algorithm happening that you clearly don't understand.
  3. No I don't wear pink glasses, I actually have phenomenal vision. But I do wear striped thigh highs like any good software engineer.
  4. (still responding to your third point) I don't know of any game that doesn't use wins and losses as a matchmaking metric. League of Legends and DotA 2 both use win/loss based matchmaking, so I don't know why you brought up DotA as an example. Though DotA does use a "Hidden MMR", it's clearly not doing what you think it does. The "Hidden MMR" is a hidden matchmaking system used for unranked play. It's the same thing, just not shown to you.
:3

Step 1: S​

1) ookay.. you just made a random statistic, which i dont get what supposed to prove, or you tried just to show me how it have to work theoretically.

Step 2​

2) common thing that such tech-nerds as you like to do: is to get rid of interlocutor, appealing to him like he is developer too and has relevant education, because im not intended to argue with your simulated statistic or numbers - it doesn't told me anything about what happens in fact. except lack of information the work in making investigation on how it works in deadlock and how many players play not on their rank would occupied a lot of time, and nobody will do it here i think. anyway, i dont assert that this win/lose based rank system doesnt work in principle, ofc it works, it can narrow down a difference between theoretical "true skill in itself and elo", but it works so only in a mathematical model. 1 fact - before they added ranked i had constant 1900 elo as i remember according to tracklock and my minimal rank after my single 3th week of calibration is archon 4. okay, we see almost 6-7 games till ascendant 2-4 which not so many. but it is in conditionals if i will win these games, i also can lose 6-7 games and then will get ritualist 4, so now the difference consists 14 games, and its too many if my true rank is ascendant. your numbers dont show the qualitative difference between those ranks to predict whether i win my next game as a ritualist because i m playing among "true ritualists" who have an appropriated to them understanding of game. i remind that your math and your "true skill with a standard deviation of 200" doesnt provide any criteria what is a true skill. so, i tell you second time, players cannot bring match in solo, so skill level of 1 person is not a sufficient condition for victory. therefore playstyle of my 5 ritualist teammates, not my own skill will influence on a result, you think what i will get after such match has to represent my skill ?

Step 3​

3) but this is not my point why i am think that sbmm based on a personal "metric stat" is batter. ive already responded on your first statement about why such a system wouldn't take just 1 point like "amount of souls" and make it prevailing in relation to others, it should work in a complex. your opinion based on the thought that many game activities will be taken out the brackets. solution: just add paragraphs such as "urns delivered: amount", "aegis stolen: amount" and that is it. players will be motivated for doing these activities.

Step 4​


4) you really dont know what a hidden pool is in dota, so should i explain ? hidden pool it isn't hidden mmr, its a certain manipulating on player selection system, which developers like Valve integrate into their games to make it more popular. It sounds like a conspiracy, but a lot of players, including high-ranked ones, are really sure about it. for example, if you are a bad player, the system selects on your team players who have a higher percentage of win rate or win streak and forces you to play against players with lower skill levels. it should motivate people not to drop the game due to it's difficulty and hold them to play further. but on another side, to prevent extreme player's win streaks or smurfing, the system artificially selects in your team 1-2 anchors (in community they called agents), or it even full team, just to stabilize your win rate and hold it % on 50. therefore, a lot of people have 50% of win rate and who have bigger - usually play party, so rank growth is carried out by such a proportion "2wins-1lose" and the system tries to hold this value. But in dota exists a hidden pool - place where people with a low reputation (behavior score, win rate etc) play in one team against casual players just like punching bag, so the last can easily win. everybody in the community knows about that. to be honest, i dont sure if it exists in deadlock (the game has just 16k constant online))), but i 100% sure that 50% system i described above doesnt work properly in the game despite of its presence. ofc i dont even talk about how honest it is from the side of developers and how it influences on competitive and equal foundation of the game - obviously bad. But check other threads on such topics, check uncountable complains on the "Bad mm thread" where people are indignant that games arent equal, that someone having 300 hours have full team of 70, and they play against 300+. when rank medals were shown it was expressed even in noticeable disproportion of them. many players, also have an extreme lose streak after an extreme win streak, performed in "1-gate" games. i had it too and can provide you with screens where the number of matches i won in a row are exactly the same i lost before. i have this thing just right now: my last 5 games are all successful, after many amounts of losses, and players in enemy team are as bad as they were in my team before.

Conclusions​


and IN THOSE SITUATIONS that take place in the game, i assume that sbmm system based on personal statistics is far more honest after all what i said. so put off your pink glasses and look at the game from the eyes of players (i played almost 5-6k hours in dota, i dont know how many exactly, i have several accounts, but i know what I'm talking about, i swam too much in this cauldron).
sorry for eng
:3:#:#:#3
 
Last edited:
  1. The actual starting conditions don't particular make a difference. I was just showing them here so that the results would be reproducible. But for the sake of being as fair as possible I used real-world conditions as a basis for the simulation.
  2. Most of what you said here was already addressed previously. But to clear up a few more things. There is no such thing as a ranking system that "only works in math". The whole game is made of math. The "true skill with a standard deviation of 200" is just a number representing the actual skill of players, and the simulation sees if it can figure out what their skill level is based on the outcomes of matches. Which it can.
  3. There are too many potential factors for an automated system to possibly rank you individually apart from your team, especially given the best course of action is heavily dependent on situation, and getting closer to a proper measurement will necessarily get you closer to predicting wins/losses, which is something you can and should already measure.
  4. The thing you've mentioned here is a conspiracy theory, and has been outright denied by the DotA 2 developers. This is simply not how the system works, you are falling victim to gambler's fallacy.
  5. (in reply to the conclusion) I do agree the game often has a problem of seeming one-sided, but this is not a flaw of win/loss based matchmaking. I believe this is actually more of a flaw caused by lower player numbers, the player base's insistence on having separate matchmaking queues (lowering the player pool further), as well as possibly an issue with the natural slowbally-ness of MOBAs as a genre. There are issues of course, but win/loss based MMR is not one of them.
 
I ran a statistical analysis:
View attachment 32778
The results are exactly what I told you they were going to be, but this graph by itself doesn't mean much so let me break down how I did this:

How the simulation works​

Step 1: Starting conditions.​

First, we generate 1000 simulated players. Each player has a "True Skill" and a starting Elo score both picked from a normal distribution with a mean of 1550 and a standard deviation of 200. These numbers were picked based on chess Elo score data since it's readily available.

Step 2: Simulating matches.​

First, we pick a random player, and then remove 95% of the matchmaking queue to simulate players who are already busy in matches. We then pick the closest 11 remaining players in skill to the original random player. These will be the players of our simulated game. We assign 6 of them to the Sapphire Flame and six of them to the Amber Hand, and create an average Elo score for each team based on the players' Elo scores, as well as an average skill score. Using the average skill score, we calculate the real probability of each team winning, and randomize the outcome of the match based on that probability. We then calculate the gained/lost Elo points based on the Elo score expected probability and the actual outcome. Each player gains and loses this amount of Elo points depending on which team they were on. The K value used for the Elo rating system here is equal to 10+5000/(100+X) with X being the amount of total games played by individual players in the match.

Step 3: Measuring the results.​

In each iteration of the simulation, we simulate 1000 total games, since that's the number of players in the simulation. After each iteration we measure the average difference between each player's skill and their Elo score. After 50 iterations (which is the number of games it took to unlock ranked by the old system), the average difference between player skill and Elo score is 85 points, which represents about a 6-4 matchup, which is pretty negligible.

Conclusions​

There are of course issues with the game, but the fact that it uses wins and losses as a metric for skill is not one of those issues. Here's a link to the spreadsheet. If you can find any flaws in my system, let me know and I can run the numbers again. If not, put this to rest.

In response to what you were saying​

  1. That is not what metric-based matchmaking would reward. Players would never deliver the urn because they could be farming instead. Players would never go mid because they could be farming instead. Anything that doesn't improve a stat would be completely ignored.
  2. They did remove "ranked" mode, but you clearly misunderstand how it works now. It's not just "win a game get a medal, lose a game lose a medal". There is a more complicated algorithm happening that you clearly don't understand.
  3. No I don't wear pink glasses, I actually have phenomenal vision. But I do wear striped thigh highs like any good software engineer.
  4. (still responding to your third point) I don't know of any game that doesn't use wins and losses as a matchmaking metric. League of Legends and DotA 2 both use win/loss based matchmaking, so I don't know why you brought up DotA as an example. Though DotA does use a "Hidden MMR", it's clearly not doing what you think it does. The "Hidden MMR" is a hidden matchmaking system used for unranked play. It's the same thing, just not shown to you.
:3
You cooked. I think youve sorta just won this at this point lmao
 
  1. The actual starting conditions don't particular make a difference. I was just showing them here so that the results would be reproducible. But for the sake of being as fair as possible I used real-world conditions as a basis for the simulation.
  2. Most of what you said here was already addressed previously. But to clear up a few more things. There is no such thing as a ranking system that "only works in math". The whole game is made of math. The "true skill with a standard deviation of 200" is just a number representing the actual skill of players, and the simulation sees if it can figure out what their skill level is based on the outcomes of matches. Which it can.
  3. There are too many potential factors for an automated system to possibly rank you individually apart from your team, especially given the best course of action is heavily dependent on situation, and getting closer to a proper measurement will necessarily get you closer to predicting wins/losses, which is something you can and should already measure.
  4. The thing you've mentioned here is a conspiracy theory, and has been outright denied by the DotA 2 developers. This is simply not how the system works, you are falling victim to gambler's fallacy.
  5. (in reply to the conclusion) I do agree the game often has a problem of seeming one-sided, but this is not a flaw of win/loss based matchmaking. I believe this is actually more of a flaw caused by lower player numbers, the player base's insistence on having separate matchmaking queues (lowering the player pool further), as well as possibly an issue with the natural slowbally-ness of MOBAs as a genre. There are issues of course, but win/loss based MMR is not one of them.
1) all your arguments once again miss the point; they essentially fail to address what I am asserting. my thesis is follow: in a game where player matchmaking is done by distributing teams in such a way that the initial player’s win rate does not exceed 50%, the rating should not be based on the number of wins because this does not measure an individual player’s skill. the 50% system is not a conspiracy - it is a fact that exists in many competitive team games, especially in Valve games - there’s no point in arguing against this.
2) your vacuum simulation, whose initial values are also borrowed from chess (where it is easier to measure a player’s "true skill" since, as far as I know, there are no open tools for monitoring player skill in deadlock, which raises questions about how its elo system is calculated), does not account for parameters such as stabilizing players' win rates by matching them with weaker teammates and stronger opponents. The 50% system is an immutable fact: the evidence is right before your eyes. if youve played the game, just review your match history (though since the system in deadlock functions incorrectly, it’s better to analyze this in the context of dota).
3) regarding the hidden pool - youve essentially claimed out of thin air that valve has denied its existence. you should know that valve is smart enough not to make any statements about widespread rumors regarding their games, as doing so could damage their reputation if the rumors turn out to be true. developers have consistently avoided commenting on this topic and for a long time, one days even ignored claims about shadow bans, which, nevertheless, exist in the game. even a developer on their forum confirmed this. officially, the game only includes the following types of punishment: low priority, explicit bans, and shadow bans. about the hidden pool, the developers remain silent.
4) let me reiterate: I do not deny that your simulation can transform the stated input data into results based on wins, etc. yes, it can. but does it work this way in deadlock? neither you nor I can know for sure from official sources, because the algorithms used for player matchmaking in the game are confidential. your simulation assumes that players are selected based on "true skill" among vacuum-like individuals, but in reality, this is not the case. therefore, I repeat, if the system prioritizes manipulating wins to ensure casual players don’t quit the game, the best way to measure rank is through individual player statistics.
you said that you are not a deadlock developer, so you cannot assert what is a gambler's fallacy and what is not (gambler's fallacy works just in casino, when prediction of victory is really random, not in competitive games lol.) I don’t even understand your interest in defending such a system if you neither develop nor play deadlock. it seems you just want to share your development experience - how you create games and what systems you use for calibrating players. okay, I’m glad your simulation successfully modeled the task you set for it - but it does not resolve the essence of the issue. read my original thesis and conclusions carefully and finally understand why a win-based rating is a real problem. this is not just a consequence of a small player base - dota serves as an example of this.
 
1) all your arguments once again miss the point; they essentially fail to address what I am asserting. my thesis is follow: in a game where player matchmaking is done by distributing teams in such a way that the initial player’s win rate does not exceed 50%, the rating should not be based on the number of wins because this does not measure an individual player’s skill. the 50% system is not a conspiracy - it is a fact that exists in many competitive team games, especially in Valve games - there’s no point in arguing against this.
2) your vacuum simulation, whose initial values are also borrowed from chess (where it is easier to measure a player’s "true skill" since, as far as I know, there are no open tools for monitoring player skill in deadlock, which raises questions about how its elo system is calculated), does not account for parameters such as stabilizing players' win rates by matching them with weaker teammates and stronger opponents. The 50% system is an immutable fact: the evidence is right before your eyes. if youve played the game, just review your match history (though since the system in deadlock functions incorrectly, it’s better to analyze this in the context of dota).
3) regarding the hidden pool - youve essentially claimed out of thin air that valve has denied its existence. you should know that valve is smart enough not to make any statements about widespread rumors regarding their games, as doing so could damage their reputation if the rumors turn out to be true. developers have consistently avoided commenting on this topic and for a long time, one days even ignored claims about shadow bans, which, nevertheless, exist in the game. even a developer on their forum confirmed this. officially, the game only includes the following types of punishment: low priority, explicit bans, and shadow bans. about the hidden pool, the developers remain silent.
4) let me reiterate: I do not deny that your simulation can transform the stated input data into results based on wins, etc. yes, it can. but does it work this way in deadlock? neither you nor I can know for sure from official sources, because the algorithms used for player matchmaking in the game are confidential. your simulation assumes that players are selected based on "true skill" among vacuum-like individuals, but in reality, this is not the case. therefore, I repeat, if the system prioritizes manipulating wins to ensure casual players don’t quit the game, the best way to measure rank is through individual player statistics.
you said that you are not a deadlock developer, so you cannot assert what is a gambler's fallacy and what is not (gambler's fallacy works just in casino, when prediction of victory is really random, not in competitive games lol.) I don’t even understand your interest in defending such a system if you neither develop nor play deadlock. it seems you just want to share your development experience - how you create games and what systems you use for calibrating players. okay, I’m glad your simulation successfully modeled the task you set for it - but it does not resolve the essence of the issue. read my original thesis and conclusions carefully and finally understand why a win-based rating is a real problem. this is not just a consequence of a small player base - dota serves as an example of this.
It's pretty well known at this point that the 50% system is a conspiracy. The only reason that your winrate trends toward around 50% is because that's a natural consequence of any good rating system. My system doesn't "stabilize" winrates because Deadlock almost certainly doesn't do that, and neither does any major competitive game.

Here is a DotA 2 developer EXPLICITLY explaining that the game DOES NOT use winrate stabilization. They usually don't make public statements like this, but people like you didn't understand how matchmaking worked so Jeff Hill felt the need to. While it's true Deadlock might not be using the Elo rating system, if they're not, they're almost certainly using something better. Elo is the default for any modern matchmaking system, so it's a safe bet to base a simulation off of that.
 
I ran a statistical analysis:
View attachment 32778
The results are exactly what I told you they were going to be, but this graph by itself doesn't mean much so let me break down how I did this:

How the simulation works​

Step 1: Starting conditions.​

First, we generate 1000 simulated players. Each player has a "True Skill" and a starting Elo score both picked from a normal distribution with a mean of 1550 and a standard deviation of 200. These numbers were picked based on chess Elo score data since it's readily available.

Step 2: Simulating matches.​

First, we pick a random player, and then remove 95% of the matchmaking queue to simulate players who are already busy in matches. We then pick the closest 11 remaining players in skill to the original random player. These will be the players of our simulated game. We assign 6 of them to the Sapphire Flame and six of them to the Amber Hand, and create an average Elo score for each team based on the players' Elo scores, as well as an average skill score. Using the average skill score, we calculate the real probability of each team winning, and randomize the outcome of the match based on that probability. We then calculate the gained/lost Elo points based on the Elo score expected probability and the actual outcome. Each player gains and loses this amount of Elo points depending on which team they were on. The K value used for the Elo rating system here is equal to 10+5000/(100+X) with X being the amount of total games played by individual players in the match.

Step 3: Measuring the results.​

In each iteration of the simulation, we simulate 1000 total games, since that's the number of players in the simulation. After each iteration we measure the average difference between each player's skill and their Elo score. After 50 iterations (which is the number of games it took to unlock ranked by the old system), the average difference between player skill and Elo score is 85 points, which represents about a 6-4 matchup, which is pretty negligible.

Conclusions​

There are of course issues with the game, but the fact that it uses wins and losses as a metric for skill is not one of those issues. Here's a link to the spreadsheet. If you can find any flaws in my system, let me know and I can run the numbers again. If not, put this to rest.

In response to what you were saying​

  1. That is not what metric-based matchmaking would reward. Players would never deliver the urn because they could be farming instead. Players would never go mid because they could be farming instead. Anything that doesn't improve a stat would be completely ignored.
  2. They did remove "ranked" mode, but you clearly misunderstand how it works now. It's not just "win a game get a medal, lose a game lose a medal". There is a more complicated algorithm happening that you clearly don't understand.
  3. No I don't wear pink glasses, I actually have phenomenal vision. But I do wear striped thigh highs like any good software engineer.
  4. (still responding to your third point) I don't know of any game that doesn't use wins and losses as a matchmaking metric. League of Legends and DotA 2 both use win/loss based matchmaking, so I don't know why you brought up DotA as an example. Though DotA does use a "Hidden MMR", it's clearly not doing what you think it does. The "Hidden MMR" is a hidden matchmaking system used for unranked play. It's the same thing, just not shown to you.
:3
Based little game dev demolishing brainlets. Astounding work
 
It's pretty well known at this point that the 50% system is a conspiracy. The only reason that your winrate trends toward around 50% is because that's a natural consequence of any good rating system. My system doesn't "stabilize" winrates because Deadlock almost certainly doesn't do that, and neither does any major competitive game.

Here is a DotA 2 developer EXPLICITLY explaining that the game DOES NOT use winrate stabilization. They usually don't make public statements like this, but people like you didn't understand how matchmaking worked so Jeff Hill felt the need to. While it's true Deadlock might not be using the Elo rating system, if they're not, they're almost certainly using something better. Elo is the default for any modern matchmaking system, so it's a safe bet to base a simulation off of that.
"The Dota matchmaker does use many other factors when trying to make a match that are more than just player skill to ensure that the teams are compatible." nuff said, didnt read further
UPD: just noticed follow paragraph: . "A 50%.....it's a consequence of trying to make the teams for each individual game fairly and players playing a large number of lifetime games. Consider what it would mean if this were not true - what if some player had a 70% lifetime win rate over a large number of games? That would mean that the teams that player was put on for those games objectively had a 70% chance to win in aggregate." based on which of those "many other factors" they measure a chances on victory and select players on your team ? isnt it a winrate )
 
Last edited:
Yeah, the winrate will naturally tend toward 50% because it's trying to put you in even matches. But they're not putting you in rigged matches or some "hidden pool" to balance out your winrate.
pls man, stop bothering me if you are not intended to change your opinion. your "valve developer" precisely said that they sort team by allegedly adding players who have lower winrate together with ones who have higher and force them to play avg mmr players just because if they didnt do this the player with high winrate will "objectively had a 70% chance to win in aggregate". what you do - is just defend your conviction which based on 0 proofs. but not to be unfounded, i already will provide you with some proofs. I show you only those profiles who is opened just to judge about their avg skill. I pick only my last won game and last lost, because the list of players i ve played recently isnt infinite. and i will also prevent you: no need to talk pls its consequences of small online, when online was 100k the situation remained the same.



1734182171098.png
1734181973388.png - MY LOSESTREAK (9 games) MY WINSTREAK (6GAMES)


1734182289380.png - my last won game and profiles of enemies belove
1734182336669.png 1734182376633.png 1734182397812.png
Team:
1734182430913.png 1734182442684.png

MY LAST LOST GAME

1734182511454.png

PROFILES OF ALLIES (ENEMIES' PROFILES ALL IS HIDDEN)
1734182558400.png 1734182574060.png 1734182599613.png 1734182612310.png
This is just a small part i can prove: because neither i have all profiles of my mates i played all time, or even access to their rank or matches history. The same situation with dota. Player observations of the situation is worth a lot, and stop denying it, if you dont have any proofs to confirm your opinion, pls stop blowing the air.
 

Attachments

  • 1734182589993.png
    1734182589993.png
    737.8 KB · Views: 2
  • 1734182420461.png
    1734182420461.png
    160.4 KB · Views: 2
Everyone knows that to determine how good the matchmaking system is, all you need to do is look at your last games and see how often you won. If it's not 50/50, then you know it's broken. Obviously.
 
Back
Top