## RTS Balancing - Proven to be NP hard (or even solvable)?

A place to discuss the science of computers and programs, from algorithms to computability.

Formal proofs preferred.

Moderators: phlip, Moderators General, Prelates

Game_boy
Posts: 1314
Joined: Tue Mar 04, 2008 7:33 pm UTC

### Re: RTS Balancing - Proven to be NP hard (or even solvable)?

Koa don't bother arguing with this guy.

Edit: OK.
Last edited by Game_boy on Wed Mar 28, 2012 2:54 pm UTC, edited 1 time in total.
The Reaper wrote:Evolution is a really really really long run-on sentence.

Koa
Posts: 512
Joined: Mon Dec 12, 2011 1:20 am UTC

### Re: RTS Balancing - Proven to be NP hard (or even solvable)?

I'm not crazy about your denouncement of him. Even if someone is wrong, I would at least like to understand why they believe that they are right through argument. I'm pretty glad I did argue it in this case. I found a fairly satisfying answer to the original question when I didn't quite have one before.

freakish777
Posts: 354
Joined: Wed Jul 13, 2011 2:14 pm UTC

### Re: RTS Balancing - Proven to be NP hard (or even solvable)?

Koa wrote:This is a problem of the disagreement on the definition of balance, which is why it would be important to define.

In agreement over the need of definition. I don't think balance can be defined by what is or isn't fun. Such a definition makes it (nearly) impossible to answer in a programmatic fashion.

You'd also have to replicate physical input limitations and the capacity to understand information.

Physical input limitations should be able to be addressed by sleeps and by introducing some randomness when placing where on a the screen to perform some action (humans don't always click the exact X, Y coordinate they want to, if whenever your bot tells a unit to go to some X, Y coord to perform action Z, we offset that X, Y coord by some small random amount, we should be able to achieve this aspect of how humans play).

Limiting information, you'll need to make sure that the bot only can only view information (health) of units selected. Most attempts at bots so far have just accessed the raw game objects, so a bot has a "God's Eye" view, which obviously you need to avoid.

You'll have to draw lines of what is acceptably possible for a human, and at what point should imbalance be acceptable when a human is capable of exceeding those limitations.

This is where you and I wholly disagree on the definition of balance. If Balance basically asks "Is the game Fair?" and the definition of Fair asks "if both players are equally skilled, and the game is not 100% luck based, is the game a 50/50 chance for both players? If the one player is more skilled than the opponent, is their chance of winning > 50%?" Clearly if a human player exceeds the skill level of all of their opponents in a skill based game, they should have a greater than 50% chance of winning (they are, after all, better).

Now, I know what you're getting at. Once you've decided on what the human limitation is, what happens when you were just a tiny bit incorrect and it messes up all the data you've done (because you've balanced for just below or just above your target). The answer is you'll want to balance for multiple levels of limitation. Again, since balance can be achieved in many different ways, you want to record every single balanced point (in your N-dimensional space) you come across, and then across all "skill levels" or bot limitations, you want to find the point that exists for as many skill levels as possible.

This also comes down to the definition of balance, and I'm afraid it may be subjective in nature.

Fun is (obviously) subjective in nature. Balance/Fairness should not be (not because of some "this is the way the world should be" attitude, but because it becomes nearly impossible to answer scientific questions when definitions are subjective, if we want to say that the definition of Force is subjective, instead of the definition being Force = Mass * Acceleration, well have fun in your Physics class, I guess).

I guess I wasn't very clear. I know what you're suggesting, but I don't think you understand how incredibly complicated that heuristic would have to be to gather all the information required to solve the game.

No, I'm fully aware, and I was up front about it being, by far, the most complicated part of all of this. Again, RTS games are all about converting one resource into another and then making favorable trades with your opponent (which is not all that different from Chess, it just happens in real time). So, before we can calculate value we need a list of conversion functions from one resource to another, over time. Then our Heuristic estimates value of units/resources, dependent on current game state (do we know our opponent has zero detectors? looks like those cloaking units are pretty good right now).

The hardest part will be ensuring that you've captured every conversion, so every aspect of the game needs to be taken into account (if you don't you leave yourself open to not being able to discover strategies that rely on that conversion).

Every unit is a resource, every building is a resource, every action that a building or unit is capable of is a resource, every unit of mineral, every unit of gas, every mineral field, every gas field, every limitation in the game (cannot build 2 gas mines on the same field, fog of war, can really only get 3 workers in a gas mine, etc), every upgrade (this one will make your head explode), time is a resource, space is a resource (the larger the map, the less likely you are to need to build against an early rush), obstacles are a resource, etc, etc.

So, if you missed a function that calculates the conversion of resources from minerals, gas, buildings, and time involved to get an upgrade, your bots shouldn't ever perform that upgrade (I'm sort of assuming that you've programmed things to exit gracefully and always return default values, which is normally a best practice, but in this case, perhaps crashing with a "Unable to Calculate Value from Heuristic of Action X" would be better since you'd find all the places you need to fix) which will lead to any strategy depending on that upgrade not being taken into account.

Note that the upgrade functions need to temporarily modify all of the functions (both conversion functions and value functions) for units/buildings effected by that upgrade.

You could attempt it by mimicking human play (and even that would be amazingly difficult to accomplish to any degree of accuracy)

Yeah, doing that is not the way to go.

, but a heuristic capable of truly solving SC2 is so over my head that... I can't even explain it... I'd have to assume you're an alien.

If that helps you sleep at night.

How do you plan on building the heuristic?

It requires intimate knowledge of the game (ie, you work for Blizzard/GameCompany and are handed a 2,000 page Requirements Document for the game that the game's software developers are also handed). As in, knowing every single resource available in the game, and how you get one from another (given zero minerals, you get to 5 minerals minerals with 1 Worker, 1 Base, and s seconds where s is dependent on how far away that base is from the mineral field, and how far away the worker starts from the mineral field).

Identify every game object. Is it a resource? How does it convert into other resources. Is it necessary (like a worker, or a base, or a gas mine) to convert one resource to another? What resources exist that aren't represented by game objects (hint, usually time and space)?

Once this is understood, then you go about creating the heuristic (for reference, the list of Resource Conversion Functions needs to get updated each time you change a unit's build time or speed, all of this can be automated since the calculations are all based on known information, one headache for Resource Conversion is in movement of workers and how that impacts conversion, in particular moving the first worker to an expansion you can't say "he moves this fast and it's this far apart" due to obstacles, and clogging a mineral field, so in reality you'd probably put in an average as determined by your thousands of games worth of data how long it takes to move that distance, or you can try to compute on the fly which will slow your bot's decision making down).

In this case your Heuristic is just a bunch of functions that estimate value for resources (which includes, units/buildings/etc), and coupled with the Resource Conversion Functions let the bot decide which thing to spend it's Time/Minerals/Gas/Units on. Build an economy? Rush? Rush with which units? Mix 75% of A and 25% of B (keep in mind your estimated values need to follow the Law of Diminishing Marginal Returns)?

So:

Resources W, X, and Y can be convert into resource Z at the following conversion rate - 50W + 10X + 25Y = Z. And Z has an estimated value of, say, 500. Or, 100W + 15X + 0Y = N, and N is estimated at 750. Which do you go for? Which is going to more likely to make you win? 2 Marines, or 2 SCVs? (Clearly dependent on game state).

A GA? How do you build that GA?

The same way we do every other GA.

So, we're updating the Estimated Values functions. You can think of these functions mathematically, as creating lines that we're trying to optimize a best fit function for. For the first GA (the one that updates Heuristics)

[*]Create initial population (we start with a population of heuristics, yes, multiple)
[*]Perform whatever it is that lets you eval fitness of each member (in this case, make them play against each other several hundred times)
[*]Repeat on this generation until condition X. (I recommend a specific number of games)
[*]Select best-fit for reproduction (in this case, we're assuming that the bots using the heuristic that most accurately reflected the values won)
[*]Create new members via crossover and mutation

So, say 2 heuristics did really well, and you have 2 Value Estimation Functions for Marine:

M = ( -0.005 * (current game time in seconds)) - (50 * (number of Tanks in opponent army)) + 100
M = ( -0.010 * (current game time in seconds)) - (2 * (number of Tanks in opponent army ^ 2 + 2 * number of Tanks in opponent army - 5)) + 75

(obviously these functions are going to be quite a bit more complicated)

You want to combine them in some way shape or form. There's several ways to do this, but the gist is that crossover means you'd take -0.005 and -0.010 and do some function with just the already existing numbers, whether it's average them, or take the first and discard the second, or add them, or subtract one from the other, etc. Where as mutation means you're inserting some randomness, think of each number in binary, 50 is 00110010 as a byte, and 2 is 00000010. Mutation would say "before we do whatever function it is that we normally do in crossover, flip one bit in 50 and 1 in 2 first before we do that function." Or whatever other means of adding randomness to your population is. It's important to add this randomness, otherwise you're GA doesn't explore new areas and can get caught finding sub-optimal results. Add too much randomness, and you run the risk of not converging/coalescing. Combine straight lines and the power curves is straight forward, you sort of at that point need to say "We're going to err on the side of functions with higher powers since we can have a leading zero effectively decrease it's power later if necessary." It's combining wholly disparate functions like a log and a sin where things start getting sticky (or when you have a 2 functions, one for X > 5 and one for X <= 5 and are trying to combine it with a continuous function).

[*]Replace least-fit with new members

So, say your population is a list of 20 heuristics. Each heuristic is a list of 500 Value Estimation Functions (or whatever the case actually is, it's probably bigger).

Every Bot uses the same DecideAndExecute() method (adversarial search), but each uses a different heuristic. They play round robin, X games against each opponent (in SC2, 9, number of factions squared, to represent the total possible different match ups between factions).

So now you know which are the fittest and the least, you pick your top 30% to create new members with, and replace your bottom 15%.

Now, we have the second GA which is trying to figure out "how do I determine when something is dominant and needs to be scaled back?" When the population of GA1 coalesces (it's "agreed" by the 20 members that a set of Estimated Valued Functions approximately represents the game and a particular strategy is winning over and over, you could say "When the Most Fit member is the same member 5 Generations in a row and has an X% win rate over that time period greater than our threshold").

Why not just build a fullblown sentient AI while you're at it? I can't even tell if I'm joking.

I'd like to. Unfortunately after a lot of thought, I've come to the conclusion that given today's hardware it would require billions of dollars at a minimum, if it's even possible.

I don't think there is a human on this planet that can say with any certainty that they can build such a bot before they've attempted to do so without exercising ignorance (no offence intended).

I'm confidant it's possible. For me the problem is time and money. Also, the run time to calculate 500+ Estimate Value Functions every time your bot asks about the value of anything (and at what depth?) would probably make performance pretty poor.

Expressive Intelligence Studio at UCSC I believe. There is a lot of scripting involved, but that's because it's necessary to develop something tangible within a meaningful time frame.

Did they print a paper on it? I'd be incredibly interested in their techniques.

But the game isn't even static!

I mean static for units/buildings/game object types. There aren't new units/buildings until an expansion gets released.

Assuming new map sizes are the same as a pre-existing map size, that shouldn't change how we go about doing things.

Now, say we want a new map, and on this new map you can't interact with your opponent outside of building flying units. If your Bots had determined "All Flying Units Suck" and you blindly balanced anyway, yeah, you'd have a problem. This is why you want GA2 to not just say "Hey I've coalesced/converged" until after you've gotten some minimum number of viable strategies (diversity of strategies required is something I assume Blizzard Business people give their "Balancing Engineers" or whatever the title is as a goal, and I wouldn't be surprised if they've given them some sort of minimum to hit in order to achieve player happiness).

Is it way over everyone in this thread's collective mind? Yes.

Disagree on this one.

Is it feasible in 2012? No.

Mostly because a budget doesn't exist for it.

Could an RTS be balanced with such a bot? Define balance. Could you define balance? No, you would have to redefine it.

Could a bot assist in modifying parameters to provide a state of such a redefinition of balance? Of course.

Would such a game, whose parameters are dictated by mathematics to achieve the state of such a redefinition of balance, be balanced? No, while the two definitions of balance (subjective, and defined objectivity) would be similar in some sense, they would never be equal. The best course will always be to give the players the same tools for success. While the pieces and board of chess are completely equal, they have unequal tools (white moves first) and is therefore, at least slightly, imbalanced.

This is why in tournament settings you alternate rounds playing white and playing black, and why World Championships are scheduled as Matches alternating between colors (and also why my first post attempting to provide definitions brought these up).

Any Game (capital G) in which an individual game (lower case g) played is unbalanced, the Game can still be balanced by going to a match format (this is why professional sports teams play half away games and half home games, and why certain championships are attempted to be played on a "neutral" field).

A game with unequal tools could in fact still provide equal chances to both players of the same skill level (and therefore be balanced by my definition).

Say you have 2 RTS players, one does very well on large maps, the other very well on small maps, but otherwise they are on the same skill level. They go head to head at a competitive event in the finals. In this tournament, in the finals, the player with the better record chooses the map for the first game, then their opponent chooses the map for the second game, and a random map is chosen for the last game.

Is this, in your opinion, a balanced tournament format? Why or why not? Is best 2 out of 3 the best format? Is best out of 6? Best out of 6 win by 2 (if player hasn't won 4-2, player more games until a player has won by at least 2 games)? Should the order in which the maps get played be random as well (to avoid the psychology of a "must win"/"back against the wall" game 2)?

I wholly disagree on balance requiring players to agree on whether or not they think it is balanced. Again, I think Balance is directly tied to whether or not the game is fair, which boils down to equally skilled players having equal percentage chances of winning a match (not a game).

Think about the 2 out of 3 example. Player A is good at small maps, they pick a small map for game 1 and win, player B picks a big map and wins game 2. It is randomly determined that a big map will be used in game 3 and Player A loses. Player A complains "That's not fair!" or "How unbalanced is that!" Player A is a now a whining 4 year old... Either it was just as likely that it was going to be a small map (in which case Player A should say "I took a bad beat" like any good Poker player and move on) or it was in fact more likely to be a large map (in which case Player A actually is less skilled than Player B who has recognized that large maps come up more frequently and has adjusted their play accordingly, unless of course Player B somehow cheated, but that's kinda irrelevant to this conversation).

Why does the word Balance have to be subjective? I see no reason for the definition to be anything other than dealing with whether or not equally skilled players have an equal percentage chance to win a match, and you haven't made it clear why a subjective definition should be used instead.

Derek
Posts: 2169
Joined: Wed Aug 18, 2010 4:15 am UTC

### Re: RTS Balancing - Proven to be NP hard (or even solvable)?

Koa wrote:Also, you can't really have a computer teach itself to play an RTS such as SC2 using genetic algorithms.

I bet you could use a genetic algorithm to find effective build orders though.

Limiting information, you'll need to make sure that the bot only can only view information (health) of units selected.

SC2 has an option to always show the health of all units.

freakish777
Posts: 354
Joined: Wed Jul 13, 2011 2:14 pm UTC

### Re: RTS Balancing - Proven to be NP hard (or even solvable)?

Derek wrote:SC2 has an option to always show the health of all units.

Hm, perhaps then that limitation could be used as a division for bot skill levels. I imagine a lot of players probably don't use that feature, or wouldn't be as adept at leveraging that information effectively as a bot would.

Koa
Posts: 512
Joined: Mon Dec 12, 2011 1:20 am UTC

### Re: RTS Balancing - Proven to be NP hard (or even solvable)?

Derek wrote:I bet you could use a genetic algorithm to find effective build orders though.

@freakish
I have to posit that you don't understand even a fraction of the amount of raw information and complexity involved in SC2, nor how the game works and under what circumstances do people win. Thus you don't understand that what you're suggesting is not only ridiculously difficult to create but not even terribly meaningful to the goal of balance (which one can't define).

This is a video of a bot that was designed to control marines to a high degree of efficiency. This is something that your bot would accomplish and then some. APM limitations are insufficient for replicating human limitations. The marines would still spread more effectively than would be physically capable of a human. I'm hoping you see the problem here. While this is likely the most prominent example (but be my guest to look at others), it exists in every facet of the game in some form or another. You're going to have to decide what a human is capable of before you even attempt to replicate their limitations, and at that point balance is no longer objective, you're dictating a form of balance to that degree of ability. At the very least the game would be horribly imbalanced for people who can't accomplish the necessary things that your bot would do.
(If you don't understand why that is then it shows how much you don't understand SC2. There are times in a game where, by design, one race has stronger economy/production/army/etc than the other's. Your bot's heuristics would deem it necessary to take advantage of those windows, often in the form of a timing attack. A player who misses the window, or isn't even aware that it exists would be at a large disadvantage. If both players don't understand their windows, then it's a whole different game they're playing, requiring a whole different set of rules.)
Now, say we want a new map, and on this new map you can't interact with your opponent outside of building flying units.

Do you not understand that every single map in the game, currently, would provide an entirely different state on balance? The game does not exist without a map, and the map will completely alter the effectiveness of everything. You'd be better off designing a bot that would determine what terrain features are the most balanced for the current state of the game, and even that would be an undertaking regardless of how you decide to do it. The same caveats apply.

I don't even know how you'd account for the lack of information the players have of one another... It'd probably just increase the number of required epochs for statistical relevance considerably. Anyway, I get the impression that there are problems that you don't even know exist, which makes your certainty seem more than a little naive, but the design and implementation of the bot is irrelevant to the original question. I answered that, no, it's not solvable because balance can't be defined in this way.

You can give players different tools and then switch them after a game to be fair, such as in chess. The tools do not switch hands in SC2. The mirror matches (tvt,pvp,zvz) are inherently balanced with some very minor advantages due to the map, but every other matchup is not and cannot be unless the game is being played at a level that is solved, and thus, pointless to play.

I didn't know that before this thread, so thanks for arguing it.

Spoiler:
I wholly disagree on balance requiring players to agree on whether or not they think it is balanced. Again, I think Balance is directly tied to whether or not the game is fair, which boils down to equally skilled players having equal percentage chances of winning a match (not a game).

Well now you have to fucking define skill.

freakish777
Posts: 354
Joined: Wed Jul 13, 2011 2:14 pm UTC

### Re: RTS Balancing - Proven to be NP hard (or even solvable)?

Koa wrote:I have to posit that you don't understand even a fraction of the amount of raw information and complexity involved in SC2

Do I understand the amount of raw information? Absolutely. Do I understand the nuances of SC2's actual data at the object level? No, I'm not a Software Engineer at Blizzard working on SC2. I'm sure there are Senior Software Engineers at Blizzard that do.

nor how the game works and under what circumstances do people win.

Thus you don't understand that what you're suggesting is not only ridiculously difficult to create

Oh no, I understand this a lot. Probably the confusing part for most people is that I said certain parts of it were trivial (which those certain parts are). There's other parts of it that would require huge amounts of information and more importantly time for a team of individuals (clearly this is not a "DIY Research Project" that one knocks out during a semester at college, this is a fully funded project with a full team of people if someone wanted it done right).

but not even terribly meaningful to the goal of balance (which one can't define).

Why can't one meaningfully define balance. This is a critical part of your argument that you still haven't given a succinct answer for.

This is a video of a bot that was designed to control marines to a high degree of efficiency. This is something that your bot would accomplish and then some. APM limitations are insufficient for replicating human limitations.

Of course an APM limitation is insufficient by itself.

The marines would still spread more effectively than would be physically capable of a human.

Not if they "misclick" with every placement. Hence, the need to add randomization to placing units. This cuts back a bots ability to out micro a human. Essentially you want the X, Y coords it picks for units to be "off" by some small enough amount such that it can't create concave lines to slaughter balls, but that aren't large enough such that it's workers can never end up on a mineral field. Random placing for buildings could be even less, since humans tend to be a little more careful with placing buildings at their ramp.

Furthermore, it's possible for bots with a sufficiently large enough sleeps (such that attempting to micro at that level actually just gets them crushed by a player with a ball and no micro) to determine that the correct strategy is to use balls instead of attempting to micro (assuming your heuristic captures metagame knowledge, this goes incredibly deep though with the heuristic, and I really don't think you want to go there).

You're going to have to decide what a human is capable of before you even attempt to replicate their limitations

Sort of, but not really. Their limitations simply have to be replicated before the first epoch. So objective 1 before you even get your bots playing games against each other is to twerk them until you're sufficiently happy that they don't exceed human capability. This needs to be done "by hand." There's not an automated process for this, and it requires some decision by those involved to say "yeah, we're pretty sure that this bot doesn't just blow people out of the water with their micro."

and at that point balance is no longer objective, you're dictating a form of balance to that degree of ability.

I disagree, you're balancing (not dictating) for that degree of ability. Now do it again for each degree of ability.

At the very least the game would be horribly imbalanced for people who can't accomplish the necessary things that your bot would do.

Totally and wholly incorrect. This is why you do it for every degree of ability your bots are capable of. One limits APM with sleep(4), another limits with sleep(5). One limits micro ability by forcing misclicks with random numbers in the range of 0 to 10 pixels, another limits micro by forcing misclicks with random numbers in the range of 0 to 50 pixels.

So you have varying degrees of ability your bots are capable of to represent the entire spectrum of human skill level of SC2 (or insert other similar game here). At each skill level, there will be a combination of points (in your N dimensional vector space of variables you're updating with the second Genetic Algorithm) that correspond to the game being balanced. It helps to think of this as a 2-d graph with points (even though it's an N-d graph). Now put your graphs "on top" of each other, and pick the point that exists across the largest number of skill levels.

If your base bot has sleep(1000000000000), and misclicks by up to 1600 pixels, every SC2 player, even the ones that get utterly destroyed in bronze league should be able to beat that bot.

(If you don't understand why that is then it shows how much you don't understand SC2. There are times in a game where, by design, one race has stronger economy/production/army/etc than the other's. Your bot's heuristics would deem it necessary to take advantage of those windows, often in the form of a timing attack. A player who misses the window, or isn't even aware that it exists would be at a large disadvantage. If both players don't understand their windows, then it's a whole different game they're playing, requiring a whole different set of rules.)

This, incidentally, is also why you store all of the heuristics your bots use. The ones where it tells your bot to over value Carriers, "because Carriers are awesome man, I win every game where I can get a Carrier out!" this leads them to miss their timing attack that gives them the best chances of winning (because they're time attacking a different build order). It also represents a different skill level, because their understanding of the game isn't as good as better skilled player.

Why do we store these heuristics? Once you established which heuristic is most dominant for that epoch, you place them all on a bell curve according to their winning percentage against that generation. As you balance you pit similarly performing heuristics against each other, to ensure that the eventual point of balance you pick is balanced for the largest number of different skill levels an ability sets. Some players have higher APM then other players. Some players misclick more often. Some players fundamentally misunderstand how good a unit or units are. A player with low APM that doesn't misclick very frequently and has a very good grasp of the game, could be on the same skill level as a player with high APM who doesn't misclick very frequently, but doesn't understand why Warp Gates or Upgrades are any good and plays Protoss.

Do you not understand that every single map in the game, currently, would provide an entirely different state on balance? The game does not exist without a map, and the map will completely alter the effectiveness of everything.

Of course I understand this. This is why I brought it up! That said, it's not like each map is 100% unique in regards to balance. This is because each map contains objects that exist in other maps. It's interesting you should mention terrain features, because those would be captured in the heuristics. As the number of chasm that can't be crossed aside from flying units goes up, so does the value of flying units...

This is why you have your bots play on a random map each time they play, so that all terrain features get used and impact the heuristics. The larger variation of maps you have to work with, the less likely a new map is to break the balance of the game (assuming you had this system in place, it's not like you would release the map before running it through the system anyways).

If a particular terrain feature ends up favoring a particular faction, you have a problem and need to balance for that terrain feature. Is the terrain feature chasms, and it benefits faction A because their flying units are better? Time to make sure faction A's flying units aren't superior.

I don't even know how you'd account for the lack of information the players have of one another... It'd probably just increase the number of required epochs for statistical relevance considerably.

What, why would it increase the number of required epochs?? Your adversarial search algorithm that all the bots are using will go "Hey I don't have any clue what my opponent is doing, but I know they're faction A (or maybe they're random!), faction A has strategy X that's really good against the faction I'm playing, so I'm going to guess they'll try to do that (what's the best play my opponent could make, according to the heuristic I'm using, assume they'll do that), so I'll respond accordingly by making units U that have better chances against that."

In any event, what should happen fairly early on is that the heuristics should start placing a value on scouting proportional to what it's actually worth in the game. The less important it is to scout (if all units are good units regardless of what your opponent's units are), then the less valuable it should be in the heuristic. This is why the heuristics in your initial generation (your "seed population") should be radically different from each other, so that when crossover does happen, your new members are sufficiently different from each, individual, parent (so that your GA sufficiently explores the problem space).

I answered that, no, it's not solvable because balance can't be defined in this way.

I still don't know why you don't think it can be defined that way?

The tools do not switch hands in SC2.

Could they switch "hands"? Would a more fair system be to play a match on one small map (favors Zerg rush?), one medium map, and one large map? Let players switch races in the middle of a match (this would favor players capable of playing multiple races at a high level, such a player would be more skilled than an opponent who couldn't and they should therefore enjoy a higher win rate, correct?)?

EDIT 1: Something I was trying to get at with my previous post. It may not be necessary for players to have the same tools in order to be a balanced game. If both players are of equal Skill level, and the tools available to them are different, but they still both have an equal chance of winning a Match, then the game is still balanced.

unless the game is being played at a level that is solved, and thus, pointless to play.

Solved games are not pointless to play. Connect 4 is solved. Kids everywhere still enjoy playing Connect 4 (though I will concede they probably enjoy it because they don't know it's solved). My point here is that when Chess is eventually solved, people will still play it and enjoy it.

Well now you have to fucking define skill.

A player's Skill (upper case) is the summation of all skills necessary to play a given game. A skill (lower case) is how well they perform a certain task, that task being a mechanic of the game.

Examples of skills in Chess (not a full list):

Knowledge of game rules, (imagine for a second that 2 players are playing in a tournament and one is blissfully unaware of the En Passant rule, that player has less total Skill than another opponent, all other things being equal, at some point that player will make a move that allows En Passant to happen, and when it does it could be to their disadvantage).

Ability to see some number of moves ahead (the further ahead you can see, the better you are at this skill, memory is a factor here).

Evaluation of Position (the more accurate your evaluation is, the better you are at this skill).

Strategy (long term planning, it's dependent on evaluation and Ability to see ahead).

Tactics (immediate piece placement in an attempt to create an advantage, examples of tactics in Chess are pins, discovered checks, forks, sacrifices, etc).

Knowledge of Openings (the more openings you know, the better you are at this skill).

Knowledge of Opening Strategy.

Knowledge of Middle game Strategy.

Knowledge of End Game Strategy.

So, in Chess your Skill would be "How well you know game rules" + "How many moves deep you can see and keep track of, and in what time frame" + "How well you evaluate positions" + "How well you strategize" + "How good are your tactics" + "Your knowledge of openings and it's strategy" + "Your knowledge of midgame strategy" + "Your knowledge of endgame strategy" + whatever I missed.

Also, I'm going to pre-empt any "Now you need to define what it means to do a task well!!!" comments. With respect to an opponent, or group of opponents. How players perform at a particular task required to play the game is compared to how other players perform that task.

Naturally, one would assume a bell curve/normal distribution of how humans perform any given task. Summation of varying skills should also lead to a normal distribution of Skill. Players of equal Skill playing a Fair match should have the same percentage chance of winning as their opponent (if no draws are allowed for a match, then 50%).

How much more Skill an individual player has over an opponent should give them a higher percentage chance to win over that opponent, and it should be proportional to the amount of Skill involved in the game (the more Skill and less luck, the larger the percentage advantage).

I know that what you're trying to get me to see is that "We can't really measure total Skill, and get some number value." That's actually incorrect, measuring win percentages is how you get some number value, then after you've assigned ratings to your players, if those player ratings don't loosely represent a bell curve, perhaps your game doesn't require enough different types of skills in order for your players to differentiate themselves from everyone else.

Lastly note that some skills are applicable to other games (and in the case of bots, your adversarial search algorithm could be if it was generic enough, while your Estimated Value Functions wouldn't be since they're specific to that game).

Koa
Posts: 512
Joined: Mon Dec 12, 2011 1:20 am UTC

### Re: RTS Balancing - Proven to be NP hard (or even solvable)?

Why can't one meaningfully define balance. This is a critical part of your argument that you still haven't given a succinct answer for.

I've been explaining why... If you need a simple analogy, this is like accomplishing a balance between how far one person can throw a spear, and how far another person can walk on their hands. They're completely different skills, and you're suggesting that you can get an aggregate of information of people who have accomplished both goals and then create an effective formula or ratio for competition. Okay, but extrapolate that to all of the different skills that exist in SC2 (assuming that's even possible, there are many different skills from splitting marines to putting workers on gas to... everything a player can do for any reason), and weigh each skill against each other, and weigh the probability that a person has that skill for the people who don't (which is going to be constantly changing, assuming you could even gather such information)... You would have to stream the game's parameters to the players because the balance would be changing so often, and then no one could ever learn how to play because their understanding of the game would have to be constantly deconstructed.

The marines would still spread more effectively than would be physically capable of a human.

Not if they "misclick" with every placement.

So... what you're saying is that you're going to quantify human error. Do I really need to explain this?

Sort of, but not really. Their limitations simply have to be replicated before the first epoch. So objective 1 before you even get your bots playing games against each other is to twerk them until you're sufficiently happy that they don't exceed human capability. This needs to be done "by hand." There's not an automated process for this, and it requires some decision by those involved to say "yeah, we're pretty sure that this bot doesn't just blow people out of the water with their micro."

So basically it's not objective. You twerk the bots until it "looks about right" and then it's good to go and start balancing for you. Not. Objective. Players are being judged not only in the human error of the process of determining what human error looks like, but also because it will not and cannot be a perfect representation of the players. Dictating.

At the very least the game would be horribly imbalanced for people who can't accomplish the necessary things that your bot would do.

Totally and wholly incorrect. This is why you do it for every degree of ability your bots are capable of.

So, now you're quantifying human error for a plethora of different skills (ex marine split) over several different degrees of ability (ex league) that are also attempting to quantify human error (ex both how and why they suck). Very. Not. Objective.

And are you actually suggesting that the different leagues should have different game parameters? And are you suggesting that the team games (2v2, 3v3, 4v4, ffa) should also each have different parameters for each different league? How many states of the game would there be? How would anyone learn to play the game if the game is not only changing on them constantly, but catering to their weaknesses from the game's values?

If you're not saying that, then you're saying that you should take the average of every skill level to balance the game, and impose those rules on everyone. This accomplishes absolutely nothing other than "balance" (dictate) the game around the average. Everyone who is above or below the average skill level has an increasingly imbalanced game the further away they are from it.

I'm sorry, this is absurd from all angles.
This, incidentally, is also why you store all of the heuristics your bots use. The ones where it tells your bot to over value Carriers, "because Carriers are awesome man, I win every game where I can get a Carrier out!" this leads them to miss their timing attack that gives them the best chances of winning (because they're time attacking a different build order). It also represents a different skill level, because their understanding of the game isn't as good as better skilled player.

No it does not. One of my first arguments was that bots learn in a different way to humans. Humans make assumptions, humans have emotions. You cannot store a heuristic state to quantify human skill, and even if you did, you're still dictating to them how skilled they are rather than accurately representing how skilled they are. And thus, Not. Objective.

Of course I understand this. This is why I brought it up! That said, it's not like each map is 100% unique in regards to balance. This is because each map contains objects that exist in other maps. It's interesting you should mention terrain features, because those would be captured in the heuristics. As the number of chasm that can't be crossed aside from flying units goes up, so does the value of flying units...

It is 100% unique, because your only data for determining whether a strategy is strong or not is how often it wins or loses. If at any point a difference in the map can potentially determine that, and you can't determine at what point of the game a part of the map did determine it, then you have to account for it as absolutely unique.

This is why you have your bots play on a random map each time they play, so that all terrain features get used and impact the heuristics. The larger variation of maps you have to work with, the less likely a new map is to break the balance of the game (assuming you had this system in place, it's not like you would release the map before running it through the system anyways).

Same problem as above, balancing for an average only balances for the average and nothing else. Some maps would be closer to the average map and thus the game would be more balanced on that map. Some would be further away, and thus more imbalanced. You're deciding what type of map to include or not include in your average, as the number of different maps is near-infinite (and by that I mean any of the smallest of changes can affect the balance of the map, so every_point_on_every_size_map^all_the_potential_terrain_features). You're dictating to them what is a potential map whereby players are judged and then the game balanced. You can accept the already present dictation that is the currently played map pool, but then you would have to alter balance each time a new map is introduced (often), you would not account for maps not currently present in the pool, and you still have the problem that all of the maps would be imbalanced because an average always has a min and a max (and average != median, therefore the map that the game would be balanced for would likely not exist).

Not. Ob-whatever. Are you getting the point yet?

If a particular terrain feature ends up favoring a particular faction, you have a problem and need to balance for that terrain feature. Is the terrain feature chasms, and it benefits faction A because their flying units are better? Time to make sure faction A's flying units aren't superior.

How do you determine these things? How would you identify that a terrain feature caused an imbalance? I'm sure you understand the amount of computations that would be required to analyze that, ontop of everything else. \$75,000?

What, why would it increase the number of required epochs?? Your adversarial search algorithm that all the bots are using will go "Hey I don't have any clue what my opponent is doing, but I know they're faction A (or maybe they're random!), faction A has strategy X that's really good against the faction I'm playing, so I'm going to guess they'll try to do that (what's the best play my opponent could make, according to the heuristic I'm using, assume they'll do that), so I'll respond accordingly by making units U that have better chances against that."

Because that's not the only instance where lack of information affects the game... Think about army positions, drops, small skirmishes in the middle of the map... Pretend that one player had perfect information and one player did not. The player that did could attack wherever and whenever the other player did not have adequate defenses, which would be always. When both players are blind, you need much more data for statistical relevance.

The tools do not switch hands in SC2.

Could they switch "hands"? Would a more fair system be to play a match on one small map (favors Zerg rush?), one medium map, and one large map? Let players switch races in the middle of a match (this would favor players capable of playing multiple races at a high level, such a player would be more skilled than an opponent who couldn't and they should therefore enjoy a higher win rate, correct?)?

Yes, they could switch hands. No, they would have to be played on the exact same map in the exact same positions (reversed for the players, same for the race). This would be balanced. They both have an equal opportunity to win.

EDIT 1: Something I was trying to get at with my previous post. It may not be necessary for players to have the same tools in order to be a balanced game. If both players are of equal Skill level, and the tools available to them are different, but they still both have an equal chance of winning a Match, then the game is still balanced.

Then by that definition SC2 is balanced and we're arguing over nothing. SC2 is roughly equal across the board, with one race going up/down a self-respective 10% at most (your bot would accomplish no better). I don't have the data on hand, but I have seen the data several times. Here is some data separated by leagues. This statistic has little bearing on whether the game is balanced or not, even if it were exactly equal. I'd also just like to add the thought that since Blizzard considers win statistics for making balance adjustments, they're effectively doing what you're suggesting that your bot would anyway, just slower. Your bot would not provide an objective balance just like Blizzard is not, it would simply provide a faster process to attempt to keep the win statistics relatively equal. Which I'd argue they do naturally anyway...

Solved games are not pointless to play. Connect 4 is solved. Kids everywhere still enjoy playing Connect 4 (though I will concede they probably enjoy it because they don't know it's solved). My point here is that when Chess is eventually solved, people will still play it and enjoy it.

You completely missed my point... People play solved games because they themselves have not solved it. They don't know the correct move, and thus, deciding what the correct move might be is entertaining. I was saying that you would have to balance the game around the assumption that the game is solved. Such a balance would only be balanced for the people who have solved the game for themselves, and assuming they have, they would find it pointless because they always know the answer. Everyone else who has not solved the game for themselves would have an imbalanced game, as it does not account for their specific lack of understanding or ability. To put it in simpler terms, banelings would be overpowered because marines would be balanced around the assumption that the player can perfectly split. And to make a comparison to what that would mean in connect 4, it would something such as the lack of understanding that they can use all of the available slots, or that they can win with a diagonal pattern.

If you're thinking that marines wouldn't perfectly split in your bot, I've already addressed that above. You can't quantify human error without making assumptions... unless you have their brain plugged into a computer so that it can determine exactly what they are capable of and how much they understand. You'd probably also have to do a complete physiology study, as the brain can't determine whether or not the hands that it attempts to control function properly or even exist, but I jest. (Not really, you would. Manipulating physical inputs is also a skill, and if their hands are cold or something they won't be able to operate as well as the mind thinks they should.)

nor how the game works and under what circumstances do people win.

I'm not saying this to ridicule or attack your argument. I'm saying this because I genuinely believe this to be a large crux of our disagreement, and with each post I am finding it to be more evident (i.e. not understanding what a player's lack of information fully entails). The other crux being your misunderstanding of what information is (you've inclined me to believe this is not the case, but the things you are suggesting...), or professing your very subjective view of balance as an objective one. Not sure. It may also be the case that you feel you have to defend yourself rather than your argument, which is clouding your thinking. That one is always hard to assess though, and not something anyone but yourself could affirm.

freakish777
Posts: 354
Joined: Wed Jul 13, 2011 2:14 pm UTC

### Re: RTS Balancing - Proven to be NP hard (or even solvable)?

Koa wrote:I've been explaining why... If you need a simple analogy, this is like accomplishing a balance between how far one person can throw a spear, and how far another person can walk on their hands.

It's not like that at all. It more like Balancing how far both of them throws the spear, but one of them throws it like a javelin and the other throws it like a discus. In the end it's the same task they're trying to accomplish.

So... what you're saying is that you're going to quantify human error. Do I really need to explain this?

If you want to explain it, go ahead.

but also because it will not and cannot be a perfect representation of the players.

Perfection is (almost) never the goal. Good enough is. Keep in mind that the kind of human error for a fast paced environment like SC2 is far different than the human error involved in building software. Each provides it's different set of challenges, but if you've got enough time, arriving at "good enough" is entirely feasible.

So, now you're quantifying human error for a plethora of different skills (ex marine split) over several different degrees of ability (ex league) that are also attempting to quantify human error (ex both how and why they suck). Very. Not. Objective.

How is this part not? One player's skill for splitting marines is worse than another's, because they misclick 10% of the time by an average of 3 pixels. We actually don't care about Why (why did they misclick? don't care) . Just How.

We're also not really concerned with how many players do this (just yet, we are eventually), just that it happens, and that there's a difference between skill levels.

And are you actually suggesting that the different leagues should have different game parameters?

What? No, how did you get that?

How many states of the game would there be?

1, at any given time.

If you're not saying that, then you're saying that you should take the average of every skill level to balance the game, and impose those rules on everyone.

No, I'm not.

Imagine a scatterplot on a 2-d graph (in this game there's exactly 2 variables that the game designers have said "we want you to twerk only these two to get a balanced game"). Each point represents an X, Y pair that would Balance the game. This graph is for one skill level. Now imagine you have 20 other skill levels (100 other, N other), and scatterplots on 2-d graphs for them as well. Lay them all down on top of each other. Where do the points line up the most? If you have 20 skill levels in total, and the point X = 3, Y = 6 appears in 19 of them you pick that point, as it is the most balanced for the largest number of players (barring the graph for that didn't have it representing the group of players that are the majority of all players somehow).

No it does not. One of my first arguments was that bots learn in a different way to humans. Humans make assumptions, humans have emotions. You cannot store a heuristic state to quantify human skill

Yes, it does. Let's say your micro and macro are superb. But you overvalue a particular unit. This is quantifiable human error, and is captured by a heuristic. You could still win lots of games, but an opponent who has the same superb micro and macro as you do, and has a better understanding of the game should beat you, on average.

Let's take your Connect 4 example you made (below) where a player never uses the middle row. A heuristic can be created such that the value of placing a piece in the middle row (taking that particular action) is so low, that it's lower than the value for losing the game. This forces any bot playing Connect 4 that uses that heuristic to never play a piece in the middle row. That is a quantifiable error in the player's skill (knowledge of the game) to play the game.

They have mis-assigned value. How badly have they mis-assigned value? This is 100% quantifiable (perhaps instead of it being so low that they never play the middle row, it's only so low that they almost never play it).

It is 100% unique

The map is unique. The map is not unique in regards to balance. Every map will have some overlap with other maps, due to having similar properties.

Same problem as above, balancing for an average only balances for the average and nothing else.

Not necessarily true. You don't believe that there's exists a combination of all N variables that could feasibly create balance for 2 or more skill levels?

Not. Ob-whatever. Are you getting the point yet?

No, I think you're the one not getting the fact that balance can exist across multiple skill levels simultaneously.

How do you determine these things?

By running enough epochs.

No, they would have to be played on the exact same map in the exact same positions (reversed for the players, same for the race). This would be balanced.

Why?

They both have an equal opportunity to win.

Uh... not if one faction happens to have better rushes and the other a better long game. SC2 happens to have taken care of this by attempting to balance all factions for all map sizes as well, but to say that a Game isn't balanced, as a whole because a faction has an advantage on a particular map size is just plain wrong.

Then by that definition SC2 is balanced and we're arguing over nothing.

Of course SC2 could already be balanced. What we're discussing though is whether or not an RTS can be balanced in an automated fashion, and therefore the definition of balance. Balance doesn't mean that both players (necessarily) need to have the same tools. It means players of similar skill levels have equal chances of winning a Match against each other. How big of an advantage they have over a player of a lower skill level is dependent on how much luck is involved in the game, and how big of a difference in skill level exists between them.

I'd also just like to add the thought that since Blizzard considers win statistics for making balance adjustments, they're effectively doing what you're suggesting that your bot would anyway, just slower.

Sort of. They also balance on utterly subjective things like player complaints in forums. Game/match win percentages are objective.

Strategy X won 40% of all competitive matches.

Compare to "Strategy X is so frustrating to play against! It's unfair!"

One is cold hard statistics. One is how a player feels.

Your bot would not provide an objective balance just like Blizzard is not

In your opinion is objective balance possible? If so, how? If not, why?

In my opinion Objective Balance could be possible, but more likely there will be a limit to the number of skill levels you're capable of balancing across and you'll need to leave some number of skill levels out.

You completely missed my point... People play solved games because they themselves have not solved it.

No, I got it, but there are a couple reasons to play a game you yourself have solved.

A. You're keeping someone else entertained (say, your child).

B. Knowing that there are opponent's who haven't solved it, and it's a gambling game (Poker), you're going to take your bankroll to a tournament and hope to encounter opponents who don't know what they're doing (I'm not going to claim that Poker is solved, but various aspects of it are). There's a fairly well known phrase that's along the lines of "Poker is a bunch of people who are good at the game trying to convince people who are bad at the game to give them money."

C. You're the kind of jerk that just likes winning for the sake of winning regardless of the challenge involved.

I was saying that you would have to balance the game around the assumption that the game is solved.

Balance with the assumption that the game is solvable.

Such a balance would only be balanced for the people who have solved the game for themselves, and assuming they have, they would find it pointless because they always know the answer.

If you've truly balanced the game, then yes, you would also have to have solved it as well.

No, it would not only be balanced for the top skill level. You keep assuming that balance somehow is relegated to one skill level. For N variables that we're given and told "pick values for these to make a balanced game" there are many combinations of values that would balance the game for a particular skill level (in fact the points are likely going to estimate an equation). Then we see what point occurs across the most skill levels (or say that you have a tight spiraling staircase of points, you'd pick the point in the middle of that staircase, this would make the game actually unbalanced across all skill levels, but you're hoping that it's "close enough" that human players won't find the little nuances that allow them to "break" the game repeatedly).

To put it in simpler terms, banelings would be overpowered because marines would be balanced around the assumption that the player can perfectly split.

Not necessarily true. See the above.

And to make a comparison to what that would mean in connect 4, it would something such as the lack of understanding that they can use all of the available slots, or that they can win with a diagonal pattern.

Ok, so, Connect 4 is unbalanced for a single game (it's solved, the player going first will always win if they've solved it), and it can be balanced as a match with an even number of games (so if both players have solved it they should always draw the match). If one player has a lack of understanding about the rules, that represents them not being as skilled (for that particular skill set, knowledge of rules) as a player who does. So, say the bot player who checks a heuristic for where to place, and the heuristic says "Every piece you put in column 1 is worth -5000 points" they're (probably) going to avoid column 1. Of course they're going lose against a bot using the same heuristic without that particular misunderstanding of the game. And they should.

and with each post I am finding it to be more evident (i.e. not understanding what a player's lack of information fully entails).

Enlighten me then.

A player's lack of information doesn't necessarily impact balance. Poker is a balanced game and involves a lack of information. The best Poker strategies require you to occasionally pay to see your opponent's hand to gain information about what type of player they are, and what types of bets they make, and to make assumptions about what your opponent is doing, and to what degree of confidence you think that's what they're doing. The best SC2 strategies require a player to scout early and often.

You don't need to run more epochs just because both players are blind. If "Build nothing but Marines" wins when both players are blind, then "Build nothing but Marines" is unbalanced for players who never scout, and that's good to know for balancing for players in that skill level.

It may also be the case that you feel you have to defend yourself rather than your argument, which is clouding your thinking.

No, if something looks ad hominem I call it out with an eye-roll and move on to a real argument.

Divinas
Posts: 57
Joined: Wed Aug 26, 2009 7:04 am UTC

### Re: RTS Balancing - Proven to be NP hard (or even solvable)?

I'd just like to provide some real world perspective on the issue.
This is something me and some colleagues (kinda) did, for a minigame. It was really much much simpler that sc2 and what other people are talking about, but it IS an RTS game: every player had a fixed pool of N units, at any given moment in time he could chose one of two actions: attack a spot from a discrete set with one unit, or send his unit to defend a specific spot. There are special nodes which are helping the player win, the other ones just provide topology. We wanted a specific 'feel' of the game, how the game would unfold and which nodes would be attacked, and to balance against different team sizes (one team would presumably outnumber the other one with ratios as much as 3,4 or even 5 to 1). We came up with some formulae and mechanics of how to do that, and created a genetic algorithm with just a few parameters: how far away the nodes are from your base, are they 'victory-related', how many other players are fighting a battle at that node, how is the battle going, how would such a move alter the topology of the map (as in, we calculated a simple split ratio, of how many other players would be cut off from each other and unable to help each other)

The end results: we found some bots that could play the game adequately, they showed us what the best winning strategies are. Even for so few parameters, getting a good bot would require several days of computation. It was really hard to analyze the data: we mostly had to visually observe how the bot played to be able to analyze his decisions, as well as some charts which showed us how much does each parameter count in the end decision of the bot. While this showed us some of the problems in our design, it provided no solutions. For example, we saw that bots preferred to attack when battles have just started, or are almost over, and almost never while they're in their fiercest moments. It was immediately obvious to us why that is so, given our game mechanics, but it provided 0 ideas on how to fix it. We also ran into a peculiar problem: we didn't want to tell it what to do, and design strategies, rather wanted it to think of those on its own. Designing the heuristics became quite a challenge, because most things that we wanted to measure were the ones that we wanted to use as a heuristic.

Anyhow, that was a little offtopic, but just wanted to say: you can't really slap on a few magic words like 'heuristic', 'genetic algorithm' and 'generation' and hope that everything will work out. Even deciding on the parameters for which your algorithm will optimize is a very very hard, and oftentimes, those parameters are really hard to measure and express as numbers for which to optimze.

As an example, let's presume you're trying to balance something very very simple: you have two abilities for one of your units, and need to balance between them. Ability A lets the unit teleport, ability B lets the unit cloak. What parameters are you going to put for those 2? Mana cost, duration, cooldown, range - that's obvious, but how are you going to measure their utility? If you just standoff bots, how are you going to decide when to cloak? When to teleport? Where to teleport to? Even totally ignoring production, and assuming you have the units from the start, just deciding if at a given time, it should teleport and waste mana or not is prohibitively hard. Factor in 'where should I teleport' and your search space grows even more, because you have to walk every possible position you could telpeport, at every moment in time. Even using a grid with as few as 9 squares, trying to do a min/max tree becomes insane, because for every tick you have another choice to make: should it move? should it stay? should it teleport (if off cooldown?).
Those are details that you're glossing over, but those details make such a thing very hard to even devise, let alone compute.