Development Discussion: Speeding up battle calculator (and thus Hard AI)
Alexei Svitkine last edited by Alexei Svitkine
Not sure if this is the best place to discuss this, since this is a more of a technical discussion on the code side as opposed to user-visible features...
Anyway, with the recent investigation into performance related to the battle calculator that's heavily used by Hard AI, I figured it's worth to discussing what would be the ideal implementation that could be much faster.
When battles are simulated, a copy of the game data (every state in the game) is made, so that when we perform battles in simulation, they don't change the real game. Obviously, this is really slow. So how can we make this better?
It seems there's a few ways to approach this - that could potentially be combined together.
We could try to do fewer calculations. For example, I added some logging code about what was being calculated and noticed some duplicates - the same battle was being calculated during two different parts of moveUnitsToDefendTerritories() (calculateBattleResults() called directly from there and through estimateDefendBattleResults()). So perhaps some kind of cache.
We could even generalize this further if we're careful - for example, if we're already simulated the attack of a given force against 2 infantry, that result could be re-used for a battle at a different territory that's also defended by two infantry. The challenge is of course taking into account anything that might affect the battle result (which may be hard to reason about all - since there's so many different techs and game rules - e.g. fighter scramble, that we'd probably need to enumerate them all somewhere and classify which can affect battles to decide what information needs to be taken into account of).
When copying game data, we could copy things selectively. For example, we probably just need all the technology and the territories involved in combat (which may need to include adjacent territories for scramble?). However, for this, we'd need custom copy logic instead of just serializing everything.
We can try not to copy the game data. For this to work, we would need to ensure several things:
a. That the combat code doesn't directly modify GameData but instead does it through an abstraction that we can have a custom implementation of.
b. That the battle calc isn't being done at the same time as the real GameData is changing. Assuming (a) is already in place, this should be safe to assume if battle calc is being done by the AI on its turn, but not if the player is using the battle calc (because they may not be the active player - or in theory they could bring up battle calc while doing their moves in parallel). So we'd still need to support a mode that does copy GameData, but for the purposes of Hard AI (which is what's slowed down the most by battle calc), we could have the fast mode.
Any other ideas? I'm thinking that (4) actually seems the most promising. We could start with a boolean flag on the battle calc that specifies not to copy game data and audit all the places that the Fight code is making changes back to GameData that need to be intercepted. In terms of keeping state, as long as the Fight code is keeping its own state (e.g. which units are left in the battle, etc), rather than querying it from the Territories and such, I think it may be pretty doable. I can look into it.
@Alexei-Svitkine best place to discuss this is probably git but you have been doing that! GOOD JOB! Hopefully this gets you some more feedback! Thanks for the hard work!
@Alexei-Svitkine Related to this, I've a game (a map), that I hope I'm close releasing, in which I have so many triggers and stuff that the battlecalculator takes 6 seconds to load the data, on my system. This must be because it is reading all the pointless triggers, instead of looking only at the things it actually needs, I suppose. A similar issue may be encountered in savegames of big maps with many units played for several tens of rounds, where you would create a huge game history within the savegame.
@Alexei-Svitkine I would recommend first performance testing to see where the AI spends most of the BC time: copying vs running simulations. Then we'll know where its best to focus time/energy on optimizing.
1 & 2 are a good idea and I've wanted to do this for a while. I'll add that caching results in addition to improving performance will also improve AI consistency and how well it does. There are some issues related to getting different simulation results from the same battle that cause non-optimal decisions in the AI. An example is let's say we have a territory A that is very close on whether we have enough units to win so we simulate attacking just that and get a slightly lucky result. Then the AI tries to attack A & B and this time the simulation for A goes badly so the AI determines it can't attack A & B even if let's say the 2 battles are completely on other sides of the map and have no unit overlap just got an unlucky result.
3 & 4 are a bit more difficult. I think doing things to simplify what needs copying is the best approach so avoid copying things that aren't needed like history, triggers, etc. Essentially just copy anything that relates to battles like units, territories, techs, etc.
- Greedily make copies for AI/BC so that a copy is ready ASAP once simulations are needed
- Within a single delegate phase, I don't believe anything should change that impact battle results as things like triggers/techs/etc only happen at beginning/end of delegate phases.
- Last time I did testing, casualty selection was the majority of the simulation time so any performance improvements there I think would make a major impact.
- The AI already tries to avoid running actual simulations for battles it deems aren't close (either obvious win or loss).
- I would recommend focusing your testing on larger maps like NWO, WaW, Dom NML, Global 40, etc
alkexr last edited by
@Alexei-Svitkine Well, we could speed up the battle calculator by not using it as much. (I'm not overly familiar with how Hard AI works, so this is going to be somewhat of a speculation.) I suppose that in most cases it's not necessary to know the exact win percentage. We just want to know if it's a battle we want to take, which usually means 90+%. In any case, there are ways to discard battles that don't fall into this category without having to simulate the same battle dozens of times.
Here is one example I quickly conjured: instead of simulating the battle a fixed number of times, we calculate (number of losses - number of wins) after every simulation. If this number ever reaches or exceeds +2, we discard this battle as obviously not good. This fairly reliably gets rid of battles below 50% (with a probability of 1, if we run the simulation an infinite number of times, but even in practice, battles below about 40% will usually be discarded in just a couple of rounds). Now this can also discard good battles (it will discard a 90% battle with probability 1 in 81), but the probability of a given discarded battle being 90+% is merely about 0.06%, assuming uniform priors for the win chance. (More precisely, the chance that any given discarded battle has a win chance larger than p is (-p + 1/p + 2log(p)) / (2 + 2log(0.5)), for values of p larger than 0.5).
This is just one example, of course, probably not the best option in practice. I'm just trying to point out that it does matter what you do with the win chance once you've calculated it, because maybe it's possible to do that without knowing the exact percentage. I can't really tell what's needed, since I don't know what types of battles eat up the most resources and what the AI really wants to know by calculating those battles, but with more information maybe I could do something useful.
alkexr last edited by
The AI already tries to avoid running actual simulations for battles it deems aren't close (either obvious win or loss).
Ahh I got sniped badly.
Its an interesting idea though the AI also runs simulations in parallel so I believe choosing to simulate a battle vs not simulate a battle has a higher cost than saying whether I run 5 or 10 simulations (though the AI already has some logic to determine how many simulation runs and try to minimize that number especially for larger battles). The other challenge is the first few simulations are by far the slowest as they are warming up the casualty selection cache (essentially it determines this is the optimal casualty order given I have these units and saves that to be used in other simulations as this takes a while to determine for large/complex battles). I believe is the majority of the simulation processing time. Once that is populated from the first few simulation runs, the rest become faster.
From what I recall in previous measurements the game data copy dominates. Copying only needed data or redefining the input data structure to not need a game data but only the needed battle data would be a win. From another perspective as well the game data copy scales to the size of the map and history. This can cause battle calc to be difficult to use in late stage games on large maps. That implies that AI would grind to a halt as well and is not scaling for that situation.
One tension and problem is direction and focus. To expound, we can have 'fast by design' or 'fast by happenstance'. There is lot of not good code that has been heavily optimized that puts us in a catch-22 situation. You can't update the code because it is complex and optimized, to get it to be simple and fast you need to update the code (which you can't do). The most significant performance gains are from design, architecture, not strategic fixes. For example with code that is so complex it takes 3 hours to understand 30 lines, we're not going to get it to ever be as fast as possible because we won't be able to see how we can get it so the code does not need to be executed.
With that said, I do have a re-write of default casualty selection in flight that is looking to be pretty fast and much simpler. I think we'll want to continue re-writing large chunks of MustFightBattle and related code as it has all snowballed to be incredibly complex.
From another perspective, there are problems/bugs, if we optimize incorrect code, we could be placing ourselves in a very bad situation where we have fast but buggy code. "Get it right, simple, fast, in that order". We kinda skipped the second step a few times in the name of adding more functionality and features and then getting it fast.
From another perspective, we do want to change game data copy/serialization to not be serialization. I was thinking this problem would be mostly solved when we change how game data is saved. We want the save game data to not be serialization so we can sanely control what constitutes a save game breaking change. For now we've been in a good spot since we are not maintaining save game compat, but soon changes will be rejected if they break compat (which could be extremely trivial changes, and again puts us in a place where we can't make the needed changes to fix things because the code becomes just static, complex, and not the best it could be).
Expounding on that more, I've been thinking beyond strategic re-writes of certain code to simplify it and make it faster, the larger problems would be solved by changing save-game serialization (saves) to be to a text format, make that fast, which then solves both a performance issue and the very major problem of save-game compatibility.
To get there it's hopeful too that we can break apart game data so that it is not quite such the god object. To that I was thinking we could move map data to be a sub-project that does not depend on game-core and provide as main output a 'map-data' object that is a java view of the XML data to later be interpretted as attachments etc. This would in turn help decouple some of the hardcoded and brittle aspects found in XML data, relieve some data found in game-data, create a set of immutable data that we would not need to save, which makes for faster copies and smaller save games.
Circling back around, a lot of the deeper fixes to solve multiple problems are quite heavy lifts, doing smaller lifts in the meantime can help but could make some problems worse as well. Overall it is a difficult situation. IMO backing up, adding good testing and then larger re-writes and then following up with sane performance improvements based on fast-by-design is likely to be the winning approach.
I don't know if this has been thought of or tried before but what about instead of running simulations, why not do something like expectiminimax (https://en.wikipedia.org/wiki/Expectiminimax)?
You can build out a tree where each level has all the possible losses. The chance nodes would then determine what the probability of the possible losses. You might be able to do this without needing to copy the game data or without running lots of simulations. You also might only need to calculate the order of losses once and use that to build out the tree.
That's an interesting idea, the number of leaves on the tree that result in a win divided by the number of total leaves would be the win probability. Each branch would need to have equal probability, so we'd have a tree that is potentially quite large, but perhaps still tractable.
For default casualty selection we cannot do that, we need to know the best next unit to select; AI though arguably should avoid using default casualty selection when simulating battles and is free to use any custom or clever algorithm it can.
An expected MiniMax Tree would be an interesting concept to apply here, but while it sounds reasonable in theory, I think the way battles currently work are way to complex right now for a feasibe implementation.
But in the end I'm convinced that from an AI perspective as well as a resource perspective this is a far better approach than "simulating" battles, so if someone has the endurance to give this project a shot I don't have any objections to this approach
@LaFayette One addition to your statements regarding game data cloning times:
IIRC (I believe this was the case the last time I checked but wouldn't bet on it) a "design problem" the current battle calc brings with it is that any simulated battles are not done in batches, so when fighting, each battle round the code removes the battles casualties from the underlying GameData so the next round of battle can re-use hat GameData to run a "new" battle as if nothing had happened.
This is nice for multiplayer because it updates "real-time" battles on the map by removing casualties, but is really bad for battle simulations because it would require us to have some sort of write-protection-layer on top of a GameData that pretends to pass any changes to the underlying game-data, but actually doesn't. As long as the battles work like this at least partial but still expensive cloning will be mandatory IMO.
I'll take a stab at the ExpectiMiniMax tree. I don't know the code very well so this will be more of a proof of concept. I'll probably miss a lot of the complex parts of the battles.
Once I get something working, I'll push it on Github.
I will read through the posts above when i have time.
right now, i just want to point out, for fastest AI, the order should be combat move, combat, then non-combat and finally purchase. this actually reduces the amount of 'thinking' that both humans and AI have to do.
@RoiEX , @LaFayette , I've written up a proof of concept for the tree. It isn't really expectiminimax since it isn't trying to find the best path. It is just calculating the win/lose/tie probabilities but it follows the same general idea.
I've tested it with some 1940 Global units (that don't give any support) and the probabilities are inline with the battle calculator. I've also done a few small trees by hand to verify the calculations are working.
You can see the code at https://github.com/trevan/triplea/commit/1658df71e12e8fb13ee09d5e2f6a33d1dc91f6ea. It doesn't do every battle situation (such as AA, supports, two hits, etc). I wrote a test file that mocks out ProData so I can test out the implementation.
What do you think? Should I create a WIP PR so discussion can happen in github?
@Trevan I must say I'm impressed, the code definitely looks promising.
I don't really have any objections to create a PR, but I'd rather have a way to actually make use of this class. If it's just gathering dust in the deoths of the repository it would be an unfortunate waste of time.
I think it would be a good idea to compare the new "IBattleCalculator" implementation with the 2 (technically 3, but whatever) existing implementations to see how close the estimated results are. Data is key here.
After that we could consider replacing one of the implementations for the AI, or just add an experimental AI in order for people to test this.
Ideally we get to a point where the "real" simulation results are super close to the estimated results 99% of the time so we're able to use the lightweight code for the hard ai.
We could also try to set up some custom maps to let your implementation battle it out in simulated games against the current hard ai using different IBattleCalculator implementations each.
Not sure what @redrum thinks about this, but I think he'll agree that this definitely looks promising
@RoiEX Agree. At a minimum if we got a decent implementation that's very fast then could probably replace Fast AI's calc as it just uses a very simple strength formula (low bar here).
@RoiEX, I was thinking the PR would be useful just for a place for discussion. But I'm fine with leaving it as is.
I decided to test a battle with 100 infantry attacking 100 infantry and it never finished so I need to look into that.
I've compared the estimates to the FastOddsEstimator and to the BattleCalculator in the UI (I think that is ConcurrentBattleCalculator). The results are basically the same as what ConcurrentBattleCalculator returns (if you use units that don't support each other).
After I've figured out the performance issue, I'll work on adding handling support units so that I can do better comparisons and then I'll post them here.
I've improved the performance and added support handling.
I used TestMapGameData.GLOBAL1940 for the comparisons. I got numbers for ConcurrentBattleCalculator with 200 rounds and 2000 rounds. I also used the PerfTimer to calculate how long it took to run:
1 infantry, 1 armour vs 2 infantry
Hard AI 200 runs: 1984.9 ms, Win: 0.43, Lose: 0.425
Hard AI 2000 runs: 1703.1 ms, Win: 0.4931640625, Lose: 0.388671875
MiniMax: 39.8 ms, Win: 0.506233909545216, Lose: 0.38491296598075025
1 armour, 1 tactical_bomber vs 2 infantry, 1 fighter
Hard AI 200 runs: 2031.1 ms, Win: 0.05, Lose: 0.95
Hard AI 2000 runs: 1709.8 ms, Win: 0.072265625, Lose: 0.9228515625
MiniMax: 56.0 ms, Win: 0.1150459321571266, Lose: 0.7947330200775955
1 infantry, 1 artillery vs 2 infantry
Hard AI 200 runs: 2091.1 ms, Win: 0.425, Lose: 0.52
Hard AI 2000 runs: 1740.2 ms, Win: 0.4521484375, Lose: 0.4580078125
MiniMax: 39.1 ms, Win: 0.4574200859119973, Lose: 0.45742008591199734
1 destroyer, 1 cruiser vs 1 destroyer, 1 cruiser
Hard AI 200 runs: 1832.9 ms, Win: 0.39, Lose: 0.425
Hard AI 2000 runs: 1616.8 ms, Win: 0.4228515625, Lose: 0.427734375
MiniMax: 36.1 ms, Win: 0.4250031963247701, Lose: 0.4250031963247701
1 destroyer, 1 cruiser, 1 battleship vs 1 destroyer, 2 cruiser
Hard AI 200 runs: 2019.2 ms, Win: 0.725, Lose: 0.18
Hard AI 2000 runs: 1787.6 ms, Win: 0.7939453125, Lose: 0.14453125
MiniMax: 39.0 ms, Win: 0.5358330747385637, Lose: 0.34400771480746317
1 infantry, 1 fighter vs 1 infantry, 1 aa gun
Hard AI 200 runs: 1844.4 ms, Win: 0.395, Lose: 0.605
Hard AI 2000 runs: 1727.8 ms, Win: 0.4306640625, Lose: 0.5625
MiniMax: 32.2 ms, Win: 0.6961033280443275, Lose: 0.2000276003717069
The numbers look pretty close except when AA guns or two hit battleships are used. I don't handle those.
@Trevan The speed is quite impressive. Do you think adding in additional unit properties like multi-hit units and AA rolls would be achievable? It might also be interesting to compare it to Fast AI as that uses a very simplistic formula but is very fast.