Gargantua's Hit-Handicap system ->New A&A Concept

variance

IL should beg him for the privilege of delivering the rightful badge of customizer.
(I am joking in case anyone with no sense of humour gets upset)

You know, this could be done in a face to face game too. You would just keep a running total of 4 things: the number of dice thrown by both sides, and the number of hits made by both sides. At the end of the game you take the number of dice thrown by each side and divide by 6, then subtract the number of actual hits made by each side throughout the game (the result may be positive or negative). Subtract these 2 numbers and you have Garg’s index. It would really not be too difficult. Maybe at each battle the attacker and defender roll the dice, and one of the other players is appointed to adjust the totals after each round of combat. Sounds like something you might do in a tournament or other high stakes situation.

Gargantua

That’s “close” variance but it wont quite work. You’d need to know what the attacking / defending values of the units are to know the “expected value”.

variance

Oh right, so you would add up the attack value and hits made, and the defense value and hits made in each round of combat. Divide the total attack and defense values by 6, and take the difference. Easy to do if using the battle board especially in larger battles

Witt

I like it Garg.

Tizkit

Gargantua,

This is definitely a great idea and would be a great tool.

I’m wondering about a slightly different metric though:
Sum[(Battle Score) - (Expected Battle Score)]

You mentioned the two main drawbacks of your calc:
a) Includes all overkill rolls
b) Values all rolls equally

These two factors could produce a false handicap. Consider a match where in the early game one player hits 3/3 on AA fire. (+2.5 Handicap) An unexpected TUV swing of 50+ is very possible from the value of the planes and the loss of their potential rolls. With top tier players that could be enough to slant the game.

If later in the game the player with the lucky AA shots mashes his stack of 50 units into a blocker a couple of times the impact of the extra 100 rolls will almost certainly overshadow the +2.5 Handicap from the AA fire. (and the impact of those rolls could easily throw the handicap into the negatives instead)

So now you can lose to an opponent who was lucky but still has a negative handicap inaccurately confirming their brilliance. :-P

Using the Battle Score differential would compensate somewhat for the above problems because it incorporates a value system to the impact of each roll and ignores overkills.

The actual battle score is easy since it comes with the turn summary. The Expected battle score is harder. I believe you need a simulator to get that reliably because of the path dependency of the battle. (i.e. the second round depends on how the first round went) Unless you know of a reliable way to get expected TUV swing without a simulation?

Perhaps the developers could get the system to run the battle calculator for X trials before each battle and spit out the expected TUV swing with the turn summary so you would get both an actual Battle Score and Expected Battle Score.

The principle would be the same as your calc, you find the Battle Score differential for both players for all combats and add them together to get an overall result. Plus the metric’s value is already in IPCs which we all understand. Measuring hits/kills is more subjective… kills of what?

Have you tested your method at all for false handicap results?

variance

Great points tizkit.

To fix the overkills problem, what if you limit the Actual Hits on each round of combat to the number needed to reduce the enemy’s force to 0 units.

For example, suppose you attack 1 destroyer with 10 fighters and let’s say you roll 3 hits. 1 hit would be sufficient to kill the destroyer so you only count that as 1 actual hit instead of 3. Drop the remainder.

Gargantua

After consideration I’ve decide that Overkill actually isn’t a problem, and I’ll explain why in a second. But before I get to that I will respond to the TUV metric.

TUV metric is what I’ll push for next. We are discussing it in the league general discussion, apparently it’s been play tested with great results. Basically, take expected TUV, vs actual TUV recieived. But this has it’s own problems. Some battles can go wild. running a battle calc can give you -20 to +30 on some larger battles, and vary in between those numbers each time you click the button. I’ve seen lots of 50/50 battles like this. SO it’s not without error; but I do like it overall.

My vision for this would be, you start the turn, prep all your combats, select DONE combat move. It’s at that point that it then calculates all your expected combat results; then you roll dice. TUV’s can then be compared by triple A of the before and after for a final cumulative output.

But here’s the thing, there’s lots of kinds of luck in axis and allies; and Having a low die roll is a luck that is independent from what type of unit it hits. The two should not be confused.

Ultimately - this is what I am trying to prove. That at any given point, when you roll a dice (1 die at a time), how you are performing compared to expected value. So you can wholesomely establish whether it’s entirely been just a dicing or not; regardless of what units got hit in what battles, or where, or on what turn (early hits better than later). The dice don’t care if it’s a battleship or a bomber.

A whiner’s argument is that everytime they roll a die, their average is off the median. What we want to prove is whether that’s true or not, so we are isolating and compartmentalizing chance specifically down to each die, each time it’s rolled, one at a time; compared to it’s expected value.

This is why OVERKILL isn’t a problem. We calculate each dice. If you’re getting diced it will reflect equally in overkill as in normal battles.

Also in my game against farmboy right now for example, I HAVE to send overkill, in order to secure victory, it’s not a choice. the dice have been that bad. Sometimes it takes 3 turns of 3 bombers to get ONE hit. and if I sent 100 bombers and got 10 hits; in one battle or over several battles, those poor results should be reflected. But if I’ve lost some small battles, but done average in the large ones; then I want to know that really in the big picture I haven’t been diced.

BACK to TUV calculations, other than the previously mentioned solution, I have another metric in mind that may help alot; or in a different way. I’m pushing for a cumulative casualty reporting system. So we can know how much plastic we have killed each game, of what type. :) This will help show TUV scores as the game progresses.

Once we start down this type of road of statistical reporting, we will start getting more reports. Like what’s russia’s kill ratio against Germany, or USA vs Japan, and that kind of thing. We’re going to get a TON of information mined out of the game, and things will continue to evolve from there.

Please understand for now, that the goal is just to prove whether the dice have been cruel or not; independent of when or where. Adding layers after or dissecting that information with different tools for different perspectives is stage #2.

Sorry for wall of text!

CWO Marc

One thing to keep in mind is that there’s a difference between the individual result of a single individual dice roll and the cumulated results of all the dice rolls that a player rolls during an entire game, in the same way that there’s a difference between weather (which is what you get on an individual day) and climate (which is the overall pattern of temperature, rainfall, etc. for a given region over the course of several years).

The cumulated dice rolls in an A&A game are like climate. In principle, they should more or less follow the normal statistical distribution that applies to the number of dice being rolled…and the more often you roll the dice, the more the results should match that distribution. Casino house games are built around this fundamental statistical principle, and this explains why the casino’s blackjack card dealers (for example) are encouraged to keep the game moving as fast as possible in order to play as many rounds as possible during their shift: because the more games are played, the more the results will fit the statistical distribution around which the payouts (which are designed to earn a profit for the house) are calculated.

An individual dice roll, by contrast, is like weather. By its very nature, it’s more prone to variability than a whole bunch of dice rolls taken together…and that’s where an important distinction comes in. If you roll two dice, and you get either two 1s or two 6s, it’s perfectly valid to say that the result doesn’t fit the predicted distribution, given that the highest probablity involving two dice is a result that adds up to 7…just as it’s perfectly valid to say that the -20C daily high temperatures that prevailed in southern Ontario and Quebec in the week between Christmas and New Year’s Day did not fit the normal season average of -7C or so. The issue of whether a player is getting “bad dice” in general, however, can only be judged by the cumulative results that he gets over the course of an entire game, not by an individual dice roll (just as the weather of a single day or a single week can’t be used to draw conclusions about whether the climate is changing).

What I’m wondering about Garg’s proposed system, which is certainly an interesting concept, is whether it’s a system that has no effect in the early rounds of the game (at which point the system is simply collecting data, and at which point it can’t draw any conclusions because it’s only got a small statistical sample to work with), and then – once it’s dealing with enough rolls to see whether a player is indeed falling outside the normal distribution overall – which gradually has more and more of a compensating effect on those players who are indeed getting bad dice. (I can’t really tell from the posts in the thread, but it may just be because I’ve only had time to read them quickly.) This also raises a potential point to think about: if the system aims to compensate for players who get excessively bad dice by making their results bettwe fit the normal distribution…shouldn’t it do the same thing to players who get excessively good dice? Nobody ever complains when they themselves get great dice, but I can understand why their opponents might complain about it.

Gargantua

Let’s break this down

LOW SAMPLING NEED NOT APPLY

What I’m wondering about Garg’s proposed system, which is certainly an interesting concept, is whether it’s a system that has no effect in the early rounds of the game (at which point the system is simply collecting data, and at which point it can’t draw any conclusions because it’s only got a small statistical sample to work with), and then – once it’s dealing with enough rolls to see whether a player is indeed falling outside the normal distribution overall – which gradually has more and more of a compensating effect on those players who are indeed getting bad dice. (I can’t really tell from the posts in the thread, but it may just be because I’ve only had time to read them quickly.) This also raises a potential point to think about: if the system aims to compensate for players who get excessively bad dice by making their results bettwe fit the normal distribution…shouldn’t it do the same thing to players who get excessively good dice? Nobody ever complains when they themselves get great dice, but I can understand why their opponents might complain about it.

It does work on low samples. Lets look at a firm example. I had an EPIC G1 against Mallery29 late 2017. In just G1 alone, Germany was +9 hits over expected value, and the allies were -6 under expected value. A Hit-Handicap of 15 units. Massacre. Sure in theory smaller sampling won’t generally show greater results, but you don’t usually know you’re getting diced in a game until a few turns in. and you have to pick a “compartmentalization point”. Are you just looking at one battle? one turn? or the whole game?. I can’t say for sure how many dice are rolled over G1 on average, or over turn 1. but G1 is probably at least 60 dice rolled? the sampling is instantly enough to start a baseline - which trends from there.

As for compensation… that’s basically house rules people can figure out to their own standard. Good or bad; and if they so want!

Something else to consider - The way the dev’s have started coding this at TripleA, it’s recording dice stats nation by nation. Maybe Germanys hot and Japan folds. what then? :)

Like Weather vs Climate

The cumulated dice rolls in an A&A game are like climate. In principle, they should more or less follow the normal statistical distribution that applies to the number of dice being rolled…and the more often you roll the dice, the more the results should match that distribution. Casino house games are built around this fundamental statistical principle, and this explains why the casino’s blackjack card dealers (for example) are encouraged to keep the game moving as fast as possible in order to play as many rounds as possible during their shift: because the more games are played, the more the results will fit the statistical distribution around which the payouts (which are designed to earn a profit for the house) are calculated.

This is exactly why all die rolls need to be weighed equally, against their expected results per die. Arizona is dry, it shouldn’t rain there often. but if it rains everyday there for a year WTF? something is off and you can quantify it by recording each day and comparing it; to see what kind of dice climate you dealt with on a quantifiable level.

Gargantua

To illustrate the point on “overkill” attacks. They should be recorded.

An example from just this turn against Farmboy.
https://www.axisandallies.org/forums/index.php?topic=41042.195

Combat - British
Battle in 98 Sea Zone
British attack with 5 fighters
Italians defend with 1 destroyer
British roll dice for 5 fighters in 98 Sea Zone, round 2 : 1/5 hits, 2.50 expected hits
Italians roll dice for 1 destroyer in 98 Sea Zone, round 2 : 1/1 hits, 0.33 expected hits
1 destroyer owned by the Italians and 1 fighter owned by the British lost in 98 Sea Zone
British win with 4 fighters remaining. Battle score for attacker is -2
Casualties for British: 1 fighter
Casualties for Italians: 1 destroyer

I attacked 4 destroyers that round, all battles the allies had 3 or more units attacking. “Overkill” as you would say. I scored 1 hit a battle, his scored 3/4 defenses. Just destroyers. All his ground units also hit atleast once defence as well.

On average I need to roll about 4 or 5 dice to get a hit on his destroyers, and his destroyers are rolling as if they are about a 5 defense unit. There’s nothing “overkill” about it.

How have the dice been treating you? Now you know :)

Gargantua

What we are effectively demonstrating should be called “Underkill” lol.

Omega1759

Regarding the “overkill”, the calculator can simply compare the outcome (# of hit or TUV damage) to the expected result (again # of hit or TUV damage).

Regarding the TUV logic, we would need to figure out how to factor retreat. Once a retreat is called, do you reduce the expected TUV to the number of rounds that were rolled?

Expected TUV appears way to go. Could also add the value of territories / NO gained in that equation.

Once all this is set up, maybe we can train an AI to read game scripts and play the game. :-D

Then skynet is born and we end up playing table top! :-D

Tizkit

@Omega1759:

Regarding the TUV logic, we would need to figure out how to factor retreat. Once a retreat is called, do you reduce the expected TUV to the number of rounds that were rolled?

That’s a good point. In an intentional strafe you might tend to look unlucky if you use pre-battle expected TUV vs actual. You are giving up the TUV swing of later rounds on purpose but the calc would attribute it to luck.

Tizkit

@Gargantua:

To illustrate the point on “overkill” attacks. They should be recorded.

I’m thinking of overkill a bit differently… not as sending extra to account for the variance in dice outcomes and making sure that you get the kill, but actual hits that are superfluous. In your example there isn’t overkill in that sense because you achieved just enough hits to clear his units. “evenkill” perhaps?

In the example of a 50 unit stack marching into a blocker, let’s say your expected hits are 20, but you only hit 10. The hit handicap calc would yield a -10 result, but this is superfluous data. It’s true that your rolls were poor, but they didn’t really have any game impact.

It would be nice if those could be eliminated, especially if you’re house ruling bonuses based on the hit handicap.

What about the original calc when units remain but if they are cleared your hit score is:

+0 ->if expected hits were greater than necessary hits
+(Actual) - (Expected) -> if expected hits were less than necessary hits

In the example of your 5x fighters vs the destroyer you would still get the -1.5 hit handicap, but if you happened to roll 5/5 you would score 0 rather than +2.5 since the extra hits didn’t actually help you.

Gargantua

@Tizkit:

@Omega1759:

Regarding the TUV logic, we would need to figure out how to factor retreat. Once a retreat is called, do you reduce the expected TUV to the number of rounds that were rolled?

That’s a good point. In an intentional strafe you might tend to look unlucky if you use pre-battle expected TUV vs actual. You are giving up the TUV swing of later rounds on purpose but the calc would attribute it to luck.

Assuming we get to the point where we can have tripleA calculate battles, to get the expected TUV gains/losse. This shouldn’t be hard to code and quantify.

After the combat moves, it can just re calculate and replace data for any retreats. It will just do the battle calc again, and fill in the “retreat after round X” data field. EZ.

Gargantua

@Tizkit:

@Gargantua:

To illustrate the point on “overkill” attacks. They should be recorded.

I’m thinking of overkill a bit differently… not as sending extra to account for the variance in dice outcomes and making sure that you get the kill, but actual hits that are superfluous. In your example there isn’t overkill in that sense because you achieved just enough hits to clear his units. “evenkill” perhaps?

In the example of a 50 unit stack marching into a blocker, let’s say your expected hits are 20, but you only hit 10. The hit handicap calc would yield a -10 result, but this is superfluous data. It’s true that your rolls were poor, but they didn’t really have any game impact.

It would be nice if those could be eliminated, especially if you’re house ruling bonuses based on the hit handicap.

What about the original calc when units remain but if they are cleared your hit score is:

+0 ->if expected hits were greater than necessary hits
+(Actual) - (Expected) -> if expected hits were less than necessary hits

In the example of your 5x fighters vs the destroyer you would still get the -1.5 hit handicap, but if you happened to roll 5/5 you would score 0 rather than +2.5 since the extra hits didn’t actually help you.

It doesn’t matter if the hits are superfluous or not. What we are establishing is how you perform when you get down to it and actually drop a dice - one at a time.

Here’s the reason:

In the majority of battles, the attacker attacks with what they perceive as a “winning” or “superior” force. Meaning they almost always have extra dice, and will often hit overkill. If on someone’s G1, 100% of their units hit. That’s outstandingly abnormal. But if we don’t count the superfluous hits, it won’t actually show how their dice performed; in fact they’ll only be able to establish when they under-hit.

Forward that over to the end of the game, and both parties will basically have -values.

This will get worse especially with lots of smaller battles against 1 unit like an inf or destroyer. At worst you will probably only get to -1 and at best you can only score 0. Even if in reality you are hitting -2 or +4.

taamvan

An interesting proposal. Sure is easy to gather the data.

While you’ve already addressed it in your comments, I would mention that luck, in the abstract, and in general, does not matter–most battles are not outcome-in-doubt. You send what you need to win; usually this 3-4 units + 1 air to blast 2-3 defenders, its a 90% odds battle. Whether the Chinese infantry gets a retal or not, or whether 4 units attack West Russia and 2 survive or 3, really don’t move the ball enough to matter.

You could have great, consistent rolls across most of the game, during blowouts, and attrition, but miserable luck during the situations it matters, which account for less than 20% of the total rolls.

Alternately, you could have awful general luck, but simply hit the averages during key battles, which may be all you need to win (for example, when your opponent blows the retal).

When you get 9 hits among 12 “2s”, or get few or no hits at all for an entire round during a big fleet battle, those are luck-excursions occurring when they matter.

Dropping 3 / 5 AAA hits is devastating, but during dark skies, Germany just replaces the bombers and continues the blasting, such is their structural advantage. The same hits would a lot worse during a fighting combat for Moscow where you lost punch at the start of the fight…

Big battles for Moscow, India, Fleet tend to move towards the averages, such that you don’t know the outcome and have to go to the odds, and there is always a chance for a flub. However, you tend to roll a lot of dice during these battles, which regresses to the mean.

Luck matters, but you have to make it matter as little as possible, to retain the value of skill, vs luck, since skill would be your advantage (we hope!)

The real challenge then is to consistently win, regardless of what the dice do. Good efforts.

Gargantua

The Key benefit of this is being able to quantify how good or bad things have been, and tangibly being able to present it, and have a baseline.

Suddenly one game can actually be compared to another on a somewhat similar scale for how good or bad it has been in general.

For example, lets say Karl7 says he got diced terribly, and I say oh no this game I played with Variance was worse. Now we can actually compare the differential, and enjoy our misery together :) Or say I played three games against a top 5 opponent, and won 1/3, I can then compare the handicaps to see if the luck mattered more than the strategies I may have employed. Or maybe some strategies (like continuing to take risks) need to change depending on the status of the handicap. If you’re down a bunch of units, maybe you should be playing more conservatively, or maybe you should press the attack. whatever.

Maybe we can have achievements on the new forums for winning a game with a great than 50 unit handicap lol or whatever lol.

farmboy

I’ve seen bits of the discussion and I saw that my game with Gargantua had been referenced. I’ll first note that this has been easily the most luck I have had in a game since I started league play (and I’ve had more than my share of luck so far).

In G3, my German stack wiped out his Soviet stack on 50% odds. I didn’t just win though, but had about twice as many units surviving than the calculator predicted. Given the situation, I didn’t think I had a choice but to do this, but had the result been even slightly less in my favour, the Soviets could have countered and would have had Eastern Europe open to them. I had nothing following that stack, whereas the Soviets did. This gave me time to recover. Following that, he had a couple of more significant combats where his units significantly under performed and we have had several rounds (especially in the Pacific) where my destroyer blockers, defending lone subs, or single infantry get hits. None of these combats were individually critical but the net effect has been that his forces have taken far more attrition and have had fewer options as a result. There have been a few battles where the luck has gone the other way but no rounds where the luck consistently and substantially went against me (except the very last one played- J12).

Obviously, luck and the uncertainty it provides is a key component of the game and, in my mind, makes it far more interesting and fun to play. The luck in that early stack battle was a pleasant surprise but I didn’t feel any guilt around benefiting from it. One should expect that to happen sometimes. And having the odd round where one’s destroyer blockers hit at 80% is going to happen too. This kind of luck is at best usually the difference between defeat now and defeat next round. But the consistency of my luck and Gargantua’s bad luck has had the effect that in a game where I was ready to surrender after G3, I am now in round 12 and have a reasonable shot at winning. I am certainly far closer to winning than my play (especially early on) deserved.

farmboy

On adding a hit differential calculator, I think having this additional info in the game would be quite interesting and useful even though I do agree that not all dice luck in the game is equal. Its more information but will sometimes lend itself to a misinterpretation of the actual luck of each player.

I do think though that if there is a way to avoid counting superfluous hits, that would make it more accurate. Some dice luck doesn’t matter to the game and so shouldn’t be counted. If 6 fighters attack one destroyer the consequences in game terms of a lucky roll of 6 hits is the same as an ‘unlucky’ role of 1 hit or an average roll of 3 hits. It doesn’t actually matter to the outcome of the game but the measure would make it appear significant.

A TUV differential might have other problems, but it would solve this since superfluous hits aren’t going to impact on it.

But I thought another way to do is to take this hit differential counter and modify it when the hits exceed casualties in a combat. That is to say, in combat rounds where the expected number of hits and the actual number of hits both exceed the number of casualties taken, the hit differential of the victor in that combat round doesn’t get counted in the hit differential. This assumes we can break it down by combat round and I’m not sure if that can be done.

If that can’t be done then perhaps in situations where the combat only goes one round, and the expected number of hits exceeds the number of casualties, we could just ignore the hit differential of the winner of that combat. That would at least remove all situations where you send a stack after a lone unit. I think?

Gargantua's Hit-Handicap system ->New A&A Concept

Featured Topics

T-shirts, Hats, and More

Suggested Topics

Grasshopper's Vacation

Полина . О кин&

Preditor v.s Alien

What's wrong in this letter?

Console gaming systems

Here's your Sign…

Clinton's Global Initiative

Bellorussia stack defense (A&A europe)

34

17.9k

40.6k

1.8m