Repeated roshambo

A place to discuss the science of computers and programs, from algorithms to computability.

Formal proofs preferred.

Moderators: phlip, Larson, Moderators General, Prelates

Repeated roshambo

EDIT: these rules are old, the net points are calculated by bid*sign(#victims-#killers) instead

OLD:
Imagine a repeated game like rock paper scissors, in which each throw has a bid associated with it, and more than 2 players are allowed. Players start with 10 points, and may not bid more than the number of points they have that round. Points for each round are calculated by bid*(#victims-#killers). For example, if you throw Rock 4, player two throws a Rock 7 and player three a Scissors 3, you would be awarded 12 points, player two awarded 21 points, and player three docked 33 points.

I'm interested to hear thoughts on betting strategies for this game. The only other information you may base your strategies upon is the total rock, paper and scissors played in each preceding round, and the number of points each player has available to bet.
Last edited by robinesque on Sat Feb 11, 2012 12:06 am UTC, edited 1 time in total.
robinesque

Posts: 5
Joined: Tue Jan 17, 2012 4:16 am UTC

Re: Repeated roshambo

Well, first thing: Assuming perfect players, all play at random 1/3 rock, 1/3 paper, 1/3 scissors, as it is the nash equilibrium and any deviation from it can be exploited (but cannot exploit the 1/3 random strategy). So any strategy just modifies the bid.

What is the aim? Have most points after n rounds? That is quite a tricky thing, as you constantly have to keep track which amount others have.

Is it possible to bid 0? In that case, the 2-player game is easy: There is one round with non-zero wins/losses (otherwise the whole game is a draw) after that the winner will bid 0 every time and win the game.
Is it possible to go below 0 points? If not, the 3-player game cam get a similar case, where one player gets more than 15 points.
mfb

Posts: 803
Joined: Thu Jan 08, 2009 7:48 pm UTC

Re: Repeated roshambo

Sorry about the long delay in responding, been horribly busy...Also the rules have changed slightly from what I originally posted above.

Yes, the aim is to have the most points. No, bidding 0 is not allowed. You are allowed to bid -n..-1, 1..n where n is your current score. If n<=0 (so yes, you may have negative points), then you are constrained to bidding -1 or 1.
A player's net on each round = bid*sign(#victims-#killers) where sign is either -1 if #victims-#killers < 0, 0 if #victims-killers==0 and 1 if #victims-killers>0 (R has victims S and killers P, e.g.). The net is added to the player's total for the next round. This means you can lose as much as you bet each round, depending on the total outcome.
Each player starts with 1 point.
Players are ranked by their total points after 1000 rounds.

Not really interested in the 2-player version, because it is degenerative as you pointed out.

Given the optimal strategy is to play randomly, I think that there may be a slight edge to be had by playing a mixed strategy since likely other players will try to do the same. Setting that aside for now, you're right, the only thing to do is modify the bid.

I think the best solution probably involves tempering the Kelly strategy with your position in the pack, perhaps measured by how many std devs you are from the mean.
robinesque

Posts: 5
Joined: Tue Jan 17, 2012 4:16 am UTC

Re: Repeated roshambo

Wait, negative bids?

Then the two player case is clearly a tie. Let P1 always choose Rock, and P2 always choose Paper. P1 always bids -n, where n is their points, and P2 always bids n, where n is their current points. P1 always loses, but since you can have a negative bid, this is a payout for them. P2 always wins, and with a positive bid, again always has a payout. The payouts will always be equal, and thus after N rounds, P1 and P2 will have up to 2N points.

This generalizes easily to any even number of players. Half play X with negative bid, half play the strategy that defeats X with positive bid. If you play X, switching to the strategy that defeats X and going to a positive bid reduces your expected income (as you are now scoring from fewer players) and vice versa. Taking the third option virtually eliminates your income, regardless of your bid. So this is a Nash equilibrium. The question is only how quickly the group can agree to some particular X.

So it's only the odd player games that we need a strategy for now.
All Shadow priest spells that deal Fire damage now appear green.
Big freaky cereal boxes of death.

WarDaft

Posts: 1538
Joined: Thu Jul 30, 2009 3:16 pm UTC

Re: Repeated roshambo

I think you're misunderstanding the rules, or I'm just completely not following what you're saying. Let me re-state them completely:

Each agent starts with 1 point. There are 1000 rounds per tournament.

A legal play must be a list of an integer bid followed by R, P or S.

If the agent has n (more than 0) points, he may bid -n,.....,-1,1,....,n (note that 0 is omitted) else (n<=0) he must bid -1 or 1.
For example, on the first move (-1 R) is a legal play.

A player's net on each round = bid*sign(#victims-#killers) where sign is either -1 if #victims-#killers < 0, 0 if #victims-killers==0 and 1 if #victims-killers>0. The net is added to the player's total for the next round. (R has victims S and killers P, e.g.) This means you can lose as much as you bet each round, depending on the total outcome.

Players are ranked by their total points after 1000 rounds.

Also, agents may not collude, so even though that may be a valid strategy, it is explicitly against the rules.
robinesque

Posts: 5
Joined: Tue Jan 17, 2012 4:16 am UTC

Re: Repeated roshambo

>> (as you are now scoring from fewer players)
He changed the rules, just the sign gets counted.

Btw: P1 always choosing rock with a bid of -n is exploitable by playing scissors with negative bid.
mfb

Posts: 803
Joined: Thu Jan 08, 2009 7:48 pm UTC

Re: Repeated roshambo

I made this handy chart for understanding bids:

bid meaning
R : S > P
P : R > S
S : P > R

-R : S < P
-P : R < S
-S : P < R

In an attempt to come up with the Kelly bet for this game, I started out with the case where there are 4 players total.
b = expected payout
p = chance of winning
(p(b+1)-1)/b = fraction of payroll to wager = f

example:
4 players total
payout: 1:1
chance of this happening:
say I pick R, then I need S >= P to win or tie
Probability of S >= P:
Total number of possible bets: 3^3
number of bets where S>=P: there are 3 bets
since we're just looking at the sum of throws, we need only look at combinations

RRR 0
RRP -1
RRS 1
RPP -1
RPS 0
PPP -1
PPS -1
PSS 1
SSS 1
6/10 results are favorable, so the Kelly bet for n=4 is .2, i.e. 20% of your current score
If there are n OTHER players, then the total number of results is ((n+2) choose 3) (combinations with repetition).

Next, I spent quite a bit of time trying to figure out how many of those combinations were 'valid', i.e. #S>=#P. Eventually I wrote a little script that just brute forced it, and a very obvious pattern emerged: (starting with n=3) 6, 9, 12, 16, 20, 25, 30
A little math later and the number of valid combinations is floor((n+2)^2/4), i.e. quarter squares. If someone can tell me why this function works I would be very grateful; I just know that it does work.

In any case, the quotient of these two formulas, I think, is the probability of not losing
floor((n+2)^2/4)/((n+2) choose 3)
robinesque

Posts: 5
Joined: Tue Jan 17, 2012 4:16 am UTC

Re: Repeated roshambo

robinesque wrote: RRR 0
RRP -1
RRS 1
RPP -1
RPS 0
PPP -1
PPS -1
PSS 1
SSS 1
6/10 results are favorable

Wait, what? Symmetry requires S>P and P>S with the same frequency, assuming non-exploitable strategies. Looking at your list, you have SSS<->PPP, PSS<->PPS, RSS<->RPP, RRS<->RRP and RPS<->RRR each with the same probability within the pairs, and each pair cancels for the expectation value.
Also note that your results do not have the same probability. RPS will occur twice as often as RPP or RSS, as it has 6 out of 3^3 equal options and not just 3.

Neglecting events with P=S (nothing happens), the expected payout is 1, and the winning probability is 0.5 (and 0.5 losing)
Using this numbers, (p(b+1)-1)/b = 0. The best strategy to maximize the expected logarithm of your money is to do nothing (or risk as few as possible).

However, the Kelly strategy is not optimal: You want to beat a lot of opponents, so you need to take more risk in order to have a chance to perform much better than the average.
Imagine you have 1 million other players and play 20 rounds: Chances are good that one player will win all 20 rounds (as 2^20 ~ 10^6 and draws are very unlikely). In order to have a chance to win this, you would have to bed all your points in nearly all rounds (maybe with the exception of the last two, depending on the results of other players). But with just two players, that is obviously a bad idea. The other one could play carefully and you would lose with very high probability.

Considering #S>#P: The probability of #R=r rocks with n other players is (n choose r)
The following probability of #S=#P for fixed even n-r is (n-r choose (n-r)/2), if not #S=#P both #S>#P and #P>#S have the same probability (1-(n-r choose (n-r)/2))/2.
And for even n-r the probability of #S>#P is just 1/2
Introduce a sum over r, multiply (n choose r) with the appropriate probability from above, and you get the chance of #S>#P. However, I don't see how you want to use it.
mfb

Posts: 803
Joined: Thu Jan 08, 2009 7:48 pm UTC

Re: Repeated roshambo

I suppose you're right, but somehow my assumptions worked out pretty well and my agent ranked 3/150.

Probably the other students just wrote really poor agents.
robinesque

Posts: 5
Joined: Tue Jan 17, 2012 4:16 am UTC