1D battleship strategy

A place to discuss the science of computers and programs, from algorithms to computability.

Formal proofs preferred.

Moderators: phlip, Larson, Moderators General, Prelates

1D battleship strategy

Wasn't sure whether to put this under Science ('cause of the reason the problem came up in the first place), Computer Science (since it's kind of an algorithm), or Math (since all algorithms are just math anyways), but here goes.

My problem: I have lots of biological samples, taken from slices of a fruit fly egg embedded in slicing medium. Because of the size of the egg, I don't know whether a particular tube has any egg in it, or just the slicing medium. The eggs are approximately 8 slices long, so if one of the samples has an egg, 7 of its neighbors will. I did the samples in 6 sets of 18, so for any set of 18 samples, there are 0 or 1 eggs, and knowing where the egg is in one set tells you nothing about where it is in another.

I can check the samples for egg, but with the following caveats:
• I can only check them 10 at a time (or rather, there's no benefit to doing fewer than 10 at a time)
• It's moderately expensive/time consuming to do the check
• If I screwed up the slicing and there are less than 8 slices, the whole set of slices is worthless.
• The downstream process is very expensive, so I want to find only the samples that actually have egg in them.
I liken this to the game of Battleship, where you know how big your opponent's ships are, but not where they are, and you have to find them as quickly as possible. I assume some sort of divide-and-conquer strategy would be the best, but I'm not sure of the actual spacing to try them out in. Any suggestions?

For the truly geeky, try to generalize to sets of N samples, of which M will have egg, and you run them K at a time.
Ten is approximately infinity (It's very large)
Ten is approximately zero (It's very small)
rflrob

Posts: 234
Joined: Wed Oct 31, 2007 6:45 pm UTC
Location: Berkeley, CA, USA, Terra, Sol

Re: 1D battleship strategy

Initial thoughts -

Until an egg is found, any samples to be tested from the same set should be spaced out by at least 8 slices. Otherwise we risk getting redundant information (except if it's a bad sample). Better to find the eggs first and then work out where its boundaries are.

Not sure about divide and conquer because is it costs more to test n slices than it does to test n/2 slices. We could better use those slots by searching more sets at once, since each set is independent.

Assuming we will never have more than 1 egg in the same set, we can stop testing in 8+ slice intervals one once we find an egg or have tested enough samples to rule out an egg being there.

Do we need to find all of the eggs, or just maximise our chances of finding more in a given test? If we don't need them all we may want to abandon or delay searching a set without completely ruling it out.

If we label your samples like this:
a01 a02 a03 ... a18 b01 b02 b03 .... b18 ... f18

I would start by testing something like
a06 b06 c06 d06 e06 f06 a12 b12 c12 d12
(*edit* as mfb points out below, only two points are needed to search one set, so lets spread them out evenly)

Then, for each of the sets where we find an egg, we can then search the two each side of it on the next go, and then either side of that on the go after that, etc. Once we get a miss we stop searching in that direction.

Continue to pick unexplored segments of 8 spaces until there are none left, or test extra neighbours of found egg slices

Given we have found part of an egg, and know nothing about the (up to) 7 neighbours either side, we can calculate the probability of an adjacent slice being a miss based on the number of slices found already. We can use this to prioritise.
Last edited by Mat on Sat Aug 27, 2011 4:03 pm UTC, edited 3 times in total.

Mat

Posts: 395
Joined: Fri Apr 21, 2006 8:19 pm UTC
Location: London

Re: 1D battleship strategy

I am a bit unsure how the numbers 8, 6 and 18 are connected:
A sample is the basic unit? So in a set of 18 samples, you may or may not have 8 consecutive slices with an egg - and if there are less, you don't care? Why is it impossible to have 2 eggs there?

As you have 12 disjoint possible positions for eggs, you need 20 checks to find all possible eggs: For example, you can check position 7 and 12 in each set. If both do not contain an egg, the set does not contain one.
If you want to find the position and length of each egg candidate, you will need more checks - that is a tricky because of the 10-check-rule: While you do check11+12 for possible egg candidates, you can already check for the positions of known egg candidates. I think a perfect algorithm will need a bit more knowledge: How many egg candidates (slices with eggs) belong to real eggs, and how do failed eggs look like?
mfb

Posts: 803
Joined: Thu Jan 08, 2009 7:48 pm UTC

Re: 1D battleship strategy

Mat wrote:
If we label your samples like this:
a01 a02 a03 ... a18 b01 b02 b03 .... b18 ... f18

I would start by testing something like
a06 b06 c06 d06 e06 f06 a12 b12 c12 d12
(*edit* as mfb points out below, only two points are needed to search one set, so lets spread them out evenly)

Then, for each of the sets where we find an egg, we can then search the two each side of it on the next go, and then either side of that on the go after that, etc. Once we get a miss we stop searching in that direction.

By coincidence, that's exactly the labeling scheme I'm using.

If my goal is to "sink" the battleship by completely covering it, then this does seem like the best strategy. If I only need to know where the ends are, though, it seems like I might be able to do better with something like the following (in parallel across the A-Fs):
if A06 is a hit, then look at A03 and A09. If only one is a hit, then shift left or right appropriately for the next round (e.g. [A02 and A08] or [A04 and A10]), or if they're both hits shift outwards [A02 and A10].

I'm not sure if I'd be better off starting with A03 and A09, or maybe A02 and A10, or A04 and A10, or...

mfb wrote:I am a bit unsure how the numbers 8, 6 and 18 are connected:
A sample is the basic unit? So in a set of 18 samples, you may or may not have 8 consecutive slices with an egg - and if there are less, you don't care?

Sounds like you have the right idea, a sample is the basic unit and they come in independent sets of 18 samples. If there are fewer than 8 consecutive samples with egg, then most likely I've screwed something up and the samples are worthless to me.
Why is it impossible to have 2 eggs there?

Each series of 18 slices comes from a separate block that was sliced. I know that I only put one egg into that block, but when I do the actual slicing, I can't tell exactly where in the block the slices came from relative to the egg (there's a small, somewhat random amount that I lose off the front while getting the slicing set up).
As you have 12 disjoint possible positions for eggs, you need 20 checks to find all possible eggs: For example, you can check position 7 and 12 in each set. If both do not contain an egg, the set does not contain one.
If you want to find the position and length of each egg candidate, you will need more checks - that is a tricky because of the 10-check-rule: While you do check11+12 for possible egg candidates, you can already check for the positions of known egg candidates. I think a perfect algorithm will need a bit more knowledge: How many egg candidates (slices with eggs) belong to real eggs, and how do failed eggs look like?

Not sure what you mean by "real eggs" and "failed eggs". There aren't any false positives, if that's what you mean. As far as failed eggs, a couple ways I might have screwed up:
• The egg is too close to the start or end of the slices, e.g.: ++++-------------- or -------------+++++
• In processing the egg slices, I messed up one or more intermediate samples, e.g.: ----+++-++++------ or ---+++-+-++-------. I'll need to think really hard about my protocol, and whether or not having a single gap is acceptable, although more than 2 or 3 probably isn't.
• In processing the slices, I messed up one or more of the slices at the edge of the egg: ------+++++-------

Perhaps in the downtime I have while A6,A12,B6,B12,C6,C12,D6,D12,E6, and E12 are running, I'll code up a simulation. If I do it in a way that's not too bad, I may put up the code and see if other people can come up with better picking algorithms.
Ten is approximately infinity (It's very large)
Ten is approximately zero (It's very small)
rflrob

Posts: 234
Joined: Wed Oct 31, 2007 6:45 pm UTC
Location: Berkeley, CA, USA, Terra, Sol

Re: 1D battleship strategy

>> As far as failed eggs, a couple ways I might have screwed up:
That is what I mean with failed eggs: Positive samples, but not a useful egg. Combining the second and the third point means that you have to test all (or nearly all if 1 gap is acceptable) slices which are part of the egg.
A real egg are 8 slices with egg inside.

>> if A06 is a hit, then look at A03 and A09. If only one is a hit, then shift left or right appropriately for the next round (e.g. [A02 and A08] or [A04 and A10]), or if they're both hits shift outwards [A02 and A10].
That means that you need something like 4 (?) rounds. But there can be better solutions, depending on the number of eggs: Imagine the case of only 1 egg in the 6 sets: If you find the position with the first 10 samples, you can check 8 samples near the hit (2 checks are needed for the remaining possible eggs) at once.
mfb

Posts: 803
Joined: Thu Jan 08, 2009 7:48 pm UTC

Re: 1D battleship strategy

Relative costs matter.

Ie, imagine you have a defective 7 length egg. You could find one end of the egg, then assume the other end is 8 away. But if the egg detection divided by defective rate is cheap compared to wasting time processing a defective egg, then you'd want to confirm both edges of the egg.

Are long eggs a problem? Are unknown long eggs a problem? Ie, is an egg of length 9 a problem? Is it a problem to think that the egg is 8 long, and it is actually 9 long?
One of the painful things about our time is that those who feel certainty are stupid, and those with any imagination and understanding are filled with doubt and indecision - BR

Last edited by JHVH on Fri Oct 23, 4004 BCE 6:17 pm, edited 6 times in total.

Yakk

Posts: 10039
Joined: Sat Jan 27, 2007 7:27 pm UTC
Location: E pur si muove