## My Hobby: Simplifying Sentences

ColdFire
Joined: Thu Mar 25, 2010 9:02 pm UTC

### My Hobby: Simplifying Sentences

[Mandatory long time reader and lurker post]

A+B+A+B+A+A+A+C = 5A+2B+C
Now I was jokeing with a fellow maths obsessed friend in Religious Education that we should simplify our names, therefore, by our quickly made rules "Barrack Obama" = "2b4a2rck om"
Sounds silly but quite funny to do, or rather to undo. Yesterday she gave me: "5I8n 8a 3r5g9h15t 4l14e, 7o 3f 2s2d j2c 2u y", which I managed to work out to be the definition of cosine. (Can't remember the exact wording), something very similar to "The ratio of the length of the side adjacent to an acute angle of a right triangle to the length of the hypotenuse".
Today she gave me: "8o2w16i10n6g 14t 9s 10r8a2hf4d 5m2p8l11e 6c 4u5y , 2b", including the comma, this has about 50% more letters, and more unique letters. MUCH harder. At first I assumed it should be impossible, but then I found this: http://www.morewords.com/. Now with this and common sense I can work out these are likely:
"8o2w16i10n6g 14t" = "Owing to"
"10r8a2hf4" = "Straightforward"
However, there're 33 possibles for "5m2p8l11e", and simply to many hidden words. Hidden words are for example, "together we run to the beach" = "4t2og5e3h2r w un bac", in which both "to" and "the" become hidden. Before I admit the impossibility, can any of you come up with any masterplan ideas?

(Not sure if this would also benefit from also being in "Language/Linguistics")

Helldrake
Joined: Fri Mar 26, 2010 4:20 am UTC

### Re: My Hobby: Simplifying Sentences

Are the spaces placed randomly? if they are, the possibilities are too many...

afk2011
Joined: Thu Jan 07, 2010 2:08 am UTC

### Re: My Hobby: Simplifying Sentences

Owing to its straightforward implementation in digital electronic circuitry using logic gates, the binary system is used internally by all modern co mputers

Helldrake
Joined: Fri Mar 26, 2010 4:20 am UTC

### Re: My Hobby: Simplifying Sentences

afk2011 wrote:Owing to its straightforward implementation in digital electronic circuitry using logic gates, the binary system is used internally by all modern co mputers

WTF...how did you manage to solve this?

afarnen
Joined: Mon May 05, 2008 12:12 pm UTC

### Re: My Hobby: Simplifying Sentences

Helldrake wrote:Are the spaces placed randomly? if they are, the possibilities are too many...

As far as I understand, word order, spaces and punctuation are preserved. But ignoring this, they are still solvable.

Since the first-of-each-letter order is preserved, they are always easier than anagrams, which are still solvable, if the message is written in a language in which most strings are meaningless (whether syntactically or semantically), like English, and the message is reasonably short . Context also helps. However, as letters are used up, this basically becomes an anagram.

I think it would help if repeated and trailing spaces were preserved, which would tell you not only how many words, but how many words fall between any two chosen words (assuming one space between each word in the original message). For visibility purposes, you can use a tilde to mark a hidden word. For example "together we run to the beach" -> "4t2og5e3h2r w un ~ ~ bac"

P.S. Obama's first name has only one "r"

afk2011
Joined: Thu Jan 07, 2010 2:08 am UTC

### Re: My Hobby: Simplifying Sentences

I think it would help if repeated and trailing spaces were preserved, which would tell you not only how many words, but how many words fall bewteen any two chosen words

Yes, some sort of thing to indicate how many words or how many words of how many letters each would help reduce the lossy-ness of this. At this point, most messages could be lots of things. For example, after the comma you could rearrange almost any of those letters and stuff to make different words and it would still be valid.

ColdFire
Joined: Thu Mar 25, 2010 9:02 pm UTC

### Re: My Hobby: Simplifying Sentences

WOW, I really didn't think this would be possible, mind blown. I did post in some other places, but I guessed correctly that XKCD would have the best selection of intelligent people who would also be happy to try things like this. The subject of binary was something I should have guessed, it's one of my current teases that we discovered recently in a chemistry lesson that she had completely misunderstood how binary works.

Ok then, the solution, if I understand correctly you made this into a list of letters, I work this out to be:
"oooooooowwiiiiiiiiiiiiiiiinnnnnnnnnnggggggttttttttttttttssssssssrrrrrrrrrraaaaaaaahhfddddmmmmmpplllllllleeeeeeeeeeeccccccuuuuyyyyybb"
Then ran it through an anagram solver, as you say, this negates the advantage of knowing the orders of the first instance of each word, and the limited spacing and punctuation that is given. However, I cannot find a anagram solver that would support this size input.

Thanks so much for answering this, I think I will have to spend a bit more time on these forums. If you could explain exactly what you did to solve this I would be even more grateful, if that is possible

[EDIT] I completely agree, that only way I can see of making these possible to work out without the aid on a computer is to include hidden word markers.

redrogue
Joined: Tue Dec 15, 2009 9:17 pm UTC

### Re: My Hobby: Simplifying Sentences

I would add some kind of annotation to the end, indicating word length.

The above sentence would become: 6i2w7ou2l7d4asm3ek7nf5t2hc2gr (1 5 3 4 4 2 10 2 3 3, 10 4 6.)

afk2011
Joined: Thu Jan 07, 2010 2:08 am UTC

### Re: My Hobby: Simplifying Sentences

"Owing to" is a given.
So is "Straightforward".

Something goes between "to" and "straightforward" that is a combination of the
letters "owingts". It's obviously not going to be something like "swinging"
"gowns" or "sting". Other common choices such as "the" and "a" are not possible
because there is no 'h' or 'a' yet, so 'its' is really the only thing that fits.

I was at first thinking "simple" for mple, but after putting it into the
anagram finder "implementation" immediately stuck out at me as something that
is definitely a word to follow 'straightforward'.

"Owing to its straightforward implementation" <-- google search

--> the full thing.

It's really more the magic of the intern et than my intelligence that made this
easy. If your friend hadn't chosen a phrase that was plastered everywhere, it
would have been pretty much impossible, and like I said, there are still other
possible things that this could have been, eg;

Owing to its straightforward implementation in digital electronic circuitry using logic gates,
the binary system is used internally by all cute modern proms

ColdFire
Joined: Thu Mar 25, 2010 9:02 pm UTC

### Re: My Hobby: Simplifying Sentences

Ahh, great work there. That does seem to be a sensible way of completing the problem. It is similar to what I did for the cosine. I must have not made the link between straightforward and implementation before I posted on this forum. I bow down to your general epicalness. Thanks again

EDIT, hmm Tatooine epicalness, interesting one from the cheese grater there, I assume that is far better than normal epicalness

Vesuvius
Joined: Tue Mar 02, 2010 5:47 am UTC

### Re: My Hobby: Simplifying Sentences

Simplifying seems to be misused here. I'd say they actually get more complicated.

Condensing maybe?

ColdFire
Joined: Thu Mar 25, 2010 9:02 pm UTC

### Re: My Hobby: Simplifying Sentences

Vesuvius wrote:Simplifying seems to be misused here. I'd say they actually get more complicated.

Condensing maybe?

Condensing and complicating defiantly, we called it simplifying as it was inspired by mathematical simplification.

feddd
Joined: Fri May 27, 2011 11:23 am UTC

### Re: My Hobby: Simplifying Sentences

Hm very interesting...
I didn't see anything like this before.
I.m. if I want to say "Hello mom" it will be something like this? - 1h1e2l2o2m?

zemerick
Joined: Thu May 12, 2011 6:01 am UTC

### Re: My Hobby: Simplifying Sentences

feddd wrote:Hm very interesting...
I didn't see anything like this before.
I.m. if I want to say "Hello mom" it will be something like this? - 1h1e2l2o2m?

The 2m should have had a space before it. Wherever you can, you need to preserve words. This leads me to another idea that would help make this a reliably solvable system: but it also undoes a lot of the work. This is to simply condense only words, not the entire sentence.

Also, how would this system handle actual numbers within the sentence? Perhaps it should be specified that all numbers must be spelled out.

An interesting idea, and I'll have to think on it a bit more, and see if I can come up with a more reliable variation.

EDIT: Another possibility is to always keep the first letter of a word. This would give you a starting place for each word, but still maintain a lot of the original compression.

EX: That that is, is. That that is not, is not. This is it, is it not?
Becomes: 8T5h4a t i7s, i. T t i n3o, i n. T i i, i i n?

This particular sentence would likely be quite difficult to de-code with the original, as it becomes: 13T5h4a 9i7s. 3n3o, . , ?
With the majority of the words as "hidden" words, it would make this one fairly difficult without stumbling across the exact text. However, using my original solution, we only get: 2Tha 2tha is, is. 2Tha 2tha is not, is not. 2Tha is it, is it not? Clearly, this does not work as it is now lacking almost any compression, defeating the purpose. So, I think the new solution is a decent compromise between compression and the ability to decode it.

It's definitely interesting, so I'll keep looking at it and see if there's anything else simple that can be done.

dedalus
Joined: Fri Apr 24, 2009 12:16 pm UTC
Location: Dark Side of the Moon.

### Re: My Hobby: Simplifying Sentences

What I think might help is if we indicate the last repetition of every letter (maybe signifying it with a symbol - clearly this isn't needed for unique letters), also preserving spaces, and maybe using underscores so that they're easier to read. E.G (very easy one):

7t5o_2b4e_r_2n__,_2ha_2i2s_-h_qu-e-s-t-i-o-n

This way you restrict the range of some letters, and give a bit more information... 8E3i5th3r_w2a2y_-y5o2u'_3s3l_3g5n__b_-u2ck_-w-i-h_-a_-l-o-g-r_-s-t-n-c-e.

Edit: whoops, made a mistake. Also, this really isn't any shorter then the original sentence, lol.
doogly wrote:Oh yea, obviously they wouldn't know Griffiths from Sakurai if I were throwing them at them.

zemerick
Joined: Thu May 12, 2011 6:01 am UTC

### Re: My Hobby: Simplifying Sentences

dedalus wrote:What I think might help is if we indicate the last repetition of every letter (maybe signifying it with a symbol - clearly this isn't needed for unique letters), also preserving spaces, and maybe using underscores so that they're easier to read. E.G (very easy one):

7t5o_2b4e_r_2n__,_2ha_2i2s_-h_qu-e-s-t-i-o-n

This way you restrict the range of some letters, and give a bit more information... 8E3i5th3r_w2a2y_-y5o2u'_3s3l_3g5n__b_-u2ck_-w-i-h_-a_-l-o-g-r_-s-t-n-c-e.

Edit: whoops, made a mistake. Also, this really isn't any shorter then the original sentence, lol.

Since the last repetition letters are represented on their own, I would subtract them from the numbers rather than leave them.

Anyways, I would be hesitant to add any kind of symbols to it, as that would mean you would mean you would lose those same symbols when encoding. Just removing the symbols, and acknowledging any repeated letter must be the last repetition will make your examples a little shorter.

Here is a look at the simplifications so far, in order of shortest to longest using the last sentence Dedalus gave:

Spoiler:
Coldfire ( Maintaining spaces for hidden words ):

Code: Select all

`8E4i6t2h3r 2w2a2y 4o2u' 3s3l 3g4n  b 2ck    .`

Mine:

Code: Select all

`8E4i5t2h3r way y4o2u' s2l 3g4n t b s2ck w a l s.`

Modified Dedalus ( Removed symbols ):

Code: Select all

`8E3i5th3r w2a2y y5o2u' 3s3l 3g5n  b u2ck wih a logr stnce.`

Not Encoded:

Code: Select all

`Either way you're still going to be stuck with a longer sentence.`

Original Dedalus:

Code: Select all

`8E3i5th3r_w2a2y_-y5o2u'_3s3l_3g5n__b_-u2ck_-w-i-h_-a_-l-o-g-r_-s-t-n-c-e.`

Redrogue:

Code: Select all

`8E4i6t2h3r 2w2a2y 4o2u' 3s3l 3g4n  b 2ck    .(6 3 3'2 5 5 2 2 5 4 1 6 8.)`

and here's a look at mine and Coldfires on a much longer sentence ( 416 characters ):

Spoiler:
Coldfire ( Maintaining spaces for hidden words ):

Code: Select all

`8F18o18l4w35i33n12g 29t16h43e 15c8m, 3p28a, 17r5y, 15s, 14u, 16d     4B    K    , v   ,        - ,          ,       ,   q       .`

Mine:

Code: Select all

`6F15o14l2w32i33n11g 20t14h39e e11c5m, 3p21a, m16r5y, 13s, c11u, a14d c i o G B a t U K f t e c, v t B E, a o t U S s t m-t c, t e l h b w d a t w, b t l l o i d, a h aq u a l f i m r.`

I'm interested to see if anyone can figure out this new sentence from either of those encodings without cheating and searching the web! I think even keeping the first letter that this one could prove quite difficult, but if anyone can do it...they surely browse these forums:)

zemerick
Joined: Thu May 12, 2011 6:01 am UTC

### Re: My Hobby: Simplifying Sentences

Anyone? No one?

Well, I've come up with another twerk that can be applied to most any variation of these fairly easily that can help make longer sentences easier to decipher.

Replace hidden words with a number for the word length...or in the case of my version that has no hidden words, it's only used for those without numbers. So, if you see a number by itself, or at the end after any letters, then that is the length of that word. I believe the initial words that have multiple letters do not need this though, as you will notice.

So, the original Coldfire, maintaining spaces with letter counts gives us:
Spoiler:
8F18o18l4w35i33n12g 29t16h43e 15c8m, 3p28a, 17r5y, 15s, 14u, 16d 8 9 2 5 4B 3 3 6 K 4 3 10 7, v 3 7 6, 3 2 3 6 6 5 3 3-9 7, 3 7 8 3 4 6 9 6 3 5, 6 3 7 8 2 13 9, 3 3 q 3 2 6 6 2 4 7.

and using my first letter with the new letter counts gives:
Spoiler:
6F15o14l2w32i33n11g 20t14h39e e11c5m, 3p21a, m16r5y, 13s, c11u, a14d c8 i9 o2 G5 B7 a3 t3 U6 K7 f4 t3 e10 c7, v3 t3 B7 E6, a3 o2 t3 U6 S6 s5 t3 m3-t9 c7, t3 e7 l8 h3 b4 w6 d9 a6 t3 w5, b6 t3 l7 l8 o2 i13 d9, a3 h3 aq8 u3 a2 l6 f6 i2 m4 r7.

As you can see, this does increase the final size again by a fair bit unfortunately, but I believe it will make Coldfires solvable for this rather long sentence, and mine should probably be fairly easy. At least a 90%+ solve anyways, there's a couple odd words that prove troublesome. If we look at the actual sizes, we'll see: Coldfire+Spaces is 129, Coldfire+Spaces+Letter Count is 181, Mine is 183, Mine+Letter Spaces is 239, and the original is 416.

I do worry though, that without adding any symbols or anything, we can't really get a better accuracy than my newest version, without losing much of what makes this interesting. We could for example only remove and count vowels. This would however be mostly readable without any work, actually just turning the sentence into internet speak. It would not appear mathematical. Other variations, as shown, would actual increase the size of the sentence rather than "simplifying" it.

Well, again, please let me know if you are ever able to decode the above messages. It would be best to try from the first one in the previous post and work your way down through the easier versions, so we know what the hardest one actually solvable is. Please spoiler any solve attempts for the time being too, so that others can make a fresh solve attempt at the harder ones.