Help me decode this "secret" message

A place to discuss the implementation and style of computer programs.

Moderators: phlip, Moderators General, Prelates

User avatar
Monika
Welcoming Aarvark
Posts: 3661
Joined: Mon Aug 18, 2008 8:03 am UTC
Location: Germany, near Heidelberg
Contact:

Help me decode this "secret" message

Postby Monika » Sat May 17, 2014 11:42 am UTC

Can you guys help me decode this "secret" message (which is most likely a German-language text)?

http://pastebin.com/3HVxYUp4

Interpreting these ones and zeroes as bytes and those as letters gives (not all characters can be displayed):
ç-paƒY2¾V§ú‰öt€™å3Äö7M<z–xÄÆ=7UÒ¹G™º¥‰ýXÝ­'§dð¢¼å‹\]Cã'ˆK¬inAsÅgÆe¥È.›eïfHX1Ž16L9õá”eڊ†º±?ãl®T%ÛsÄLTD¤[&ȾgŠãÛùŸ´ØM4dçÂÄS˜1Dy¥ øÓK;7‹LFqÉÅùÄQ¤¥R¸²ÜgyªÔº)78Tt]Pˍö¡³ú¼Êm³‰üd™W/¸áRS)PÉi¬Ö9!.ªµpŠ™8HéÍWM•¤’›æ^âÕþHFèɞ›xX”(O¦GDþl°‚ža¢­@ƒ¢°Õ¢¹('æ'ª­ÜjØ&³»Ý–ÉIw_¼Á©€<côWöí —íEM(þÏ S%ʜ&Àf~8u|óV—\ÿ‹˜Ø•ü¹ê

These are 181 different characters, so it's not some kind of Caesar-like chiffre, because there are too many characters (even considering that besides letters there are also spaces, periods and maybe other punctuation, and that a German-language text would have a lot more capital letters than those of other languages written in Latin alphabets). Also their frequency is not distributed as letters would be, but there are many that appear once and a few that appear 2 or 3 or rarely 4 or 5 times:
(Map with the character before the = and the number of times it occurs after the =, some characters cannot be displayed)
{=1, =2, =2, =2, =2,=5, =2, =1, =1, =3, =1, =1, ▒=3, 1, ▒=2, =2, =2, =3, !=1, &=3, '=4, %=2, (=3, )=2, .=2, /=1, -=1, 3=1, 2=1, 1=3, 7=4, 6=1, 4=1, ;=1, 9=2, 8=3, ?=1, ==1, <=2, D=3, E=1, F=2, G=2, @=1, A=1, C=1, L=3, M=4, O=1, H=3, I=1, K=2, U=1, T=3, W=3, V=2, Q=1, P=2, S=3, R=2, ]=2, \=2, _=1, ^=1, Y=1, X=3, [=1, f=2, g=3, d=3, e=3, c=1, a=2, n=1, l=2, m=1, j=1, i=2, w=1, u=1, t=2, s=2, q=1, p=2, =1, ~=1, |=1, z=1, y=2, x=2, =3, =1, =3, =3, =1, =1, =1, =1, =2, =2, =1, =1, =2, =4, 3, =1, =2, =1, =1, =2, =2, =2, =2, ª=3, ©=1, ®=1, ­=3, ¬=2, ¢=4, ¡=1, §=2, ¦=1, ¥=4, ¤=3, º=3, »=1, ¸=2, ¹=3, ¾=2, ¼=3, ²=1, ³=3, °=2, ±=1, ´=1, µ=1, Í=1, Ï=1, É=4, È=2, Ë=1, Ê=2, Å=2, Ä=5, Æ=2, Á=1, À=1, Â=1, Ü=2, Ý=2, Ø=3, Ú=1, Û=2, Ô=1, Õ=2, Ö=1, Ò=1, Ó=1, ï=1, í=2, ê=1, é=1, è=1, ç=2, æ=2, å=2, ã=3, â=1, á=2, þ=3, ÿ=1, ü=2, ý=1, ú=2, ø=1, ù=2, ö=4, ô=1, õ=1, ó=1, ð=1}

I thought it might be a Vigenere chiffre, with a short password of length 3, 4, 5 or 6 and counted the frequencies, but it does not get to anything like a distribution of letters.

So I guess it's not Vigenere. Any ideas what else I could try?

Background, if you're curious: The encoded text was created with a symmetric encryption by a 12th grade student (coded in Perl) and he would like "hackers" to attempt to crack it so he knows how hard it would be. (He's the son of a colleague.)
#xkcd-q on irc.foonetic.net - the LGBTIQQA support channel
Please donate to help these people

User avatar
PM 2Ring
Posts: 3647
Joined: Mon Jan 26, 2009 3:19 pm UTC
Location: Mid north coast, NSW, Australia

Re: Help me decode this "secret" message

Postby PM 2Ring » Sun May 18, 2014 8:00 am UTC

Monika wrote:Can you guys help me decode this "secret" message (which is most likely a German-language text)?

http://pastebin.com/3HVxYUp4

Interpreting these ones and zeroes as bytes and those as letters gives (not all characters can be displayed):
ç-paƒY2¾V§ú‰öt€™å3Äö7M<z–xÄÆ=7UÒ¹G™º¥‰ýXÝ­'§dð¢¼å‹\]Cã'ˆK¬inAsÅgÆe¥È.›eïfHX1Ž16L9õá”eڊ†º±?ãl®T%ÛsÄLTD¤[&ȾgŠãÛùŸ´ØM4dçÂÄS˜1Dy¥ øÓK;7‹LFqÉÅùÄQ¤¥R¸²ÜgyªÔº)78Tt]Pˍö¡³ú¼Êm³‰üd™W/¸áRS)PÉi¬Ö9!.ªµpŠ™8HéÍWM•¤’›æ^âÕþHFèɞ›xX”(O¦GDþl°‚ža¢­@ƒ¢°Õ¢¹('æ'ª­ÜjØ&³»Ý–ÉIw_¼Á©€<côWöí —íEM(þÏ S%ʜ&Àf~8u|óV—\ÿ‹˜Ø•ü¹ê

These are 181 different characters, so it's not some kind of Caesar-like chiffre, because there are too many characters (even considering that besides letters there are also spaces, periods and maybe other punctuation, and that a German-language text would have a lot more capital letters than those of other languages written in Latin alphabets). Also their frequency is not distributed as letters would be, but there are many that appear once and a few that appear 2 or 3 or rarely 4 or 5 times:


Sorry I don't have the skills to help break this code, Monika. But that "181 different characters" thing inspired me to do some simple stats on the bit pattern frequencies, and I noticed that your code has a bug.The linked data actually contains all 256 possible 8 bit patterns, but the 181st pattern in the stream decodes as a NUL byte. And I guess that whatever you're using to do your analysis decided that that was a \0 terminator byte at the end of the data. Oops!

Anyway, here's a hexdump of the linked data.

Code: Select all

00000000  e7 2d 70 61 83 59 32 02  be 56 a7 fa 89 f6 08 74  |.-pa.Y2..V.....t|
00000010  80 99 e5 33 c4 05 f6 37  4d 3c 7a 96 1d 78 c4 c6  |...3...7M<z..x..|
00000020  3d 37 55 d2 b9 47 99 ba  a5 18 89 81 fd 58 dd 10  |=7U..G.......X..|
00000030  ad 27 a7 64 f0 a2 bc e5  8b 5c 08 5d 43 08 e3 27  |.'.d.....\.]C..'|
00000040  88 4b ac 69 6e 0e 41 73  c5 67 c6 65 a5 c8 2e 9b  |.K.in.As.g.e....|
00000050  65 ef 66 02 48 58 31 8e  31 36 4c 17 39 f5 e1 94  |e.f.HX1.16L.9...|
00000060  65 da 8a 86 ba b1 3f e3  6c ae 54 18 25 db 73 c4  |e.....?.l.T.%.s.|
00000070  4c 54 44 a4 12 5b 26 c8  be 67 8a e3 db f9 9f b4  |LTD..[&..g......|
00000080  d8 4d 34 64 e7 c2 c4 53  98 31 44 79 13 a5 14 20  |.M4d...S.1Dy... |
00000090  f8 0a 1b d3 4b 3b 37 8b  4c 46 71 c9 c5 1d f9 c4  |....K;7.LFq.....|
000000a0  51 a4 a5 52 b8 18 12 b2  03 05 dc 67 79 aa d4 ba  |Q..R.......gy...|
000000b0  29 37 38 54 7f 74 5d 08  50 cb 8d f6 a1 b3 0e fa  |)78T.t].P.......|
000000c0  bc ca 6d b3 89 fc 64 0a  99 57 2f 1c b8 e1 52 53  |..m...d..W/...RS|
000000d0  29 50 c9 69 ac d6 39 03  21 2e aa b5 70 8a 99 38  |)P.i..9.!...p..8|
000000e0  48 e9 cd 57 4d 95 a4 92  9b e6 5e 07 e2 d5 fe 48  |H..WM.....^....H|
000000f0  1c 46 1a e8 07 c9 9e 9b  12 78 58 94 28 4f a6 47  |.F.......xX.(O.G|
00000100  44 fe 6c b0 82 9e 61 a2  ad 40 83 a2 b0 d5 a2 b9  |D.l...a..@......|
00000110  28 27 e6 27 aa ad dc 6a  d8 26 b3 bb dd 96 8f 1a  |('.'...j.&......|
00000120  c9 49 77 5f bc c1 a9 80  3c 63 f4 57 f6 ed a0 97  |.Iw_....<c.W....|
00000130  ed 45 4d 28 fe cf 08 09  53 25 ca 9c 26 c0 66 7e  |.EM(....S%..&.f~|
00000140  38 75 7c f3 56 01 97 5c  ff 8b 98 d8 95 fc b9 ea  |8u|.V..\........|
00000150  00 ce f6 f5 89 98 0b 06  0f 32 02 46 52 c4 cc dd  |.........2.FR...|
00000160  40 a9 3b a2 41 c7 1e b4  00 d9 dd ab e4 e7 f6 ec  |@.;.A...........|
00000170  dc 51 3b 13 14 c1 00 d1  ea c6 64 19 2a a4 e9 40  |.Q;.......d.*..@|
00000180  44 f0 c8 bb f0 e4 0c ee  2f 6a 10 7e 33 af b3 ee  |D......./j.~3...|
00000190  29 be cb 5d 34 16 06 12  5d b2 6e 1f ea 4a 3d 0d  |)..]4...].n..J=.|
000001a0  28 e6 31 a3 be 0a 37 83  07 3e 29 f0 de 33 2c 3b  |(.1...7..>)..3,;|
000001b0  a5 d9 a9 23 23 df 6e 6d  aa c2 7a 19 15 99 e3 83  |...##.nm..z.....|
000001c0  ad db 58 97 5b 91 3a ad  d2 c6 05 66 0a 7b 4c b0  |..X.[.:....f.{L.|
000001d0  09 94 ec 62 1b aa b6 ca  9d e6 a9 14 8c ca f4 06  |...b............|
000001e0  60 8a 80 67 90 e2 84 64  fb 05 a1 3c 01 78 a1 d7  |`..g...d...<.x..|
000001f0  79 8f 99 08 de cf 07 c9  80 9d 41 6e 0a 6a 63 e4  |y.........An.jc.|
00000200  58 59 47 2d ee 13 b9 9e  c4 38 a3 ed 74 c0 84 55  |XYG-.....8..t..U|
00000210  5d f6 02 cf 75 38 40 41  41 8a 18 a0 51 0e 52 38  |]...u8@AA...Q.R8|
00000220  b9 ff 27 f6 78 64 cd 56  5d a7 77 dc 47 dc dd 0b  |..'.xd.V].w.G...|
00000230  4d fa 5b d5 b2 f3 00 54  4d 61 a9 2a 6b 51 66 44  |M.[....TMa.*kQfD|
00000240  5d 26 17 ca 04 e1 44 10  a9 75 5c cf 93 1f 5b 0c  |]&....D..u\...[.|
00000250  68 0e bd ce ad 76 a0 c6  5f 5b eb 7a 82 3a ec f8  |h....v.._[.z.:..|
00000260  3d 9e 3a 6c 1c 6f fc 41  4d f7 56 08 c3 fb 5c 16  |=.:l.o.AM.V...\.|
00000270  3d dd 4c 6f 12 be b9 5e  74 cc ed d0 10 a2 a3 64  |=.Lo...^t......d|
00000280  f0 8a 29 5b 16 e5 18 e8  79 d6 ee 4f 22 df a3 f4  |..)[....y..O"...|
00000290  bd 3c 10 c1 fd b5 a2 14  8a 07 a9 da fc 23 e7 1c  |.<...........#..|
000002a0  fc 2a df 54 4c 2f da 9a  2f a9 2a 07 fa 72 9a 1b  |.*.TL/../.*..r..|
000002b0  1d 72 ce 86 e4 fa cb f2  0b 12 57 70 59 09 73 9c  |.r........WpY.s.|
000002c0  31 d6 e2 b4 47 ec e8 a0  23 9e 90 e3 7b 2d ad 8d  |1...G...#...{-..|
000002d0  68 92 a9 25 ee a3 9c 34  3e b2 04 46 0e 16 de 00  |h..%...4>..F....|
000002e0  9c 75 6f d5 0e e1 6f ff  dd 6d 88 5d 72 a4 35 88  |.uo...o..m.]r.5.|
000002f0  6c 11 1f 56 c8 25 3f c5  8d 85 5c 27 24 4c d1 46  |l..V.%?...\'$L.F|
00000300  0d 3f 1a e7 7e 2f 66 9b  ab f8 aa 92 b1 b0 d5 7d  |.?..~/f........}|
00000310  81 c2 32 54 2e 76 66 e4  d2 1f c9 14 d1 8b 91 8f  |..2T.vf.........|
00000320  9d b5 37 8f 5d 0f 79 f8  db 17 18 71 c5 d9 6e 4d  |..7.].y....q..nM|
00000330  38 fd af 48 e3 7a a4 50  97 e7 f0 0f fc 79 95 78  |8..H.z.P.....y.x|
00000340  50 bf 4d 55 20 34 5b b0  b0 4d ff 14 60 e6 9b b1  |P.MU 4[..M..`...|
00000350  ac 4f 55 55 3a 70 2c fb  75 7e b2 74 b0 2f 07 51  |.OUU:p,.u~.t./.Q|
00000360  45 c0 8f d7 81 e0 fd fd  6f e1 c0 03 82 e7 83 cb  |E.......o.......|
00000370  bd 2d 26 3d ec 8d a1 2a  b2 cb 08 c1 ee fc 00 c8  |.-&=...*........|
00000380  ac 66 1e cf 5f 6a f2 0f  9d 32 29 c8 c3 b0 16 77  |.f.._j...2)....w|
00000390  69 60 16 40 57 53 b1 2b  01 a2 ac 74 df 88 6d 0a  |i`.@WS.+...t..m.|
000003a0  ac 03 02 44 98 cd ef 88  81 f4 92 22 fa 9b f8 5b  |...D......."...[|
000003b0  ec 33 c4 0f ae e0 da 77  83 d9 da 5d 37 da 41 ad  |.3.....w...]7.A.|
000003c0  34 8f 65 37 c3 f6 2d c7  e6 c2 e8 45 d8 be eb bf  |4.e7..-....E....|
000003d0  50 9f 0d a7 57 2e 23 32  9d 17 81 40 6d ba ec a4  |P...W.#2...@m...|
000003e0  a0 3a 3d 6e f5 de 57 f6  02 8f f5 2b 83 89 3d 74  |.:=n..W....+..=t|
000003f0  fd 30 84 5c 27 1e b6 72  0c aa 57 50 19 8d 0a f9  |.0.\'..r..WP....|
00000400  49 5c 8b be 75 46 44 df  4b 48 1c 12 77 a4 60 69  |I\..uFD.KH..w.`i|
00000410  e9 5a 07 55 a9 40 fc 9d  bf 3f 60 ce 66 92 8a 18  |.Z.U.@...?`.f...|
00000420  d9 51 35 e1 b3 50 76 c7  ac 3f c7 13 0a 88 8a ab  |.Q5..Pv..?......|
00000430  30 fb 3a cd a3 a3 bc 0d  8c 2a 69 5c 9b 0e 86 ba  |0.:......*i\....|
00000440  21 9d 1f 86 e7 0d 99 a2  d8 56 37 a9 b1 f4 fd 81  |!........V7.....|
00000450  f5 7d 9a c1 09 07 51 f8  3c 96 e9 0e 15 80 3e 67  |.}....Q.<.....>g|
00000460  d9 00 9b 72 c8 f8 ce 00  98 30 1b 2a bd 5b aa 10  |...r.....0.*.[..|
00000470  4d 16 69 22 15 15 0e 2d  1e dc f0 22 e4 16 4a a5  |M.i"...-..."..J.|
00000480  99 55 d9 cb 23 2a 36 e9  d1 db ed 16 8b 84 04 fb  |.U..#*6.........|
00000490  91 ce ca df b6 5d 12 e9  d4 91 fc 73 2a 09 af d2  |.....].....s*...|
000004a0  28 e7 be 94 1e 45 dd 68  21 43 d3 98 96 b4 15 ce  |(....E.h!C......|
000004b0  0d f8 34 b6 1a d1 5f 4c  01 6a 9d da 5b 86 c1 44  |..4..._L.j..[..D|
000004c0  d4 fc cb fd d0 4f 06 77  07 c7 c4 a0 85 1d 98 c5  |.....O.w........|
000004d0  99 6d 1e 57 b7 ec 67 82  fc ca a4 b4 78 9c d5 4b  |.m.W..g.....x..K|
000004e0  40 b9 dc 47 06 2f 43 69  f9 32 0f 5e ed 90 bd ee  |@..G./Ci.2.^....|
000004f0  ed cd 33 be 3d 5f 8e 97  ba 38 58 a8 0d 03 62 17  |..3.=_...8X...b.|
00000500  95 9e e2 f2 7e 9f 89 bf  64 8e 27 2c 7a 19 e6 8b  |....~...d.',z...|
00000510  7d 0b 6e 7e 70 8f cc cd  cf 12 55 5d 28 37 d2 eb  |}.n~p.....U](7..|
00000520  09 73 54 69 80 7f 14 da  22 3f af ca 8e 47 eb 86  |.sTi...."?...G..|
00000530  7d 75 7e bd 40 a6 21 89  3c 68 09 80 b7 b3 ae 6d  |}u~.@.!.<h.....m|
00000540  2c 8d c1 38 50 5f 7b 81  c4 9f 26 b3 79 ef 4a 6c  |,..8P_{...&.y.Jl|
00000550  b1 fb f1 33 a2 40 cf 69  ec 51 1b e9 1a 2c 8f 8a  |...3.@.i.Q...,..|
00000560  69 4e 54 75 1a a2 a9 24  f9 cb a5 d6 6b 43 a3 0b  |iNTu...$....kC..|
00000570  79 0e 19 d4 d7 38 07 1a  4c 28 ab e6 7a 6d 9b 5c  |y....8..L(..zm.\|
00000580  cd d3 74 3e 62 29 32 98  eb fd 86 3a 42 07 42 08  |..t>b)2....:B.B.|
00000590  69 52 44 aa e7 5d 98 da  e8 fd 8d bd 11 62 c4 03  |iRD..].......b..|
000005a0  4c 39 41 5a 15 66 10 70  8a c3 79 a9 5c db 75 e2  |L9AZ.f.p..y.\.u.|
000005b0  f5 c6 03 5a 71 15 79 3a  f8 0e 2f 7a fd a3 a0 92  |...Zq.y:../z....|
000005c0  bc 43 c0 12 2e 24 d5 bf  51 79 a1 f6 3e a9 2b 2b  |.C...$..Qy..>.++|
000005d0  f1 d2 b8 73 3c 8b e3 64  25 6d 12 5b 13 70 bc 86  |...s<..d%m.[.p..|
000005e0  39 52 ec 1e 67 f1 b8 f7  35 d6 77 fc 25 22 4f db  |9R..g...5.w.%"O.|
000005f0  58 09 3f 11 af d1 12 a2  0d ce 54 1c 21 59 70 ee  |X.?.......T.!Yp.|
00000600  ed d4 ff 7f 2f 19 3e 4a  12 fa 2b 4c 81 69 23 6a  |..../.>J..+L.i#j|
00000610  0d c7 6a 9f a6 46 1c 49  aa 1f 81 ca 0d ce 90 75  |..j..F.I.......u|
00000620  e5 ad 51 fa ea 84 81 59  2e b3 33 6f 46 fa 91 85  |..Q....Y..3oF...|
00000630  80 87 b5 bd 9a b4 d2 36  2d 64 66 9c d0 ad 45 aa  |.......6-df...E.|
00000640  4c 0b 87 4e e6 b8 69 4c  32 15 d7 a3 81 21 be b0  |L..N..iL2....!..|
00000650  b4 a4 17 6a 9f 92 ca bb  34 31 3a b2 ad 2d a6 a9  |...j....41:..-..|
00000660  dd 60 40 9c 69 94 3b 1d  19 da 23 18 e2 09 39 3f  |.`@.i.;...#...9?|
00000670  1d 7c 41 cb fe 80 9a 1c  e2 fc 49 2e 49 84 8c 36  |.|A.......I.I..6|
00000680  54 b0 37 e7 8d 2e cd b6  b4 d7 75 12 d8 da d6 a4  |T.7.......u.....|
00000690  bd 58 a6 39 30 80 c1 40  0c 61 9d da ee 17 91 bc  |.X.90..@.a......|
000006a0  29 73 92 00 09 0e 00 1a  4e 6f 45 d0 17 16 20 40  |)s......NoE... @|
000006b0  e8 60 38 02 d7 79 ab 96  53 d1 ba ae d9 9e 8a ba  |.`8..y..S.......|
000006c0  70 a2 72 9d 2e 60 68 eb  9a 18 2a db 56 e6 df e1  |p.r..`h...*.V...|
000006d0  b9 6d b9 38 d6 a8 7f b1  49 57 e1 c4 be aa 3e 82  |.m.8....IW....>.|
000006e0  48 70 1e ac 74 d9 b6 b5  4d a3 6f a9 ca bb 44 ef  |Hp..t...M.o...D.|
000006f0  2d 10 d1 55 ab 54 1f 56  01 7d 0b 50 53 96 36 60  |-..U.T.V.}.PS.6`|
00000700  b9 15 9c ec da 4a 6c 06  78 14 2c 94 1d 98 a5 bd  |.....Jl.x.,.....|
00000710  94 c3 4a cd f3 20 11 f2  62 cb 05 f1 11 63 ef ef  |..J.. ..b....c..|
00000720  48 69 64 9c ca c1 ba c1  a0 2f 79 49 72 d5 c7 6e  |Hid....../yIr..n|
00000730  fd 3f 2e 6d 6c 10 da 43  8a 65 c0 34 62 79 f3 bd  |.?.ml..C.e.4by..|
00000740  1d 7f e0 16 5f c9 9b b3  ba 12 50 46 7c cf 49 b1  |...._.....PF|.I.|
00000750  6c 05 6a ef 21 11 56 d5  b6 7f 6c b1 70 54 ab ab  |l.j.!.V...l.pT..|
00000760  68 06 c4 78 a2 3b 66 98  85 a0 dd 4d 05 fd 00 cd  |h..x.;f....M....|
00000770  3c 31 f1 97 ee 68 54 cf  02 bd a0 0a 89 0f c7 0e  |<1...hT.........|
00000780  1d f7 a8 ae 69 d8 6e 05  78 2a 84 8a 57 c8 1c 52  |....i.n.x*..W..R|
00000790  54 b8 7b 4f 91 6d 49 e3  23 41 fc d4 a0 88 e0 85  |T.{O.mI.#A......|
000007a0  65 b2 73 09 6a c0 7a 2d  bd aa e1 ae 8f 97 db 9d  |e.s.j.z-........|
000007b0  1c 14 ff 9d a8 dc 2b 24  ba 96 fa 2a 7f 7a 77 8b  |......+$...*.zw.|
000007c0  40 e3 da 99 c4 e7 11 21  f2 4b 5d 73 ff fa f9 d2  |@......!.K]s....|
000007d0  88 bf 30 c9 50 09 67 ec  f6 a5 f5 7a 45 3a 12 25  |..0.P.g....zE:.%|
000007e0  59 91 fa 74 20 ae e1 81  9a 0a 25 c1 74 f6 60 1e  |Y..t .....%.t.`.|
000007f0  c8 58 77 22 6f f0 d9 0e  ae 52 31 00 e5 ff 74 51  |.Xw"o....R1...tQ|
00000800  cc 22 72 a8 9d 1d d2 1c  bc ba a2 01 aa 1b b6 66  |."r............f|
00000810  cc 64 15 67 5c ed 92 82  a0 b8 a0 a3 2d 74 8b f5  |.d.g\.......-t..|
00000820  9d a1 79 6d 12 bf f9 c5  ec 11 93 4c 2c 6e 9f 22  |..ym.......L,n."|
00000830  10 81 59 05 05 f6 6b fd  99 11 bc cb 86 82 ca b7  |..Y...k.........|
00000840  7c b9 ac f5 b3 6d 3e 09  51 5c d1 b7 9d e0 62 ad  ||....m>.Q\....b.|
00000850  a9 19 84 5e 40 71 cd e3  c0 ab a2 d7 93 74 37 b8  |...^@q.......t7.|
00000860  c8 fb 9e d4 d3 64 ea 8c  66 5c 4d 68 5c 53 2c 3f  |.....d..f\Mh\S,?|
00000870  60 97 bf 88 c9 7f 4b b0  b0 cd 66 81 24 83 ca dc  |`.....K...f.$...|
00000880  18 4d 30 94 ca 1f 1f e9  3f e9 ed 6b 5e f6 83 0b  |.M0.....?..k^...|
00000890  89 10 d2 7c 7d dd 55 1b  d1 36 0c c3 5f 35 8d 98  |...|}.U..6.._5..|
000008a0  9c bb a0 93 a6 47 f2 08  53 90 ff 97 8f a6 1d 7e  |.....G..S......~|
000008b0  91 bd c0 99 3e 49 11 19  0b 88 76 59 c8 68 ea c1  |....>I....vY.h..|
000008c0  85 bf 47 e2 be 62 b5 9e  2e 4b 0e 6e af 74 99 d5  |..G..b...K.n.t..|
000008d0  40 f3 14 64 e9 d5 a1 54  88 16 55 37 fe 44 87 e9  |@..d...T..U7.D..|
000008e0  a1 c7 ca 70 1b 72 7d 62  59 a8 63 b8 05 b0 59 aa  |...p.r}bY.c...Y.|
000008f0  8d b8 cf c4 7c d0 4c 72  35 97 51 d7 5a fb da 89  |....|.Lr5.Q.Z...|
00000900  08 46 54 c8 29 6f 30 41  04 f0 b5 b8 b4 ab d9 4e  |.FT.)o0A.......N|
00000910  dd 18 65 e9 c1 ed 24 fe  5a 9d 32 4e 0a a5 ea 28  |..e...$.Z.2N...(|
00000920  39 0a 5c 93 44 93 24 05  43 69 cb b9 0b 6d d0 c0  |9.\.D.$.Ci...m..|
00000930  78 d8 96 4e 72 77 e9 01  42 c9 10 2f 06 ba 9e dd  |x..Nrw..B../....|
00000940

And here's the Python 2 program I used to create the binary file:

Code: Select all

#! /usr/bin/env python

'''
 Convert a string of bits to binary.
 Input data is encoded as ASCII '0' & '1'
'''

iname = '/home/michael/Documents/MonikaCodeBits.txt'


def main():
    f = open(iname, 'rb')
    data = f.read()
    f.close()
   
    #Cut data into 8 bit blocks & convert to ASCII chars
    width = 8
    blocks = (data[i:i+width] for i in xrange(0, len(data), width))
    s = ''.join(chr(int(b, 2)) for b in blocks)

    f = open(iname + '.bin', 'wb')
    f.write(s)
    f.close()


if __name__ == '__main__':
    main()


I suppose it's reasonable to assume he's using 8 bit codewords, since the file size = 18944 = 2**9 * 37. But I guess we shouldn't rule out a different word size, or even a variable-width code, as is used in Huffman coding & some versions of LZW.

FWIW, here are some histograms of the data.

Code: Select all

Width = 1. 2 block patterns found.

Sorted by block pattern:
0: 9562
1: 9382

Sorted by block count:
1: 9382
0: 9562


Width = 2. 4 block patterns found.

Sorted by block pattern:
00: 2356
01: 2426
10: 2424
11: 2266

Sorted by block count:
11: 2266
00: 2356
10: 2424
01: 2426


Width = 4. 16 block patterns found.

Sorted by block pattern:
0000: 311
0001: 303
0010: 278
0011: 252
0100: 304
0101: 293
0110: 296
0111: 277
1000: 292
1001: 327
1010: 338
1011: 279
1100: 305
1101: 333
1110: 276
1111: 272

Sorted by block count:
0011: 252
1111: 272
1110: 276
0111: 277
0010: 278
1011: 279
1000: 292
0101: 293
0110: 296
0001: 303
0100: 304
1100: 305
0000: 311
1001: 327
1101: 333
1010: 338


Width = 8. 256 block patterns found.

Sorted by block pattern:
00000000: 12
00000001: 7
00000010: 8
00000011: 7
00000100: 4
00000101: 12
00000110: 8
00000111: 12
00001000: 11
00001001: 13
00001010: 12
00001011: 10
00001100: 5
00001101: 10
00001110: 15
00001111: 7
00010000: 12
00010001: 10
00010010: 17
00010011: 5
00010100: 10
00010101: 10
00010110: 12
00010111: 8
00011000: 11
00011001: 9
00011010: 8
00011011: 8
00011100: 11
00011101: 11
00011110: 9
00011111: 9
00100000: 5
00100001: 8
00100010: 9
00100011: 9
00100100: 7
00100101: 8
00100110: 6
00100111: 8
00101000: 8
00101001: 9
00101010: 12
00101011: 6
00101100: 8
00101101: 11
00101110: 11
00101111: 11
00110000: 7
00110001: 8
00110010: 9
00110011: 7
00110100: 8
00110101: 5
00110110: 6
00110111: 13
00111000: 12
00111001: 7
00111010: 10
00111011: 6
00111100: 8
00111101: 8
00111110: 9
00111111: 11
01000000: 16
01000001: 11
01000010: 3
01000011: 7
01000100: 13
01000101: 7
01000110: 10
01000111: 9
01001000: 7
01001001: 10
01001010: 6
01001011: 7
01001100: 15
01001101: 15
01001110: 6
01001111: 6
01010000: 11
01010001: 13
01010010: 8
01010011: 8
01010100: 16
01010101: 11
01010110: 9
01010111: 11
01011000: 9
01011001: 10
01011010: 5
01011011: 11
01011100: 15
01011101: 14
01011110: 5
01011111: 8
01100000: 11
01100001: 4
01100010: 9
01100011: 4
01100100: 14
01100101: 7
01100110: 14
01100111: 9
01101000: 9
01101001: 17
01101010: 10
01101011: 4
01101100: 9
01101101: 15
01101110: 11
01101111: 10
01110000: 12
01110001: 4
01110010: 11
01110011: 9
01110100: 15
01110101: 11
01110110: 4
01110111: 10
01111000: 10
01111001: 15
01111010: 10
01111011: 4
01111100: 6
01111101: 7
01111110: 8
01111111: 8
10000000: 10
10000001: 14
10000010: 7
10000011: 9
10000100: 8
10000101: 6
10000110: 9
10000111: 3
10001000: 11
10001001: 10
10001010: 14
10001011: 10
10001100: 4
10001101: 10
10001110: 4
10001111: 11
10010000: 5
10010001: 9
10010010: 9
10010011: 6
10010100: 8
10010101: 4
10010110: 8
10010111: 10
10011000: 12
10011001: 13
10011010: 7
10011011: 10
10011100: 10
10011101: 16
10011110: 10
10011111: 7
10100000: 14
10100001: 8
10100010: 16
10100011: 12
10100100: 11
10100101: 10
10100110: 7
10100111: 4
10101000: 6
10101001: 17
10101010: 15
10101011: 10
10101100: 9
10101101: 12
10101110: 8
10101111: 6
10110000: 13
10110001: 9
10110010: 8
10110011: 10
10110100: 9
10110101: 7
10110110: 8
10110111: 4
10111000: 11
10111001: 12
10111010: 13
10111011: 5
10111100: 9
10111101: 14
10111110: 12
10111111: 9
11000000: 10
11000001: 13
11000010: 4
11000011: 6
11000100: 15
11000101: 6
11000110: 6
11000111: 9
11001000: 12
11001001: 10
11001010: 16
11001011: 12
11001100: 5
11001101: 12
11001110: 9
11001111: 10
11010000: 6
11010001: 10
11010010: 10
11010011: 4
11010100: 7
11010101: 11
11010110: 7
11010111: 8
11011000: 8
11011001: 11
11011010: 16
11011011: 9
11011100: 9
11011101: 13
11011110: 4
11011111: 7
11100000: 5
11100001: 10
11100010: 8
11100011: 10
11100100: 6
11100101: 5
11100110: 10
11100111: 12
11101000: 6
11101001: 13
11101010: 7
11101011: 6
11101100: 13
11101101: 11
11101110: 10
11101111: 7
11110000: 9
11110001: 5
11110010: 6
11110011: 5
11110100: 5
11110101: 9
11110110: 15
11110111: 3
11111000: 9
11111001: 7
11111010: 12
11111011: 8
11111100: 14
11111101: 14
11111110: 6
11111111: 9

Sorted by block count:
01000010: 3
10000111: 3
11110111: 3
00000100: 4
01100001: 4
01100011: 4
01101011: 4
01110001: 4
01110110: 4
01111011: 4
10001100: 4
10001110: 4
10010101: 4
10100111: 4
10110111: 4
11000010: 4
11010011: 4
11011110: 4
00001100: 5
00010011: 5
00100000: 5
00110101: 5
01011010: 5
01011110: 5
10010000: 5
10111011: 5
11001100: 5
11100000: 5
11100101: 5
11110001: 5
11110011: 5
11110100: 5
00100110: 6
00101011: 6
00110110: 6
00111011: 6
01001010: 6
01001110: 6
01001111: 6
01111100: 6
10000101: 6
10010011: 6
10101000: 6
10101111: 6
11000011: 6
11000101: 6
11000110: 6
11010000: 6
11100100: 6
11101000: 6
11101011: 6
11110010: 6
11111110: 6
00000001: 7
00000011: 7
00001111: 7
00100100: 7
00110000: 7
00110011: 7
00111001: 7
01000011: 7
01000101: 7
01001000: 7
01001011: 7
01100101: 7
01111101: 7
10000010: 7
10011010: 7
10011111: 7
10100110: 7
10110101: 7
11010100: 7
11010110: 7
11011111: 7
11101010: 7
11101111: 7
11111001: 7
00000010: 8
00000110: 8
00010111: 8
00011010: 8
00011011: 8
00100001: 8
00100101: 8
00100111: 8
00101000: 8
00101100: 8
00110001: 8
00110100: 8
00111100: 8
00111101: 8
01010010: 8
01010011: 8
01011111: 8
01111110: 8
01111111: 8
10000100: 8
10010100: 8
10010110: 8
10100001: 8
10101110: 8
10110010: 8
10110110: 8
11010111: 8
11011000: 8
11100010: 8
11111011: 8
00011001: 9
00011110: 9
00011111: 9
00100010: 9
00100011: 9
00101001: 9
00110010: 9
00111110: 9
01000111: 9
01010110: 9
01011000: 9
01100010: 9
01100111: 9
01101000: 9
01101100: 9
01110011: 9
10000011: 9
10000110: 9
10010001: 9
10010010: 9
10101100: 9
10110001: 9
10110100: 9
10111100: 9
10111111: 9
11000111: 9
11001110: 9
11011011: 9
11011100: 9
11110000: 9
11110101: 9
11111000: 9
11111111: 9
00001011: 10
00001101: 10
00010001: 10
00010100: 10
00010101: 10
00111010: 10
01000110: 10
01001001: 10
01011001: 10
01101010: 10
01101111: 10
01110111: 10
01111000: 10
01111010: 10
10000000: 10
10001001: 10
10001011: 10
10001101: 10
10010111: 10
10011011: 10
10011100: 10
10011110: 10
10100101: 10
10101011: 10
10110011: 10
11000000: 10
11001001: 10
11001111: 10
11010001: 10
11010010: 10
11100001: 10
11100011: 10
11100110: 10
11101110: 10
00001000: 11
00011000: 11
00011100: 11
00011101: 11
00101101: 11
00101110: 11
00101111: 11
00111111: 11
01000001: 11
01010000: 11
01010101: 11
01010111: 11
01011011: 11
01100000: 11
01101110: 11
01110010: 11
01110101: 11
10001000: 11
10001111: 11
10100100: 11
10111000: 11
11010101: 11
11011001: 11
11101101: 11
00000000: 12
00000101: 12
00000111: 12
00001010: 12
00010000: 12
00010110: 12
00101010: 12
00111000: 12
01110000: 12
10011000: 12
10100011: 12
10101101: 12
10111001: 12
10111110: 12
11001000: 12
11001011: 12
11001101: 12
11100111: 12
11111010: 12
00001001: 13
00110111: 13
01000100: 13
01010001: 13
10011001: 13
10110000: 13
10111010: 13
11000001: 13
11011101: 13
11101001: 13
11101100: 13
01011101: 14
01100100: 14
01100110: 14
10000001: 14
10001010: 14
10100000: 14
10111101: 14
11111100: 14
11111101: 14
00001110: 15
01001100: 15
01001101: 15
01011100: 15
01101101: 15
01110100: 15
01111001: 15
10101010: 15
11000100: 15
11110110: 15
01000000: 16
01010100: 16
10011101: 16
10100010: 16
11001010: 16
11011010: 16
00010010: 17
01101001: 17
10101001: 17



The above histograms were generated using this Python program:

Code: Select all

#! /usr/bin/env python

'''
 Create histograms for substrings of various width from a string of bits
 in plain ASCII. I.E., data is encoded as ASCII '0' & '1'
'''

import sys

iname = 'MonikaCodeBits.txt'


def make_histogram(data, width=1):
    datalen = len(data)
    #Adjust datalength down to a whole number of blocks
    datalen -= datalen % width
   
    #Cut data into blocks & count the occurence of each block type
    hist = {}
    for block in (data[i:i+width] for i in xrange(0, datalen, width)):
        hist[block] = hist.get(block, 0) + 1
       
    print 'Width = %d. %d block patterns found.\n' % (width, len(hist))
    print 'Sorted by block pattern:'
    for k in sorted(hist.keys()):
        print '%s: %d' % (k, hist[k])
    print
   
    print 'Sorted by block count:'
    for t in sorted(hist.items(), key=lambda (k,v): (v,k)):
        print '%s: %d' % t
    print '\n'


def main():
    #Get block widths from cmdline
    widths = [int(i) for i in sys.argv[1:]]
    if len(widths) == 0:
        widths = [1, 2, 4, 8]   

    f = open(iname, 'rb')
    data = f.read()
    f.close()
   
    for width in widths:
        make_histogram(data, width)   


if __name__ == '__main__':
    main()


If we assume he is using 8 bit fixed-width codewords, from the fairly wide range of block frequencies I think we can assume that the data is not compressed. But I Am Not A Cryptanalyst.

User avatar
Monika
Welcoming Aarvark
Posts: 3661
Joined: Mon Aug 18, 2008 8:03 am UTC
Location: Germany, near Heidelberg
Contact:

Re: Help me decode this "secret" message

Postby Monika » Sun May 18, 2014 10:33 am UTC

PM 2Ring wrote:But that "181 different characters" thing inspired me to do some simple stats on the bit pattern frequencies, and I noticed that your code has a bug.The linked data actually contains all 256 possible 8 bit patterns, but the 181st pattern in the stream decodes as a NUL byte. And I guess that whatever you're using to do your analysis decided that that was a \0 terminator byte at the end of the data. Oops!

Oh wow, thank you so much!
I used http://paulschou.com/tools/xlate/ to convert the binary to ASCII (or maybe it was one of the other binary to ASCII converters that popped up on the first Google page that did not choke on the length).
Then in my program to count the frequencies I used the ASCII string ... a mistake, as I see now.

All possible 8-bit patterns ... that sounds suspicious. There must be a reason for that ...
#xkcd-q on irc.foonetic.net - the LGBTIQQA support channel
Please donate to help these people

User avatar
scarecrovv
It's pronounced 'double u'
Posts: 674
Joined: Wed Jul 30, 2008 4:09 pm UTC
Location: California

Re: Help me decode this "secret" message

Postby scarecrovv » Mon May 19, 2014 3:55 pm UTC

Monika wrote:All possible 8-bit patterns ... that sounds suspicious. There must be a reason for that ...


There are 2368 bytes of data. There are 256 possible bytes. The probability that a particular byte would never appear in 2368 random bytes is (I think) (255/256)^2368 = 0.0000944. The probability that any byte would never appear should be 256 times that, or 0.0242. So the probability that any random string of 2368 bytes hits every possible byte is about 98%.

Probability can be tricky though, and I'm not completely confident I did that right. However, if I did, then this data is consistent with some form of good encryption, such as a one time pad. That reminds me of something I did back in high school: I wrote a "one time pad" using the output of rand() as the encryption key. Somebody might want to try that for various seeds, just on the off chance that that's what this person has done. I might do it later if nobody beats me to it, but I have to go to work.

User avatar
phlip
Restorer of Worlds
Posts: 7554
Joined: Sat Sep 23, 2006 3:56 am UTC
Location: Australia
Contact:

Re: Help me decode this "secret" message

Postby phlip » Tue May 20, 2014 6:13 am UTC

scarecrovv wrote:The probability that any byte would never appear should be 256 times that, or 0.0242.

Not quite, you can't just multiply it out like that. For these low probabilities, though, it does give a reasonable approximation, though.

From a quick calculation, I think that it's about a 97.6% chance that 2368 independent random bytes will cover all 256 possible values.

Continuing down this road... the frequencies of each different byte seem consistent with what you'd expect from random data... that is, the number of byte values that appear 3 times, the number that appear 4 times, etc all seem reasonable close to the curve of what you'd expect:
histogram.png


This would imply that either (1) the data is encrypted with something that makes it appear, at the byte level at least, to be random data, which is what a good crypto scheme will try to achieve... (2) the data actually is random, and this is all a wild goose chase, or (3) the bitstream isn't supposed to be split into 8-bit bytes, and the fact that its length is a multiple of 8 is either a coincidence or an intentional red herring... so it looks random when you split it into 8-bit bytes, but if you split it up differently (either shorter or longer or with some form of varying length) then patterns can emerge.

Code: Select all

enum ಠ_ಠ {°□°╰=1, °Д°╰, ಠ益ಠ╰};
void ┻━┻︵​╰(ಠ_ಠ ⚠) {exit((int)⚠);}
[he/him/his]

User avatar
scarecrovv
It's pronounced 'double u'
Posts: 674
Joined: Wed Jul 30, 2008 4:09 pm UTC
Location: California

Re: Help me decode this "secret" message

Postby scarecrovv » Tue May 20, 2014 6:24 am UTC

scarecrovv wrote:That reminds me of something I did back in high school: I wrote a "one time pad" using the output of rand() as the encryption key. Somebody might want to try that for various seeds, just on the off chance that that's what this person has done. I might do it later if nobody beats me to it, but I have to go to work.


I tried this today for 10^9 possible seeds, for each one counting the number of printable ascii values that came out, and didn't come up with anything significant. That doesn't mean much though. There are so many possible slight variations on what I tried that I could have come within an inch of the right answer and never known it.

The puzzle might be a lot more interesting if the OP can ask the OC (original cryptographer) to encrypt some known text, so we can all try a known plaintext attack. Or perhaps if the OC could divulge the source code but not the key, that would be interesting too.

User avatar
Monika
Welcoming Aarvark
Posts: 3661
Joined: Mon Aug 18, 2008 8:03 am UTC
Location: Germany, near Heidelberg
Contact:

Re: Help me decode this "secret" message

Postby Monika » Tue May 20, 2014 9:02 pm UTC

Okay, his encoding program requires three inputs, the text to be encoded, a numeric "security level" between 1 and 8 (not sure exactly what it does) and a password.

I tried this:
Input aaaaaaaaa (9 a's)
Level 1
Password aaaa (4 a's)
=> Results in output

Code: Select all




Input aaaaaaaaa (9 a's)
Level 1
Password aa (2 a's)
=> Results in output

Code: Select all




The same with security level 2 results (9 a's + 4 a's) in

Code: Select all



and (9 a's + 2 a's)

Code: Select all

1011101101111111001010011010011101101011101101111011001101111001110100000101100000000010010100101001111010110010110011010100001111111000101001010011001101111011100010000010101000000100101110010111111001000011000110010111100111110000100000101001111101111010111001110110010111111011000111000100101011011101010010010000110010011000100001001100111001111100011110001011000010111001010001001101100010011111001111110110011010001011000100111000111100011100000001010000001011100011100110000010101111001010111101001001110010100100111000000111000011100010000110100101010001000110011010001100100110101100010000101111111000010000010010000111010001010101011100001000001111011100001100100111100011001001011001101101000100000010100011000100000000010111100111010101011001000001110111011001100010001101101001011010011011101110000010110110000100101101000000010100000001011001100111111011001110011101101001011001101100111110001110000101100111111011010100011110101001001010011010000000011100100011011111010001111110011101101010100011011000000111101110010001101010111001101101110111001010110010110110000110100101110010010100101111111001111100101101010101100101100110111111001101110100010000101101110110110001001000110010111000100001010110111101001010110001100110111101010000101100011101111100100011011011100001100000011101100100010111011000010100100000100001111111101100111001010110100101000000011001011010010000000010001111011111111011101101110110100111110001010100111010101110010100110101001011011010101011001010001101001101100111100011011111100000011001011011001111110100111101110000110110110011000101110110110000101000111001011000101000111000101000110110100001010111001001000000110110100001100001010010100010011111110001010111111111000000001111111110101011110111011010010000101001101110111100011000001001000010101111010000111101101010001010011011100000101011010111000001000111110110011011100100100101111011111011000000011011000100100010101100000011100111001100101010010110101000100111101000001101110110010001101101010010111001110011111100101111101110010011110111011011110011000001110000100011000101111111010101111100110011000010110000011011110101101101101100100100101011001001001110110111010011011000110110001100001111110100100111000010100100011111111000011111000101011110010100010001010010100001011101110100100100110010100011111110001111111110110110100111010111000111011001101111111011111010000110100001110101011101000010111101100101000111111000001111101011011111100100101111000101001010000111110010011001110011011100111101000001100001010010110110010011010010100000001101101100


So the "security level" makes it longer. But it's already pretty long for just 9 chars that were encoded. Maybe a lot of buffer bytes? Random ones?

/edit: There is definitely a random component in it. I asked for encoding aaaaaaaaa (9 a's) with level 1 and password aa three more times and got three different results:

Code: Select all



Code: Select all



Code: Select all




But it's not a joke encoding that just generates only random data. So I think there are random bytes added and with higher "security levels" more random bytes are added.
#xkcd-q on irc.foonetic.net - the LGBTIQQA support channel
Please donate to help these people

Derek
Posts: 2178
Joined: Wed Aug 18, 2010 4:15 am UTC

Re: Help me decode this "secret" message

Postby Derek » Tue May 20, 2014 10:29 pm UTC

Generate several strings (~10 or so?) of the same encoding and see if there is a consistent set of bits that don't change.

alessandro95
Posts: 109
Joined: Wed Apr 24, 2013 1:33 am UTC

Re: Help me decode this "secret" message

Postby alessandro95 » Wed May 21, 2014 6:07 am UTC

Could you give us the whole program? It'll be very helpful
The primary reason Bourbaki stopped writing books was the realization that Lang was one single person.

obfpen
Posts: 59
Joined: Mon Jun 20, 2011 1:16 pm UTC

Re: Help me decode this "secret" message

Postby obfpen » Thu May 22, 2014 5:09 am UTC

Monika wrote:The encoded text was created with a symmetric encryption by a 12th grade student (coded in Perl) and he would like "hackers" to attempt to crack it so he knows how hard it would be.
But that isn't how cryptanalysis works. For anything more complex than the most simple ciphers, just looking at a small snippet of ciphertext isn't going to be enough. It might as well be a one-time pad. To have any real chance at decryption, we're going to need more, whether that's additional ciphertext, some idea of the encryption mechanism, or greater clues to the type of plaintext involved.

Having one small ciphertext that isn't known to be decrypted isn't really going to demonstrate security. If the cryptographer-in-training wants a critique of their system, they should reveal its workings as alessandro95 requests.

In the absence of a stolen Enigma however, let's look at what we have so far:
  • 9 characters at security level 1 produce 1280 bits of ciphertext.
  • 9 characters at security level 2 produce 2560 bits of ciphertext.
  • The original problem-text has 18,944 bits.
  • The same input set can produce different output. Since these all need to be decrypted to the same message, there must be an additional indicator in the system. It may be hardcoded into the algorithm, or may be present in the ciphertext.
  • The four instances of plaintext 'aaaaaaaaa' + key 'aa' at security level 1 produce the following common output:

Code: Select all

1---------------------------1--------0------1-0--10-----1--------1------0--------0------0----------------------1----------1----0
1--0-------0---0---------------1-------0---------0--------------1------00---0--------1---11---11----------------0---------0-----
101------------0---------0----------------1----0-0---0-------------------1---11-----------0-------------------0-----------------
10------------0-1-----------1-----0------------0--------0------------1--------------------0---1------------------------------0--
1----------0-----0------------1------0--1------------1--------------------11---1----1-1-1--1-------0------------------0---0-----
1--0----0--------------------------------0-------0----------0--1------0---------1-----0--1-------111---0------------------------
1-----------------------0----------0------0----0---0-0----1--1--0----1-------1-------------------------1----0-------1-----------
1-01--------1-----1--------1---0--------1-0-------0---1--1-------0-----------11-1--1-0----1-0--1----1----1----------------------
1-------0-0---0---------0----0-----------0-----------------1------------000-------10----1--1--0-1-------00----------------------
011-------------------1--0--------1--0--------1----------1-01-0------001---1-------1---------0-----------1--1--------------0-0--
The only pattern I could discern was the '1111111110' appearing at positions 1 mod 128. This is present in all the 9x'a' ciphertexts---the same sequence is found in the two level-2 ciphertexts at positions 1 mod 256.

The highest common factor for 1280, 2560, and 18944 is 256. If we simply assume that each character produces the same quantity of encrypted output, then 256 bits per character is immediately problematic, at least for security level 1, as 1280 bits would only permit 5 characters, not the 9 we know to have been entered. At level 2, however, it's fine, permitting 10 plaintext characters.

If we then assume that level 1 uses 128 bits to level 2's 256, consistent with the earlier pattern coincidence, then each allows 10 characters, one more than the known 9-character input. If we further assume that the 'randomisation' indicator is present in the output as a single equivalent chunk, then we can generate quite a neat formula connecting the number of plaintext characters, n, to the ciphertext bits, b, at a security level s:
b = 128 * s * (n + 1)

Or alternatively:
b = 128 * 2^(s-1) * (n + 1)


The factorisation of 18944 provided by PM 2Ring would appear to fit in with either of these.
18944 = 2^9 * 37 = 128 * 148 = 256 * 74 = 512 * 37.


So if chunks are multiples of 128 bits, the original ciphertext has 148 1x128-bit chunks, 74 2x chunks, or 37 4x chunks, which makes me wonder whether the original plaintext contained 36, 73 or 147 characters.

If the key is simply repeated, then 'aa' and 'aaaa' are equivalent keys. This whittles down the common output somewhat, but the 1 mod 128 pattern remains. Whether this is significant I cannot say, but it is at least intriguing.

My current suspicions, therefore, are that each input character is enciphered with the key, and then padded out to a level-dependent multiple of 128 bits to obscure things, with another chunk of equal size indicating how to undo this padding.

Of course, this could just be a load of numerological nonsense stacked on coincidences, but it is at least testable given further results of known plaintext. For example, if correct, a single letter at level 1 would produce 256 bits of output, and either 768 or 1024 bits at level 3.

As far as plaintext suggestions go, I'd like to see the output of the following plaintext-key pairings at level 1:

Code: Select all

'Z' + 'U'
'U' + 'Z'
'UP' + 'Z_'
'UU' + 'ZZ'
'h' + '='
'UPh=' + 'Z_=h'
Why these combinations? Well, in addition to providing three different lengths with some common characters in common positions, which can help indicate the presence of transposition, they are also able to hint at character encoding and key-plaintext combination. Encoded as ASCII and combined with modulo-2 addition (XOR) they provide the following bit pattterns:

Code: Select all

'Z' + 'U'       --> 0000 1111
'U' + 'Z'       --> 0000 1111
'UP' + 'Z_'     --> 0000 1111 0000 1111
'UU' + 'ZZ'     --> 0000 1111 0000 1111
'h' + '='       --> 0101 0101
'UPh=' + 'Z_=h' --> 0000 1111 0000 1111 0101 0101 0101 0101
As Derek points out, several encryptions of each are going to be needed to get a handle on the apparent randomisation of the output.

And a few more longer pieces of ciphertext, even for unknown plaintext, would obviously help too.

Derek
Posts: 2178
Joined: Wed Aug 18, 2010 4:15 am UTC

Re: Help me decode this "secret" message

Postby Derek » Thu May 22, 2014 7:30 am UTC

Hypothesis: One "character" is being used (probably either prepended or appended) to indicate the randomization used. This would explain why the output has one more "character" than the input. In the lines you posted the last line stands out as the only one not to start with a 1, so that would be a good candidate (I'm not sure why the "randomization character" would need to be marked if it's always at the last position, mind you, but it still seems worth investigating).

User avatar
Monika
Welcoming Aarvark
Posts: 3661
Joined: Mon Aug 18, 2008 8:03 am UTC
Location: Germany, near Heidelberg
Contact:

Re: Help me decode this "secret" message

Postby Monika » Thu May 22, 2014 8:09 pm UTC

Okay, here's the code from the boy:
perl_o_crypt_beta_0_6_pwgen.pl http://pastebin.com/TVPku1v9 I translated the user output, but not (yet) the comments
hash_table.pm http://pastebin.com/AjNUGLKc This module is used by the above code, put it into the same folder

This is the "simple" message that is to be decoded (same link as in original post): http://pastebin.com/3HVxYUp4
Here's a message that is supposed to be more difficult (but maybe it's just longer?): http://javaschubla.de/2014/extern/vs_schwer.txt

He also gave me a sample encrypted file unlock_test.txt http://pastebin.com/LBHbwgnT (it did not originally contain linebreaks, my text editor added them) with a key_test.txt file that contained 13300112000870012400081000090007800096. The decryption results in "Testdatei. Wenn das hier korrekt angezeigt wird, ist alles in Ordnung."

Two important things:
- The program doesn't ask for a security level, but the first digit of the password is the security level.
- The passwords were really supposed to be numeric, I didn't understand that correctly before. What does Perl do when supplying aa where it expects a number? Does it convert it to 0?

If you figure out the two messages, the simple original one and/or the difficult one, please post whether you did it with known-plaintext attacks only or whether you read the code.
#xkcd-q on irc.foonetic.net - the LGBTIQQA support channel
Please donate to help these people

obfpen
Posts: 59
Joined: Mon Jun 20, 2011 1:16 pm UTC

Re: Help me decode this "secret" message

Postby obfpen » Fri May 23, 2014 8:21 am UTC

Monika wrote:Two important things:
- The program doesn't ask for a security level, but the first digit of the password is the security level.
- The passwords were really supposed to be numeric, I didn't understand that correctly before. What does Perl do when supplying aa where it expects a number? Does it convert it to 0?

Having eventually looked at the source, the key actually has to be in a very specific format. Since that format is itself a clue to the encryption mechanism, I've spoilered it below.
Spoiler:
The key is a 38-digit number.
  • The first digit, s, must be in the range 1--8 inclusive.
  • The second and third digits must be in the range 00--70 inclusive.
  • The remaining 35 digits should be treated as seven groups of five. Each group should be unique and in the range 00000--(2^(6+s) - 1).
Without that particular type of key, the encryption isn't reliable so the results are of limited use. They provide hints (what I wrote earlier is along the right lines) but aren't useful for direct attacks.

And now for the actual method:
Spoiler:
It's a three-step process, operating on a character at a time with the same sequence of steps:
  • First, a Caesar shift. The alphabet has 71 characters comprising a space (1), the lower and upper case letters (52), numerals (10), and some punctuation (8). The rotation of this alphabet is determined by the second part of the key.
  • Next, an encoding. Each character is effectively looked up in a code book containing 7-digit binary numbers. Consecutive letters within each case, and the numerals, generate consecutive binary numbers. The number is equivalent to the base position within the Caesar-shift alphabet.
  • Finally, a stencil cipher with transposition. The carrier text is a block of random binary digits. The size of this block is determined by the first part of the key, s, as 2^(6+s). Each of the seven bits of the code generated in step 2 is placed into this block at positions determined by the final seven groups provided by the key.

So how would I go about trying to break this?
Spoiler:
For a start, while the block size is unknown, there are only 8 possibilities (128, 256, 512, 1024, 2048, 4096, 8192, and 16384), and even that can be reduced. Obviously, a ciphertext with only 4096 bits cannot be encoded at security level 7 or 8. And since the system seems to add a newline character automatically at the end of each plaintext prior to enciphering, security level 6 can also be ruled out.
But it's possible to narrow it down further given the right message. Obscuring the security level is dependent on the length of the plaintext having factors of 2. For example, a 7-character message enciphered at level 2 will produce a ciphertext with seven 256-bit blocks. 512-bit blocks and larger are ruled out since they will not neatly divide the ciphertext (a message cannot be 3.5 characters long). Each factor of two in the length of the plaintext (including added newline) gives one extra possible security level. To obscure the security level, s, completely, the plaintext length must have 8-s factors of 2. Without careful padding, a message enciphered at level 3, for example, is only going to have this kind of obfuscation around 3% of the time.

The Caesar shift is by far the weakest step, easily susceptible to frequency analysis but simple enough to just brute force along with the reduced set of security levels.

The stencil step is the real bottleneck here, but it also has its weaknesses.
Every character uses the same pattern, with the random bits in the same position within each character block. This makes it susceptible to digraph attacks, especially double letters, which will produce pairs of blocks that are identical at seven positions. While the random bits tend to create pairs of blocks with identical bits in half the positions, messages with several double letter digraphs will skew the totals.
There's still the transposition, of course, but because the binary codes cover the numbers 0 to 70, the first bit will be 0 for around 90% of the characters (0 to 63), so that'll be a little easier to identify.

How secure is it?
Spoiler:
Secure enough that I'd want help from a computer to attack it. What I've done so far is look at things in Notepad. That's been more revealing than anyone should want an encryption method to be, but it's not quite enough for actual decryption. For example, just looking at unlock_test.txt with 128-bit rows shows a suspicious column of mostly 1s at position 87 (0-indexed), matching 00087 in the known key. Stencil ciphers in general are quite good, particularly for short messages with regularly changed stencils. Using the same pattern for every character is a problem, especially for longer messages, although the larger chunk sizes do a lot to offset that.

How could it be improved (without fundamentally changing the approach)?
Spoiler:
  • The Caesar shift is just about the weakest substitution cipher out there. The substitution step could be improved by randomizing the alphabet order ('Aj7;t'... instead of 'ABCDE'...).
  • Similarly the binary encoding could be randomized. Having half the numbers greater than 64 would make the highest bit less obvious.
  • The bits could also be spread across different positions for each character. This doesn't even need a longer key. Instead of having the key dictate the positions directly, use the key to seed a pseudorandom number generator. This can then supply seven positions for each character that will be different each time, but replicatable given the same seed from the key. Having the message bits spread out differently per character block would (I believe) protect against the attacks I've mentioned.
.

Looking at the source code had me wondering which I was rustier with, Perl or German. It's been some time for both! I can offer a tip or two though.
Spoiler:
  • For loops would be clearer than most of the do...until loops.
  • There are bitwise operators which simplify the binary conversion.
So, for example:

Code: Select all

@binary_figure = ('0','0','0','0','0','0','0',);
$index = 6;
$position = 0;
do{
    $check = 2**$index;
    if ($check <= $figure) {
        $binary_figure[$position] = 1;
        $figure = $figure - $check;
    }
    else{
       $binary_figure[$position] = 0;   
    }
    $position = $position +1;
    $index = $index-1;
   
}until($position == 7);

can be replaced with:

Code: Select all

@binary_figure = ();
for ($i=6; $i>=0; $i--) {
    push @binary_figure, $figure >> $i & 1;
}

  • Modular arithmetic has an operator too.

Code: Select all

if ($rot_lock[$count_lta] +$rot <=70) {
    $rot_lock[$count_lta] = $rot_lock[$count_lta]+ $rot;     
}
else{
    $rot_lock[$count_lta] = ($rot_lock[$count_lta]+ $rot)-71;
}

can become:

Code: Select all

$rot_lock[$count_lta] = ($rot_lock[$count_lta] + $rot) % 71;

  • foreach loops for iterating over existing arrays, such as @lock_text_array, mean the array length and position don't need to be tracked.
Iterating through the plaintext then becomes (in a simplified fashion):

Code: Select all

foreach $plain (@lock_text_array) {
    $encrypted = &encrypt($plain);
    push @encrypted_array, $encrypted;
}


Monika wrote:If you figure out the two messages, the simple original one and/or the difficult one, please post whether you did it with known-plaintext attacks only or whether you read the code.
I haven't figured out either; the short one is short, and the long one is a lucky length that doesn't help to narrow down the security level used, so could actually be rather short as well. I don't think that known-plaintext attacks are going to be any use unless they're encrypted with the correct key (which if we actually had would render the exercise needless). So we'd need to know that a message was a letter starting "Dear Monika", for example, or somehow elicit the sender to include something like "Messages like these can only be broken 0.000000000001842% of the time!" where that run of 0s and the following specific digits would make for really helpful indicators. If this encryption system were tied into e-mail, this would actually be quite easy, given how common it is to quote original messages in replies. But for the current situation, where unknown short messages are encrypted with different keys, the task of breaking the encryption is pretty much a futile one.

User avatar
Monika
Welcoming Aarvark
Posts: 3661
Joined: Mon Aug 18, 2008 8:03 am UTC
Location: Germany, near Heidelberg
Contact:

Re: Help me decode this "secret" message

Postby Monika » Mon May 26, 2014 7:38 pm UTC

Thanks a lot obfpen, I hope I will find the solution with all this help!
#xkcd-q on irc.foonetic.net - the LGBTIQQA support channel
Please donate to help these people


Return to “Coding”

Who is online

Users browsing this forum: No registered users and 9 guests