32 hex character strings md5 hashing to themselves
Moderators: phlip, Moderators General, Prelates
32 hex character strings md5 hashing to themselves
Is it possible for a 32-hex-character-string to md5 hash to... itself?
Had this thought while on #stats. Anyone know if this is possible?
Had this thought while on #stats. Anyone know if this is possible?
Last edited by Blipo on Thu Nov 22, 2007 3:35 am UTC, edited 1 time in total.
Re: 32 hex character strings hashing to themselves
What's your hash function?
Re: 32 hex character strings hashing to themselves
Rysto wrote:What's your hash function?
Whoa. Fix'd. =\
-
- Posts: 1478
- Joined: Sun Nov 05, 2006 12:49 am UTC
Re: 32 hex character strings md5 hashing to themselves
md5 you say?
well I guess there's only one way to find out. There's a finite number of possible strings. Get crackin.
well I guess there's only one way to find out. There's a finite number of possible strings. Get crackin.
Re: 32 hex character strings md5 hashing to themselves
joeframbach wrote:md5 you say?
well I guess there's only one way to find out. There's a finite number of possible strings. Get crackin.
It'd take rather more time than I have to figure this out, and I don't have nearly enough talent to write code to check for me.
Anyways, it's hypothetical. Is there anything preventing it from happening?
Re: 32 hex character strings md5 hashing to themselves
There are 16^32 possible strings like that.
A 128 bit md5 has has 16^32 (2^128) possible values.
If we pretend md5 is totally random, the odds of the first string ("0000000000000000") hashing to itself is 1/(16^32). The second string ("0000000000000001") also has a 1/(16^32) chance of hashing to itself.
So the odds of no string hashing to itself is (1-1/(16^32))^(16^32), which works out to.. uh.. something like 0.37.
In other words there probably is a md5 hash that hashes to itself. And would someone please double check my math?
A 128 bit md5 has has 16^32 (2^128) possible values.
If we pretend md5 is totally random, the odds of the first string ("0000000000000000") hashing to itself is 1/(16^32). The second string ("0000000000000001") also has a 1/(16^32) chance of hashing to itself.
So the odds of no string hashing to itself is (1-1/(16^32))^(16^32), which works out to.. uh.. something like 0.37.
In other words there probably is a md5 hash that hashes to itself. And would someone please double check my math?

Re: 32 hex character strings md5 hashing to themselves
I got the same. The chances seem to come out to about 1/e.
- Yakk
- Poster with most posts but no title.
- Posts: 11073
- Joined: Sat Jan 27, 2007 7:27 pm UTC
- Location: E pur si muove
Re: 32 hex character strings md5 hashing to themselves
Just to note: there are hash functions that do not hash to themselves. The hash function "+5" doesn't, as an example. 
Write or find an md5 function, and that should do it.

Code: Select all
#include <vector>
#include <assert.h>
static bool progress_report = true;
static unsigned int progress_freq = 100000;
typedef unsigned int (*hash_function)( unsigned char* data, size_t length );
bool find_to_itself( hash_function hash, unsigned int* result, size_t min_length_ = 0 ) {
unsigned int min_length = (min_length_<sizeof(unsigned int))?sizeof(unsigned int):min_length);
std::vector<unsigned char> buff(min_length);
unsigned char* buff_ptr = &buff.front();
unsigned int* data_ptr = reinterpret_cast<unsigned int*>(buff_ptr);
if (!data_ptr) {
assert(false);
return false;
}
unsigned int& data = *data_ptr;
data = 0;
unsigned int hash_result = hash_function( &buff.front(), min_length );
if (hash_result == data) {
if (result) *result = data;
return true;
}
while( (++data) != 0 ) {
hash_result = hash_function( &buff.front(), min_length );
if (hash_result == data) {
if (result) *result = data;
return true;
}
if (progress_report) {
if ((progress_freq % data) == 0) {
printf("Progress: up to %d checked\n", data);
}
}
}
return false;
}
void main() {
unsigned int result = 0;
bool found = find_to_itself( md5hash, &result );
if (found) {
printf("Found: %d hashes to itself\n", result);
} else {
printf("No self-hash found\n");
}
}
Write or find an md5 function, and that should do it.
One of the painful things about our time is that those who feel certainty are stupid, and those with any imagination and understanding are filled with doubt and indecision - BR
Last edited by JHVH on Fri Oct 23, 4004 BCE 6:17 pm, edited 6 times in total.
Last edited by JHVH on Fri Oct 23, 4004 BCE 6:17 pm, edited 6 times in total.
Re: 32 hex character strings md5 hashing to themselves
There are 2^128 different possible strings. Surely that would take a prohibitively long time to check.
- Yakk
- Poster with most posts but no title.
- Posts: 11073
- Joined: Sat Jan 27, 2007 7:27 pm UTC
- Location: E pur si muove
Re: 32 hex character strings md5 hashing to themselves
sizeof(unsigned int) is usually 4, and char's usually have 8 bits.
So that's 2^32 different unsigned ints to check, which is about 4 billion. I'd be shocked if the above program took more than a week to run, and if you remove the printf's and do some optimization you should be able to do this on the order of hours (if not minutes or seconds) on a modern consumer computer.
The real problem becomes "how expensive is generating an md5 checksum". And the easiest way to check that is to run the above program and time how long it takes to hit 1 million md5s. Multiply that by 4000, and that is your estimated time of computation.
So that's 2^32 different unsigned ints to check, which is about 4 billion. I'd be shocked if the above program took more than a week to run, and if you remove the printf's and do some optimization you should be able to do this on the order of hours (if not minutes or seconds) on a modern consumer computer.
The real problem becomes "how expensive is generating an md5 checksum". And the easiest way to check that is to run the above program and time how long it takes to hit 1 million md5s. Multiply that by 4000, and that is your estimated time of computation.
One of the painful things about our time is that those who feel certainty are stupid, and those with any imagination and understanding are filled with doubt and indecision - BR
Last edited by JHVH on Fri Oct 23, 4004 BCE 6:17 pm, edited 6 times in total.
Last edited by JHVH on Fri Oct 23, 4004 BCE 6:17 pm, edited 6 times in total.
- Yakk
- Poster with most posts but no title.
- Posts: 11073
- Joined: Sat Jan 27, 2007 7:27 pm UTC
- Location: E pur si muove
Re: 32 hex character strings md5 hashing to themselves
Oh god, I'm sorry. I misread the original problem.
32 character hex string? Ahhhh! PAIN!
Ya, that's far to large to test exhastively.
I thought he said 32 bit number.
32 character hex string? Ahhhh! PAIN!
Ya, that's far to large to test exhastively.
I thought he said 32 bit number.

One of the painful things about our time is that those who feel certainty are stupid, and those with any imagination and understanding are filled with doubt and indecision - BR
Last edited by JHVH on Fri Oct 23, 4004 BCE 6:17 pm, edited 6 times in total.
Last edited by JHVH on Fri Oct 23, 4004 BCE 6:17 pm, edited 6 times in total.
- phlip
- Restorer of Worlds
- Posts: 7550
- Joined: Sat Sep 23, 2006 3:56 am UTC
- Location: Australia
- Contact:
Re: 32 hex character strings md5 hashing to themselves
Now the real question is whether you're trying to hash the 16 bytes of binary data, or 32 bytes of that data as hexadecimal in ASCII... and in that latter case, whether it's uppercase or lowercase... or whether you'll accept a hex string with mixed cases that still hashes to itself... or maybe EBCDIC is OK...
Better test them all, just to be sure.
Better test them all, just to be sure.
Code: Select all
enum ಠ_ಠ {°□°╰=1, °Д°╰, ಠ益ಠ╰};
void ┻━┻︵╰(ಠ_ಠ ⚠) {exit((int)⚠);}
- Yakk
- Poster with most posts but no title.
- Posts: 11073
- Joined: Sat Jan 27, 2007 7:27 pm UTC
- Location: E pur si muove
Re: 32 hex character strings md5 hashing to themselves
Oh ya: regardless of the medium, if the 32 bit result equals the input, the input must fit in 32 bits of freedom.
So it is only a 2^32 search, which is quite doable. My code should work, you just have to have the hash function take the integer input, turn it into the format you want to CRC from, then do the CRC32, and return the result as a 32 bit int.
So it is only a 2^32 search, which is quite doable. My code should work, you just have to have the hash function take the integer input, turn it into the format you want to CRC from, then do the CRC32, and return the result as a 32 bit int.
One of the painful things about our time is that those who feel certainty are stupid, and those with any imagination and understanding are filled with doubt and indecision - BR
Last edited by JHVH on Fri Oct 23, 4004 BCE 6:17 pm, edited 6 times in total.
Last edited by JHVH on Fri Oct 23, 4004 BCE 6:17 pm, edited 6 times in total.
Re: 32 hex character strings md5 hashing to themselves
SHA-5 gives a 128-bit hash.
Re: 32 hex character strings md5 hashing to themselves
In case you are curious, the md5 hash function introduces some permutation on 32 hex character strings, as you realized. You are asking the question if the md5 induced permutation is (not) a derangement. You can read about them on wikipedia here.
http://en.wikipedia.org/wiki/Derangement
http://en.wikipedia.org/wiki/Derangement
Re: 32 hex character strings md5 hashing to themselves
i find taht very improbable, when you look how md5 works. First it makes a block of 512bits, then makes 16blocks of 32bits, then makes some nuts functions, rotations and XORing on that. You'd have to have mighty luck to find such a hash... and i still think it doesn't exist and i'm just not able to se how to prove it by looking at md5...
42
--
If God intended us to program we would be born with serial I/O ports.
--
If God intended us to program we would be born with serial I/O ports.
Re: 32 hex character strings md5 hashing to themselves
Simple.
Done.
You'll be wanting a supercomputer to do this quickly though.
- First, generate all possible MD5 hashes
Second, hash them all.
Third, compare the two for any items that match themselves
Done.
You'll be wanting a supercomputer to do this quickly though.
"Wile E. Coyote was a theoretical mathematician." - Leliel
"Modern life can be so boring without elements of the bizarre or the fantastical. Hence, we have steampunk." - Me
"Modern life can be so boring without elements of the bizarre or the fantastical. Hence, we have steampunk." - Me
Re: 32 hex character strings md5 hashing to themselves
xyzzy wrote:Simple.First, generate all possible MD5 hashes
Second, hash them all.
Third, compare the two for any items that match themselves
Done.
You'll be wanting a supercomputer to do this quickly though.
many of those i think...
42
--
If God intended us to program we would be born with serial I/O ports.
--
If God intended us to program we would be born with serial I/O ports.
- Yakk
- Poster with most posts but no title.
- Posts: 11073
- Joined: Sat Jan 27, 2007 7:27 pm UTC
- Location: E pur si muove
Re: 32 hex character strings md5 hashing to themselves
Rysto wrote:SHA-5 gives a 128-bit hash.
That is what I get for having CRC-32 on the brain.
One of the painful things about our time is that those who feel certainty are stupid, and those with any imagination and understanding are filled with doubt and indecision - BR
Last edited by JHVH on Fri Oct 23, 4004 BCE 6:17 pm, edited 6 times in total.
Last edited by JHVH on Fri Oct 23, 4004 BCE 6:17 pm, edited 6 times in total.
Who is online
Users browsing this forum: No registered users and 3 guests