MD5CRK

In cryptography, MD5CRK was a volunteer computing effort (similar to distributed.net) launched by Jean-Luc Cooke and his company, CertainKey Cryptosystems, to demonstrate that the MD5 message digest algorithm is insecure by finding a collision – two messages that produce the same MD5 hash. The project went live on March 1, 2004. The project ended on August 24, 2004 after researchers independently demonstrated a technique for generating collisions in MD5 using analytical methods by Xiaoyun Wang, Feng, Xuejia Lai, and Yu.[1] CertainKey awarded a 10,000 Canadian Dollar prize to Wang, Feng, Lai and Yu for their discovery.[2]

Pollard's Rho collision search for a single path

A technique called Floyd's cycle-finding algorithm was used to try to find a collision for MD5. The algorithm can be described by analogy with a random walk. Using the principle that any function with a finite number of possible outputs placed in a feedback loop will cycle, one can use a relatively small amount of memory to store outputs with particular structures and use them as "markers" to better detect when a marker has been "passed" before. These markers are called distinguished points, the point where two inputs produce the same output is called a collision point. MD5CRK considered any point whose first 32 bits were zeroes to be a distinguished point.

Complexity

The expected time to find a collision is not equal to 2 N {\displaystyle 2^{N}} where N {\displaystyle N} is the number of bits in the digest output. It is in fact 2 N ! ( 2 N K ) ! × 2 N K {\displaystyle 2^{N}! \over {(2^{N}-K)!\times {2^{N}}^{K}}} , where K {\displaystyle K} is the number of function outputs collected.

For this project, the probability of success after K {\displaystyle K} MD5 computations can be approximated by: 1 1 e K × ( 1 K ) 2 N + 1 {\displaystyle 1 \over {1-e^{K\times (1-K) \over 2^{N+1}}}} .

The expected number of computations required to produce a collision in the 128-bit MD5 message digest function is thus: 1.17741 × 2 N / 2 = 1.17741 × 2 64 {\displaystyle {1.17741\times 2^{N/2}}={1.17741\times 2^{64}}}

To give some perspective to this, using Virginia Tech's System X with a maximum performance of 12.25 Teraflops, it would take approximately 2.17 × 10 19 / 12.25 × 10 12 1 , 770 , 000 {\displaystyle {2.17\times 10^{19}/12.25\times 10^{12}\approx 1,770,000}} seconds or about 3 weeks. Or for commodity processors at 2 gigaflops it would take 6,000 machines approximately the same amount of time.

See also

References

  1. ^ Xiaoyun Wang; Dengguo Feng; Xuejia Lai; Hongbo Yu (17 August 2004). "Collisions for Hash Functions MD4, MD5, HAVAL-128 and RIPEMD". Cryptology ePrint Archive.
  2. ^ "Popular, Yet Obsolete, Banking Algorithm Broken". CertainKey News (Press release). 17 February 2005. Archived from the original on 13 May 2011.

Further reading

  • Paul C. van Oorschot; Michael J. Wiener. Parallel Collision Search with Application to Hash Functions and Discrete Logarithms (PDF). ACM Conference on Computer and Communications Security 1994. pp. 210–218.