[vox-tech] shell script challenge - Now MD5sum erratia
Charles Polisher
vox-tech@lists.lugod.org
Thu, 8 Aug 2002 12:35:06 -0700
Micah Cowan wrote:
> GNU Linux writes:
> > Found a very interesting page on md5sum. It's:
> >
> > http://hills.ccsf.org/~jharri01/project.html
> >
> > "So why does MD5 seem so secure? Because 128 bits allows you to have
> > 2128=340,282,366,920,938,463,463,374,607,431,768,211,456 different
> > possible MD5 codes"
> >
> > Lots of good reading for insomniacs.
>
> It still shouldn't be relied upon, however, that two identical MD5
> checksums are sufficient evidence that the corresponding files are
> identical; I've heard more than one person claim to have encountered
> identical MD5 sums for different files, and its certainly not
> impossible, just improbable.
I'm dubious ;^)
<H. Lector voice>
The voices tell you they've seen MD5's collide; do they
tell you other things, Micah?
</H. Lector voice>
> But it's a heluva lot better than running diff from one file to every
> other file - a factorial-time operation! :)
And now, a quibble:
Actually, a comparison of the entire file would be
no different than a comparison of the md5sum. From a
Big O standpoint, it's just a constant factor. If
comparing md5's isn't factorial, neither should a full
diff. If there were a ton of files and the lengths were
spread out, the comparisons could be further reduced
by sorting the list by file size, then comparing only
among groups of the same size.
--
Fscking Pedants. I mean that in the nicest possible way of course.