[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Suggestions on keyed-MD5



Here is my suggestion on how to proceed with the issue of keyed-MD5.
(The busy people (like me ;-) can read only the first part
and skip the more technical Notes at the end.)

As for background: my suggestions below are based on very recent
analysis work done by Mihir Bellare, Ran Canetti and myself.
(We will make the paper available as soon as it is written, here I
am just mentioning some conclusions, and I allow myself some inaccuracies
in the arguments to avoid being too technical).

If I need to give a fast and easy to understand answer I would say
go to MD5(K,text,K) (which, for example, is not subject to attacks
via collisions-finding while Burt Kaliski's suggestion is - see
Note 1 below)

However, our analysis suggests a better construction.
The later is related to Burt's suggestion
but with some (significant) differences.

Burt suggested:

              MD5(MD5(text),K)

We suggest:

              MD5_K(MD5_K(text))

which is very similar to Burt's EXCEPT:

     the key K is used (not to apppend or prepend to the text but)
     as the INITIAL value of the registers ABCD in the computation
     of MD5 (subsequent iterations of MD5 are untouched).

This suggestion has some significant security advantages in reducing the
risk due to potential vulnerabilities of MD5 (see Note 2 -
please read it if you really want to judge this proposal).

It also has a minor performance advantage: no need to append or
prepend text (this may, in some cases, save one MD5 application -
no big deal).

However, it also
requires that the code of MD5 is (slightly) changed: instead of
initiallizing ABCD to MD5's "magic numbers", you need to read
into them the value of the key.
This is a really minor change in the code (the
iterations are untouched, only the initialization changed) but still
a change. IMHO, that's enough to require a separate RFC to define it.

CONCLUSION:

My suggestion is to have in the current draft MD5(K,text,K) which is
the best approximation to the above MD5_K that does not requires
code changes to MD5 (it is also backed by many people in this list).
In addition, there should be a pointer in the draft
to an RFC that defines keyed-MD5 and that implementators should check.
Such an RFC is being prepared (to the best of my knowledge) by Jim
Galvin and Burt Kaliski.
Whether the above MD5_K suggestion is acceptable should be judged by
the RFC authors (Burt and Jim) and by the IETF, and reflected in that RFC.
(Hopefully, that RFC will be ready by the time the IPSP drafts
become a standard).

As a final remark, all of the above applies to SHA (and other
hash functions that work iteratively), therefore if relevant weaknesses
of MD5 are found in the future one can move to another hash function,
without modifying the basic scheme.

Hugo

Note 1: A fast and inacurate justification of MD5(K,text,K) as opposed
        to MD5(MD5(text),K) is that the later is very close
- in its actual security - to MD5(text, K). In particular,
both are broken if you find collisions in MD5. (A more ellaborated
analysis is possible but "beyond the scope" of this message)

That is, MD5(K,text,K) gives you most of the security of MD5(text,K)
but also added security like resistance to collisions in MD5.

The issue of resistance to collisions is IMPORTANT. It makes the
current cryptanalysis work on the MD-family, e.g. collisions on the
compression function, insufficient to attack the keyed version.
Also (in a more subtle/paranoid point),
whoever will build a collision finding machine (a la van OOrschot-Wiener)
for MD5 shouldn't be able to use it against keyed-MD5. Notice that
breaking MD5 via collisions is much valuable than breaking
a message authentication function. The former can be used
to change long-term contracts, the later looses any value once
the key expires.

Last, but not least. MD5(K,text,K) represents
a better approximation to the MD5_K construction
whose advantages are discussed in Note 2.

Note 2: Security advantages of MD5_K(MD5_K(text))
       (this is a rough sketch, and follows the mentioned paper in
        preparation)

   * Even if collisions in MD5 are found the message authentication is
     NOT BROKEN (Burt's suggestion is insecure given collisions)

   * If ONE iteration of MD5 (i.e. its compression function) with random
     ABCD is pseudorandom then the whole construction IS SECURE for
     message authentication.
     Roughly speaking, pseudorandom means that
     the output of the function "looks random", i.e. behaves as a
     good pseudorandom generator. This property is implicitly or
     explicitly assumed in uses of MD5 to generate keys
     (as SSL, MKMP, Phil Karn and many other have suggested).
     In the case of SHA, its pseudorandomness properties are
     suggested by the designers themselves (it is suggested as a
     pseudorandom generator for DSS).

   * If MD5 is collisions free and its compression function is a good
     MAC (i.e., unforgeable even after seeing many input-output pairs)
     the construction is secure.

Burt's suggestion is based on (a variant) of the last assumption,
while ours will work in this case but also under the first two cases.
Namely, in order to break our construction one needs to break (keyed)
MD5 as a pseudorandom function AND as a collision free function, OR
the compression function needs to be broken as a MAC (in the later case
no hope is left to base message authentication in MD5).