[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: IKEv2 use of HMAC-SHA-1 for Key Derivation



Bill,

You are right in your description below of what HMAC does. And surely
enough HMAC is not stronger than its key length. If the key is 160 bits
(as in SHA-1) then you can break the function in 2^160 operations by
exhaustive search. In particular the key derivation procedures of IKEv1
and IKEv2 are NOT STRONGER than the strength of the prf. If you use
HMAC-SHA1 as the prf then the whole key derivation is no stronger than
2^160. If someone finds this level of security insufficient (?) then 
this someone can use longer-keys prf's (e.g. HMAC with SHA2-256).
But then this someone wil have to use (prohibitively costly?) DH groups, 
192-bit key AES, etc. Nothing that (baring major cryptanalytical
breakthroughs) makes sense today or in the visible future in commercial
applications.

Let me also note that the dependncy on the strength of the prf shows 
up already in the definition of SKEYSEED (which is later used as 
K in the key derivation mechanism that you refer to). Indeed, SKEYSEED 
is obtained by hashing down the full DH key into L bits (where L 
is the length of the prf output). A naive observer of this hashing 
could think that it is silly to give up many g^xy bits (say 1024) 
for "just" 160 bits of the prf output. But it is actually the other 
way around: the hashing of the DH key represents a strengthening 
of the DH exchange in the sense that instead of directlty using the long
g^xy key with relatively low "computational entropy" you apply the hashing 
to "extract" and concentrate the ful entropy in a single (shorter) 
key. The theoretical basis for this type of "entropy smoothing" (this 
is the usual technical term) via hash functions can be found in Goldreich's 
book (Foundations of Cryptography). The only heuristic twist in IKE's 
key derivation (v1 and v2) is that the prf is used in lieu of a universal 
hash family (using an explicit UH family would have been better from 
the theoretical point of view but would have required to add yet another 
transform to the protocol).

Hugo

On Mon, 2 Dec 2002, Bill Sommerfeld wrote:

> [note: from my perspective, I don't think the truncation described
> below will be a problem in the near term -- i.e., it's definitely not
> worth holding up IKEv2 to fix; moreover, if you believe
> draft-ietf-ipsec-ike-modp-groups-04.txt, the estimated limits on group
> strength seem to be a weaker link than the key derivation hash..]
> 
> > Hilarie, can you  clarify what you mean by your comment below?
> > In particular, what is wrong if K is large and what 
> > do you mean by "blocksize" in this context?
> 
> I'm not Hilarie but I was in a side discussion on this very issue in
> Atlanta.  I believe she's referring to the output size of the
> underlying compression function.
> 
> (notation: || constitutes concatenation; B is the input block size in
> bytes; L is the output block size.  I'm reusing some of the notation
> from RFC2104 but distinguishing between concatenation and separate
> parameters).
> 
> Hmac is:
> 
> 	HMAC(K, text) == H(K XOR opad || H(K XOR ipad || text))
> 
> [with K padded with zeros to the input blocksize (B) of the hash's
> compression function; ipad and opad are different constants.]
> 
> The underlying hash function H is built out of repeated applications
> of a compression function CF which takes a previous value of size L
> and an input block of size B and produces an output block of size L.
> 
> so, when "text" is short, the inner value:
> 
> 	H1 = H(K XOR ipad || text)
> 
> is computed:
> 
> 	C1 = CF(C0, K XOR ipad)
> 	C2 = CF(C1, text || pad)
> 
> 	H1 = C2
> 	
> and the final value is computed:
> 
> 	H2 = H(K XOR opad || H1)	
> 
> 	D1 = CF(D0, K XOR opad)
> 	D2 = CF(D1, H1 || pad)
> 
> 	H2 = D2
> 
> [other details: C0 and D0 are equal and are the constant initial value
> for the iterated hash; "pad" is the usual MDn/SHAn "fill to near the
> end of the input block then add bitcount"]
> 
> Note that:
> 
>  - the value of both C1 and D1 depend only on the key
>  - the key is not used directly as input in any later stage.
>  - sizeof(C1) == sizeof(D1) == L
> 
> so, this isn't a problem for a single hmac invocation since you only
> have L bytes of output, but in the iterated hmac case:
> 
> >     T1 = HMAC-SHA1(K, S | 0x01)
> >     T2 = HMAC-SHA1(K, T1 | S | 0x02)
> >     T3 = HMAC-SHA1(K, T2 | S | 0x03)
> >     T4 = HMAC-SHA1(K, T3 | S | 0x04)
> 
> .. the intermediate C1/D1 values above are the same for each of
> T1,T2,T3, ...
> 
> if we regard [T1,T2,T3,...] as a keystream, it would seem that there
> can be at most 2^(8*2*L) possible keystreams for a given HMAC, even if
> K is significantly longer than 2*L bytes.
> 
> Hilarie seemed to have a more pessimistic evaluation than this but I'm
> not sure what it is.
> 
> 						- Bill
>