[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: draft-ietf-ipsec-ciph-aes-ctr-00.txt



At 12:44 PM -0700 8/5/02, David A. Mcgrew wrote:
>Steve,
>
>personal remarks elided, your technical comments and my responses below.
>
>Steve Kent wrote:
>>
>>  On the topic of the overhead posed by an explicit vs. implicit IV, I
>>  questioned the 20-byte number you used as an example. Your rebuttal
>>  was vague and in no way countered my observation that the payload
>>  size you cited is a very misleading value. Dave Oran pointed out that
>>  header compression can reduce the IP header overhead, and the ESP
>>  header as well. But the UDP header is inside the encryption boundary,
>>  as are the ESP padding and pad length and NEXT field, and the ESP ICV
>>  is random. So, my point stands, i.e., either you were very careless
>>  in making the comparison between a 20-byte payload and an 8-byte IV,
>>  or you were intentionally skewing the argument.
>
>No.  Perhaps I should have provided more detail when I referred to RFC2507.
>>From Section 7.10:
>
>    ... when the ESP Header is used in tunnel mode an entire IP
>    packet is encrypted, and the headers of that packet MAY be compressed
>    before the packet is encrypted at the entry point of the tunnel.
>
>Also, perhaps you are unaware of the work being done to compress headers
>inside of tunnels.  Examples of this work include
>draft-ietf-avt-crtp-enhance-04.txt and draft-ietf-avt-tcrtp-06.txt, and much
>of the work of the Robust Header Compression (RHOC) WG is applicable as
>well.  It's worth nothing that compression that can be used in conjunction
>with IPsec is an explicit goal of the RHOC WG.  Many of these technologies
>were originally developed with voice over IP in mind, but can be used to
>general benefit, especially over wireless links.

I see that you elided my personal comments to make room for yours :-)

The question here, David, is not what I or the WG is aware or unaware 
of. The question is why you chose to use an obviously misleading 
figure as a basis for the argument. If you wanted to make a precise 
argument about a realistic size for VoIP packets, before and after 
applying ESP, then you could have done so.  You certainly have no 
trouble devoting considerable text to the math associated with the 
issue of how many blocks of data can be safely encrypted with counter 
mode. What we have seen in subsequent messages on this topic from 
others is that:

	- another protocol was developed to secure VoIP, 
specifically, calling into question whether we should bother trying 
to optimize ESP for this application
	- a very little analysis of the likely, average compression 
that may be obtained for SOME the headers that you omitted in your 
initial message
	- another message suggesting that well, maybe there is still 
a desire to have a very efficient version of ESP to use with VoIP, 
anyway
	- a message from the IP storage folks says, 8-bytes is not an 
issue for them

What we still do not have is a concrete analysis of the percentage 
overhead that would be associated with use of an explicit vs. 
implicit IV, under various realistic cases, e.g., no compression, 
outer header compression only, inner and outer header compression, 
for VoIP. This analysis is what you should have provided in your 
message, or in any subsequent message, but it is still missing.

>  > With regard to the security evaluation boundary argument, the issue
>>  is exactly that sharing SA state, specifically sequence number
>>  values, across chip or PC board boundaries presents a limitation on
>>  performance. I serve (or have served) as a technical advisor to three
>>  different companies developing high speed crypto chips for IPsec. The
>>  hardware engineers in each company agree with my observation that is
>>  would be a fundamentally bad idea to try to maintain sequence number
>>  sync across chip/board boundaries, for very high speed
>>  implementations.
>
>Steve, I've participated in high speed hardware designs using AES counter
>mode, and we have not seen the limitation that you describe.  I think that
>the best way to get to the root of our disagreement is for you to provide
>more details about the design that you have in mind.  What are the goals to
>which that design is aiming (10 gigabits per second?  100 gigabits per
>second? 1 terabit per second?) and what assumptions does it make (use
>existing chips?  use some particular ASIC technology or particular
>backplane?)?  Why is it necessary to use the same key in different ASICs or
>boards (which is apparently driving your concern about sequence numbers),
>and why is it desirable to do so if those ASICs are within different
>security boundaries?  If it is really necessary to have multiple senders for
>a single key, why not use a short sender-id value?  Is the design that you
>describe in brief a proprietary one or an open one?  If the latter, can you
>refer to an example in the published literature?

I have participated in 10 Gb/s IPsec chip designs, but this issue 
arises at lower speeds as well.

You seem to have lost track of the underlying issue, an issue we 
discussed at length in the design team list, specifically in my 
message of May 17, 2002. At that time I pointed out the problem faced 
by PPVPN and other "aggregating" IPsec implementations, where there 
is a need to mux traffic from one SA over multiple chips/boards, to 
maximize throughput while minimizing hardware costs. Product 
developers trying to do this have discovered that it is infeasible 
(from a performance perspective) to maintain the shared state for 
each SA's sequence number since that state must be updated by 
multiple chips, perhaps on different PC boards, within a product, for 
each packet sent or received on an SA. The WG discussed this issue on 
the list over a year ago, when product developers first encountered 
the problem. The vendors were asking if the sequence numbers could be 
made optional, for the sender, to avoid the problem. The WG said no. 
The WG then strongly recommended managing the sequence numbers for 
each SA before sending the traffic for the SA to a crypto module (as 
viewed from the transmitter's perspective). This is the context that 
motivates managing these values off chip, although others may exist 
as well. This design strategy is consistent with security evaluation 
practices such as 140-1/2, i.e., it does not affect the security 
boundary for the product evaluation so long as the sequence number is 
used only for its originally defined purpose.

>Additionally, I do not understand your implicit contention that LFSR
>synchronization would be easier to maintain across chip or board boundaries
>than integer synchronization.  The synch problem is fundamental to the
>movement of data, not to the algebraic mechanism by which a next-value is
>generated.

This is not an LFSR-specific issue. This too has been discussed on 
the design team mailing list. Your colleague, Scott Fluhrer, first 
suggested that one could use part of the counter for a per-chip ID, 
to avoid conflicts in the per-packet IV. One of the attractions of 
the compromise design that Russ proposed is that it allows the sender 
to include a per-chip value, if needed, as part of the per-packet IV, 
or to omit it if there is no need to mux an SA over multiple chips. 
This is essentially invisible to the receiver, who does not need to 
examine IV structure. One cannot make use of the same technique with 
the ESP sequence numbers, because the semantics for them require 
sequential generation of values.

>  > Separartely, you maintained that your employer had not encountered
>>  any problems in this regard, and I countered that this was probably
>>  because your employer was not building high assurance products, and
>>  thus the security boundary was very large. A check of the FIPS 140-1
>>  evaluated products list confirms my assertion about the assurance
>  > level of all Cisco evaluated products.
>
>The FIPS-140 evaluations of Cisco gear have no bearing on this discussion.
>It's worth pointing out that the approach of using the ESP sequence number
>as a per-packet counter does *not* decrease security, and that you have not
>even argued that it does.  What you have argued is that that approach could
>prove limiting in a high-speed design - *not* that it is less secure.  That
>approach does not limit the security of a system, and does not have any
>bearing on the FIPS-140-2 process.

YOU argued that maintaining ESP sequence numbers (that are used as 
counter mode inputs) within the security boundary was easy, based on 
Cisco's experience. Since
the context of the discussion was the security issues associated with 
maintaining these values off chip, for the reasons explained (yet 
again) above, I responded that keeping the counters within the 
security boundary was easy only if one had a relatively broad 
security boundary, consistent with a low level of assurance, e.g., 
FIPS 140-1/2 level 2 or less. So, my observation about the evaluation 
of Cisco IPsec products is at level 2 is thoroughly relevant to this 
discussion, given your initial message on the topic.

The use of these sequence numbers as counter inputs almost certainly 
does decrease security assurance, relative to current designs, 
whenever SAs have to be muxed across chips/boards. This is because 
the counters need to be managed off chip under these circumstances, 
for the reasons explained above, again. At best one might be able to 
maintain the currently attainable assurance level if the security 
boundary were extended to encompass the parts of the systems where 
the (now overused) sequence numbers were generated, but almost 
certainly at increased system costs/complexity. I believe we 
discussed this topic in the design team message exchanges earlier 
this year.

>  > Again, your response has not
>>  countered my criticisms. Instead you asked what vendors didn't
>>  maintain the sequence number within the security perimeter. That's
>>  not the point. The point is that, historically, there has been no
>>  need to maintain the ESP sequence counter within the perimeter for
>>  FIPS 140-1.
>
>RFCs 2401 and 2406 state that "protection against replays" is one of the
>goals of IPsec and that ESP provides that service by using monotonically
>increasing sequence numbers.   Given this fact, how can an implementer *not*
>put the ESP sequence number within the security perimeter?
>
>	<SNIP>
>
>>  But, reusing it for counter mode creates that need and
>>  adversely limits the design space for higher assurance products.
>
>I do not believe that the use of a monotonically increasing sequence number
>limits the design space.  This is apparently the real source of our
>disagreements.
>
>  > The above argument segues into the disagreement re the pitfalls of
>  > reusing ESP sequence numbers for counter mode, and what I maintain is
>>  an irrelevant discussion of other crypto modes.  You claim to not
>>  understand the security difference between sequence numbers used,
>>  optionally, for anti-replay, and unique values used as inputs to a
>>  crypto algorithm.
>
>No, what I don't understand is the contention that the current standards do
>not mandate the uniqueness of the ESP sequence number.
>
>  > The distinction is that the former use is of
>>  secondary security importance, as evidenced by the fact that it is an
>>  optional feature (for the receiver) and that is is outside the
>>  security evaluation boundary under 140-1/2. In contrast. the inputs
>>  for counter mode are security critical values that fall within the
>>  evaluation boundary for 140-1/2.
>
>Sure, but the fact that anti-replay may be of lesser importance than
>confidentiality has no bearing on the fact that RFCs mandate the uniqueness
>of the ESP sequence numbers.
>
>
>  > As is so often the case, the term
>>  "trust" has no relevance here. The ESP sequence number plays a
>>  well-defined role, and reusing it for another purpose is a bad design
>>  approach, from a security standpoint.
>
>Including the ESP sequence number inside the security boundary *increases*
>security.

What the preceding text says is that you really do not understand 
FIPS 140-1/2! The security concerns for ESP sequence number values 
are NOT the same as for crypto inputs such as like IVs.  This is not 
a matter of opinion; it is a clear cut evaluation criteria difference 
under FIPS 140-1/2.  Ask anyone who is familiar with the criteria.

Today, if an IPsec chip manages the sequence number on chip, the 
assurance level is potentially quite high, but this is irrelevant to 
FIPS evaluation, so long as the sequence number is NOT used as an 
input to the crypto function as an IV or equivalent. But, we have 
identified a set of scenarios where it is not feasaible to manage the 
sequence number on  chip, as explained above.  So, the WG suggested, 
over a year ago, that designers to to adopt a strategy that will 
require moving the sequence number off chip, under some 
circumstances. This posed no problem re basic crypto security, UNTIL 
you proposed using this sequence number as an input to counter mode. 
That adversely affects the security boundary and, the assurance for 
products.

>  > With regard to the use of counters for per-packet and intra-packet
>  > inputs to counter mode, your response did not rebut my observation
>>  that the 2-fold difference in adder size was an issue ignore by your
>>  initial claim.
>
>Fine, but that has nothing to do with the point that I was making: the
>performance advantage of a 64-bit LFSR over a 64-bit integer increment
>function is completely negligible when compared to the computational cost of
>the AES-encrypting a packet.  This is an important point because
>draft-ietf-ipsec-ciph-aes-ctr-00.txt uses that performance advantage as a
>false premise in support of an argument as to why an implementer might want
>to use an LFSR rather than counter mode.

The issue here is a subtle one, but the argument in Russ' ID is not 
incorrect, although it may be incomplete. Consider an IPsec device 
optimized for your favorite application, VoIP. Let's say that the 
encrypted payload, including all protocol headers and the ESP trailer 
is always between 16 and 32 bytes. I could design an IPsec device 
using a crypto chip that contains two AES cores. (Many high end 
crypto chips today employ multiple algorithm cores.)  When I receive 
an outbound packet for an SA, I know I will need no more than  32 
bytes of key stream, because of the limited application context for 
which I have optimized the device. I select the key for the SA and I 
generate a new, per-packet IV and use it to create key stream for 
both AES cores, in parallel, with the intra-packet counter values set 
to "0" and "1" (maybe hardwired). Now I can begin to encrypt that 
packet. When I am one ROUND into the encryption process, I am now 
ready to begin encrypting a new packet, perhaps for another SA. So, 
the time to generate the new per-packet IV might be bounded by the 
time to execute one ROUND of AES, rather than by a full, 12-round 
encryption process, as you suggest. This is one example where the 
time for per-packet IV generation might be as critical as the 
intra-packet counter update. I suspect there are others.

>  > You did respond to my observations about the greater flexibility
>  > afforded to a product designer by a per-packet input approach that
>>  allows the sender to choose whatever means of generating the value
>>  that meets the security and performance requirements for a product.
>>  Your response was as assertion that the 8-byte overhead of an
>>  explicit IV is not worth the flexibility. This is a value judgement,
>>  not a technical argument. However, I am annoyed by the way you tend
>>  to couch the argument, i.e., as though this is an extra 8 bytes,
>>  whereas the reality is that the same 8-byte overhead that has been
>>  the default for ESP (in CBC mode) since it became a standard. So the
>>  question is not whether to add 8 bytes, but whether there is a
>>  pressing need to remove 8 bytes.
>
>Yes, reducing the encapsulation overhead of ESP while maintaining security
>is a worthwhile goal, in my opinion.

Your proposal reduces security, in a range of real system 
implementations, in exchange for an 8-byte savings. We still don't 
have a firm percentage overhead we can assign to that 8-byte savings, 
in your favorite case, but we do know that it is tiny in most other 
contexts.

You have argued for a very conservative bound on the number of 
packets that can be encrypted per SA, because you understand the 
potential adverse, theoretical consequences of sending more than 2^64 
blocks, even though you cannot describe an attack that would exploit 
this potential weakness.  On the other hand, experience has shown 
that the vast majority of security failures in deployed crypto 
systems arise not from subtle crypto vulnerabilities of the sort you 
are focusing on, but on system design and implementation 
vulnerabilities. My emphasis on not reusing the ESP sequence number, 
to save 8 bytes, is based on that latter set of concerns and a desire 
to offer an architecture that preserves maximum implementation and 
performance flexibility for developers, while not degrading security 
assurance.

Steve