[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Counterpane comments, ASCII version



<bigger>My annotations are in brackets.


Steve

----------


\chapter*{Executive summary}

IPsec is a set of protocols that provides communication security for
computers 

using IP-based communication networks. It provides authentication and 

confidentiality services on a packet level. To support the IPsec
security, a key 

management protocol called ISAKMP is used. ISAKMP uses public-key
cryptographic 

techniques to set up keys between the different parties to be used with
IPsec.


Both IPsec and ISAKMP are too complex. [a protocol is too complex only
relative 

to a specified set of requirements that are satisfied by a simpler
protocol. To 

substantiate this observation, one ought to define the requirements
that one 

believes the protocol is trying top satisfy, and then  offer a simpler

protocol.] This high complexity leads to errors. We have found security
flaws in 

both IPsec and ISAKMP, and expect that there are many more. We expect
any actual 

implementation to contain many more errors, some of which will cause
security 

weaknesses. These protocols give the impression of having been designed
by a 

committee: they try to be everything for everybody at the cost of
complexity. 

For normal standards, that is bad enough; for security systems, it is 

catastrophic. In our opinion, the complexity of both IPsec and ISAKMP
can be 

reduced by a large factor without a significant loss of functionality.


IPsec is in better shape than ISAKMP. The description and definitions
are 

reasonably clear. A careful implementation of IPsec can achieve a good
level of 

security. Unfortunately, IPsec by itself is not a very useful protocol.
Use on a 

large scale requires the key management functions of ISAKMP. [while I
would tend 

to agree with this observation, I should note that a non-trivial number
of IPsec 

implementations, used in constrained contexts, are manually keyed.]


ISAKMP is currently not in a suitable state for implementation. Major
work will 

be required to get it to that point. There are many security-critical
errors, as 

well as many unnecessary cross-dependencies within the protocol. These
should 

all be eliminated before a new evaluation is done.


Based on our analysis, we recommend that IPsec and ISAKMP not be used
for 

confidential information. At the moment we cannot recommend a direct 

alternative. Some applications might be able to use SSL
\cite{SSLv3Nov96}, which 

in our opinion is a much better protocol that provides a much higher
level of 

security when used appropriately.


\tableofcontents


\chapter{Introduction}


At the request of NSA, Counterpane has conducted a security review of
the IPsec 

and ISAKMP security protocols.


This evaluation is based on RFCs 2401--2411 and RFC 2451  

\cite{RFC2401,RFC2402,RFC2403,RFC2404,RFC2405,RFC2406,RFC2407,RFC2408,RFC2409,RF

C2410,RFC2411,RFC2451}. The Oakley protocol \cite{RFC2412} is only an 

informational RFC; it is not part of the standard and is not used in
ISAKMP. RFC 

documents are available from {\tt ftp:ftp.isi.edu\slash in-notes\slash

rfc<<n>.txt}. 


As \cite{RFC2401} states: ``The suite of IPsec protocols and associated
default 

algorithms are designed to provide high quality security for Internet
traffic. 

However, the security offered by use of these protocols ultimately
depends on 

the quality of the their implementation, which is outside the scope of
this set 

of standards.  Moreover, the security of a computer system or network
is a 

function of many factors, including personnel, physical, procedural, 

compromising emanations, and computer security practices.  Thus IPsec
is only 

one part of an overall system security architecture.'' This evaluation
only 

deals with the IPsec and ISAKMP specifications and is not directly
concerned 

with any of the other factors. However, we do comment on aspects of the

specifications that affect other security factors.


IPsec and ISAKMP are highly complex systems. Unfortunately, we cannot
give a 

sufficiently detailed description of these systems in this document to
allow the 

reader to understand our comments without being familiar with IPsec and
ISAKMP. 

Our comments frequently refer to specific places in the RFC documents
for ease 

of reference.


The rest of this report is structured as follows.
Chapter~\ref{chap:general} 

gives some general comments. Chapter~\ref{chap:bulk} discusses the
IPsec 

protocols that handle bulk data. Chapter~\ref{chap:ISAKMP} discusses
the ISAKMP 

generic definitions. Chapter~\ref{chap:IPsecDOI} talks about the IPsec
Domain of 

Interpretation which gives more details on how the generic ISAKMP
structure 

applies to the IPsec protocols. Finally, chapter~\ref{chap:IKE}
discusses the 

IKE protocol that is the default key management protocol used with
ISAKMP.


\chapter{General comments}\label{chap:general}


\section{Complexity}


Complexity is the biggest enemy of security. This might seem an odd
statement in 

the light of the many fielded systems that exhibit critical security
failures 

for very simple reasons. It is true nonetheless. The simple failures
are simple 

to avoid, and often simple to fix. The problem is not that we do not
know how to 

solve them; it is that this knowledge is often not applied. Complexity,
however, 

is a different beast because we do not really know how to handle it.


Designing any software system is always a matter of weighing various 

requirements. These include functionality, efficiency, political
acceptability, 

security, backward compatibility, deadlines, flexibility, ease of use,
and many 

more. The unspoken requirement is often the complexity. If the system
gets too 

complex, it becomes too difficult, and therefore too expensive, to
make. As 

fulfilling more of the requirements usually involves a more complex
design, many 

systems end up with a design that is as complex as the designers and 

implementors can reasonably handle. 


Virtually all software is developed using a try-and-fix methodology.
Small 

pieces are implemented, tested, fixed, and tested
again.\footnote{Usually 

several iterations are required.} Several of these small pieces are
combined 

into a larger module, and this module is tested, fixed, and tested
again. The 

end result is software that more or less functions as expected,
although we are 

all familiar with the high frequency of functional failures of software
systems.


This process of making fairly complex systems and implementing them
with a try-

and-fix methodology has a devastating effect on the security. The
central reason 

is that you cannot test for security. Therefore, security bugs are not
detected 

during the development process in the same way that functional bugs
are. Suppose 

a reasonably sized program is developed without any testing at all
during 

development and quality control. We feel confident in stating that the
result 

will be a completely useless program; most likely it will not perform
any of the 

desired functions correctly. Yet this is exactly what we get from the
try-and-

fix methodology when we look at security. 


The only reasonable way to ``test'' the security of a security product
is to 

perform security reviews on it.\footnote{A cracking contest can be seen
as a 

cheap way of getting other people to do a security analysis. The big
problem is 

interpreting the results. If the prize is not claimed, it does not
imply that 

any competent analysis was done and came up empty.} A security review
is a 

manual process; it is relatively expensive in terms of time and effort
and it 

will never be able to show that the product is in fact secure. [this
seems to 

ignore the approaches usually employed for high assurance system design
and 

implementation , i.e., careful design and review coupled with rigid
development 

procedures, all prior to testing.]


The more complex the system is, the harder a security evaluation
becomes. A more 

complex system will have more security-related errors in the
specification, 

design, and implementation. We claim that the number of errors and
difficulty of 

the evaluation are not linear functions of the complexity, but in fact
grow much 

faster.


For the sake of simplicity, let us assume the system has $n$ different
options, 

each with two possible choices.\footnote{We use $n$ as the measure of
the 

complexity. This seems reasonable, as the length of the system
specification and 

the implementation is proportional to $n$.} Then there are $n(n-1)/2 =
O(n^2)$ 

different pairs of options that could interact in unexpected ways, and
$2^n$ 

different configurations altogether. Each possible interaction can lead
to a 

security weakness, and the number of possible complex interactions that
involve 

several options is huge. As each interaction can produce a security
weakness, we 

expect that the number of actual security weaknesses grows very rapidly
with 

increasing complexity.


The same holds for the security evaluation. For a system with a
moderate number 

of options, checking all the interactions becomes a huge amount of
work. 

Checking every possible configuration is effectively impossible. Thus
the 

difficulty of performing security evaluations also grows very rapidly
with 

increasing complexity. The combination of additional (potential)
weaknesses and 

a more difficult security analysis unavoidably results in insecure
systems.


In actual systems, the situation is not quite so bad; there are often
options 

that are ``orthogonal'' in that they have no relation or interaction
with each 

other. This occurs, for example, if the options are on different layers
in the 

communication system, and the layers are separated by a well-defined
interface 

that does not ``show'' the options on either side. For this very
reason, such a 

separation of a system into relatively independent modules with clearly
defined 

interfaces is a hallmark of good design. Good modularization can
dramatically 

reduce the ``effective'' complexity of a system without the need to
eliminate 

important features. Options within a single module can of course still
have 

interactions that need to be analyzed, so the number of options per
module 

should be minimized. Modularization works well when used properly, but
most 

actual systems still include cross-dependencies where options in
different 

modules do affect each other.


A more complex system loses on all fronts. It contains more weaknesses
to start 

with, it is much harder to analyze, and it is much harder to implement
without 

introducing security-critical errors in the implementation.


Complexity not only makes it virtually impossible to create a secure 

implementation, it also makes the system extremely hard to manage. The
people 

running the actual system typically do not have a thorough
understanding of the 

security issues involved. Configuration options should therefore be
kept to a 

minimum, and the options should provide a very simple model to the
user. Complex 

combinations of options are very likely to be configured erroneously,
which 

results in a loss of security. The stories in \cite{TheCodebreakers}
and 

\cite{A:WhyFail} illustrate how management of complex systems is often
the 

weakest link.


Both IPsec and ISAKMP are too complex to be secure. The design
obviously tries 

to support many different situations with different options. We feel
very 

strongly that the resulting system is well beyond the level of
complexity that 

can be implemented securely with current methodologies.



\section{Stating what is achieved}

A security analysis evaluates the security aspects of a system. To be
able to 

give any sensible answer, it should be clear what properties the system
claims 

to have. That is, the system documentation should clearly state what
security 

properties are achieved. This can be seen as the functional
specification of the 

security properties. This applies not only to the entire system, but
also to the 

individual modules. At each module or function, the security properties
should 

be specified. 


A good comparison is the testing of a product. The testing verifies
that the 

product performs according to the functional specifications. Without 

specifications, the testers might have some interesting comments, but
they can 

never give a real answer. 


Without security specifications, the first task of the security
analysis is to 

create descriptions of the security properties achieved, based on the
perceived 

intentions of the system designer. The subsequent evaluation might then
turn up 

problems of the form ``this function does not achieve the properties
that we 

think it should have.'' The obvious answer will be: ``but that is not
the 

properties that I designed it to have.'' Very quickly the discussion
moves away 

from the actual security into what was meant. The overall result is a
security 

evaluation that might point out some potential weaknesses, but that
will hardly 

help in improving the security.


The IPsec and ISAKMP protocols do not specify clearly which security
properties 

they claim to achieve. [RFCs 2401, 2402, and 2406 clearly state the
security 

services offered by the AH and ESP protocols.] The same holds for the
modules 

and functions. [modules are not specified by these standards; they are

implementation artifacts.] We recommend that each function, module, and
protocol 

be extended to include clear specifications regarding the
security-related 

functionality they achieve. We feel that unless this is done, it will
not be 

possible to perform an adequate security evaluation on a system of this

complexity.



\chapter{Bulk data handling}\label{chap:bulk}


In this chapter we discuss the methods used to handle the encryption
and 

authentication of the bulk data, as specified in 

\cite{RFC2401,RFC2402,RFC2403,RFC2404,RFC2405,RFC2406,RFC2451,RFC2410,RFC2411}.

Together these documents specify the IPsec protocol. They specify the
actual 

encryption and authentication of packets, assuming that symmetric keys
have 

already been exchanged. We refer the reader to \cite{RFC2401} sections
1--4.2 

for an overview of this part of IPsec and the relevant terminology.



\section{Functionality}

IPsec is capable of providing authentication and confidentiality
services on a 

packet level. The security configuration of an IPsec implementation is
done 

centrally, presumably by the system administrator. [In some
environments, a 

single administrator might control the configuration of each IPsec 

implementation, or each user might have some control over it.  The
latter would 

tend to be characterized as a distributed management paradigm, not a
central 

one.  Also, two IPsec peers communicate ONLY if both agree on the
security 

parameters for the SA, i.e., there is suitable overlap in the SPDs.  In
that 

sense too, security configuration is distributed.]


IPsec is very suitable for creating a VPN over the Internet, improved
security 

for dial-in connections to portables, restricting access to parts of a
network, 

etc. These are very much network-level functions. IPsec by itself does
not 

supply application-level security. Authentication links the packet to
the 

security gateway of the originating network, the originating host, or
possibly 

the originating user, but not to the application in question or the
data the 

application was handling when it sent the packet. [true, but for many 

applications, application layer security is not needed, and its
implementation 

might well be accorded less assurance than the network layer security
provided 

by IPsec. This paragraph seems to suggest that there is some important
benefit 

to linking data to an application, through an application-specific
security 

mechanism.  There are good examples of where this is true, e.g., e-mail
and 

directories. However, unless there are application-specific security
semantics 

that cannot be captured by use of an application security protocol,
your own 

arguments about simplicity, as well as a number of arguments re
assurance, argue 

against proliferation of application security protocols.]


The IPsec functionality can significantly increase the security of the
network. 

It is not a panacea for all security problems, and applications that
require 

security services will typically have to use other security systems in
addition 

to IPsec. [I might disagree with the term "typically" here. A lot
depends on the 

application, where IPsec is implemented, etc.]



\section{Complexity}\label{sec:complexity}

Our biggest criticism is that IPsec is too complex. There are too many
options 

that achieve the same or similar properties. [if they were completely
equivalent 

this would be a good basis for simplifying IPsec. However, there are
subtle 

differences that have resulted in the proliferation of options you
address 

below.]


\subsection{Options}


IPsec suffers from an abundance of options. For example, two hosts that
want to 

authenticate IP packets can use four different modes: transport/AH,
tunnel/AH, 

transport/ESP with NULL encryption, and tunnel/ESP with NULL
encryption. The 

differences between these options, both in functionality and
performance, are 

minor. 


In particular, the following options seem to create a great deal of
needless 

complexity:


\begin{enumerate}

\item There are two modes that can be used: transport mode and tunnel
mode. In 

transport mode, the IP header of the packet is left untouched. AH
authenticates 

both the IP header and the packet payload. ESP encrypts and
authenticates the 

payload, but not the header. The lack of header authentication in
transport/ESP 

is a real weakness, as it allows various manipulations to be performed.
In 

tunnel mode, the full original IP packet (including headers) is used as
the 

payload in a new IP packet with new headers. The same AH or ESP
functions are 

used. As the original header is now included in the ESP authentication,
the 

transport/ESP authentication weakness no longer exists.


Transport mode provides a subset of the functionality of tunnel mode.
The only 

advantage that we can see to transport mode is that it uses a somewhat
smaller 

bandwidth. However, the tunnel mode could be extended in a
straightforward way 

with a specialized header-compression scheme that we will explain
shortly. This 

would achieve virtually the same performance as transport mode without

introducing an entirely new mode. We therefore recommend that the
transport mode 

be eliminated. [transport mode and tunnel mode address fundamentally
different 

requirements, from a networking point of view. When security gateways
are 

involved, the use of tunnel mode is an absolute requirement, whereas it
is a 

minor (and rarely used) feature for communications between end systems.
A 

proposal to make all traffic tunnel mode, and to try to offset the
added 

overhead through compression, seems to ignore the IPCOMP facility that
is 

already available to IPsec implementations. Today, transport mode is
used 

primarily to carry L2TP traffic, although this is primarily an
efficiency 

issue.]


\item There are two protocols: AH and ESP. AH provides authentication,
and ESP 

provides encryption, authentication, or both. In transport mode, AH
provides a 

stronger authentication than ESP can provide, as it also authenticates
the IP 

header. One of the standard modes of operation would seem to be to use
both AH 

and ESP in transport mode. [although this mode is required to be
supported, it 

seems to be rarely used today. A plausible, near-term use for AH is to
provide 

integrity and authenticity for IPsec traffic between an end system and
a first-

hop intermediary. For example, AH can be used  between a host inside an
enclave 

and a security gateway at the perimeter, to allow the SG to control
what traffic 

leaves the enclave, without granting the SG access to plaintext
traffic. This, 

and similar concatenated SA examples, motivate retention of AH. One
could 

achieve a similar effect with (authentication-only) ESP tunnels, but
with 

increased bandwidth and processing overhead.] In tunnel mode, the
authentication 

that ESP provides is good enough (it includes the IP header), and AH is

typically not combined with ESP \cite[section 4.5]{RFC2401}. [the
example above 

shows why one might wish to use AH for the outer header, but most
likely with 

ESP in transport mode.] (Implementations are not required to support
nested 

tunnels that would allow ESP and AH to both be used.)


The AH protocol \cite{RFC2402} authenticates the IP headers of the
lower layers. 

[AH authenticates the IP header at the SAME layer, in many respects. AH
was 

originally described as an IP (v4) option. In IPv6, AH is viewed as 
part of the 

AH header, and may appear before other header extensions (see section
4.1 of RFC 

2401). I agree that AH represents ugly layering, but it's not as bad as
you 

suggest here.] This creates all kind of problems, as some header fields
change 

in transit. As a result, the AH protocol needs to be aware of all data
formats 

used at lower layers so that these mutable fields can be avoided. [this
is an 

inaccurate characterization, especially given the status of AH re IPv6.
Don't 

think of AH as a transport protocol. It isn't.] This is a very ugly 

construction, and one that will create more problems when future
extensions to 

the IP protocol are made that create new fields that the AH protocol is
not 

aware of. [RFC 2402 explains how to deal with new IP header fields in
v6 (see 

section 3.3.3.1.2.2). The existence of a mutability flag in such
extensions 

makes processing relatively straightforward.] Also, as some header
fields are 

not authenticated, the receiving application still cannot rely on the
entire 

packet. To fully understand the authentication provided by AH, an
application 

needs to take into account the same complex IP header parsing rules
that AH 

uses. The complex definition of the functionality that AH provides can
easily 

lead to security-relevant errors.


The tunnel/ESP authentication avoids this problem, but uses more
bandwidth. [but 

it does not provide exactly the same features, as noted above, so the 

alternative is not quite equivalent.] The extra bandwidth requirement
can be 

reduced by a simple specialized compression scheme: for some suitably
chosen set 

of IP header fields $X$, a single bit in the ESP header indicates
whether the 

$X$ fields in the inner IP header are identical to the corresponding
fields in 

the outer header.\footnote{A trivial generalization is to have several
flag 

bits, each controlling a set of IP header fields.} The fields in
question are 

then removed to reduce the payload size. This compression should be
applied 

after computing the authentication but before any encryption. The
authentication 

is thus still computed on the entire original packet. The receiver
reconstitutes 

the original packet using the outer header fields, and verifies the 

authentication. A suitable choice of the set of header fields $X$
allows 

tunnel/ESP to achieve virtually the same low message expansion as
transport/AH.


We conclude that eliminating transport mode allows the elimination of
the AH 

protocol as well, without loss of functionality.  [counter examples
provided 

above suggest that this claim is a bit overstated.]


\item The standard defines two categories of machines: hosts and
security 

gateways. Hosts can use transport mode, but security gateways must
always use 

tunnel mode. Eliminating transport mode would also allow this
distinction to be 

eliminated. Various computers could of course still function as hosts
or 

security gateways, but these different uses would no longer affect the
protocol.


\item The ESP protocol allows the payload to be encrypted without being

authenticated. In virtually all cases, encryption without
authentication is not 

useful. The only situation in which it makes sense not to use
authentication in 

the ESP protocol is when the authentication is provided by a subsequent

application of the AH protocol (as is done in transport mode because
ESP 

authentication in transport mode is not strong enough). [this is one
example of 

when one might not need authentication with ESP, but it is not the only
one. In 

general, if there is a higher layer integrity and/or authentication
function in 

place, providing integrity/authentication in IPsec is redundant, both
in terms 

of space and processing. The authentication field for ESP or AH is 12
bytes. For 

applications where packet sizes are quite small, and for some
environments where 

packet size is of critical importance, e.g., packet voice in a wireless

environment, ESP w/o authentication may be appropriate. This is
especially true 

if the application protocol embodies an authentication mechanism. This
might 

happen if the application protocol wants to offer uniform protection 

irrespective of the lower layers.  Admittedly, this might also cause
the 

application to offer confidentiality as well, but depending on the
application, 

the choices of what security services are being offered may vary.]
Without the 

transport mode to worry about, ESP should always provide its own
authentication. 

We recommend that ESP authentication always be used, and only
encryption be made 

optional. [the question of authentication as an intrinsic part of ESP
is 

independent of mode, i.e., whether one choose to provide authentication
as a 

part of ESP is not determined by the choice of transport vs. tunnel
mode.] 


\end{enumerate}


We can thus remove three of the four operational modes without any
significant 

loss of functionality. [sorry, can't agree, given the counter examples
above.]


\subsection{Undesirable options}

There are existing combinations of options that are undesirable. These
pose a 

problem when non-experts have to configure an IPsec installation. Given
the fact 

that experts are rare and usually have better things to do, most IPsec

installations will be configured by non-experts. [yes, we were aware of
this 

concern. However, there is always a tradeoff between adopting the "we
know 

what's best for you" approach, vs. the "you can screw it up if you want
to 

approach." We opted for a point somewhere along this spectrum, but not
at either 

end.]


\begin{enumerate}

\item In transport mode, use of ESP provides authentication of the
payload only. 

The authentication excludes the IP headers of the packet. The result is
a data 

stream that is advertised as ``authenticated'' for which critical
pieces of 

information (such as the source and destination IP number) are not 

authenticated. Unless the system administrator is intimately familiar
with the 

different forms of authentication used by IPsec, it is quite likely
that the 

administrator will assume that the authentication protects the entire
packet. 

The combination of transport mode and the ESP protocol (without the AH
protocol) 

should therefore not be allowed. [The IP source and destination address
are 

covered by the TCP checksum, which is covered by the ESP integrity
check, so 

this does limit (a tiny bit) the ability to change these values without

detection. A more significant observation is that transport mode IPsec
SAs will 

probably always use source and/or destination IP addresses as part of
the 

selector set. In such cases, tampering with the either address will
result in a 

failed authentication check.]


\item The standard allows ESP to be used with the NULL encryption, such
that it 

provides only authentication. The authentication provided by ESP in
transport 

mode is less functional than the authentication provided by AH, at a
similar 

cost. If transport mode is retained, either the EPS ESP authentication
should be 

extended or the use of ESP with only authentication should be forbidden
and 

replaced by the use of AH. [ESP authentication is more efficient to
compute than 

AH, because of the selective IP header coverage provided by AH.  Thus
there is 

good reason to allow authentication-only ESP as an alternative to AH.
This point 

was debated by the group and, with implementation experience, vendors
came to 

agree that this is true.]


\item The ESP protocol can provide encryption without authentication.
This does 

not make much sense in an application. It protects the application
against 

passive eavesdroppers, but provides no protection against active
attacks that 

are often far more devastating. Again, this mode can lure non-expert
users into 

using an unsafe configuration that they think is secure. Encryption
without 

authentication should be forbidden. [as noted above, there are examples
where 

this feature set for ESP is attractive.]


\end{enumerate}


\subsection{Orthogonality}

IPsec also suffers from a lack of orthogonality. The AH and ESP
protocols can be 

used together, but should only be used in one particular order. In
transport 

mode, ESP by itself provides only partial authentication of the IP
packet, and 

using AH too is advisable. [not in most cases, as noted above.] In
tunnel mode 

the ESP protocol authenticates the inner headers, so use of AH is no
longer 

required. These interdependencies between the choices demonstrate that
these 

options are not independent of each other. [true, but who says that
this is a 

critical criteria? TCP and IP are not orthogonal either, e.g., note the
TCP 

checksum covering parts of the IP header.]


\subsection{Compatibility}

The IPsec protocols are also hampered by the compatibility
requirements. A 

simple problem is the TOS field in the IP header \cite[p.\
10]{RFC2402}. 

Although this is supposed to be unchanged during the transmission of a
packet 

(according to the IP specifications), some routers are known to change
this 

field. IPsec chose to exclude the TOS field from the authentication
provided by 

the AH protocol to avoid errors introduced by such rogue routers. The
result is 

that, in transport/AH packets that have an authenticated header, the
TOS field 

is not authenticated. This is clearly unexpected from the application
point of 

view, which might want to rely on the correct value of the TOS field.
This 

problem does not occur in tunnel mode. [it is unfortunate that cisco
chose to 

not follow the specs here, and in several other places. I agree that an

unenlightened system administrator might be surprised in this case.
But, in 

practice, the effect is minimal.  Your example cites transport mode,
which means 

that the TOS bits are being acted upon by the end system. If end
systems really 

paid attention to these bits in the first place, cisco would not have
been able 

to corrupt them with impunity! The reason that these bits are being
re-used by 

the ECN folks is because hosts have never made use of them.  Still,
going 

forward, one should pay attention to this vulnerability.]


A more complex compatibility problem is the interaction between
fragmentation 

and IPsec \cite[appendix B]{RFC2401}. This is a complex area, but a
typical 

IPsec implementation has to perform specialized processing to
facilitate the 

proper behavior of higher-level protocols in relation to fragmentation.
Strictly 

speaking, fragmentation is part of the communication layer below the
IPsec 

layer, and in an ideal world it would be transparent to IPsec.
Compatibility 

requirements with existing protocols (such as TCP) force IPsec to
explicitly 

handle fragmentation issues, which adds significantly to the overall
complexity. 

Unfortunately, there does not seem to be an elegant solution to this
problem. 

[The requirement here is the same that arises whenever an intermediate
system 

adds info to a packet, or when a smaller MTU intermediate system is
traversed. 

IPsec in an SG is doing what a router along a path would do if the
"other side" 

network were smaller. IPsec in a host is doing what the NIC would do if
the LAN 

MTU changed. The real complexity arises when we wish to do this
optimally, at a 

security gateway or a BITS or BITW implementation, in cases where
different SAs 

use different combinations of AH and ESP, or different algorithms,
etc.]


\subsection{Conclusion}

The overall result is that IPsec bulk data handing is overly complex.
In our 

opinion it is possible to define an equivalent system that is far less
complex. 



\section{Order of operations}


\subsection{Introduction}

When both encryption and authentication are provided, IPsec performs
the 

encryption first, and authenticates the ciphertext. In our opinion,
this is the 

wrong order. Going by the ``Horton principle'' \cite{WS:SSL30}, the
protocol 

should authenticate what was meant, not what was said. The ``meaning''
of the 

ciphertext still depends on the decryption key used. Authentication
should thus 

be applied to the plaintext (as it is in SSL \cite{SSLv3Nov96}), and
not to the 

ciphertext.[The order of processing is intentional. It is explicitly
designed to 

allow a receiver to discard a packet as quickly as possible, in the
event of DoS 

attacks, as you acknowledge below. The suggestion that this concern be
addressed 

by the addition of a secondary MAC seems to violate the spirit of
simplicity 

that this document espouses so strongly, and the specific proposed fix
is not 

strong enough to warrant its incorporation. Moreover, this ordering
allows 

parallel processing at a receiver, as a means of increasing throughput
and 

reducing delay.] 


This does not always lead to a direct security problem. In the case of
the ESP 

protocol, the encryption key and authentication key are part of a
single ESP key 

in the SA. A successful authentication shows that the packet was sent
by someone 

who knew the authentication key. The recipient trusts the sender to
encrypt that 

packet with the other half of the ESP key, so that the decrypted data
is in fact 

the same as the original data that was sent. The exact argument why
this is 

secure gets to be very complicated, and requires special assumptions
about the 

key agreement protocol. For example, suppose an attacker can manipulate
the key 

agreement protocol used to set up the SA in such a way that the two
parties get 

an agreement on the authentication key but a disagreement on the
encryption key. 

When this is done, the data transmitted will be authenticated
successfully, but 

decryption takes place with a different key than encryption, and all
the 

plaintext data is still garbled. [The fundamental assumption is that an
ESP SA 

that employs both encryption and an HMAC will have the keys bound
together, 

irrespective of the means by which they are generated. This assumption
probably 

could be better stated in the RFCs.]


In other situations, the wrong order does lead to direct security
weaknesses.


\subsection{An attack on IPsec}

Suppose two hosts have a manually keyed transport-mode AH-protocol SA,
which we 

will call SAah. Due to the manual keying, the AH protocol does not
provide any 

replay protection. These two hosts now negotiate a transport-mode
encryption-

only ESP SA (which we will call SAesp1) and use this to send
information using 

both SAesp1 and SAah. The application can expect to get confidentiality
and 

authentication on this channel, but no replay protection. When the
immediate 

interaction is finished, SAesp1 is deleted. A few hours later, the two
hosts 

again negotiate a transport-mode encryption-only ESP SA (SAesp2), and
the 

receiver chooses the same SPI value for SAesp2 as was used for SAesp1.
Again, 

data is transmitted using both SAesp2 and SAah. The attacker now
introduces one 

of the packets from the first exchange. This packet was encrypted using
SAesp1 

and authenticated using SAah. The receiver checks the authentication
and finds 

it valid. (As replay protection is not enabled, the sequence number
field is 

ignored.) The receiver then proceeds to decrypt the packet using
SAesp2, which 

presumably has a different decryption key then SAesp1. The end result
is that 

the receiver accepts the packet as valid, decrypts it with the wrong
key, and 

presents the garbled data to the application. Clearly, the
authentication 

property has been violated. [this attack is not a criticism of the
choice of ESP 

operation ordering, but rather the notion of applying AH and ESP
(encryption 

only) in a particular order, as allowed by RFC 2401. The specific
combination of 

keying operations described here, though not prohibited by 2401, does
not seem 

likely to occur in practice. Specifically, if an IPsec implementation
supports 

automated key management, as described above for the ESP SAs, then it
is highly 

unlikely that the AH SA would be manually keyed. The push to retain
manual 

keying as a base facility for IPsec is waning, and most implementations
have IKE 

available.  Under these circumstances, this vulnerability is unlikely
to be 

realized.] 


\subsection{Other considerations}

Doing the encryption first and authentication later allows the
recipient to 

discard packets with erroneous authentication faster, without the
overhead of 

the decryption. This helps the computer cope with denial-of-service
attacks in 

which a large number of fake packets eat up a lot of CPU time. We
question 

whether this would be the preferred mode of attack against a
TCP/IP-enabled 

computer. If this property is really important, a 1- or 2-byte MAC
(Message 

Authentication Code) on the ciphertext could be added. The MAC code
allows the 

recipient to rapidly discard virtually all bogus packets at the cost of
an 

additional MAC computation per packet. [a one or two byte MAC provides
so little 

protection that this does not seem to be an attractive
counter-proposal. Also, 

as noted above, it adds complexity Š]


\subsection{Conclusion}

The ordering of encryption and authentication in IPsec is wrong.
Authentication 

should be applied to the plaintext of the payload, and encryption
should be 

applied after that.



\section{Security Associations}

A Security Association (SA) is a simplex ``connection' that affords
security 

services to the traffic carried by it \cite[section 4]{RFC2401}. The
two 

computers on either side of the SA store the mode, protocol,
algorithms, and 

keys used in the SA. Each SA is used only in one direction; for
bidirectional 

communications two SAs are required. Each SA implements a single mode
and 

protocol; if two protocols (such as AH and ESP) are to be applied to a
single 

packet, two SAs are required.


Most of our aforementioned comments also affect the SA system; the use
of two 

modes and two protocols make the SA system more complex than
necessary.


There are very few (if any) situations in which a computer sends an IP
packet to 

a host, but no reply is ever sent. [we have a growing number of apps
where this 

functionality may be appropriate. For example, broadcast packet video
feeds and 

secure time feeds are unidirectional.] There are also very few
situations in 

which the traffic in one direction needs to be secured, but the traffic
in the 

other direction does not need to be secured. It therefore seems that in

virtually all practical situations, SAs occur in pairs to allow
bidirectional 

secured communications. In fact, the IKE protocol negotiates SAs in
pairs. [IKE 

has not always been well coordinated with IPsec, unfortunately. This is
why we 

have to have null encryption and null authentication algorithms. So, I
don't 

think one should cite IKE behavior as a basis for making SAs
bi-directional. I 

agree that the vast majority of examples that we see now are full
duplex, but we 

have example where this may not apply, as noted above.]


This would suggest that it is more logical to make an SA a
bidirectional 

``connection'' between two machines. This would halve the number of SAs
in the 

overall system. It would also avoid asymmetric security configurations,
which we 

think are undesirable (see section~\ref{sec:SPD}). [The SPI that is
used as a 

primary de-multiplexing value, must be chosen locally, by the receiver,
so 

having bi-directional SAs probably won't change the size of the SAD 

substantially. Specifically, how do you envision that a switch to bi-

directionality would simplify implementations?]


\section{Security policies}\label{sec:SPD}

The security policies are stored in the SPD (Security Policy Database).
For 

every packet that is to be sent out, the SPD is checked to find how the
packet 

is to be processed. The SPD can specify three actions: discard the
packet, let 

the packet bypass IPsec processing, or apply IPsec processing. In the
last case, 

the SPD also specifies which SAs should be used (if suitable SAs have
already 

been set up) or specifies with what parameters new SAs should be set up
to be 

used.


The SPD seems to be a very flexible control mechanism that allows a
very fine-

grained control over the security processing of each packet. Packets
are 

classified according to a large number of selectors, and the SPD can
match some 

or all selectors to determine the appropriate action. Depending on the
SPD, this 

can result in either all traffic between two computers being carried on
a single 

SA, or a separate SA being used for each application, or even each TCP

connection. Such a very fine granularity has disadvantages. There is a

significantly increased overhead in setting up the required SAs, and
more 

traffic analysis information is made available to the attacker. At the
same time 

we do not see any need for such a fine-grained control. [a lot of
customers for 

IPsec products disagree!] The SPD should specify whether a packet
should be 

discarded, should bypass any IPsec processing, requires authentication,
or 

requires authentication and encryption. Whether several packets are
combined on 

the same SA is not important. [yes it is. By allowing an administrator
the 

ability to select the granularity of protection, one can control the
level of 

partial traffic flow confidentiality offered between security gateways.
Also, 

fine-grained access control allows an admin to allow some forms of
connections 

through the gateway, while rejecting others. Access control is often
the 

primary, underlying motivation for using IPsec. A number of attacks
become 

possible if one cannot tightly bind the authentication provided by
IPsec to the 

access control decision. Also, given the computational costs of SA
establishment 

via IKE, it is important to allow an administrator to select the
granularity of 

SAs.] The same holds for the exact choice of cryptographic algorithm:
any good 

algorithm will do. There are two reasons for this. First of all, nobody
ever 

attacks a system by cryptanalysis. Instead, attacks are made on the
users, 

implementation, management, etc. Any reasonable cryptographic algorithm
will 

provide adequate protection. The second reason is that there are very
efficient 

and secure algorithms available. Two machines should negotiate the
strongest 

algorithm that they are allowed. There is no reason to select
individual 

algorithms on an application-by-application basis. [if one were to
employ ESP 

without authentication, because a specific higher layer protocol
provided its 

own authentication, and maybe because the application employed FEC,
then one 

might well imagine using different encryption algorithms, or different
modes 

(e.g., block vs. stream) for different SAs. while I agree that the
focus on 

algorithm agility may be overstated, it does allow communicating
parties to 

select a higher quality algorithm, relative to the mandated default, if
they 

both support that algorithm.] 


In our opinion, management of the IPsec protocols can be simplified by
letting 

the SPD contain policies formulated at such a higher level. As we
argued in 

section~\ref{sec:complexity}, simplification will strengthen the actual
system. 

[examples provided above illustrate why fine-grained access control is

important.]


It would be nice if the same high-level approach could be done in
relation to 

the choice of SA end-points. As there currently does not seem to be a
reliable 

automatic method of detecting IPsec-enabled security gateways, we do
not see a 

practical alternative to manual configuration of these parameters. It
is 

questionable whether automatic detection of IPsec-enabled gateways is
possible 

at all. Without some initial knowledge of the other side, any detection
and 

negotiation algorithm can be subverted by an active attacker. [the
authors 

identify a good problem, but it is hardly an unsolvable one. A proposal
was put 

forth (by Bob Moscowtiz, over a year ago) to include records in the DNS

analogous to MX records. When one tried to establish an SA to a host
"behind" an 

SG, fetching this record would direct the initiator to an appropriate
SG.  This 

solves the SG discovery problem. Other approaches have been put forth
in the 

more recent BBN work on security policy management, which forms the
basis for a 

new IETF WG, chaired by Luis Sanchez. The fact that none of the
approaches has 

been deployed says more about the priorities of IPsec vendors and early
adopters 

than about the intractability of the problem. The other part of the
problem is 

verifying that an SG is authorized to represent the SA target. Here
too, various 

approaches have been described on the IPsec mailing list.]


\section{General comments}

This section contains general comments that came up during our
evaluation of 

IPsec. 


\begin{enumerate}

\item In \cite[p.\ 22]{RFC2401}, several fields in the SAD are required
for all 

implementations, but only used in some of them. It does not make sense
to 

require the presence of fields within an implementation. Only the
external 

behavior of the system should be standardized. [the SAD defined in 2401
is 

nominal, as the text explains. An implementation is not required to
implement 

these fields, but must exhibit behavior consistent with the presence of
these 

fields. We were unable to specify external behavior without reference
to a 

construct of this sort. The SPD has the same property.]


\item According to \cite[p.\ 23]{RFC2401}, an SA can be either for
transport 

mode, tunnel mode, or ``wildcard,'' in which case the sending
application can 

choose the mode on a packet-by-packet basis. Much of the rest of the
text does 

not seem to take this possibility into account. It also appears to us
to be 

needless complexity that will hardly every be used, and is never a
necessity. We 

have already argued that transport mode should be eliminated, which
implies that 

this option is removed too. If transport mode is to be retained, we
would 

certainly get rid of this option. [I agree, but at least one
knowledgeable WG 

member was quite adamant about this. So, chalk it up to the committee
process!]


\item IPsec does not allow replay protection on an SA that was
established using 

manual key management techniques. This is a strange requirement. We
realize that 

the replay protection limits the number of packets that can be
transmitted with 

the SA to $2^{32}-1$. Still, there are applications that have a low
data rate 

where replay protection is important and manual keying is the easiest
solution. 

[elsewhere this critique argues for not presenting options in a
standard that 

can be misconfigured. Yet here, the authors make an argument for just
such an 

option! The WG decided that there was too great a chance that a
manually keyed 

SA would fail to maintain counter state across key lifetime and thus
made a 

value judgement to ban anti-replay in this context.]


\item \cite[section 5.2.1, point 3]{RFC2401} suggests that an
implementation can 

find the matching SPD entry for a packet using back-pointers from the
SAD 

entries. In general this will not work correctly. Suppose the SPD
contains two 

rules: the first one outlaws all packets to port $X$, and the second
one allows 

all incoming packets that have been authenticated. An SA is set up for
this 

second rule. The sender now sends a packet on this SA addressed to port
$X$. 

This packet should be refused as it matches the first SPD rule.
However, the 

backpointer from the SA points to the second rule in the SPD, which
allows the 

packet. This shows that back-pointers from the SA do not always point
to the 

appropriate rule, and that this is not a proper method of finding the
relevant 

SPD entry. [this is point #3 and is applied only after points #1 and
#2. Since 

point #1 calls for a liner search of the SPD, the packet would be
rejected, as 

required. Thus point #3 is not in error.]


\item The handling of ICMP messages as described in \cite[section
6]{RFC2401} is 

unclear to us. It states that an ICMP message generated by a router
must not be 

forwarded over a transport-mode SA, but transport mode SAs can only
occur in 

hosts. By definition, hosts do not forward packets, and a router never
has 

access to a transport-mode SA. [the text in the beginning of section 6
is 

emphasizing that an SA from a router to a host or security gateway,
must be a 

tunnel mode SA, vs. a transport mode SA. If we didn't make this clear,
someone 

might choose to establish a transport mode SA from an intermediate
system, and 

this would cause the source address checks to fail under certain
circumstances, 

as noted by the text.] 


The text further suggests that unauthenticated ICMP messages should be

disregarded. This creates problems. Let us envision two machines that
are 

geographically far apart and have a tunnel-mode SA set up. There are
probably a 

dozen routers between these two machines that forward the packets. None
of these 

routers knows about the existence of the SA. Any ICMP messages relating
to the 

packets that are sent will be unauthenticated and unencrypted. Simply
discarding 

these ICMP messages results in a loss of IP functionality. This problem
is 

mentioned, but the text claims this is due to the routers not
implementing 

IPsec. Even if the routers implement IPsec, they still cannot send
authenticated 

ICMP messages about the tunnel unless they themselves set up an SA with
the 

tunnel end-point for the purpose of sending the ICMP packet. The tunnel
end-

point in turn wants to be sure the source is a real router. This
requires a 

generic public-key infrastructure, which does not exist. [RFC 2401
clearly 

states the dangers associated with blindly accepting unauthenticated
ICMP 

messages, and the functionality problems associated with discarding
such 

messages. System administrators are provided with the ability to make
this 

tradeoff locally. The first step to addressing this problem is the
addition of 

IPsec into routers, as stated in the RFC. Only then does one face the
need to 

have a PKI that identifies routers. Yes, this second PKI does not
exist, but a 

subset of it (at BGP routers) might be established if the S-BGP
technology is 

deployed. These are the routers most likely to issue ICMP PMTU
messages. So, the 

answer here is that the specifications allow site administrators to
make 

security/functionality tradeoffs, locally. The longer term solution
described 

would require routers to implement IPsec, so that they can send
authenticated 

ICMP messages. Yes, this would require a PKI, but such a PKI may arise
for other 

reasons.]


As far as we understand this problem, this is a fundamental
compatibility 

problem with the existing IP protocol that does not have a good
solution. 


\item \cite[section 6.1.2.1]{RFC2401} lists a number of possible ways
of 

handling ICMP PMTU messages. An option that is not mentioned is to keep
a 

limited history of packets that were sent, and to match the header
inside the 

PMTU packet to the history list. This can identify the host where the
packet 

that was too large originated. [the approach suggested by the authors
was 

rejected as imposing too much of a burden on an SG. section 6.1.2.1
offers 

options (not suggestions) for an SG to respond to ICMP PMTU messages,
including 

heuristics to employ when not enough information is present in the
returned 

header. These options may not as responsive as a strategy that caches
traffic on 

each SA, but they are modest in the overhead imposed. Also, an SA that
carries a 

wide range of traffic (not fine-grained) might not benefit from a
limited 

traffic history, as the traffic that caused the ICMP might well be from
a host 

whose traffic has been flushed from the "limited history."]


\item \cite[section 7]{RFC2401} mentions that each auditable event in
the AH and 

ESP specifications lists a minimum set of information that should be
included in 

the audit-log entry. Not all auditable events defined in \cite{RFC2406}
include 

that information.[you're right. Exactly one auditable event in 2406
does not 

specify the list of data that SHOULD be audited.  We'll fix that in the
next 

pass. Furthermore, auditable events in \cite{RFC2401} do not specify
such a 

minimum list of information. [there are exactly 3 events defined as
auditable in 

2401, one of which overlaps with 2406. So, to be more precise, the
other 2 

auditable events defined in 2401 ought to have the minimum data
requirements 

defined.  Another good point that we will fix in the next pass.] The 

documentation should be reviewed to ensure that a minimum list of
audit-log 

information is specified with each auditable event.


\item Various algorithm specifications require the implementation to
reject 

known weak keys. For example, the DES-CBC encryption algorithm
specifications 

\cite{RFC2405} requires that DES weak keys are rejected. It is
questionable 

whether this actually increases security. It might very well be that
the extra 

code that this requires creates more security problems due to bugs than
are 

solved by rejecting weak keys.


Weak keys are not really a problem in most situations. For DES, it is
far less 

work for an attacker to do an exhaustive search over all possible keys
than to 

wait for an SA that happens to use a weak key. After all, the easiest
way for 

the attacker to detect the weak keys is to try them all. Weak-key
rejection is 

only required for algorithms where detecting the weak key class by the
weak 

cipher properties is significantly less work than trying all the weak
keys in 

question.


We recommend that the weak-key elimination requirement be removed.
Encryption 

algorithms that have large classes of weak keys that introduce security

weaknesses should simply not be used. [I tend to agree with this
analysis. The 

argument for weak key checking was made by folks who don't understand
the 

cryptographic issues involved, but who are persistent and loud, e.g.,
Bill 

Simpson. Ted T'so (co-chair of the WG) and I discussed this problem,
and tried 

to explain it to the list, but were unsuccessful. Another flaw in the
committee 

process.]


\item The only mandatory encryption algorithm in ESP is DES-CBC. Due to
the very 

limited key length of DES, this cannot be considered to be very secure.
We 

strongly urge that this algorithm not be standardized but be replaced
by a 

stronger alternative. The most obvious candidate is triple-DES.
Blowfish could 

be used as an interim high-speed solution.\footnote{On a Pentium CPU,
Blowfish 

is about six to seven times faster than triple-DES.} The upcoming AES
standard 

will presumably gain quick acceptance and probably become the default
encryption 

method for most systems. [DES as a default was mandated because of
pressure from 

vendors who, at the time, could not get export permission for 3DES.
Triple DES 

or AES will certainly augment DES as additional, mandatory defaults,
and may 

replace it in the future. ]


\item The insistence on randomly selected IV values in \cite{RFC2405}
seems to 

be overkill. It is true that a counter would provide known low
Hamming-weight 

input differentials to the block cipher. All reasonable block ciphers
are secure 

enough against this type of attack. Use of a random generator results
in an 

increased risk of an implementation error that will lead to low-entropy
or 

constant IV values; such an error would typically not be found during
testing. 

[In practice the IV is usually acquired from previous ciphertext
output, as 

suggested in the text for CBC mode ciphers, which is easy to acquire
and not 

likely to result in significant complexity. In hardware assisted
environment an 

RNG is usually available anyway. In a high assurance hardware
implementation, 

the crypto chip would generate the IV.]


\item Use of a block cipher with a 64-bit block size should in general
be 

limited to at most $2^{32}$ block encryptions per key. This is due to
the 

birthday paradox. After $2^{32}$ blocks we can expect one
collision.\footnote{To 

get a $10^{-6}$ probability of a collision it should be limited to
about 

$2^{22}$ blocks.} In CBC mode, two equal ciphertexts give the attacker
the XOR 

of two blocks of the plaintext. The specifications for the DES-CBC
encryption 

algorithm \cite{RFC2405} should mention this, and require that any SA
using such 

an algorithm limit the total amount of data encrypted by a single key
to a 

suitable value. 


\item The preferred mode for using a block cipher in ESP seems to be
CBC mode 

\cite{RFC2451}. This is probably the most widely used cipher mode, but
it has 

some disadvantages. As mentioned earlier, a collision gives direct
information 

about the relation of two plaintext blocks. Furthermore, in hardware 

implementations each of the encryptions has to be done in turn. This
gives a 

limited parallelism, which hinders high-speed hardware implementations.
[first, 

this is not an intrinsic part of the architecture; one can define
different 

modes for use with existing or different algorithms if the WG is so
motivated. 

Second, current hardware is available at speeds higher than the
associated 

packet processing capability of current IPsec devices, so this does not
appear 

to be a problem for the near term. Transition to AES will decrease the

processing burden (relative to 3DES), which may render this concern
less 

serious.]


Although not used very often, the counter mode seems to be preferable.
The 

ciphertext of block $i$ is formed as $C_i = P_i \oplus E_K( i )$, where
$i$ is 

the block number that needs to be sent at the start of the
packet.\footnote{If 

replay protection is always in use, then the starting $i$-value could
be formed 

as $2^{32}$ times the sequence number. This saves eight bytes per
packet.} After 

more than $2^{32}$ blocks counter mode also reveals some information
about the 

plaintext, but this is less than what occurs in CBC. The big advantage
of 

counter mode is that hardware implementations can parallelize the
encryption and 

decryption process, thus achieving a much higher throughput. [earlier
the 

authors criticize IPsec for a lack of orthogonality, but introducing 

interdependence between the anti-replay counter and encryption would
certainly 

violate the spirit of the earlier criticism! Counter mode versions of
algorithms 

can be added to the list easily if there is sufficient vendor
support.]


\item \cite[section 2.3]{RFC2451} states that Blowfish has weak keys,
but that 

the likelihood of generating one is very small. We disagree with these

statements. The likelihood of getting two equal 32-bit values in any
one 256-

entry S-box is about ${256 \choose 2} \cdot 2^{-32} \approx 2^{-17}$.
This is an 

event that will certainly occur in practice. However, the Blowfish weak
keys 

only lead to detectable weaknesses in reduced-round versions of the
cipher. 

There are no known weak keys for the full Blowfish cipher. 


\item In \cite[section 2.5]{RFC2451}, it is suggested to negotiate the
number of 

rounds of a cipher. We consider this to be a very bad idea. The number
of rounds 

is integral to the cipher specifications and should not be changed at
will. Even 

for ciphers that are specified with a variable number of rounds, the 

determination of the number of rounds should not be left up to the
individual 

system administrators. The IPsec standard should specify the number of
rounds 

for those ciphers. [I agree that this algorithm spec ought not
encourage 

negotiation of the number of rounds, without specifying a minimum for
each 

cipher, although this gets us into the crypto strength value judgement
arena 

again. Also, the inclusion of 3DES in this table is inappropriate as it
is a 48 

round algorithm, period.  So, yes, there is definite room for
improvement in 

this RFC.]


\item \cite[section 2.5]{RFC2451} proposes the use of RC5. We urge
caution in 

the use of this cipher. It uses some new ideas that have not been fully
analyzed 

or understood by the cryptographic community. The original RC5 as
proposed (with 

12 rounds) was broken, and in response to that the recommended number
of rounds 

was increased to 16. We feel that further research into the use of
data-

dependent rotations is required before RC5 is used in fielded systems.
[RC5 is 

not required by IPsec implementations. In the IETF spirit of flexible 

parameterization of implementations, vendors are free to offer any
additional 

algorithms in addition to the required default. In general, the IETF is
not 

prepared to make value judgements about these algorithms and so one may
see RFCs 

that specify a variety of additional algorithms.]


\item \cite[section 2.4]{RFC2406} specifies that the ESP padding should
pad the 

plaintext to a length so that the overall ciphertext length is both a
multiple 

of the block size and a multiple of 4. If a block cipher of unusual
block size 

is used (e.g., 15 bytes), then this can require up to 59 bytes of
padding. This 

padding rule works best for block sizes that are a multiple of 4, which

fortunately is the case for most block ciphers. [this padding rule is
based 

primarily on IP packet alignment considerations, not on common block
cipher 

sizes! This is stated in the text.]


\item \cite[p.\ 6, point a]{RFC2406} states that the padding
computations of the 

ESP payload with regard to the block size of the cipher apply to the
payload 

data, excluding the IV (if present), Pad Length, and Next Header
fields. This 

would imply that the Pad Length and Next Header fields are not being
encrypted. 

Yet the rest of the specification is clear that the Pad Length and Next
Header 

field are to be encrypted, which is what should happen. The text of
point a 

should be made consistent with the rest of the text. [The text says
"Šthe 

padding computation applies to the Payload Data exclusive of the IV,
the Pad 

Length, and Next Header fields." The comma after "IV" is meat to
terminate the 

scope of the word "exclusive," and thus the intent is to include the
pad length 

and next header fields. The term "payload" in ESP applies to a set of
data not 

including the latter two fields, so the sentence is, technically,
unambiguous, 

and it is consistent with the terms employed in the figure in section
2.  But, I 

admit the wording could be improved.]


\item There is a document that defines the NULL encryption algorithm
used in ESP 

\cite{RFC2410}, but no document that defines the NULL authentication
algorithm, 

which is also used by ESP \cite[section 5]{RFC2406}. [good point.
Another RFC 

publication opportunity!]


\item The NULL cipher specifies an IV length of zero \cite{RFC2410}.
This would 

seem to imply that the NULL cipher is used in CBC mode, which is
clearly not the 

case. The NULL cipher is in fact used in ECB mode, which does not
require an IV. 

Therefore, no IV length should be specified. [use of the NULL cipher in
ECB mode 

would be inconsistent with the guidance in FIPS 82, and thus CBC mode
is 

intended, to preserve the confidentiality characteristics inherent in
this 

cipher :-).]


\end{enumerate}


\section{Conclusions}

The IPsec system should be simplified significantly. This can be done
without 

loss of functionality or performance. There are also some security
weaknesses 

that should be fixed. [the extensive comments above illustrate that the
proposed 

changes to IPsec would change the functionality, contrary to the claim
made 

here. One might argue about the importance of some of this
functionality, but 

several examples have been provided to illustrate application contexts
that the 

authors of this report did not consider in their analysis. Several 

misunderstandings of some RFCs also were noted.] 


Due to its high complexity, we have not been able to analyze IPsec as
thoroughly 

as we would have liked. After simplification, a new security analysis
should be 

performed.


I have not reviewed the ISAKMP/IKE comments. However, I agree that this
protocol 

is very complex. Much of the complexity results of incremental
enhancement and a 

reluctance on the part of developers to discard older versions of
code.
</bigger>

References: