[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Path MTU Discovery



-----BEGIN PGP SIGNED MESSAGE-----


I originally sent this message to Steve Kent, who asked me to forward
to the list for comments, corrections etc.

Path-mtu discovery breaks in the presence of multiple IPsec
encapsulation(*) (it might even break in the presence of ONE
intermediate encapsulating entity). We could make a requirement of
IPsec that it shouldn't; here's a suggested algorithm:

a) a host that does encapsulation and cannot forward the packet due to
DF bit set, should send back to the source of the packet (original
source or previous encapsulating entity) a normal ICMP, with enough
payload length to include the SPI of the packet received. This can be
done by copying the first XXX bytes of the packet in a separate
buffer/mbuf-chain if the DF bit is set, process it normally, and if it
fails send back the ICMP based on the copy kept (so no decapsulation
is necessary).

b) uppon receipt of an ICMP Unreachable-need fragmentation, if the
protocol on the internal header is AH or ESP, validate the
destination/SPI pair and keep the value of next_mtu as reported. If a
packet arrives that would use this SPI and the size is larger than the
mtu value, subtract the overhead this host imposed on the packet and
send back an ICMP (as per bullet -a-).

c) Alternatively, an IPsec implementation could keep a table with
ip_id|SPI sent (a circular buffer of fixed size, preferably dependent
on the total speed of the network interfaces) where each time an IPsec
packet is sent out, the ip_id and the SPI is kept. Uppon receipt of an
ICMP fragneeded, check the ip_id in the internal IP header against the
table, and find the relevant SPI. Proceed as per bullet -b-.

It might seem complicated, but all it really needs is a little more
book-keeping on behalf of an implementation.

This still doesn't address the problem of the original TCP mtu (the
mtu of the outgoing interface could be less than that reported on the
kernel structure, depending on whether a packet will be IPsec'ed or
not). But i doubt we can mandate a solution for that.

Also, there's the case of whether we accept as valid ICMPs from anyone
in between (which means anyone) or just two encapsulating entities
(e.g. two tunneling firewalls). The network-correct approach is
anyone; the security correct is next enc entity.
- -Angelos

(*) Steve Kent replied that it shouldn't break for an end host;
however, the 4.4BSD TCP code checks the outgoing interface MTU
directly to determine the size of the packets, if the route entry does not
have an mtu (check tcp_input.c, tcp_mss()). This means that either TCP
is patched, or fragmentation will happen.

-----BEGIN PGP SIGNATURE-----
Version: 2.6.3i
Charset: noconv
Comment: Processed by Mailcrypt 3.4, an Emacs/PGP interface

iQCVAwUBMvlb8b0pBjh2h1kFAQH4VgP/c6m7K8UqUPbzZImhsUjZj6wRmphKsldP
0TcbhC9Rf7CrZcrG7spjCecM2I+c3hxG04G8r6XeXR9ajY7KqtphUxz+5xDH+B/N
/8U44zoNEyVQZbxvzeXSe/U93AIq+sgDcsy1BmGDXMq2MxYTefYIU1QOgTGIv7JT
aSG04Yy2vS8=
=l0al
-----END PGP SIGNATURE-----




Follow-Ups: