[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Heartbeats Straw Poll

On Mon, 7 Aug 2000, Theodore Y. Ts'o wrote:

>    From: Bill Sommerfeld <sommerfeld@East.Sun.COM>
>    Date: Mon, 07 Aug 2000 19:09:50 -0400
>    Heh.  Actually, one of the ideas I'm kicking around:
>    If we have the system sign a "birth certificate" when it reboots
>    (including a reboot time or boot sequence number), we could include
>    that with a "bad spi" ICMP error and in the negotiation of the IKE SA.
>    This pushes the burden of reestablishing state to the end which
>    already thinks it has shared state and has traffic it wants to send.
>    The system which is receiving packets to unknown spi's merely has to
>    respond with a simple message which involves no real-time cryptography
>    (it should, of course, be rate limited).
>    The system receiving the error message can discard it if it doesn't
>    correspond to existing state or if it's "old news" (i.e., you get
>    replay protection); if it's not old news, you can rate-limit how often
>    you attempt to verify the signature.
> (with my wg-chair hat off)
> I like this idea lot.  It solves the problem quite nicely, without
> having to deal with the complexities of negotiating whether or not to do
> heartbeats, and it avoids the possibility that a temporary less of
> connectivity (perhaps caused by a lame LAN emulation over an ATM
> backbone :-) causes a SA to get unnecessarily flushed.

If I understand this proposal correctly, it depends on the SG which still has
state to be transmitting data.  In a client to SG scenario, most of the traffic
from the SG to the client will only occur as the result of the client making
some transaction request.  Thus when the client disconnects, there is a high
probability that no traffic will flow in the direction of the client.  This
mechanism doesn't seem to allow the SG to reliably determine when the peer has
lost state.

> The key here is that it distinguishes between temporary loss of network
> connectivity from a peer reboot (and loss of IPSEC state).   There are
> other ways of getting the same result:  You could double check the loss
> of a hearbeat by attempting an unsecured ping to the IPSEC endpoint.  If
> the ping works, but the heartbeat doesn't, then it's likely the that
> communications peer has lost its state, and you need to flush all of the
> SA's and renegotiate them.  On the other hand, if you can't ping your
> peer, it's probably just a random network outage, and it's better to
> *not* flush the SA state in that case.

A ping to an IP address which previously was your peer just doesn't work.  The
only thing that uniquely identifies your peer is the cryptographic information
used to identify him and the keys negotiated during SA establishment.  You just
can't make an assumption that the IP address is still valid.  In a dynamic
address assignment world, after the peer disconnects the address is either going
to be used by someone else (who you don't trust yet) or is going to be unused.
IMO, any "ping" must be bound to the SA for it to have any value whatsoever.

It seems to me that the only solution which has been proposed that has merits
over any keepalive proposal is short SA lifetimes.  This is obviously the
easiest proposal to implement, however I am not sure I am crazy about refreshing
my keys every 1 to 2 minutes, especially if I am configured for PFS and am
terminating 1000's of tunnels.


> Still,  Bill's approach is much cleaner, and much simpler.  
> 						- Ted