[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Heartbeats (was RE: keepalives)



Andrew,

I think that the most useful kind of hearbeat is the one that allows to detect
if the peer machine is alive, and not individual SAs (phase 1 or 2). Detection
of the dead peer will alow me to tear down all SAs to or through that peer.

Andrew Krywaniuk wrote:

> How about, to get this discussion going, I suggest a format and you (the
> list) tell me if it seems appropriate. I can put this in a draft if there is
> interest in standardization.
>
> I think that different types of heartbeats (phase 1/phase 2,
> SA-referenced/host-referenced) provide different services, and we need to
> support all kinds.
>
> I use the term 'heartbeat' throughout. If you prefer keep-alive, etc. then
> search & replace with your favorite term. When I do refer to keep-alives at
> the end of the document, I use Tim's definition (a mechanism for disabling
> the peer's inactivity timeout).
>
> ----------------------------------------------------------------------------
>
> As I see it, there are two types of heartbeats: phase 1 heartbeats and phase
> 2 heartbeats.
>
> PHASE 1 HEARTBEATS:
>
> Phase 1 heartbeats tell you if the phase 1 SA is still up. Therefore, they
> also tell you that the peer is still there. However, this is a sufficient
> but not necessary condition. The peer may be a dangling implementation, in
> which case they may not send phase 1 heartbeats even though they are still
> running.
>
> However, phase 1 heartbeats still have a use because they ensure that the
> peers will always agree on whether a phase 1 exists. It avoids the
> clumsiness of one peer trying to send a message on the phase 1, receiving
> NOTIFY_INVALID_COOKIES and then timing out, before realizing that the phase
> 1 is down and needs to be rekeyed.
>
> Also, as was discussed in an earlier thread, using NOTIFY_INVALID_COOKIES as
> a means of determining when an SA has gone down is vulnerable to DoS
> attacks, whereas heartbeats are not.
>
> I'm not going to discuss a format for phase 1 heartbeats in this post
> because IMHO it's not a technically difficult issue. Any one of a million
> packet formats would suffice (info mode, config mode, acknowledged info
> mode, some new exchange) and the only real issue is getting a group of
> people to settle on any particular one.
>
> PHASE 2 HEARTBEATS:
>
> There are two types of phase 2 heartbeats: host-referenced and
> SA-referenced. A host-referenced heartbeat is a protocol that runs across a
> dedicated phase 2 SA between the two peers. An SA-referenced heartbeat is a
> protocol that runs across an existing (user) SA.
>
> Host-referenced heartbeats can only be used to detect if the peer is still
> up and running. Therefore, they are of limited use. (However, the fact that
> they don't carry any sensitive information means that they they would never
> need to be deleted before their natural lifetime. Therefore, they would be
> the most reliable means of detecting if the peer is still alive since there
> is no possibility of a phase 2 delete being lost.)
>
> SA-referenced heartbeats detect if a specific phase 2 SA is still working.
> They also probably tell you when the peer is not there, since you wouldn't
> expect a phase 2 SA to disappear without receiving a delete (although I've
> been hearing some discussion of this recently on the list). However, they
> are probably the most useful type of heartbeat, which is why I am going to
> discuss them here.
>
> SA-referenced Phase 2 heartbeats are more technically complicated than other
> heartbeats because:
>
> 1) They must not interfere with the peer's inactivity timeouts.
> 2) They must not disturb any accounting services that may be running.
> 3) They must not result in any packets ending up on the peer's red network.
> 4) They must not assume that a phase 1 SA exists between the two peers.
>
> It is not, in general, possible to satisfy all of these constraints without
> some degree of cooperation. Therefore, both peers must be aware of the
> heartbeat scheme that is being used (i.e. it must be negotiated).
>
> In light of these constraints, I propose the following format:
>
> Every X seconds, peer 1 (the initiator) sends an encrypted ping to peer 2
> and peer 2 replies. In order to distinguish these pings from user traffic,
> the source and destinations addresses are set to the hosts' black IPs. If
> either side fails to receive a heartbeat within N*X seconds then they can
> assume that the SA has gone down (and they should send a delete for it). (If
> they fail to receive a ping but they receive other traffic on the SA then
> something has gone wrong and they should log the event). Replay protection
> is not required, as IPSec automatically provides it.
>
> It is not necessary for peer 2 to ever initiate the pings. However, to
> increase reliability, if peer 2 does not receive a ping during the normal
> window [X, X*3/2], he may force the issue by initiating a ping in the
> opposite direction.
>
> This technique has the following advantages:
>
> 1) It satisfies all of the above constraints.
> 2) It does not require the host to have any knowledge of the peer's red IP
> or red subnet.
> 3) Ping has universal brand-name recognition as a heartbeat protocol.
> Therefore, no special payload format is required.
>
> and the following disadvantages:
>
> 1) The SPD must make a specific exception for ping packets between the black
> IPs.
> 2) The accounting service should know not to bill the user for this traffic.
>
> However, I believe these disadvantages will be inherent in any SA-referenced
> heartbeat scheme.
>
> Note that a Host-referenced heartbeat scheme could be constructed in the
> same way as an SA-referenced scheme, simply by negotiating a dedicated SA
> using the black IPs as the endpoints. This could be done in tunnel mode
> (presumably using the same policy exception that is used for SA-referenced
> heartbeats) or it could simply be done in transport mode.
>
> FUTURE CONSIDERATIONS:
>
> One potential limitation of this scheme is that it does not generalize well
> to keep-alives. The use of ping as a packet format is simple, but it doesn't
> allow us to specify any additional information (all it says is
> STILL_CONNECTED). It may be desirable to send extra information in the
> packet. For example, a simple keep-alive (in the literal sense) scheme would
> be to take the heartbeat scheme and add one extra bit of information (E.g.
> STILL_CONNECTED, IDLE_TIMEOUT=disabled). On the other hand, I would prefer
> that idle timeouts be disabled via. a negotiated attribute of the SA (if
> feature negotiation ever gets standardized).
>
> There is no particular reason to use ping as transport, except for the fact
> that it is already a universally accepted packet format and requires no
> approval from IANA.
>
> Andrew
> _______________________________________________
>  Beauty without truth is insubstantial.
>  Truth without beauty is unbearable.

--
Bronislav Kavsan
IRE Secure Solutions, Inc.
100 Conifer Hill Drive  Suite 513
Danvers, MA  01923
voice: 978-539-4816
http://www.ire.com





Follow-Ups: References: