[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: heartbeats (summary of responses)



Good summary Andrew!

My vote goes to Phase1-based hearbeats using never-going-away IKE SAs (or
skeletal ones for resource-restricted implementations - just enough to protect
Informational messages). I would also like to suggest using Ack-ed NOTIFY
mechanism and not to invent yet another scheme. Heartbeat management messages
will also be useful.

Andrew Krywaniuk wrote:

> Hi all. It's been a few days since my original post on this subject and
> there have been a fair number of replies. Thanks to everyone who offered
> comments. This message is a summary of all replies plus some comments from
> me.
>
> I'm attaching the original message for context:  <<RE: Heartbeats (was RE:
> keepalives)>>
>
> Most people expressed an interest in host-referenced heartbeats. However,
> they disagreed on what mechanism should be used to transport the heartbeats.
> Some people preferred a phase 1 solution; others preferred a phase 2
> solution; a couple of people preferred clear pings. Only a couple of people
> (Tero) agreed that it is desirable to support more than one kind of
> heartbeat protocol.
>
> Clear Pings:
>
> I don't believe that clear pings are an acceptable solution. We can't
> possibly prevent every kind of DoS attack, but we have an obligation to not
> make them *easy*.
>
> Phase 1 Sa-Referenced Heartbeats:
>
> In the absence of phase 1 host-referenced heartbeats, these basically tell
> you if the SA is still up. Since you have a reliable indication of when the
> SA is up, an attacker can't induce you to renegotiate (a DoS threat) by
> sending unauthenticated Notify Invalid Cookies (or spoofing a message with
> invalid cookies). Also, it speeds up negotiation of phase 2s, since you know
> ahead of time whether you need to negotiate a phase 1 first. No one showed
> much interest in this type of heartbeat.
>
> Phase 1 Host-Referenced Heartbeats:
>
> The biggest problem with this idea is that phase 1 heartbeats seem to be
> incompatible with dangling SA awareness. We have already established that
> many vendors are not willing to use the continuous channel model, therefore
> any phase 1 heartbeat scheme must be dangling SA aware.
>
> I know Jan has disputed this, but I think that most of us will want to send
> heartbeat packets at regular intervals. If you set your heartbeat rate to
> once every 30 seconds, how many people would want to renegotiate their phase
> 1 at 30 second intervals (under low memory conditions)?
>
> On the other hand, I believe that we could get around this limitation by not
> permitting dangling SAs, but allowing 'pseudo-dangling' SAs instead. In this
> scenerio, implementations would not be permitted to delete their phase 1s at
> will. However, they would be allowed to convert them into a skeletal form
> (pseudo-phase 1), which contains just enough information to receive
> heartbeats (and probably, by extension, info modes), but not enough to
> negotiate QMs or use advanced features.
>
> This would kill two birds with one stone. It would allow implementations to
> save memory by discarding unused phase 1 info, but it would still allow them
> to send phase 2 deletes without renegotiating the phase 1. It would also
> allow applications to send host-referenced heartbeat packets in phase 1,
> which IMHO is the right place to send management-type packets.
>
> I know Dan said he would look into using using "inline Isakmp" messages to
> accomplish this, which is a form of pseudo-phase 1 SA as well; in order to
> send the message, you still have to keep a state. Instead, why not have the
> pseudo-phase 1 just store the encryption and authentication objects, plus
> the iv for info modes... I guess you'd need to track the phase 1 lifetime
> params as well. Plus the heartbeat info, of course. Anything else? Wouldn't
> this be easier (and less computationally intensive) than
> generating/verifying an RSA signature on every heartbeat? (and it wouldn't
> expose known-plaintext RSA sig pairs to a passive attacker) (P.S. I have to
> give Slava credit for this idea since I just noticed he proposed it already)
>
> The only other concern that I have heard about phase 1 heartbeats is that
> some people are worried that the IKE daemon may crash independently of the
> IPSEC task and they don't want a failure in IKE to cause the phase 2 SAs to
> go away. This may be an issue in load-sharing situations where the two
> processes may be running on different boxes.
>
> Jan suggested that the phase 1 heartbeat inform the peer which phase 2s are
> still up. I don't think this would be much of an issue if we had phase 1
> heartbeats AND acknowledged info modes. (This also assumes that the IKE
> daemon 'knows' when a phase 2 goes down unpredictably.)
>
> I was thinking that probably we would want to use a periodic heartbeat
> message with a replay counter (i.e. a synchronous uni-directional
> heartbeat). Derrell commented that some vendors have already implemented a
> query-type heartbeat protocol. This seems less useful to me than a periodic
> uni-directional protocol. (How do you know when to query? If you wait until
> you need to send a packet then you have high latency. If you query on a
> regular basis then you get the same results as the synchronous protocol, but
> use twice the bandwidth.)
>
> Phase 2 Host-Referenced Heartbeats:
>
> This is an alternative to host-referenced heartbeats of the phase 1 variety.
> The advantage is that they do not rely on the existence of the phase 1 SA,
> so they are automatically compatible with dangling implementations.
>
> Still, they don't solve every possible scenerio. In the same way that a
> phase 1 heartbeat only tells you that the Isakmp daemon is running and not
> necessarily that the Ipsec layer is working, there is the possibility that
> one Ipsec SA may crash but another may continue working. In particular, I'm
> thinking of load-sharing scenarios where the tunnel endpoint is the same but
> the SAs are running on different host processors.
>
> I'm assuming that we would send them across a black-to-black tunnel. This
> would normally be used for management traffic, but Jan tells me that l2tp
> uses it as well. However, it would be easy to tell the l2tp traffic from the
> heartbeat traffic since the next protocol in the header would be l2tp, not
> icmp.
>
> Someone thought we should define a completely new protocol for ipsec
> heartbeats. This doesn't seem realistic to me. I suppose if ping was
> unacceptable we could define a new Isakmp message type and allow it to be
> interpreted as a heartbeat if it is received on an ipsec tunnel. But asking
> IANA for a new protocol number just for the heartbeats seems presumptuous.
>
> Jan also suggested the possibility of a new doi for heartbeat messages. That
> wouldn't be such a bad idea for host-referenced heartbeats. After all, they
> don't need to 'hijack' existing SAs, so there is no reason why they couldn't
> be negotiated within a completely separate domain. It might make this into a
> slightly bigger project than I intended, though. This also provides a
> potential fix for the load-sharing scenario. Since there is no requirement
> to
>
> There is also some potential for legacy support. If an existing host allows
> negotiation of black-to-black SAs and the SPD allows pings then the legacy
> host can be monitored for liveliness without any negotiation. I do think we
> need to support negotiation in the long run, though.
>
> A couple of people suggested sending a ping to the red (internal IP). I
> don't agree with this, since it would then require the peer to have
> knowledge of the internal IP, which is unnecessary (although I suppose the
> parameters for this could be negotiated).
>
> Phase 2 Sa-Referenced Heartbeats:
>
> A couple of people expressed interest in these, mostly as a
> resource-friendly alternative to host-referenced heartbeats. In remote
> access scenarios, typically there is only 1 (or only a few) phase 2 SA
> between the peers; adding a host-referenced heartbeat SA would double the
> resource requirements (assuming you are dropping the phase 1s). By hijacking
> an existing phase 2 you can save memory and negotiation time.
>
> I also foresaw two other uses: One was a simple solution to the load sharing
> scenario (sending the heartbeat on a hijacked SA guarantees that the same
> physical host that is  sending the ping is the same one that is transmitting
> on the SA). The other possibility is that some phase 2 SAs (probably in the
> VPN backbone) will be deemed 'critical' (in that they guarantee less than
> 0.01% downtime or such). In order to ensure that these SAs are not only up,
> but also WORKING (i.e. the state on the peers is synchronized), it might be
> desirable to send a regular heartbeat message.
>
> Jan expressed concern that this would slow down Ipsec processing too much.
> That depends. How many people verify the source/dest addresses for the
> tunneled packet against the ones in the SPD? If you are already doing this
> for every packet then you lose no speed; if you just check the spi currently
> then adding the check for the IPs would slow you down.
>
> But I was thinking, maybe this is another case where using a separate doi
> would be useful. I wouldn't want to negotiate a separate SAs for the new doi
> (that would mostly defeat the point), but maybe a doi of IPSEC_MANAGEMENT
> could be used to identify management traffic that is flowing on an existing
> Ipsec SA.
>
> Jan expressed concern about not interfering with customer billing
> information, however all the proposed solutions have already taken this into
> account. Ditto with not allowing heartbeats to interfere with inactivity
> timers.
>
> Tero commented that phase 2 host-referenced heartbeats could be implemented
> as a special case of phase 2 sa-referenced heartbeats, which is what I
> originally intended.
>
> Heartbeat Negotiation Protocols:
>
> I didn't really bring this up before. I assume that we will need to have
> some kind of general negotiation framework in place (a subject that was
> discussed briefly and then dropped). In the meantime, I will probably label
> this an 'experimental future' and use a vendor id; however, since there will
> undoubtedly be parameters (e.g. heartbeat type, heartbeat frequency,
> recovery action, IP address to ping, etc.) regardless of which heartbeat
> format we decide on, maybe we will need a config exchange as well.
>
> Andrew
> _______________________________________________
>  Beauty without truth is insubstantial.
>  Truth without beauty is unbearable.
>
>   ------------------------------------------------------------------------
>
> Subject: RE: Heartbeats (was RE: keepalives)
> Date: Fri, 3 Dec 1999 16:15:16 -0500
> From: Andrew Krywaniuk <akrywaniuk@TimeStep.com>
> To: ipsec@lists.tislabs.com
>
> How about, to get this discussion going, I suggest a format and you (the
> list) tell me if it seems appropriate. I can put this in a draft if there is
> interest in standardization.
>
> I think that different types of heartbeats (phase 1/phase 2,
> SA-referenced/host-referenced) provide different services, and we need to
> support all kinds.
>
> I use the term 'heartbeat' throughout. If you prefer keep-alive, etc. then
> search & replace with your favorite term. When I do refer to keep-alives at
> the end of the document, I use Tim's definition (a mechanism for disabling
> the peer's inactivity timeout).
>
> ----------------------------------------------------------------------------
>
> As I see it, there are two types of heartbeats: phase 1 heartbeats and phase
> 2 heartbeats.
>
> PHASE 1 HEARTBEATS:
>
> Phase 1 heartbeats tell you if the phase 1 SA is still up. Therefore, they
> also tell you that the peer is still there. However, this is a sufficient
> but not necessary condition. The peer may be a dangling implementation, in
> which case they may not send phase 1 heartbeats even though they are still
> running.
>
> However, phase 1 heartbeats still have a use because they ensure that the
> peers will always agree on whether a phase 1 exists. It avoids the
> clumsiness of one peer trying to send a message on the phase 1, receiving
> NOTIFY_INVALID_COOKIES and then timing out, before realizing that the phase
> 1 is down and needs to be rekeyed.
>
> Also, as was discussed in an earlier thread, using NOTIFY_INVALID_COOKIES as
> a means of determining when an SA has gone down is vulnerable to DoS
> attacks, whereas heartbeats are not.
>
> I'm not going to discuss a format for phase 1 heartbeats in this post
> because IMHO it's not a technically difficult issue. Any one of a million
> packet formats would suffice (info mode, config mode, acknowledged info
> mode, some new exchange) and the only real issue is getting a group of
> people to settle on any particular one.
>
> PHASE 2 HEARTBEATS:
>
> There are two types of phase 2 heartbeats: host-referenced and
> SA-referenced. A host-referenced heartbeat is a protocol that runs across a
> dedicated phase 2 SA between the two peers. An SA-referenced heartbeat is a
> protocol that runs across an existing (user) SA.
>
> Host-referenced heartbeats can only be used to detect if the peer is still
> up and running. Therefore, they are of limited use. (However, the fact that
> they don't carry any sensitive information means that they they would never
> need to be deleted before their natural lifetime. Therefore, they would be
> the most reliable means of detecting if the peer is still alive since there
> is no possibility of a phase 2 delete being lost.)
>
> SA-referenced heartbeats detect if a specific phase 2 SA is still working.
> They also probably tell you when the peer is not there, since you wouldn't
> expect a phase 2 SA to disappear without receiving a delete (although I've
> been hearing some discussion of this recently on the list). However, they
> are probably the most useful type of heartbeat, which is why I am going to
> discuss them here.
>
> SA-referenced Phase 2 heartbeats are more technically complicated than other
> heartbeats because:
>
> 1) They must not interfere with the peer's inactivity timeouts.
> 2) They must not disturb any accounting services that may be running.
> 3) They must not result in any packets ending up on the peer's red network.
> 4) They must not assume that a phase 1 SA exists between the two peers.
>
> It is not, in general, possible to satisfy all of these constraints without
> some degree of cooperation. Therefore, both peers must be aware of the
> heartbeat scheme that is being used (i.e. it must be negotiated).
>
> In light of these constraints, I propose the following format:
>
> Every X seconds, peer 1 (the initiator) sends an encrypted ping to peer 2
> and peer 2 replies. In order to distinguish these pings from user traffic,
> the source and destinations addresses are set to the hosts' black IPs. If
> either side fails to receive a heartbeat within N*X seconds then they can
> assume that the SA has gone down (and they should send a delete for it). (If
> they fail to receive a ping but they receive other traffic on the SA then
> something has gone wrong and they should log the event). Replay protection
> is not required, as IPSec automatically provides it.
>
> It is not necessary for peer 2 to ever initiate the pings. However, to
> increase reliability, if peer 2 does not receive a ping during the normal
> window [X, X*3/2], he may force the issue by initiating a ping in the
> opposite direction.
>
> This technique has the following advantages:
>
> 1) It satisfies all of the above constraints.
> 2) It does not require the host to have any knowledge of the peer's red IP
> or red subnet.
> 3) Ping has universal brand-name recognition as a heartbeat protocol.
> Therefore, no special payload format is required.
>
> and the following disadvantages:
>
> 1) The SPD must make a specific exception for ping packets between the black
> IPs.
> 2) The accounting service should know not to bill the user for this traffic.
>
> However, I believe these disadvantages will be inherent in any SA-referenced
> heartbeat scheme.
>
> Note that a Host-referenced heartbeat scheme could be constructed in the
> same way as an SA-referenced scheme, simply by negotiating a dedicated SA
> using the black IPs as the endpoints. This could be done in tunnel mode
> (presumably using the same policy exception that is used for SA-referenced
> heartbeats) or it could simply be done in transport mode.
>
> FUTURE CONSIDERATIONS:
>
> One potential limitation of this scheme is that it does not generalize well
> to keep-alives. The use of ping as a packet format is simple, but it doesn't
> allow us to specify any additional information (all it says is
> STILL_CONNECTED). It may be desirable to send extra information in the
> packet. For example, a simple keep-alive (in the literal sense) scheme would
> be to take the heartbeat scheme and add one extra bit of information (E.g.
> STILL_CONNECTED, IDLE_TIMEOUT=disabled). On the other hand, I would prefer
> that idle timeouts be disabled via. a negotiated attribute of the SA (if
> feature negotiation ever gets standardized).
>
> There is no particular reason to use ping as transport, except for the fact
> that it is already a universally accepted packet format and requires no
> approval from IANA.
>
> Andrew
> _______________________________________________
>  Beauty without truth is insubstantial.
>  Truth without beauty is unbearable.




Follow-Ups: References: