[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: heartbeats (summary of responses)



On Wed, 8 Dec 1999, Slava Kavsan wrote:
> Good summary Andrew!
> 
> My vote goes to Phase1-based hearbeats using never-going-away IKE SAs

I still disagree with this. It goes counter to what has been discussed
previously about continuous channel mode. I don't want to require anyone to
HAVE to keep their IKE SA up. Only when needed.

The question I've seen is: Well, won't you need it all the time?

Answer: No. In some cases, when you might have keepalives only with ipsec
traffic (for dial-up scnearios), you may have the link go quiet for extended
periods of time, during which you certainly do NOT need to keep the IKE SA
around. The normal negotiated lifetimes apply, and after that *poof* it goes
away (on both sides, I might add). When more ipsec traffic needs to be sent,
and new keepalives need to be sent, you renegotiate a phase 1.

> (or
> skeletal ones for resource-restricted implementations - just enough to protect
> Informational messages). I would also like to suggest using Ack-ed NOTIFY
> mechanism and not to invent yet another scheme. Heartbeat management messages
> will also be useful.
> 
I don't think we need to burden this scheme with more management messages.
All you need to do is to negotiate the kind of keepalive you want
(continuous, or only with ipsec traffic) and what frequency (if applicable),
and so on. After that, don't worry about it.

Uni-directional heartbeats (to be negotiated by each side, i.e. each side
requests heartbeats to be sent from the other side) is acceptable, too, I
think, for the long run. I understand the argument about this cutting
messages down by half. As long as you can negotiate the frequency and the
type up front, I think this is acceptable (despite what some of us already
have implemented (in more than one way)).

jan


> Andrew Krywaniuk wrote:
> 
> > Hi all. It's been a few days since my original post on this subject and
> > there have been a fair number of replies. Thanks to everyone who offered
> > comments. This message is a summary of all replies plus some comments from
> > me.
> >
> > I'm attaching the original message for context:  <<RE: Heartbeats (was RE:
> > keepalives)>>
> >
> > Most people expressed an interest in host-referenced heartbeats. However,
> > they disagreed on what mechanism should be used to transport the heartbeats.
> > Some people preferred a phase 1 solution; others preferred a phase 2
> > solution; a couple of people preferred clear pings. Only a couple of people
> > (Tero) agreed that it is desirable to support more than one kind of
> > heartbeat protocol.
> >
> > Clear Pings:
> >
> > I don't believe that clear pings are an acceptable solution. We can't
> > possibly prevent every kind of DoS attack, but we have an obligation to not
> > make them *easy*.
> >
> > Phase 1 Sa-Referenced Heartbeats:
> >
> > In the absence of phase 1 host-referenced heartbeats, these basically tell
> > you if the SA is still up. Since you have a reliable indication of when the
> > SA is up, an attacker can't induce you to renegotiate (a DoS threat) by
> > sending unauthenticated Notify Invalid Cookies (or spoofing a message with
> > invalid cookies). Also, it speeds up negotiation of phase 2s, since you know
> > ahead of time whether you need to negotiate a phase 1 first. No one showed
> > much interest in this type of heartbeat.
> >
> > Phase 1 Host-Referenced Heartbeats:
> >
> > The biggest problem with this idea is that phase 1 heartbeats seem to be
> > incompatible with dangling SA awareness. We have already established that
> > many vendors are not willing to use the continuous channel model, therefore
> > any phase 1 heartbeat scheme must be dangling SA aware.
> >
> > I know Jan has disputed this, but I think that most of us will want to send
> > heartbeat packets at regular intervals. If you set your heartbeat rate to
> > once every 30 seconds, how many people would want to renegotiate their phase
> > 1 at 30 second intervals (under low memory conditions)?
> >
> > On the other hand, I believe that we could get around this limitation by not
> > permitting dangling SAs, but allowing 'pseudo-dangling' SAs instead. In this
> > scenerio, implementations would not be permitted to delete their phase 1s at
> > will. However, they would be allowed to convert them into a skeletal form
> > (pseudo-phase 1), which contains just enough information to receive
> > heartbeats (and probably, by extension, info modes), but not enough to
> > negotiate QMs or use advanced features.
> >
> > This would kill two birds with one stone. It would allow implementations to
> > save memory by discarding unused phase 1 info, but it would still allow them
> > to send phase 2 deletes without renegotiating the phase 1. It would also
> > allow applications to send host-referenced heartbeat packets in phase 1,
> > which IMHO is the right place to send management-type packets.
> >
> > I know Dan said he would look into using using "inline Isakmp" messages to
> > accomplish this, which is a form of pseudo-phase 1 SA as well; in order to
> > send the message, you still have to keep a state. Instead, why not have the
> > pseudo-phase 1 just store the encryption and authentication objects, plus
> > the iv for info modes... I guess you'd need to track the phase 1 lifetime
> > params as well. Plus the heartbeat info, of course. Anything else? Wouldn't
> > this be easier (and less computationally intensive) than
> > generating/verifying an RSA signature on every heartbeat? (and it wouldn't
> > expose known-plaintext RSA sig pairs to a passive attacker) (P.S. I have to
> > give Slava credit for this idea since I just noticed he proposed it already)
> >
> > The only other concern that I have heard about phase 1 heartbeats is that
> > some people are worried that the IKE daemon may crash independently of the
> > IPSEC task and they don't want a failure in IKE to cause the phase 2 SAs to
> > go away. This may be an issue in load-sharing situations where the two
> > processes may be running on different boxes.
> >
> > Jan suggested that the phase 1 heartbeat inform the peer which phase 2s are
> > still up. I don't think this would be much of an issue if we had phase 1
> > heartbeats AND acknowledged info modes. (This also assumes that the IKE
> > daemon 'knows' when a phase 2 goes down unpredictably.)
> >
> > I was thinking that probably we would want to use a periodic heartbeat
> > message with a replay counter (i.e. a synchronous uni-directional
> > heartbeat). Derrell commented that some vendors have already implemented a
> > query-type heartbeat protocol. This seems less useful to me than a periodic
> > uni-directional protocol. (How do you know when to query? If you wait until
> > you need to send a packet then you have high latency. If you query on a
> > regular basis then you get the same results as the synchronous protocol, but
> > use twice the bandwidth.)
> >
> > Phase 2 Host-Referenced Heartbeats:
> >
> > This is an alternative to host-referenced heartbeats of the phase 1 variety.
> > The advantage is that they do not rely on the existence of the phase 1 SA,
> > so they are automatically compatible with dangling implementations.
> >
> > Still, they don't solve every possible scenerio. In the same way that a
> > phase 1 heartbeat only tells you that the Isakmp daemon is running and not
> > necessarily that the Ipsec layer is working, there is the possibility that
> > one Ipsec SA may crash but another may continue working. In particular, I'm
> > thinking of load-sharing scenarios where the tunnel endpoint is the same but
> > the SAs are running on different host processors.
> >
> > I'm assuming that we would send them across a black-to-black tunnel. This
> > would normally be used for management traffic, but Jan tells me that l2tp
> > uses it as well. However, it would be easy to tell the l2tp traffic from the
> > heartbeat traffic since the next protocol in the header would be l2tp, not
> > icmp.
> >
> > Someone thought we should define a completely new protocol for ipsec
> > heartbeats. This doesn't seem realistic to me. I suppose if ping was
> > unacceptable we could define a new Isakmp message type and allow it to be
> > interpreted as a heartbeat if it is received on an ipsec tunnel. But asking
> > IANA for a new protocol number just for the heartbeats seems presumptuous.
> >
> > Jan also suggested the possibility of a new doi for heartbeat messages. That
> > wouldn't be such a bad idea for host-referenced heartbeats. After all, they
> > don't need to 'hijack' existing SAs, so there is no reason why they couldn't
> > be negotiated within a completely separate domain. It might make this into a
> > slightly bigger project than I intended, though. This also provides a
> > potential fix for the load-sharing scenario. Since there is no requirement
> > to
> >
> > There is also some potential for legacy support. If an existing host allows
> > negotiation of black-to-black SAs and the SPD allows pings then the legacy
> > host can be monitored for liveliness without any negotiation. I do think we
> > need to support negotiation in the long run, though.
> >
> > A couple of people suggested sending a ping to the red (internal IP). I
> > don't agree with this, since it would then require the peer to have
> > knowledge of the internal IP, which is unnecessary (although I suppose the
> > parameters for this could be negotiated).
> >
> > Phase 2 Sa-Referenced Heartbeats:
> >
> > A couple of people expressed interest in these, mostly as a
> > resource-friendly alternative to host-referenced heartbeats. In remote
> > access scenarios, typically there is only 1 (or only a few) phase 2 SA
> > between the peers; adding a host-referenced heartbeat SA would double the
> > resource requirements (assuming you are dropping the phase 1s). By hijacking
> > an existing phase 2 you can save memory and negotiation time.
> >
> > I also foresaw two other uses: One was a simple solution to the load sharing
> > scenario (sending the heartbeat on a hijacked SA guarantees that the same
> > physical host that is  sending the ping is the same one that is transmitting
> > on the SA). The other possibility is that some phase 2 SAs (probably in the
> > VPN backbone) will be deemed 'critical' (in that they guarantee less than
> > 0.01% downtime or such). In order to ensure that these SAs are not only up,
> > but also WORKING (i.e. the state on the peers is synchronized), it might be
> > desirable to send a regular heartbeat message.
> >
> > Jan expressed concern that this would slow down Ipsec processing too much.
> > That depends. How many people verify the source/dest addresses for the
> > tunneled packet against the ones in the SPD? If you are already doing this
> > for every packet then you lose no speed; if you just check the spi currently
> > then adding the check for the IPs would slow you down.
> >
> > But I was thinking, maybe this is another case where using a separate doi
> > would be useful. I wouldn't want to negotiate a separate SAs for the new doi
> > (that would mostly defeat the point), but maybe a doi of IPSEC_MANAGEMENT
> > could be used to identify management traffic that is flowing on an existing
> > Ipsec SA.
> >
> > Jan expressed concern about not interfering with customer billing
> > information, however all the proposed solutions have already taken this into
> > account. Ditto with not allowing heartbeats to interfere with inactivity
> > timers.
> >
> > Tero commented that phase 2 host-referenced heartbeats could be implemented
> > as a special case of phase 2 sa-referenced heartbeats, which is what I
> > originally intended.
> >
> > Heartbeat Negotiation Protocols:
> >
> > I didn't really bring this up before. I assume that we will need to have
> > some kind of general negotiation framework in place (a subject that was
> > discussed briefly and then dropped). In the meantime, I will probably label
> > this an 'experimental future' and use a vendor id; however, since there will
> > undoubtedly be parameters (e.g. heartbeat type, heartbeat frequency,
> > recovery action, IP address to ping, etc.) regardless of which heartbeat
> > format we decide on, maybe we will need a config exchange as well.
> >
> > Andrew
> > _______________________________________________
> >  Beauty without truth is insubstantial.
> >  Truth without beauty is unbearable.
> >
> >   ------------------------------------------------------------------------
> >
> > Subject: RE: Heartbeats (was RE: keepalives)
> > Date: Fri, 3 Dec 1999 16:15:16 -0500
> > From: Andrew Krywaniuk <akrywaniuk@TimeStep.com>
> > To: ipsec@lists.tislabs.com
> >
> > How about, to get this discussion going, I suggest a format and you (the
> > list) tell me if it seems appropriate. I can put this in a draft if there is
> > interest in standardization.
> >
> > I think that different types of heartbeats (phase 1/phase 2,
> > SA-referenced/host-referenced) provide different services, and we need to
> > support all kinds.
> >
> > I use the term 'heartbeat' throughout. If you prefer keep-alive, etc. then
> > search & replace with your favorite term. When I do refer to keep-alives at
> > the end of the document, I use Tim's definition (a mechanism for disabling
> > the peer's inactivity timeout).
> >
> > ----------------------------------------------------------------------------
> >
> > As I see it, there are two types of heartbeats: phase 1 heartbeats and phase
> > 2 heartbeats.
> >
> > PHASE 1 HEARTBEATS:
> >
> > Phase 1 heartbeats tell you if the phase 1 SA is still up. Therefore, they
> > also tell you that the peer is still there. However, this is a sufficient
> > but not necessary condition. The peer may be a dangling implementation, in
> > which case they may not send phase 1 heartbeats even though they are still
> > running.
> >
> > However, phase 1 heartbeats still have a use because they ensure that the
> > peers will always agree on whether a phase 1 exists. It avoids the
> > clumsiness of one peer trying to send a message on the phase 1, receiving
> > NOTIFY_INVALID_COOKIES and then timing out, before realizing that the phase
> > 1 is down and needs to be rekeyed.
> >
> > Also, as was discussed in an earlier thread, using NOTIFY_INVALID_COOKIES as
> > a means of determining when an SA has gone down is vulnerable to DoS
> > attacks, whereas heartbeats are not.
> >
> > I'm not going to discuss a format for phase 1 heartbeats in this post
> > because IMHO it's not a technically difficult issue. Any one of a million
> > packet formats would suffice (info mode, config mode, acknowledged info
> > mode, some new exchange) and the only real issue is getting a group of
> > people to settle on any particular one.
> >
> > PHASE 2 HEARTBEATS:
> >
> > There are two types of phase 2 heartbeats: host-referenced and
> > SA-referenced. A host-referenced heartbeat is a protocol that runs across a
> > dedicated phase 2 SA between the two peers. An SA-referenced heartbeat is a
> > protocol that runs across an existing (user) SA.
> >
> > Host-referenced heartbeats can only be used to detect if the peer is still
> > up and running. Therefore, they are of limited use. (However, the fact that
> > they don't carry any sensitive information means that they they would never
> > need to be deleted before their natural lifetime. Therefore, they would be
> > the most reliable means of detecting if the peer is still alive since there
> > is no possibility of a phase 2 delete being lost.)
> >
> > SA-referenced heartbeats detect if a specific phase 2 SA is still working.
> > They also probably tell you when the peer is not there, since you wouldn't
> > expect a phase 2 SA to disappear without receiving a delete (although I've
> > been hearing some discussion of this recently on the list). However, they
> > are probably the most useful type of heartbeat, which is why I am going to
> > discuss them here.
> >
> > SA-referenced Phase 2 heartbeats are more technically complicated than other
> > heartbeats because:
> >
> > 1) They must not interfere with the peer's inactivity timeouts.
> > 2) They must not disturb any accounting services that may be running.
> > 3) They must not result in any packets ending up on the peer's red network.
> > 4) They must not assume that a phase 1 SA exists between the two peers.
> >
> > It is not, in general, possible to satisfy all of these constraints without
> > some degree of cooperation. Therefore, both peers must be aware of the
> > heartbeat scheme that is being used (i.e. it must be negotiated).
> >
> > In light of these constraints, I propose the following format:
> >
> > Every X seconds, peer 1 (the initiator) sends an encrypted ping to peer 2
> > and peer 2 replies. In order to distinguish these pings from user traffic,
> > the source and destinations addresses are set to the hosts' black IPs. If
> > either side fails to receive a heartbeat within N*X seconds then they can
> > assume that the SA has gone down (and they should send a delete for it). (If
> > they fail to receive a ping but they receive other traffic on the SA then
> > something has gone wrong and they should log the event). Replay protection
> > is not required, as IPSec automatically provides it.
> >
> > It is not necessary for peer 2 to ever initiate the pings. However, to
> > increase reliability, if peer 2 does not receive a ping during the normal
> > window [X, X*3/2], he may force the issue by initiating a ping in the
> > opposite direction.
> >
> > This technique has the following advantages:
> >
> > 1) It satisfies all of the above constraints.
> > 2) It does not require the host to have any knowledge of the peer's red IP
> > or red subnet.
> > 3) Ping has universal brand-name recognition as a heartbeat protocol.
> > Therefore, no special payload format is required.
> >
> > and the following disadvantages:
> >
> > 1) The SPD must make a specific exception for ping packets between the black
> > IPs.
> > 2) The accounting service should know not to bill the user for this traffic.
> >
> > However, I believe these disadvantages will be inherent in any SA-referenced
> > heartbeat scheme.
> >
> > Note that a Host-referenced heartbeat scheme could be constructed in the
> > same way as an SA-referenced scheme, simply by negotiating a dedicated SA
> > using the black IPs as the endpoints. This could be done in tunnel mode
> > (presumably using the same policy exception that is used for SA-referenced
> > heartbeats) or it could simply be done in transport mode.
> >
> > FUTURE CONSIDERATIONS:
> >
> > One potential limitation of this scheme is that it does not generalize well
> > to keep-alives. The use of ping as a packet format is simple, but it doesn't
> > allow us to specify any additional information (all it says is
> > STILL_CONNECTED). It may be desirable to send extra information in the
> > packet. For example, a simple keep-alive (in the literal sense) scheme would
> > be to take the heartbeat scheme and add one extra bit of information (E.g.
> > STILL_CONNECTED, IDLE_TIMEOUT=disabled). On the other hand, I would prefer
> > that idle timeouts be disabled via. a negotiated attribute of the SA (if
> > feature negotiation ever gets standardized).
> >
> > There is no particular reason to use ping as transport, except for the fact
> > that it is already a universally accepted packet format and requires no
> > approval from IANA.
> >
> > Andrew
> > _______________________________________________
> >  Beauty without truth is insubstantial.
> >  Truth without beauty is unbearable.
> 
> 
> 

 --
Jan Vilhuber                                            vilhuber@cisco.com
Cisco Systems, San Jose                                     (408) 527-0847



Follow-Ups: References: