[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: Heartbeats draft available



Hi,

Thanks for the explanations. I have some more questions.

>From: Andrew Krywaniuk [akrywani@newbridge.com]
>Sent: Sunday, March 26, 2000 10:12 PM
>To: SRamamoorthi@NetScreen.com
>Cc: ipsec@lists.tislabs.com
>Subject: RE: Heartbeats draft available
>
>Hi Sankar,
>
>> The heartbeat protocol as described does not take into account of data
>> packets. If there is data traffic already flowing between the client
>> and gateway, the heartbeat packets from the client-to-gateway seem
>> reduntant - is there a way data packets can double as
>> heartbeat packets?
>
>Yes, there is some redundancy, but the effect is small. The intent here is
>to keep the heartbeat scheduling simple, and not base it on subjective
>measures such as reception of traffic. (Reception of traffic can be used as
>a sanity check, though.)
>

The other redundancy concern I have is about the symmetric nature
of the described heartbeat protocol. Should two communicating parties
have to independently negotiate heartbeat protocol if they want to
ensure the other one is alive?

Also is the effect really small as you describe?
Consider a gateway hosting lots of clients. Eack IKE SA to the client is
separate. If all clients were to send 'notify active' periodically, would'nt
it add up to considerable overhead  on the gateway to process the keep-alive
packets, verify the hash... The scaling problem worsens if the gateway
negotiates for a smaller 'notify active' time interval.

Also in your draft you state that

   The Stateful Request/Reply Phase 1 Heartbeat:

   When used properly (with a sequence number and a query interval),
   this heartbeat mode is similar to the one described in this document.
   The main difference is that it requires double the bandwidth to do
   the same thing.

Why does it require double the bandwidth? With a stateful Request/Reply
protocol the sender of the request/reply message could send it only
when needed - that should actually save processing time and bandwidth.

With a stateful Request/Reply, all that is needed to be provided in IKE
is a mechanism(echo, echo-reply). Implementations can build a
'proof-of-aliveness' check using it without needing a protocol.
What am I missing?


>> I also have concerns with the one-way heartbeat message. It forces
>> negotiation
>> of expected timeout-interval, lost-packet tolerance window
>> and a number of
>> other parameters between the communicating parties. This
>> seems to add a
>> lot of complexity to the protocol and additional administrative knobs.
>
>Only the heartbeat interval is negotiated. The other parameters are not
>actually part of the protocol. The description of the other parameters is
>merely an implementation hint that suggests what seems to be the most
>logical way to decide when the connection is down "beyond a reasonable
>doubt". (This point applies to any heartbeat protocol, BTW, not just the
one
>in the draft.)

The complexity is in determining the acceptable heartbeat interval
beforehand. How can the gateway determine the suitable acceptable
interval to be negotiated? A time of 2 hours may look appropriate
in the begining (to allow the gateway to be not overly sensitive to
link, router failures..), but as the load on the system increases
it may want to clean up stale SAs at a faster rate.

>
>> Heartbeat protocol is one way to detect failure of a session
>> - but it seems
>> to tie the failure of a transport with failure of a session.
>> For example if the router in the transport path is down for 5 minutes
>> should the SA's have to be renegotiated?
>
>I can't think of any way to distinguish a failure of transport from a
>failure of the session. Probably yes, the SAs need to be renegotiated.
>However, there is a note at the end of the draft which suggests that in the
>case of dialup connections, we may want to preserve the SAs even though the
>link has gone down. This would need to be negotiated.
>
>> Heartbeat protocol also seem to be a very proactive way of detecting
>> failures.
>> In addition it would be nice to a have an ping like mechanism
>> which would allow a communicating party to detect failures when needed
>> (For example if a gateway detects an SA has been inactive for a period
>> of time, it can issue a ping to detect if the client is still
>> active).
>
>Yes, proactive was the intent. The structure of the protocol allows the
>sender to choose the maximum load it will accept from heartbeats and the
>approximate recovery time. This results in a fixed overhead for the sender
>and it prevents the receiver from issuing pings capriciously (e.g. every
>time it receives an unauthenticated invalid SPI message).

This fixed overhead is per remote peer. Is it scalable?


>
>Your second point is one of the exact reasons why a gateway should not
issue
>pings without negotiation. What if the client has an inactivity timer that
>expires every 5 minutes, but the gateway issues a ping whenever it doesn't
>receive any traffic on the SA for 4 minutes. In this case, the SA will stay
>up forever, even though no user traffic is every being sent.
>

I did not get it. Why are we mixing up these pings with user traffic?
If done correctly the client would see the ping from the gateway -
reprime its inactivity timer and not a send a ping to the gateway.

Also the inactivity timeout has no bearing on SA lifetime - right?

 Thanks,
-- sankar --



Follow-Ups: References: