[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Peer liveliness



Just for clarification, DPD makes no mention of sending IC.  It's "out of scope" 
;-).

-g

Gregory Lebovitz wrote:
> Moving the discussion back to IKEv2, for the moment...
> 
> Several of us have spent a lot of time discussing this issue in the past few
> weeks. A main problem we are trying to solve (though not the only one) is
> rapid recovery from a rebooted peer.
> 
> If you look at the current DPD draft for IKEv1, it calls for sending
> INITIAL-CONTACT whenever a peer thinks this is its first contact, i.e. has
> no established SAs with the remote peer. This is done, even in the case
> where DPD is running on both peers, to let the other peer -- the persisted
> peer (as opposed to the rebooted peer) -- know to delete the old SAs asap.
> Because the DPD timers might not catch it fast enough.
> 
> It is a very good idea to do this because sending the empty notify (DPD),
> and the timer setting for how often, are totally optional. Therefore,
> depending on settings, and *without* the INITIAL-CONTACT, it could be quite
> some time before the persisted peer relinquishes its current SAs. 
> 
> Charlie, I can't remember, is the sending of INITIAL-CONTACT a MUST in the
> latest IKEv2 draft? Would it be a good idea to make INITIAL-CONTACT
> notification a MUST, if it is not already? Doing so would help shorten the
> tunnel black hole in most cases, regardless of dpd settings.
> 
> The next question is: what is the best behavior for a (rebooted) peer who
> receives an invalid SPI? Today the mandate is to drop silently. But, if two
> rules are checked first, it can be fine (i think) to respond. Those rules
> are:
>   - do I have an active SA with the sender of the invalid SPI? If yes, drop
> silently. If no, go to next rule check...
>   - do I have the source IP of the sender in my SPD? i.e. is the sender a
> valid peer? If no, drop silently. If yes...
>   - initiate IKE per SPD definition.
> 
> If these two rules are followed, the only threat I see to responding with
> IKE initiation is that an attacker who knew all of my valid peers' IPs
> could, at the moment of recovery from reboot (or power up), cause me to
> establish IKE with all my peers listed in SPD, even though I might not have
> otherwise made those establishmetns. Attacker would do so by sending me
> invalid SPIs spoofed with source of each of my peers. I guess I see this as
> a pretty tough attack to pull-off in the real world (given spoof checking
> used on most ISP routers these days), and the pay-off of the attack likely
> doesn't merit the difficulty of execution. Does the value merrit the risk? 
> 
> Summary: IKEv2 aliveness checking doesn't ensure fast recovery. It provides
> a mechanism that MAY be used for fast detection and recovery, but doesn't
> guarantee it. However, combining the initiate-IKE response behavior +
> INITIAL-CONTACT + liveness detection would ensure VERY fast
> re-establishments for valid peers after one rebooted (and covers all other
> cases too). If the liveness checking doesn't catch the failure fast enough,
> the initiate IKE response w/ IC will. 
> 
> Thoughts?
> 
> Gregory.
> 
> 
>>-----Original Message-----
>>From: Ravi [mailto:ravivsn@roc.co.in]
>>Sent: Wednesday, May 14, 2003 9:59 PM
>>To: Charlie_Kaufman@notesdev.ibm.com
>>Cc: Gregory Lebovitz; 'ddukes@cisco.com'; ipsec@lists.tislabs.com;
>>Michael Choung Shieh; owner-ipsec@lists.tislabs.com
>>Subject: Re: Peer liveliness
>>
>>
>>Hi,
>>  In IKEv2, the IKE SA are bound to the IPSEC SA and IPSEC SAs (Child
>>  SAs) are deleted whenever IKE SA is dead. Due to this, I 
>>don't see any
>>  problem with the approach mentioned in IKEv2 specifications. But, in
>>  IKEv1, this binding is not mandated and IPSEC SA can exist without
>>  corresponding IKE SA. This is where I see problem and current DPD
>>  specification does not seem to be considering this. I was proposing
>>  before, the need for Dead Tunnel detection on the remote 
>>SGs. I plan to
>>  come out with draft in 1 to 2 weeks on this. It is only applicable
>>  for IKEv1 implementations.
>>
>>Regards
>>Ravi
>>
>>Charlie_Kaufman@notesdev.ibm.com wrote:
>>
>>>
>>>
>>>I believe that the current IKEv2 spec addresses this issue 
>>
>>in a way that
>>
>>>puts minimal requirements on implementations, guarantees 
>>
>>interoperability
>>
>>>(though with less than ideal convergence time), and allows 
>>
>>implementations
>>
>>>to do better.
>>>
>>>But it's quite possible that I don't understand all of the 
>>
>>things that
>>
>>>could go wrong, or have inadequately expressed what 
>>
>>implementations MUST
>>
>>>do, or just plain screwed up.
>>>
>>>The implementation requirements for robust interoperability are:
>>>
>>>(1) An IKE SA and all of its associated child SAs fail 
>>
>>together. You aren't
>>
>>>allowed a "partial crash" where some of the state is lost 
>>
>>but some is kept.
>>
>>>This will fall out naturally in most implementations, but 
>>
>>may require some
>>
>>>modular designs to have different modules poll one another 
>>
>>for liveness.
>>
>>>(2) A node may not send on a set of SAs associated with a 
>>
>>single IKE SA
>>
>>>indefinitely without hearing something back. If it hears 
>>
>>nothing for long
>>
>>>enough, it should send an IKE message requiring a reply, 
>>
>>and if no reply
>>
>>>comes it must declare all of the SAs dead.
>>>
>>>(3) A node that has packets to send according to its SPD 
>>
>>and no SA to send
>>
>>>them on must periodically attempt to open an SA for them.
>>>
>>>I believe these three requirements along guarantee that the 
>>
>>right thing
>>
>>>will happen eventually. But it doesn't prescribe what the 
>>
>>timers should be.
>>
>>>So it's possible it will take unacceptably long for things 
>>
>>to converge. (If
>>
>>>network delays are long enough and timeouts short enough, 
>>
>>the system could
>>
>>>fail to work at all, but I believe that problem is unavoidable).
>>>
>>>The problem with more sophisticated strategies is that they may be
>>>exploitable for denial of service attacks. Anyone can forge 
>>
>>an INVALID_SPI
>>
>>>notification message from an IP address of their choice 
>>
>>(since such a
>>
>>>message is not cryptographically protected). If such a message were
>>>sufficient to cause its recipient to shut down and restart 
>>
>>the SA, it would
>>
>>>be a very effective attack. So the spec says that such a 
>>
>>message may be
>>
>>>used only as a hint to a problem - for example to trigger a
>>>cryptographically protected liveness test. This will cause 
>>
>>the failure to
>>
>>>be detected more quickly, but will never cause one to be 
>>
>>detected falsely.
>>
>>>Similarly, the INITIAL_CONTACT notification can be used 
>>
>>when setting up an
>>
>>>SA to assure the other end that it should abandon any SAs 
>>
>>it has open to
>>
>>>the same identity. This is useful in - for example - the 
>>
>>firewall case
>>
>>>where an identity is tied to a single box and it would be 
>>
>>an error for that
>>
>>>box to bring up two connections at once. It would not be 
>>
>>useful in the case
>>
>>>of a user who is allowed to remotely log in from multiple 
>>
>>workstations at
>>
>>>the same time. Again, this makes convergence happen faster 
>>
>>while never
>>
>>>making the wrong thing happen.
>>>
>>>Responding to the individual comments below...
>>>
>>>Gregory Lebovitz <Gregory@netscreen.com> wrote on 04/29/2003:
>>>
>>>
>>>>[WE] won't achieve interoperability unless it's mandated that
>>>>[IMPLEMENTORS] must
>>>>
>>>>
>>>>>reply INVALID_SPI (in clear or initiate IKE back to the
>>>>>sender) whenever it
>>>>>receives bad spi packets.  Current IKEv2 draft doesn't
>>>>>address this issue
>>>>>(only states you MAY reply a clear notify message).
>>>>>
>>>>>IKEv1 vendors has implemented many ways to solve it which 
>>
>>leave poor
>>
>>>>>interoperability.  We should just pick a method and clarify
>>>>>it in IKEv2.
>>>>>===============
>>>>>Michael Shieh
>>>>>
>>>>
>>>I think we did, but if you don't think it works, explain why.
>>>
>>>
>>>
>>>>We have been having quite a debate in the ICSA IPsec 
>>
>>consortium mail list
>>
>>>>recently trying to figure out how to handle this in IKEv1 
>>
>>(YES, STILL!!!)
>>
>>>>Here is what we know for sure of this problem statement:
>>>>(a) detecting liveness/deadness of peer is a good thing, 
>>
>>but does not
>>
>>>solve
>>>
>>>
>>>>all the failure cases in and of itself
>>>
>>>Which ones does it not solve?
>>>
>>>
>>>
>>>>(b) the behavior of a recently rebooted device when it receives an
>>>>encrypted packet for an SPI or IKE-SA not in its SADB MUST 
>>
>>be mandated,
>>
>>>or
>>>
>>>
>>>>else implementations will not interoperate (as is the case 
>>
>>in IKEv1, 5
>>
>>>years
>>>
>>>
>>>>later).
>>>
>>>Can you give an example of how two implementations 
>>
>>following IKEv2 could
>>
>>>fail to interoperate?
>>>
>>>
>>>
>>>>(c) the behavior of a peer that receives a new IKE from a 
>>
>>peer that it
>>
>>>has
>>>
>>>
>>>>an existing IKE-SA with (i.e. the rebooted peer that is trying to
>>>
>>>initiate a
>>>
>>>
>>>>new connection) MUST be mandated, or else implementations will not
>>>>interoperate (as is the case in IKEv1, 5 years later).
>>>
>>>I believe it is mandated that the new IKE-SA must be 
>>
>>accepted, and the old
>>
>>>one either closed immediately or closed after a timeout, 
>>
>>though perhaps
>>
>>>that's just what I was thinking and not what I wrote. Is 
>>
>>there anything
>>
>>>specific you would recommend?
>>>
>>>
>>>
>>>>Darren Dukes wrote:
>>>>
>>>>
>>>>>I believe INVALID_SPI does what you are looking for.  If I 
>>
>>receive an
>>
>>>>>INVALID_SPI notify via an IKE SA I know to delete the SA and
>>>>>traffic will
>>>>>bring up a new one.
>>>>
>>>>I don't believe this will work, since it assumes that an IKE SA is
>>>>established. In the scenario, the IKE-SA would have been 
>>
>>lost along with
>>
>>>the
>>>
>>>
>>>>SPI of the CHILD-SA by the rebooted peer.
>>>>
>>>
>>>Until a new IKE-SA is established, any INVALID_SPI message would be
>>>cryptographically unprotected and therefore not to be taken as other
>>>than a hint. If a new IKE-SA is established, the INVALID_SPI could
>>>be taken as trustworthy and used to abandon the old SA. Without the
>>>INVALID_SPI message, abandonment would still happen but it 
>>
>>would take
>>
>>>longer.
>>>
>>>
>>>
>>>>Recommendations to solve the solution:
>>>>- the empty notify as an aliveness check is a good idea. It 
>>
>>accomplishes
>>
>>>>what the DPD draft did. Keep using this.
>>>>
>>>
>>>Generating them is not mandated, but the ability to respond 
>>
>>to them is.
>>
>>>
>>>>- do what you can to use empty notify to detect dead peer ASAP. The
>>>
>>>faster
>>>
>>>
>>>>the persisting peer can delete the old SPI and IKE-SA, the 
>>
>>better. The
>>
>>>best
>>>
>>>
>>>>case is for Persisting Peer to detect death and initiate new IKE to
>>>
>>>rebooted
>>>
>>>
>>>>peer before rebooted peer gets packets with old SPI, IKE-SA.
>>>>
>>>
>>>If the rebooted peer knows that the SA is needed, it can do 
>>
>>that. If it
>>
>>>sets them up based on traffic, it has to wait until a 
>>
>>packet comes in from
>>
>>>one side or the other.
>>>
>>>
>>>
>>>>- On the Rebooted peer side: If an implementation receives 
>>
>>a protected
>>
>>>>packet from an unkown SPI,
>>>>- simply relying on sending back an unprotected 
>>
>>INVALID_SPI is not a
>>
>>>good
>>>
>>>
>>>>idea. It is too easy to DoS the persisting peer by simply 
>>
>>spoofing the
>>
>>>>rebooted peer's address.
>>>>- initiate IKE to the persisting peer.
>>>
>>>This is allowed, although sending what looks like protected 
>>
>>messages from
>>
>>>randomly chosen IP addresses to cause the node to attempt 
>>
>>lots of IKE
>>
>>>connections is also a plausible DOS attack. Sending the 
>>
>>INVALID_SPI message
>>
>>>will tell the other end to probe this end for liveness and 
>>
>>initiate its own
>>
>>>new IKE connection if that liveness test fails. That's the 
>>
>>path guaranteed
>>
>>>to work. Others will speed things up if implementations 
>>
>>choose to do them.
>>
>>>
>>>>- On the Persisting Peer:
>>>>- If you get a new IKE request from a peer already in your 
>>
>>SADB, respond
>>
>>>>with the under-attack, 6 message method. This will mitigate the DoS
>>>
>>>attack.
>>>
>>>
>>>>If you get all the way through SA and TS negotiation 
>>
>>successfully, you
>>
>>>are
>>>
>>>
>>>>assured (unless I'm missing something) that this really is 
>>
>>your peer, and
>>
>>>>that he re-initiated because he lost the original IKE-SA. 
>>
>>Start using the
>>
>>>>new IKE-SA and the new CHILD-SA and delete the previous 
>>
>>ones after some
>>
>>>wait
>>>
>>>
>>>>period.
>>>>
>>>
>>>Only if there is an INITIAL_CONTACT notification message. 
>>
>>Otherwise it's
>>
>>>possible that the peer is opening multiple IKE SAs, perhaps 
>>
>>because he is
>>
>>>replicated. In some configurations this might be 
>>
>>acceptable. In firewall to
>>
>>>firewall tunnels, it would not and an implementation might 
>>
>>reasonably treat
>>
>>>any IKE-SA as an INITIAL_CONTACT.
>>>
>>>
>>>
>>>>Would this proposal explicitly solve things?
>>>>
>>>>Gregory.
>>>
>>>
>>>      --Charlie
>>>
>>
>>
>>-- 
>>
>>
>>The views presented in this mail are completely mine. The 
>>company is not
>>responsible for whatsoever.
>>--------------------------------------------------------------
>>----------
>>Ravi Kumar CH
>>Rendezvous On Chip (i) Pvt Ltd
>>Hyderabad, India
>>Ph: +91-40-2335 1214 / 1175 / 1184
>>
>>ROC home page <http://www.roc.co.in>
>>
>>
>>