[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: compression and encryption



When my customers have run compression I've seen them find acceptable
benefit from using non-history-based compression, which is to say I feel
it's fine to do a reset after every
packet.  This case side-steps your analysis.  I don't have enough
information at my fingertips
to confirm your simulation matches the field data I have seen.

If someone wants to accept the potential impact of using a history-based
compression
scheme I don't see any reason to architect the protocol to stop them.  On
the other hand, if there's not 'rough consensus' for putting compression
into ESP then fine, don't put it in.

There's still the 'next header' field so the stand-alone compression
transform can be used.
  

At 08:54 PM 11/6/96 -0500, you wrote:
>I've been thinking a lot about doing compression over a lossy medium
>such as IP.  The problem is that the receiver can lose compression
>synchronization.  While there may be other ways to handle this,
>one that has been suggested is the ``resync every n packets''
>approach.  That is, the sender would include a sequence number in
>each packet, so the receiver could detect missing packets; every
>so often, a flag bit would be set or the sequence number would be
>set to 0, and a new compression state would be initialized.  This
>has the effect of tying together a sequence of packets -- if a
>packet in the sequence is lost, all subsequent packets until the
>next resync point are also lost, because they cannot be decompressed.
>This implies that compression over a lossy medium, using this resync
>algorithm, acts as a packet loss multiplier -- the loss of one
>packet will often cause the loss of other packets as well.
>
>Call the probability of loss of an individual packet 'p'.  (We will assume
>that loss probability for separate packets is independent.  Whether or
>not this is true may be dependent on the implementation strategies of
>various routers, and on the hardware error characteristics of the
>underlying media.)  Every 'n' packets, the compression state is reset.
>We wish to calculate P, the fraction of packets lost out of a burst of n.
>
>The first packet is dropped with probability p, and hence arrives with
>probability 1-p.  The second packet arrives successfully if and only
>(a) it isn't dropped, and (b) the packet preceeding it isn't dropped.
>In other words, the probability of safe arrival of the second packet in
>the burst is (1-p)^2.  Similarly, we see that for packet 'i', 1<=i<=n,
>it arrives safely with probability (1-p)^i.  We can calculate the average
>number of safely arriving packets in a burst by adding the average number
>of copies of each individual packet that arrive -- (i-p)^i -- and
>dividing by n.  Subtracting that from 1 gives the loss probability:
>
>	P = 1 - sum(i=1 to n) (1-p)^i / n
>
>(Btw, I've confirmed this equation via a simulation.)
>
>Here are the results of the equation for packet loss probabilities
>ranging from .01 to .12, and burst sizes ranging from 1 to 16:
>
>     0.01  0.02  0.03  0.04  0.05  0.06  0.07  0.08  0.09  0.10  0.11  0.12
>     ----------------------------------------------------------------------
>  1  0.01  0.02  0.03  0.04  0.05  0.06  0.07  0.08  0.09  0.10  0.11  0.12
>  2  0.01  0.03  0.04  0.06  0.07  0.09  0.10  0.12  0.13  0.15  0.16  0.17
>  3  0.02  0.04  0.06  0.08  0.10  0.12  0.13  0.15  0.17  0.19  0.20  0.22
>  4  0.02  0.05  0.07  0.10  0.12  0.14  0.16  0.18  0.21  0.23  0.25  0.27
>  5  0.03  0.06  0.09  0.11  0.14  0.17  0.19  0.22  0.24  0.26  0.29  0.31
>  6  0.03  0.07  0.10  0.13  0.16  0.19  0.22  0.25  0.27  0.30  0.32  0.35
>  7  0.04  0.08  0.11  0.15  0.18  0.21  0.24  0.27  0.30  0.33  0.36  0.38
>  8  0.04  0.09  0.13  0.16  0.20  0.24  0.27  0.30  0.33  0.36  0.39  0.41
>  9  0.05  0.09  0.14  0.18  0.22  0.26  0.29  0.33  0.36  0.39  0.42  0.44
> 10  0.05  0.10  0.15  0.20  0.24  0.28  0.31  0.35  0.38  0.41  0.44  0.47
> 11  0.06  0.11  0.16  0.21  0.26  0.30  0.34  0.37  0.41  0.44  0.47  0.50
> 12  0.06  0.12  0.18  0.23  0.27  0.32  0.36  0.39  0.43  0.46  0.49  0.52
> 13  0.07  0.13  0.19  0.24  0.29  0.33  0.38  0.41  0.45  0.48  0.51  0.54
> 14  0.07  0.14  0.20  0.25  0.30  0.35  0.39  0.43  0.47  0.50  0.54  0.56
> 15  0.08  0.15  0.21  0.27  0.32  0.37  0.41  0.45  0.49  0.52  0.55  0.58
> 16  0.08  0.15  0.22  0.28  0.34  0.38  0.43  0.47  0.51  0.54  0.57  0.60
>
>The first line is the loss probability; the first column is the
>burst size.
>
>We know empirically that a loss probability of .05 is not atypical
>within the U.S.; the transoceanic trunks can have loss probabilities
>of .15 or thereabouts.  (The packet loss figure reported by 'ping' is a
>measure of the round trip packet loss, which is too pessimistic.
>If g is the loss figure reported by ping, the one-way loss can be
>approximated as 1-sqrt(1-g), since the packet and the response
>together have more or less independent loss probabilities.)
>
>We also know that when packet losses get above 10%, the TCP congestion
>control algorithms will serve to cut throughput drastically.  (N.B.
>The 10% figure is based on my experience; I haven't analyzed it or
>even checked the literature...)  Our goal, then, must be to keep
>P below .1, or the resulting TCP reactions will more than compensate
>for the gains from copmression.
>
>We can see from the chart that for p=.05, we must keep n<=4.  That
>is, no more than every fourth packet, the compression state must
>be reset.  For lossy links, the burst size should be 1.
>
>One more problem with larger burst sizes should be noted.  Even
>independent of the congestion control algorithms, TCP does only a
>finite number of retransmissions.  Depending on the stack, this
>number may be surprisingly low.  If too many packets in a burst
>are lost -- or if the resync packet following a group of useless
>packets is lost -- it is possible that the connection will be
>aborted.
>
>Given all this, it isn't clear how to implement compression under
>IPSEC.  Picking a standard now is clearly premature; we need to
>investigate various choices.  For example, 'n' might be negotiated
>during the key management exchange.  But it isn't clear how a host
>would know when to pick a large 'n' or when to pick a small 'n'.
>It might also be desirable to let systems adapt dynamically.  For
>example, a receiver might note the loss rate on received packets,
>as determined by the sequence numbers, and send periodically send
>a control packet.  The sender could adjust its 'n' accordingly.
>(Receivers don't need to know 'n'; the change can be made unilaterally
>by the sender.)  Of course, the rate of such messages -- which are
>reminscent of the link quality messages in PPP -- might need to be
>negotiable too.  But the prospect of any such messages, with the
>consequent implications for packet formats, is yet another reason to
>hold off on a compression standard.
>
>We should also analyze just how much compression will help.  On a
>typical dial-up link -- the market for this technology -- I suspect
>that not much text is sent.  GIF and JPEG files are already
>compressed, and are not further compressible.
>
>There is a potentially large gain from VJ-like header compression,
>which we also lose with IPSEC.  It might be possible to use our
>cryptographic state to implement some form of this.  If we use
>per-connection keying, much of the TCP and IP headers can be
>reconstructed by caching the connection ID information in the
>security association table, and rebuilding the headers at the far
>end.  I'll leave it to someone else to do the detailed design, but
>my gut feeling is that this is a bigger win than, say, LZW data
>compression.
>
>A final choice, for some people, might be to let their ISP do the
>encryption at the POP.  (Please, return to your seats.  I'm well
>aware of the implications of this...)  Suppose that someone made
>IPSEC-capable terminal servers, with secure cryptographic hardware.
>The user's travel key could be transmitted to this box via a
>cryptographically protected exchange.  The traffic is thus protected
>from the POP to the user's host (or firewall).  It might even be
>desirable to do AH end-to-end (sacrificing VJ header compression but
>retaining modem or PPP compression), resulting in a final packet format
>of IP-ESP-AH-data.  (I don't remember if this was in Steve Kent's
>list.)  For most users and most data, the threat model is such that
>POP-to-firewall secrecy is good enough, provided that the ISP can't
>impersonate them.  (I would note that most people trust phone lines
>far more than they trust the Internet.  While I suspect that the
>Internet isn't quite as dangerous -- nor phone lines quite as safe --
>as is generally believed, I do think that on the whole, the perception
>is correct.)
>
>Based on all this, I make the following recommendations:
>
>	a) we proceed with ESP as it is now.  No changes should
>	be made, or hooks left, for compression.
>
>	b) the key management exchanges should permit the negotiation
>	of an arbitrary set of compression parameters, for when something
>	is defined.
>
>	c) we investigate IPSEC header compression.
>
>	d) the IPSEC architecture must allow for the header formats
>	resulting from POP-based encryption.
>
>Comments?
>
>
>
>

               Rodney Thayer <rodney@sabletech.com>       +1 617 332 7292
               Sable Technology Corp, 246 Walnut St., Newton MA 02160 USA
               Fax: +1 617 332 7970           http://www.shore.net/~sable
                           Communications Software Development



Follow-Ups: