From owner-spki@c2.net Tue Mar 16 12:17:49 1999 Received: from blacklodge.c2.net (blacklodge.c2.net [140.174.185.245]) by lox.sandelman.ottawa.on.ca (8.8.7/8.8.8) with ESMTP id MAA01263; Tue, 16 Mar 1999 12:17:47 -0500 (EST) Received: (from majordom@localhost) by blacklodge.c2.net (8.8.8/8.7.3) id IAA23823 for spki-outgoing; Tue, 16 Mar 1999 08:15:02 -0800 (PST) Message-Id: <3.0.3.32.19990316081155.035ecd58@spiritone.com> X-Sender: cellison@spiritone.com X-Mailer: QUALCOMM Windows Eudora Pro Version 3.0.3 (32) Date: Tue, 16 Mar 1999 08:11:55 -0800 To: Gunther Schadow From: Carl Ellison Subject: RE: Display types? Cc: ehg@research.bell-labs.com, nisse@lysator.liu.se, rgrimm@cs.washington.edu, schadow@aurora.rg.iupui.edu, paul@hedonism.demon.co.uk, rivest@theory.lcs.mit.edu, spki@c2.net In-Reply-To: <199903161542.KAA13704@aurora.rg.iupui.edu> Mime-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Sender: owner-spki@c2.net Precedence: bulk Content-Transfer-Encoding: 8bit X-MIME-Autoconverted: from quoted-printable to 8bit by lox.sandelman.ottawa.on.ca id MAA01263 -----BEGIN PGP SIGNED MESSAGE----- At 10:42 AM 3/16/99 -0500, Gunther Schadow wrote: >... I forgot one thing. There are problems with canonical encodings >for multilingual character sets, since the boundary between two >characters sometimes is not clear. Accents, diacritical marks, and >ligatures as they exist in almost all languages in the world >(including English) pose a problem. Not only in Unicode do you have >multiple ways to express such constructs. For instance you could >represent an e-acute in at least three different ways: [EACUTE] >(precoordinated), [E][ACUTE], or [ACUTE][E] (typewriter style). > >In general, for a canonical form you want to decompose as much as >possible. Especially you want to get rid of those stylized ligatures >(that are often wrong anyway). For instance, the ff, fi, fl, ffi, and >ffl ligatures do only exist in roman or gothic fonts (not in fixed >width fonts) and are often wrong if they span a word boundary in a >composite word. English language does not have so many composite >words, but German has. "Giffy" might have the ffi ligature, but >"Auffassung" may not. You definitely want to keep this typographical >stuff out of SPKI. We decided a while back that what you're expressing in a canonical form is bytes, not what appears on a screen. You are not signing the pixels displayed to the user -- only the bytes used to transport those pixels. If there are ambiguous ways to express something -- e.g., é -- then we're using (and signing) whatever choice the original composer made. I agree that if you're doing security involving what a user sees, you need to have secured the entire path to the user -- and that means pixels -- but that's a topic way too big for SPKI or any other certificate work. Und, als jemand der Deutsch manchmal schreibt, ich möchte naturlich Latin-1 benutzen. -- Carl -----BEGIN PGP SIGNATURE----- Version: PGP for Personal Privacy 5.5.3 iQCVAwUBNu6CyhN3Wx8QwqUtAQEUEQP+Mj/sEMaOW2H/eDIrF8e3KDZ8BcMY6DZK yClwu6TSu+epRXwA19ZlOgqvC3FWPJzmF5Ug+SAED4pziw7JuySzEas4bmbTFVqS 7o2paDqNkoNuJYTW7TFd0Rtte5z2Bp7wgkpfhiITxqlh4utzgq1EHTCam0v4XqWv r0JkAYXP05I= =RsWp -----END PGP SIGNATURE----- +------------------------------------------------------------------+ |Carl M. Ellison cme@acm.org http://www.pobox.com/~cme | | PGP: 08FF BA05 599B 49D2 23C6 6FFD 36BA D342 | +--Officer, officer, arrest that man. He's whistling a dirty song.-+