RE: Display types?


At 10:42 AM 3/16/99 -0500, Gunther Schadow wrote:
>... I forgot one thing. There are problems with canonical encodings
>for multilingual character sets, since the boundary between two
>characters sometimes is not clear. Accents, diacritical marks, and
>ligatures as they exist in almost all languages in the world
>(including English) pose a problem. Not only in Unicode do you have
>multiple ways to express such constructs. For instance you could
>represent an e-acute in at least three different ways: [EACUTE]
>(precoordinated), [E][ACUTE], or [ACUTE][E] (typewriter style).
>In general, for a canonical form you want to decompose as much as
>possible. Especially you want to get rid of those stylized ligatures
>(that are often wrong anyway). For instance, the ff, fi, fl, ffi, and
>ffl ligatures do only exist in roman or gothic fonts (not in fixed
>width fonts) and are often wrong if they span a word boundary in a
>composite word. English language does not have so many composite
>words, but German has. "Giffy" might have the ffi ligature, but
>"Auffassung" may not. You definitely want to keep this typographical
>stuff out of SPKI.

We decided a while back that what you're expressing in a canonical form is 
bytes, not what appears on a screen.  You are not signing the pixels 
displayed to the user -- only the bytes used to transport those pixels.  If 
there are ambiguous ways to express something -- e.g., é -- then we're using 
(and signing) whatever choice the original composer made.  I agree that
if you're doing security involving what a user sees, you need to have 
secured the entire path to the user -- and that means pixels -- but that's a 
topic way too big for SPKI or any other certificate work.

Und, als jemand der Deutsch manchmal schreibt, ich möchte naturlich Latin-1 

 -- Carl

