[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Display types?
>>>>> "Niels" == Niels =?ISO-8859-1?Q?M=F6ller?= <nisse@lysator.liu.se> writes:
Niels> Paul Koning <pkoning@xedia.com> writes:
>> Why would the mismatch between character count and byte count
>> affect a human? I must be missing something here. If variable
>> length is the issue then Unicode with wide chars would serve
>> (provided they don't start using codes outside the basic
>> multinational plane, i.e., codes outside the 16-bit space). UTF-8
>> has an advantage, though, in that it encodes characters from the
>> Latin-1 set with the same bitstrings as Latin-1 does.
Niels> The last sentence is wrong. Perhaps you meant some other
Niels> encoding than UTF-8 here? For instance, my last name (written
Niels> in hex, to survive any transformations in the mail system) is
Niels> "4df6 6c6c 6572" in latin-1, "4dc3 b66c 6c65 72", and "004d
Niels> 00f6 006c 006c 0065 0072" in UTF-16 (assuming network byte
Niels> order).
Right; as has been pointed out, I was mixed up with 7-bit ASCII
backwards compatibility.
The UFT-8 and Unicode discussion in the Plan9 document that was quoted
earlier makes the point that Unicode in its 16-bit form has the major
problem that the byte order isn't well defined and cleanly handled.
That does sound familiar, and it's totally unacceptable to introduce
such a thing anywhere.
paul
Follow-Ups:
References: