home bbs files messages ]

Just a sample of the Echomail archive

<< oldest | < older | list | newer > | newest >> ]

 Message 2231 
 Michiel van der Vlist to Nicholas Boel 
 Need volonteers to test another patch 
 03 Mar 24 16:45:34 
 
TID: FMail-W32 2.2.0.0
RFC-X-No-Archive: Yes
TZUTC: 0100
CHRS: UTF-8 4
MSGID: 2:280/5555 65e49e05
REPLY: 1:154/10 65e48d3c
Hello Nicholas,

On Sunday March 03 2024 08:46, you wrote to Vitaliy Aksyonov:

 NB> As for the pseudo-graphics wrapped to the next line, I have a
 NB> (probably dumb) question about this: If the pseudo graphics were
 NB> originally cp437 (single byte) and translated to utf-8, once they are
 NB> translated are they now multiple bytes per character?

I prefer dumb quetion, they are easier to answer... ;-)

Yes, they are translated to multi (usually two for most characters used in
Fidonet) byte characters. Only the ASCII characters (0-127) are not translated
and so remain one byte.

 NB> If "UTF-8 uses 1 to 4 bytes to encode a single character", I guess
 NB> what I'm wondering is if the character was 1 byte to begin with, why
 NB> wouldn't it stay 1 byte when translated to utf-8? Or is it because
 NB> those _specific_ characters when in utf-8 are already multiple bytes?

A non ASCII character can not be translated to one byte for the simple reason
that the remaning  128 bytes with the highest bit set are not enough to encode
ALL the characters in ALL the single byte characters sets. The whole idea of
unicode is to encode ALL the characters of ALL those characters sets, CP437,
CP850, CP 866, CP 1250, etc into ONE encoding scheme. One byte is just not
enough for all.

To put it simple: if you want to encode CP437 and CP866, you could put CP437
OR CP866 in the first byte, but you need at least one bit more information
which one it is; CP437 or CP866. That is not exactly how UTF-8 works but it
should give you an idea of why just one byte can not be enough.


Cheers, Michiel

--- GoldED+/W32-MSVC 1.1.5-b20170303
 * Origin: Nieuw Schnøørd (2:280/5555)
SEEN-BY: 15/0 18/200 90/1 103/705 105/81 106/201 124/5016 128/260
SEEN-BY: 129/305 135/225 153/757 7715 154/10 30 203/0 218/700 221/0
SEEN-BY: 221/6 226/30 227/114 229/110 112 113 206 307 317 400 426
SEEN-BY: 229/428 470 664 700 240/1120 5832 266/512 280/464 5003 5555
SEEN-BY: 282/1038 291/111 292/854 8125 301/1 310/31 320/219 322/757
SEEN-BY: 341/66 234 342/200 396/45 423/120 460/16 58 256 1124 5858
SEEN-BY: 467/888 633/280 712/848 770/1 5019/40 5020/400 1042 5053/58
SEEN-BY: 5054/30 5075/35
PATH: 280/5555 464 460/58 229/426


<< oldest | < older | list | newer > | newest >> ]

(c) 1994,  bbs@darkrealms.ca