home bbs files messages ]

Just a sample of the Echomail archive

<< oldest | < older | list | newer > | newest >> ]

 Message 275 
 Michiel van der Vlist to Sergey Dorofeev 
 UTF-8 nodelist report 
 09 Mar 25 11:42:16 
 
TID: FMail-W32 2.3.0.1-B20240319
TZUTC: 0100
CHRS: UTF-8 4
MSGID: 2:280/5555 67cd7088
REPLY: 2:5020/12000 4f5391fe
Hello Sergey,

On Friday March 07 2025 15:01, you wrote to me:

 MV>> He insists on entering the 'a' and 'o' with umlaut in Säve and
 MV>> Björn in 202/208 in Latin-1 in the normal ASCII nodelist. So in
 MV>> the ASCII list they are replaced by question marks by MakeNl. In
 MV>> the UTF list which in his case is just a copy of the ASCII
 MV>> segment submitted, they appear "as submitted" and the line is
 MV>> flagged as in error by my program.

 SD> I think it is not very contradictory. I he will success in entering
 SD> non-ASCII chars in nodelist (making it full 8-bit), encoding must be
 SD> defined.

The encoding for the regular nodelist IS defined: ASCII and ASCII only. For
backward compatibility it must stay that way. There still may be nodelist
processing software around that breaks when he highest bit is not zero. That
is why MakeNl (without the ALLOW8BIT setting) substitutes a question mark for
characters with the highest bit set.

The encoding for the UTF nodelist is also defined: UTF-8.

 SD>  Ok, if it will be latin-1, but let it be only for European
 SD> segments. That is, lets define encoding on per-region or even
 SD> per-network basis.

Very bad idea. Having more than one encoding within the same file is a bad
idea anyway, not just for the nodelist but for ANY text file.

 SD> So when importing nodelist, it must be split back on segments and
 SD> correctly transcoded. E.g. default encoding if ASCII, so Zone records
 SD> must be ASCII. But zone may specify own encoding, so regions in it may
 SD> use it in own record, and define encoding for underlying regions.
 SD> Further, region record use zone encoding, and may define encoding for
 SD> networks. Network record use region encoding and may define encoding
 SD> for node records.

Are you serious? You really still want every back alley in Fidonet to have its
own 8 bit encoding? With all the forward and backward re-encoding and other
limitations? C'mon.. That's chaos! Unicode was invented for the very purpose
of getting rid of all this codepage shit.

Why do you think Microsoft went full Unicode internally? Three decades ago.
Why do you think 99% of what is on the web is UTF-8? To get rid of the mess of
all the hundreds of 8 bit encodings that floated around!

Nah, as far as the nodelist goes, it is either just ASCII or UTF-8. No more
codepage shit.


Cheers, Michiel

--- GoldED+/W32-MSVC 1.1.5-b20170303
 * Origin: Nieuw Schnøørd (2:280/5555)
SEEN-BY: 4/0 90/0 105/81 106/201 128/187 153/7715 154/10 110 203/0
SEEN-BY: 218/700 221/6 226/30 227/114 229/110 114 317 426 428 470
SEEN-BY: 229/700 705 240/5832 280/464 5555 291/111 292/789 301/1 310/31
SEEN-BY: 320/219 341/66 234 460/58 900/0 902/0 26 905/0 5019/40 5020/1042
SEEN-BY: 5075/35
PATH: 280/5555 341/66 902/26 229/426


<< oldest | < older | list | newer > | newest >> ]

(c) 1994,  bbs@darkrealms.ca