home bbs files messages ]

Just a sample of the Echomail archive

<< oldest | < older | list | newer > | newest >> ]

 Message 302 
 Wilfred van Velzen to mark lewis 
 Re: sbbs as fidonet hub 
 01 Apr 21 15:08:42 
 
TID: FMail-lnx64 2.1.0.18-B20170815
RFC-X-No-Archive: Yes
TZUTC: 0200
CHRS: UTF-8 2
PID: GED+LNX 1.1.5-b20161221
MSGID: 2:280/464 6065cae8
REPLY: 16159.fido-synchron@1:3634/12 24cb32b5
* Originally in SYNCHRONET
* Crossposted in NET_DEV

Hi mark,

On 2021-04-01 08:33:26, you wrote to me:

 ml> in any case, dupe checking in FTN is not done my /just/ detecting
 ml> duplicate MSGIDs and rejecting the others... the header, including the
 ml> time stamp, as well as the message body should be taken into
 ml> account...

That's how FMail does it.

 ml> it should also be said that CRC16/CRC32 on the message bodies is also
 ml> not sifficient... even with filtering out white space and the various
 ml> EoLs... this because, and most programmers know this, there's a
 ml> limited supply of CRC values in the tables and it is all too easy to
 ml> find "hash clashes"... CRC16 has only 65536 values... CRC32 has only
 ml> 4294967296 values... the "Birthday Problem" also comes into play...

Indeed. That's why it's on my todo list to go from 32 bit to 64 bit hash
values in FMail. That would be more than enough to keep a (very) big dupe
"database", and still have a very small probability for hash collisions.

 ml> these days, MD5 and SHA1 are also out due to defects in them...

That's because of their security aspect. That's not an issue when you would be
using them as hash's for dupe detection.

 ml> SHA256 would be the first really useful algorithm or SHA512...

Using a 256 or even 512 bit secure hash value would be overkill for dupe
detection. And would be using way too much resources to calculate and check
them.

64 Bit is enough, and it doesn't have to be secure. My prime candidate for
FMail is this one:

https://github.com/Cyan4973/xxHash

 ml> but the real key is to filter out the stuff that can change and hash
 ml> only that which won't...

Indeed.

 ml> anyway, i'm done with this topic in this area... the discussion really
 ml> belongs elsewhere for those that are truely interested in implementing
 ml> proper duplicate detection in FTNs...

(I've crossposted to NET_DEV.)


Bye, Wilfred.

--- FMail-lnx64 2.1.0.18-B20170815
 * Origin: FMail development HQ (2:280/464)
SEEN-BY: 1/123 18/200 90/1 103/705 105/81 120/340 123/131 124/5016
SEEN-BY: 154/10 203/0 221/0 226/30 227/114 229/424 426 664 700 1016
SEEN-BY: 229/1017 240/5138 5411 5824 5832 5853 249/206 317 280/464
SEEN-BY: 280/5003 282/1038 288/100 292/854 8125 310/31 317/3 322/757
SEEN-BY: 342/200 396/45 423/120 633/280 712/848 770/1 2432/390 2454/119
PATH: 280/464 240/5832 229/426


<< oldest | < older | list | newer > | newest >> ]

(c) 1994,  bbs@darkrealms.ca