... darkrealms ...

Forums before death by AOL, social media and spammers... "We can't have nice things"
comp.lang.c
Meh, in C you gotta define EVERYTHING
243,242 messages
[ << oldest | < older | list | newer > | newest >> ]
Message 242,631 of 243,242
Waldek Hebisch to Michael S
Re: Article of Melissa O'Nail (1/2)
28 Dec 25 05:38:55
   From: antispam@fricas.org   
      
   Michael S  wrote:   
   > On Wed, 24 Dec 2025 12:12:11 +0200   
   > Michael S  wrote:   
   >   
   >> On Wed, 24 Dec 2025 09:00:50 -0000 (UTC)   
   >> antispam@fricas.org (Waldek Hebisch) wrote:   
   >>   
   >> > Michael S  wrote:   
   >> > > On Tue, 23 Dec 2025 17:54:05 -0000 (UTC)   
   >> > > antispam@fricas.org (Waldek Hebisch) wrote:   
   >> > >   
   >> > >> Michael S  wrote:   
   >> > >> > On Mon, 22 Dec 2025 18:41:10 +0100   
   >> > >> > Janis Papanagnou  wrote:   
   >> > >> >   
   >> >   
   >> > > Also, the TestU01 suit is made for generators with 32-bit output.   
   >> > > M. O’Neill used ad hoc technique to make it applicable to   
   >> > > generators with 64-bit output. Is this technique right? Or may be   
   >> > > it put 64-bit PRNG at unfair disadvantage?   
   >> >   
   >> > My point of view is that generator can be used to generate long   
   >> > bistream.  Then you can cut the bitstream and get number of   
   >> > desired size.  Good tests should check that such usage leads   
   >> > to reasonable properties.  So, fact that one generator produces   
   >> > 32-bit pieces and other produces 64-bit pieces should be irrelevant   
   >> > to the test.   
   >> >   
   >>   
   >> What you say is correct in few use cases. But there are many uses   
   >> cases (in field of testing of numeric code, probably, most of them)   
   >> in which "less random" LS bits are acceptable.   
   >> Not that I can see why it could be the case for MT19937-64, but it   
   >> could apply to one of two of other 64-bit generators tested by   
   >> O'Neill.   
   >>   
   >>   
   >> > > Besides, I strongly disagree with at least one assertion made by   
   >> > > O’Neill: "While security-related applications should   
   >> > > use a secure generator, because we cannot always know the future   
   >> > > contexts in which our code will be used, it seems wise for all   
   >> > > applications to avoid generators that make discovering their   
   >> > > entire internal state completely trivial."   
   >> > > No, I know exactly what I am doing/ I know exactly that for my   
   >> > > application easy discovery of complete state of PRNG is not a   
   >> > > defect.   
   >> >   
   >> > O’Neill is not a prophet, ignore what she say it you think you   
   >> > know better (which is probably the above).   
   >> >   
   >> > > Anyway, even if I am skeptical about her criticism of popular   
   >> > > PRNGs, intuitively I agree with the constructive part of the   
   >> > > article - medium-quality PRNG that feeds medium quality hash   
   >> > > function can potentially produce very good fast PRNG with rather   
   >> > > small internal state.   
   >> >   
   >> > She seem to care very much about having minimal possible state.   
   >> > That is may be nice on embeded systems, but in general I would   
   >> > happily accept slighty bigger state (say 256 bits).  But if   
   >> > we can get good properties with very small state, then why not?   
   >> > After all looking at state and updating it takes code, so   
   >> > small state helps with having fast generator.   
   >> >   
   >>   
   >> Agreed.   
   >>   
   >> > Concerning Mersenne Twister, she is not the only one to   
   >> > criticise it.  My personal opinion is that given large   
   >> > state and not so simple update Mersenne Twister would   
   >> > have to be very very good to justify its use.   
   >>   
   >> One theoretical advantage of MT19937 is that it has period of   
   >> astronomic proportions. Which means that one instance of PRNG could be   
   >> de-multiplexed into millions or billions of sub-streams with no   
   >> detectable degradation of the quality of each sub-stream.   
   >> However I fail to see how de-multiplexing into more than ~ one   
   >> thousand of sub-streams can be practical. And for the latter one does   
   >> not need to be astronomical, something like period=2**96 would be   
   >> fully sufficient with many bits to spare.   
   >> So, in theory I agree with the criticism. But in practice I am not   
   >> bothered by the size of MT state.   
   >>   
   >> > But it   
   >> > fails some tests, so does not look _better_ than other   
   >> > generators.   
   >> >   
   >>   
   >> It would be interesting to find out what were those tests that failed.   
   >> I wonder, if tests suit can run faster on multicore computer. I don't   
   >> want to wait 5-6 hours just to find out that report does not provide   
   >> an information that I am looking for.   
   >>   
   >   
   > I reproduced results of M. O'Neil. Luckily, on semi-modern hardware   
   > (Coffee Lake or EPYC3) and for PRNGs in question BigCrash finishes in   
   > 2-2.5 hours. Which is a pain, but considarably less so then 5 hours.   
   > mt19937 fails all tests of scomp_LinearComp() variaty (2 in Crash and 2   
   > in BigCrash). It passes all the rest of tests.   
   > After re-reading O'Neil, I see that she wrote that, but first time I   
   > didn't pay attention.   
   >   
   > I have read description of scomp_LinearComp(). I can't say that I   
   > understood much. Neither theory, nor parameters, in particular, even   
   > after looking though code I have no idea about the meaning of parameter   
   > r.   
   > I am not so sure that Pierre L’Ecuyer himself fully understands this   
   > test, apart from the fact that many moons ago basic algorithm for   
   > calculation of linear complexity was published in IEEE Transactions on   
   > Information Theory. Otherwise, his description would not look so much   
   > as hand waving. As to O'Neil, she likely understands it better than   
   > myself, but that's not a huge achievement :(   
   > My less than scientific feeling about this test is that one part of it   
   > is looking if test is LFSR or LFSR-family and if the answer is yes then   
   > test fails.   
   > So, eccentrically, mt19937 is punished for what it is rather than for   
   > randomness of results that it produces.   
      
   Well, I peeked at the code but did not read destription of the   
   test.  In the code I see mention of Berlekamp-Massey.  That is   
   well-known algorithm to find regularites in data, basically   
   tries to find linear recurence satisfied by the data.  One of   
   things that I do is looking for reccurences, including linear   
   ones.  Up to now I did not run any serious simulation for   
   such things but I may wish to do so (and IIUC other people did   
   run some simulations).  When doing simulations I really do   
   not want to see artifacts due to PRNG producing sequence with   
   different features than random sequence.  So for me, if mt19937   
   produces sequence with visible extra linear regulatities that is   
   significant failure.   
      
   > I made a minimal modification to mt19937 algorithm, telling it to skip   
   > every 19936th result word. With modification it easily passes all 3   
   > crash batteries of Pierre L’Ecuyer.   
   > Do I think that my modified mt19937 is better than original? No, I   
   > don't. IMHO, the only thing it is better in is passing batteries of   
   > L’Ecuyer.   
      
   Your modification is enough to fool the tests.  It is not clear   
   to me if it is enough to fool better regularity finders, so   
      
   [continued in next message]   
      
   --- SoupGate-Win32 v1.05   
    * Origin: you cannot sedate... all the things you hate (1:229/2)
[ << oldest | < older | list | newer > | newest >> ]