From: anton@mips.complang.tuwien.ac.at   
      
   antispam@fricas.org (Waldek Hebisch) writes:   
   >Anton Ertl wrote:   
   >> I wanted to generate a random sequence consisting of a number, NOOP,   
   >> and DROP. When EVALUATEing the sequence, starting with an empty   
   >> stack, it should end with an empty stack, and should not underflow the   
   >> stack.   
   >>   
   >> I implemented this twice. The first attempt just generated a random   
   >> word/number in the normal case, and a specific kind of word/number in   
   >> the boundary cases:   
   >>   
   >> : genseq {: u -- :}   
   >> 0 u 0 do   
   >> case   
   >> dup 0= ?of 1+ ." 9 " endof   
   >> dup i + u u>= ?of 1- ." drop " endof   
   >> dup i + 1+ u u>= ?of ." noop " endof   
   >> rnd 3 um* nip 1- dup case   
   >> -1 of ." drop " endof   
   >> 0 of ." noop " endof   
   >> 1 of ." 8 " endof   
   >> endcase   
   >> +   
   >> 0 endcase   
   >> loop   
   >> drop ;   
   >>   
   >> When I do 16 GENSEQ with seed 1234, I get the following sequence:   
   >>   
   >> 9 drop 9 8 8 noop drop drop 8 noop drop noop noop 8 drop drop   
   >>   
   >> The requirements are distributed across several places in this word,   
   >> which I did not like. I also did not like the non-randomness of the   
   >> sequence of DROPs at the end, and the non-randomness of the 9 whenever   
   >> the stack is empty. I also found that I needed to add a clause for   
   >> when the stack depth is one less than the number of remaining words.   
   >>   
   >> So I thought about other ways to do it.   
   >>   
   >> Eventually I came up with the following approach: Starts with counters   
   >> for the different kinds of words/numbers. Whenever one word/number is   
   >> generated, its counter is reduced. At each step the probability of   
   >> generating a kind of word/number is proportional to the remaining   
   >> counter for that word/number. The same-depth requirement is   
   >> implemented by having the same initial counter for the number and   
   >> DROP. The no-underflow requirement is implemented by skipping the   
   >> output of a DROP if the count of remaining DROPs = the count of the   
   >> remaining numbers.   
   >>   
   >> Here's the code:   
   >>   
   >> create counts 3 cells allot   
   >>   
   >> : initcounts ( u -- )   
   >> dup 3 / dup counts ! dup counts 2 th ! 2* - counts cell+ ! ;   
   >>   
   >> : .counts ( -- )   
   >> counts 3 cell array>mem mem+do   
   >> i @ 4 .r   
   >> loop ;   
   >>   
   >> : countsth-- ( u -- )   
   >> -1 counts rot th +! ;   
   >>   
   >> : select1 ( u -- )   
   >> \ u is the index of the word   
   >> dup case   
   >> 0 of ." 9 " countsth-- endof   
   >> 1 of ." noop " countsth-- endof   
   >> 2 of counts @ counts 2 th @ u< if   
   >> ." drop " countsth--   
   >> else   
   >> drop endif   
   >> endof   
   >> endcase ;   
   >>   
   >> : sum ( addr u -- u2 )   
   >> \ addr u is an array with u elements, u2 is the sum of the elements   
   >> 0 -rot cell array>mem mem+do   
   >> i @ +   
   >> loop ;   
   >>   
   >> : genseq ( u -- )   
   >> initcounts   
   >> begin   
   >> counts 3 sum dup while   
   >> rnd um* nip ( sum )   
   >> 3 0 u+do ( sum1 )   
   >> counts i th @ 2dup u< if   
   >> i select1 then   
   >> - loop   
   >> drop   
   >> repeat   
   >> drop ;   
   >>   
   >> The code is much longer, but I like it better. The requirements are   
   >> concentrated in two places.   
   >>   
   >> 16 GENSEQ generates   
   >>   
   >> 9 9 noop noop drop noop 9 drop noop drop noop 9 drop 9 noop drop   
   >>   
   >> for seed 123. A disadvantage of this approach is that it does not   
   >> generate a word/number on each iteration through the main loop. I   
   >> have thought about how to achieve this, but the result would be quite   
   >> a bit more complex; it's also only a minor issue. E.g., for   
   >> generating 1000 words/numbers with seed 123, the main loop has 1036   
   >> iterations.   
   >>   
   >> The second approach takes quite a bit more time; for generating a   
   >> million words/numbers into a string on a Zen 4, 192_876_892 cycles   
   >> compared to 146_644_156 cycles.   
   >>   
   >> The original goal was to produce a sequence where various parts of the   
   >> text interpreter have branch mispredictions; the less predictable the   
   >> sequence, the better. In this respect the second sequence is   
   >> significantly better. For a sequence of 1000 words/numbers,   
   >> text-interpreted 1000 times, the number of mispredictions is:   
   >>   
   >> 728_322 first approach   
   >> 821_813 second approach   
   >>   
   >> (with 247_125 mispredictions coming from (mainly) Gforth startup).   
   >   
      
      
   >In related problem I used the following technique. Basic part   
   >is: at each position you need to generate next step with right   
   >probablity. How to know probabilities? Just count all possible   
   >continuations. To do this with reasonable efficiency start   
   >from the end and observe that legality of a continuation depends   
   >only on number of elements on the stack, but does not depend on   
   >sequence of operations that put the items on the stack. Once   
   >you know counts for all contunuations of length k you can compute   
   >counts for contunuations of length k + 1 (each possible first step   
   >leads to new state for which count is known).   
   >   
   >This produces uniform distribution, so in this sense gives   
   >pretty good randomness. OTOH size of table of counts grows   
   >quadratically with length   
      
   It's unclear to me what benefit this table provides over my method   
   which uses just counts. My counts represent the number of outstanding   
   9s, NOOPs, and DROPs. And the counts are used for the probabilities   
   of the words used for the random selection (except that a selection is   
   suppressed if it is not legal (i.e., a DROP when the stack is empty)).   
      
   - anton   
   --   
   M. Anton Ertl http://www.complang.tuwien.ac.at/anton/home.html   
   comp.lang.forth FAQs: http://www.complang.tuwien.ac.at/forth/faq/toc.html   
    New standard: https://forth-standard.org/   
   EuroForth 2025 CFP: http://www.euroforth.org/ef25/cfp.html   
   EuroForth 2025 registration: https://euro.theforth.net/   
      
   --- SoupGate-Win32 v1.05   
    * Origin: you cannot sedate... all the things you hate (1:229/2)   
|