home bbs files messages ]

Forums before death by AOL, social media and spammers... "We can't have nice things"

   comp.arch      Apparently more than just beeps & boops      131,241 messages   

[   << oldest   |   < older   |   list   |   newer >   |   newest >>   ]

   Message 131,235 of 131,241   
   Stephen Fuld to Anton Ertl   
   Re: IA-64   
   24 Feb 26 07:51:13   
   
   From: sfuld@alumni.cmu.edu.invalid   
      
   On 2/24/2026 3:25 AM, Anton Ertl wrote:   
   > Stephen Fuld  writes:   
   >> On 2/21/2026 8:18 AM, Anton Ertl wrote:   
   >>   
   >> big snip   
   >>   
   >>> Otherwise what kind of common code do we have that is   
   >>> memory-dominated?  Tree searching and binary search in arrays come to   
   >>> mind, but are they really common, apart from programming classes?   
   >>   
   >> It is probably useful to distinguish between latency bound and bandwidth   
   >> bound.   
   >   
   > If a problem is bandwidth-bound, then differences between conventional   
   > architectures and EPIC play no role, and microarchitectural   
   > differences in the core play no role, either; they all have to wait   
   > for memory.   
   >   
   > For latency various forms of prefetching (by hardware or software) can   
   > help.   
   >   
   >> Many occur in commercial (i.e. non scientific) programs, such as   
   >> database systems.  For example, imagine a company employee file (table),   
   >> with a (say 300 byte) record for each of its many thousands of employees   
   >> each containing typical employee stuff).  Now suppose someone wants to   
   >> know "What is the total salary of all the employees in the "Sales"   
   >> department.  With no index on "department", but it is at a fixed   
   >> displacement within each record, the code looks at each record, does a   
   >> trivial test on it, perhaps adds to a register, then goes to the next   
   >> record.  This it almost certainly memory latency bound.   
   >   
   > If the records are stored sequentially, either because the programming   
   > language supports that arrangement and the programmer made use of   
   > that, or because the allocation happened in a way that resulted in   
   > such an arrangement, stride-based prefetching will prefetch the   
   > accessed fields and reduce the latency to the one due to bandwidth   
   > limits.   
      
   Let me better explain what I was trying to set up, then you can tell me   
   where I went wrong.  I did expect the records to be sequential, and   
   could be pre-fetched, but with the inner loop so short, just a few   
   instructions, I thought that it would quickly "get ahead" of the   
   prefetch.  That is, that there was a small limit on the number of   
   prefetches that could be in process simultaneously, and with such a   
   small CPU loop, it would quickly hit that limit, and thus be latency bound.   
      
      
   > If the records are stored randomly, but are pointed to by an array,   
   > one can prefetch the relevant fields easily, again turning the problem   
   > into a latency-bound problem.  If, OTOH, the records are stored   
   > randomly and are in a linked list, this problem is a case of   
   > pointer-chasing and is indeed latency-bound.   
   >   
   > BTW, thousands of employee records, each with 300 bytes, fit in the L2   
   > or L3 cache of modern processors.   
      
   Yes, I miscalculated.  My intent was to force a DRAM access for each   
   record, which would make the problem worse (DRAM access time versus L3   
   access time).  But I think the same issue would apply, even it it fits   
   in an L3 cache, but if it doesn't, increase the record size or number of   
   records so that it doesn't fit in L3.  But this just changes the number   
   of prefetches in process needed to prevent it from becoming latency bound.   
      
   Thanks.   
      
   --   
     - Stephen Fuld   
   (e-mail address disguised to prevent spam)   
      
   --- SoupGate-Win32 v1.05   
    * Origin: you cannot sedate... all the things you hate (1:229/2)   

[   << oldest   |   < older   |   list   |   newer >   |   newest >>   ]


(c) 1994,  bbs@darkrealms.ca