From: huey.dll@tampabay.rr.com   
      
   Mike Spencer wrote in   
   news:874itakjhn.fsf@enoch.nodomain.nowhere:   
      
   >   
   > Nearly 40 years ago, the MIT press published Parallel Distributed   
   > Processing, Vol. 1 & 2, by Rumelhart, McClelland et al. I read those   
   > as well as similar material published at MIT in the early 90s, wrote   
   > some functional toy code (on an Osborne I).   
   >   
   > But I haven't kept up.   
   >   
   > Can someone suggest books at more or less the same level of   
   > technicality that that I might look at to catch up a bit on how neural   
   > nets are now constructed, trained, connected etc. to produce what is   
   > being called "large language models"?   
   >   
   > The net is of course rife, indeed inundated, with stuff on the topic.   
   > But the vast bulk of it falls into one of two categories. One category   
   > is mass media news and pop science reporting, intended to provoke "Oh,   
   > gee whiz" by the average person or at best a vague notion of the   
   > subject for for the literate but non-technical. The other category   
   > is material intended for someone who has read all the technical   
   > literature for the last 40 years or at least has obtained a master's   
   > degree in AI computing/theory in the last decade. In the latter case,   
   > just the terminology is a barrier.   
   >   
   > I'm now an old guy. I'm not going to completely beat up all the math   
   > that has evolved since PDP but I'd like to get a more or less   
   > caught-up handle on how this stuff works internally.   
   >   
   > Any suggestions?   
   >   
   > [ Yes, I had a look at some of the AI newsgroups. Moribund or   
   > highjacked by politics.]   
      
   Hi Mike,   
      
   You might look up Stephane Charette's work on Dark Net. He has several   
   examples and technical descriptions online. His work and examples mainly   
   focus on video recognition of still frames. LLMs, from my limited   
   understanding, are just massive text training sets obtained by grabbing   
   everything the developer deems potentially useful and using that to train   
   massive AI Networks.   
      
   He does explain the math and training concepts he found useful.   
      
   https://www.youtube.com/c/StephaneCharette/videos   
      
   --- SoupGate-Win32 v1.05   
    * Origin: you cannot sedate... all the things you hate (1:229/2)   
|