home bbs files messages ]

Forums before death by AOL, social media and spammers... "We can't have nice things"

   comp.misc      General topics about computers not cover      21,759 messages   

[   << oldest   |   < older   |   list   |   newer >   |   newest >>   ]

   Message 21,185 of 21,759   
   anthk to Toaster   
   Re: bad bot behavior   
   12 May 25 06:24:45   
   
   From: anthk@openbsd.home   
      
   On 2025-03-18, Toaster  wrote:   
   > On Tue, 18 Mar 2025 12:00:07 -0500   
   > D Finnigan  wrote:   
   >   
   >> On 3/18/25 10:17 AM, Ben Collver wrote:   
   >> > Please stop externalizing your costs directly into my face   
   >> > ==========================================================   
   >> > March 17, 2025 on Drew DeVault's blog   
   >> >   
   >> > Over the past few months, instead of working on our priorities at   
   >> > SourceHut, I have spent anywhere from 20-100% of my time in any   
   >> > given week mitigating hyper-aggressive LLM crawlers at scale.   
   >>   
   >> This is happening at my little web site, and if you have a web site,   
   >> it's happening to you too. Don't be a victim.   
   >>   
   >> Actually, I've been wondering where they're storing all this data;   
   >> and how much duplicate data is stored from separate parties all   
   >> scraping the web simultaneously, but independently.   
   >   
   > But what can be done to mitigate this issue? Crawlers and bots ruin the   
   > internet.   
   >   
      
   GZip bombs + fake links = profit. Remember that gz'ed web pages are a   
   standard, even lynx can parse gz files natively.   
      
   Also, Megahal/Hailo under Perl. Feed it nonsense, and create some   
   non-visible contents under a robots.txt-dissallowed directory   
   full of Markov-chains generated nonsense and gzip bombs.   
      
   --- SoupGate-DOS v1.05   
    * Origin: you cannot sedate... all the things you hate (1:229/2)   

[   << oldest   |   < older   |   list   |   newer >   |   newest >>   ]


(c) 1994,  bbs@darkrealms.ca