home bbs files messages ]

Forums before death by AOL, social media and spammers... "We can't have nice things"

   alt.os.linux.ubuntu      I preferred Xubuntu, seemed a bit faster      134,474 messages   

[   << oldest   |   < older   |   list   |   newer >   |   newest >>   ]

   Message 133,694 of 134,474   
   Michael F. Stemper to Dan Purgert   
   Re: Unable to wget some pages   
   12 Mar 24 09:14:22   
   
   From: michael.stemper@gmail.com   
      
   On 12/03/2024 04.24, Dan Purgert wrote:   
   > On 2024-03-11, Michael F. Stemper wrote:   
   >> On 11/03/2024 09.27, Dan Purgert wrote:   
   >>> On 2024-03-11, Michael F. Stemper wrote:   
   >>>> Late last week, a script that I have used for several years suddenly   
      
   >>>> Looking at the error message, one might think that this page/site   
   >>>> requires user login credentials. However, the same URL works just   
   >>>> fine in Firefox, with no login requested or required.   
   >>>   
   >>> Looks like the page *does* have a login button / javascript thing   
   >>> "somewhere" (at least I can see it when I open the page in lynx here).   
      
   >>> I'd imagine either   
   >>>   
   >>>     (1) wget is respecting some robots.txt somewhere OR   
   >>>     (2) wget is following that login link for some reason   
   >>   
   >> Any ideas how I could test for, or prevent, either of these?   
   >   
   > Potentially adding "-e robots=off" will avoid #1. More verbosity (-v) or   
   > turning on headers (-S?) may help for both as well.   
      
   No joy from robots=off, and wget's man page says that -v is the default.   
      
   But, I just tried with curl, and think that I've found a clue. Included   
   in what it downloaded was:   
      "Please enable JS and disable any ad blocker"   
      
   I'm not sure if it's possible for wget to fake having javascript, but   
   it seems as if that's the next place to look.   
   --   
   Michael F. Stemper   
   This sentence no verb.   
      
   --- SoupGate-Win32 v1.05   
    * Origin: you cannot sedate... all the things you hate (1:229/2)   

[   << oldest   |   < older   |   list   |   newer >   |   newest >>   ]


(c) 1994,  bbs@darkrealms.ca