Forums before death by AOL, social media and spammers... "We can't have nice things"
|    alt.os.linux.ubuntu    |    I preferred Xubuntu, seemed a bit faster    |    134,474 messages    |
[   << oldest   |   < older   |   list   |   newer >   |   newest >>   ]
|    Message 133,694 of 134,474    |
|    Michael F. Stemper to Dan Purgert    |
|    Re: Unable to wget some pages    |
|    12 Mar 24 09:14:22    |
      From: michael.stemper@gmail.com              On 12/03/2024 04.24, Dan Purgert wrote:       > On 2024-03-11, Michael F. Stemper wrote:       >> On 11/03/2024 09.27, Dan Purgert wrote:       >>> On 2024-03-11, Michael F. Stemper wrote:       >>>> Late last week, a script that I have used for several years suddenly              >>>> Looking at the error message, one might think that this page/site       >>>> requires user login credentials. However, the same URL works just       >>>> fine in Firefox, with no login requested or required.       >>>       >>> Looks like the page *does* have a login button / javascript thing       >>> "somewhere" (at least I can see it when I open the page in lynx here).              >>> I'd imagine either       >>>       >>> (1) wget is respecting some robots.txt somewhere OR       >>> (2) wget is following that login link for some reason       >>       >> Any ideas how I could test for, or prevent, either of these?       >       > Potentially adding "-e robots=off" will avoid #1. More verbosity (-v) or       > turning on headers (-S?) may help for both as well.              No joy from robots=off, and wget's man page says that -v is the default.              But, I just tried with curl, and think that I've found a clue. Included       in what it downloaded was:        "Please enable JS and disable any ad blocker"              I'm not sure if it's possible for wget to fake having javascript, but       it seems as if that's the next place to look.       --       Michael F. Stemper       This sentence no verb.              --- SoupGate-Win32 v1.05        * Origin: you cannot sedate... all the things you hate (1:229/2)    |
[   << oldest   |   < older   |   list   |   newer >   |   newest >>   ]
(c) 1994, bbs@darkrealms.ca