Comment

Comment on Best web archiving software for complex sites and sites requiring logins?

There's a "philosopher" who the far-right techbro-oligarchs rely on, whose blog is grey-something-or-other..

I tried using wget & there's a bug or something in the site, so it keeps inserting links-to-other-sites into uri's, so you get bullshit like

grey-something-or-other.substack.com/e/b/a/http://en.wikipedia.org/wiki/etc..

The site apparently works for the people who browse it, but wget isn't succeeding in just cloning the thing.

I want the items that the usable-site is made-of, not endless-failed-requests following recursive errors, forever..

Apparently one has to be ultra-competent to be able to configure all the disincludes & things in the command-line-switches, to get any particular site dealt-with by wget.

Sure, on static-sites it's magic, but on too many sites with dynamically-constructed portions of themselves, it's a damn headache, at times..

_ /\ _

source

Sort:hotnew top

Xanza@lemm.ee ⁨8⁩ ⁨months⁩ ago
That’s not a bug. You literally told wget to follow links, so it did.

source
- Paragone@piefed.social ⁨8⁩ ⁨months⁩ ago
  There ought be a do not follow recursive links switch for it, Hoomin..
  
  _ /\ _
  
  source
  - Xanza@lemm.ee ⁨8⁩ ⁨months⁩ ago
    There is. wget doesn’t follow recursive links by default. If it is, you’re using an option which is telling it to…
    
    source