Comment

Comment on Google Search is losing its 'cached' web page feature

Raiderkev@lemmy.world ⁨1⁩ ⁨year⁩ ago

Without getting into too much detail, a cached site saved my ass in a court case. Fuck you Google.

source

Sort:hotnew top

lud@lemm.ee ⁨1⁩ ⁨year⁩ ago
It sucks because it’s sometimes (but not very often) useful but it’s not like they are under any obligation to support it or are getting any money from doing it.

source
- modus@lemmy.world ⁨1⁩ ⁨year⁩ ago
  Isn’t caching how anti-paywall sites like 12ft.io work?
  
  source
  - megaman@discuss.tchncs.de ⁨1⁩ ⁨year⁩ ago
    At least some of these tools change their “user agent” to be whatever google’s crawler is.
    
    When you browse in, say, Firefox, one of the headers that firefox sends to the website is “I am using Firefox” which might affect how the website should display to you or let the admin knkw they need firefox compatibility (or be used to fingerprint you…).
    
    You can just lie on that, though. Some privacy tools will change it to Chrome, since that’s the most common.
    
    Or, you say “i am the google web crawler”, which they let past the paywall so it can be added to google.
    
    source
    sfgifz@lemmy.world ⁨1⁩ ⁨year⁩ ago
    
    Or, you say “i am the google web crawler”, which they let past the paywall so it can be added to google.
    
    If in not wrong, Google has a set range of IP addresses for their crawerls, so not all sites will let you through just because your UA claims to be Googlebot
    
    source
  - lud@lemm.ee ⁨1⁩ ⁨year⁩ ago
    I dunno, but I suspect that they aren’t using Google’s cache if that’s the case.
    
    My guess is that the site uses its own scrapper that acts like a search engine and because websites want to be seen to search engines they allow them to see everything. This is just my guess, so it might very well be completely wrong.
    
    source
Tangent5280@lemmy.world ⁨1⁩ ⁨year⁩ ago
Would you be willing to share more? It’s fine if you don’t want to, I wouldn’t either.

source
- Raiderkev@lemmy.world ⁨1⁩ ⁨year⁩ ago
  No, it was pretty personal, and also a legal matter, so I gotta take the high road.
  
  source
  - verity_kindle@sh.itjust.works ⁨1⁩ ⁨year⁩ ago
    Respect for your discretion.
    
    source
Flax_vert@feddit.uk ⁨1⁩ ⁨year⁩ ago
Need the tea!!!

source
drislands@lemmy.world ⁨1⁩ ⁨year⁩ ago
Was that not something the Wayback Machine could have solved?

source
- icedterminal@lemmy.world ⁨1⁩ ⁨year⁩ ago
  Depends. Not every site, or its pages, will be crawled by the Internet Archive. Many pages are available only because someone has submitted it to be archived. Whereas Google search will typically cache after indexed.
  
  source