These two shitty companies deserve each other.
Google Is the Only Search Engine That Works on Reddit Now Thanks to AI Deal
Submitted 1 month ago by shish_mish@lemmy.world to technology@lemmy.world
https://www.404media.co/google-is-the-only-search-engine-that-works-on-reddit-now-thanks-to-ai-deal/
Comments
MyOpinion@lemm.ee 1 month ago
BearOfaTime@lemm.ee 1 month ago
Excellent! Nowni won’t get reddit results and then have to filter them out!
MehBlah@lemmy.world 1 month ago
Sounds great to me. With reddit gone maybe we can start to find what we are looking for without having to go sort through reddit.
tal@lemmy.today 1 month ago
Kagis has a “search lens” specifically to search the Threadiverse. Like, they track lemmy/kbin/etc sites and you can specifically include them in their own results section, and can have a ‘!threadiverse’ or whatever you want specifically search that.
They do the same for Usenet.
I suppose, given this new robots.txt Reddit development, that they’ll probably never have a Reddit lens, though.
zutto@lemmy.fedi.zutto.fi 1 month ago
Kagi is a metasearch-engine (apart from their homebrew small-web index, known as Teclis), so the reddit lenses will continue to function long as one of the search engines it’s querying is paying reddit.
BrianTheeBiscuiteer@lemmy.world 1 month ago
Thank you Lemmy, for making it so much easier to walk away from that dumpster fire!
maxenmajs@lemmy.world 1 month ago
Alright then. The 3rd party app drama already pushed me here. I really won’t go back for anything if I’m not allowed to search for Reddit anymore.
Petter1@lemm.ee 1 month ago
This seems illegal to me 😮
Reverendender@sh.itjust.works 1 month ago
Is Google really permitted to prevent any other search engine from looking at Reddit?
Evotech@lemmy.world 1 month ago
I guess Reddit is permitted to only let Google index it
helenslunch@feddit.nl 1 month ago
How can they do that, logistically?
Like I realize there’s a flag they can raise that asks not to be indexed but that’s not necessarily a legal requirement.
moe90@feddit.nl 1 month ago
just begin with site:reddit.com test for brave search and it still works
itslilith@lemmy.blahaj.zone 1 month ago
did you set time limit to last week? old posts are still indexed. just tried “site:reddit.com df:w” on DDG and no hits
pipows@lemmy.today 1 month ago
I can confirm, it works just fine on brave search. I set the time limit to yesterday and it still gave me results
z3rOR0ne@lemmy.ml 1 month ago
Well that’s annoying. One work around is to use a redirect extension like Libredirect and you can still search via the !reddit bang on DuckDuckGo. Thusly if I type into my search bar which has DuckDuckGo as default:
!reddit some new post or topic
, it will search reddit for the search term, then when it attempts to load the reddit page, the libredirect extension will redirect and show the results.Requires a bit of configuring and sure is annoying, but hey, no Google search necessary to get the up to date reddit threads.
woelkchen@lemmy.world 1 month ago
test site:reddit.com
works fine from DDG for me.tal@lemmy.today 1 month ago
Older results will still show up, but these search engines are no longer able to “crawl” Reddit, meaning that Google is the only search engine that will turn up results from Reddit going forward.
Robots.txt lets you ask specific user-agents not to index the site. My guess is that that’s how they restricted it. I don’t know how those changes are reflected in existing indexed pages – don’t know if there’s any standard there – but it’ll stop crawlers from examining new pages.
squidspinachfootball@lemm.ee 1 month ago
iirc, isn’t robots.txt more of a gentlemen’s agreement? I vaguely recall bots being able to crawl a site regardless, it’s just that most devs respect robots.txt and don’t. Could be wrong though, happy to be corrected.
eager_eagle@lemmy.world 1 month ago
User-Agent: bender Disallow: /my_shiny_metal_ass
itslilith@lemmy.blahaj.zone 1 month ago
set the date filter to something recent,
test site:reddit.com df:w
(results from last week only) gives 0 hours hits
upside431@lemmy.world 1 month ago
This should be allowed
hahattpro@lemmy.world 1 month ago
And Brave Search
Brewchin@lemmy.world 1 month ago
Parts of the Internet now only searchable on specific sites now? What next - charging a monthly subscription to use Google?
This needs to be regulated before the Internet becomes like streaming TV.
tal@lemmy.today 1 month ago
Robots.txt has been around for a long time, and all the major search engines will honor it. Not having a full index of the Web is the norm.
That isn’t to say that the practice of signing agreements isn’t potentially a concern.
reddig33@lemmy.world 1 month ago
What isn’t the norm is to serve one robots.txt to one company, and a different robots.txt to everyone else. Which is what Reddit is doing here.