These two shitty companies deserve each other.
Google Is the Only Search Engine That Works on Reddit Now Thanks to AI Deal
Submitted 3 months ago by shish_mish@lemmy.world to technology@lemmy.world
https://www.404media.co/google-is-the-only-search-engine-that-works-on-reddit-now-thanks-to-ai-deal/
Comments
MyOpinion@lemm.ee 3 months ago
BearOfaTime@lemm.ee 3 months ago
Excellent! Nowni won’t get reddit results and then have to filter them out!
MehBlah@lemmy.world 3 months ago
Sounds great to me. With reddit gone maybe we can start to find what we are looking for without having to go sort through reddit.
tal@lemmy.today 3 months ago
Kagis has a “search lens” specifically to search the Threadiverse. Like, they track lemmy/kbin/etc sites and you can specifically include them in their own results section, and can have a ‘!threadiverse’ or whatever you want specifically search that.
They do the same for Usenet.
I suppose, given this new robots.txt Reddit development, that they’ll probably never have a Reddit lens, though.
zutto@lemmy.fedi.zutto.fi 3 months ago
Kagi is a metasearch-engine (apart from their homebrew small-web index, known as Teclis), so the reddit lenses will continue to function long as one of the search engines it’s querying is paying reddit.
BrianTheeBiscuiteer@lemmy.world 3 months ago
Thank you Lemmy, for making it so much easier to walk away from that dumpster fire!
maxenmajs@lemmy.world 3 months ago
Alright then. The 3rd party app drama already pushed me here. I really won’t go back for anything if I’m not allowed to search for Reddit anymore.
Petter1@lemm.ee 3 months ago
This seems illegal to me 😮
Reverendender@sh.itjust.works 3 months ago
Is Google really permitted to prevent any other search engine from looking at Reddit?
Evotech@lemmy.world 3 months ago
I guess Reddit is permitted to only let Google index it
helenslunch@feddit.nl 3 months ago
How can they do that, logistically?
Like I realize there’s a flag they can raise that asks not to be indexed but that’s not necessarily a legal requirement.
moe90@feddit.nl 3 months ago
just begin with site:reddit.com test for brave search and it still works
itslilith@lemmy.blahaj.zone 3 months ago
did you set time limit to last week? old posts are still indexed. just tried “site:reddit.com df:w” on DDG and no hits
pipows@lemmy.today 3 months ago
I can confirm, it works just fine on brave search. I set the time limit to yesterday and it still gave me results
z3rOR0ne@lemmy.ml 3 months ago
Well that’s annoying. One work around is to use a redirect extension like Libredirect and you can still search via the !reddit bang on DuckDuckGo. Thusly if I type into my search bar which has DuckDuckGo as default:
!reddit some new post or topic
, it will search reddit for the search term, then when it attempts to load the reddit page, the libredirect extension will redirect and show the results.Requires a bit of configuring and sure is annoying, but hey, no Google search necessary to get the up to date reddit threads.
woelkchen@lemmy.world 3 months ago
test site:reddit.com
works fine from DDG for me.tal@lemmy.today 3 months ago
Older results will still show up, but these search engines are no longer able to “crawl” Reddit, meaning that Google is the only search engine that will turn up results from Reddit going forward.
Robots.txt lets you ask specific user-agents not to index the site. My guess is that that’s how they restricted it. I don’t know how those changes are reflected in existing indexed pages – don’t know if there’s any standard there – but it’ll stop crawlers from examining new pages.
squidspinachfootball@lemm.ee 3 months ago
iirc, isn’t robots.txt more of a gentlemen’s agreement? I vaguely recall bots being able to crawl a site regardless, it’s just that most devs respect robots.txt and don’t. Could be wrong though, happy to be corrected.
eager_eagle@lemmy.world 3 months ago
User-Agent: bender Disallow: /my_shiny_metal_ass
itslilith@lemmy.blahaj.zone 3 months ago
set the date filter to something recent,
test site:reddit.com df:w
(results from last week only) gives 0 hours hits
upside431@lemmy.world 3 months ago
This should be allowed
hahattpro@lemmy.world 3 months ago
And Brave Search
Brewchin@lemmy.world 3 months ago
Parts of the Internet now only searchable on specific sites now? What next - charging a monthly subscription to use Google?
This needs to be regulated before the Internet becomes like streaming TV.
tal@lemmy.today 3 months ago
Robots.txt has been around for a long time, and all the major search engines will honor it. Not having a full index of the Web is the norm.
That isn’t to say that the practice of signing agreements isn’t potentially a concern.
reddig33@lemmy.world 3 months ago
What isn’t the norm is to serve one robots.txt to one company, and a different robots.txt to everyone else. Which is what Reddit is doing here.