Smokey’s Simple Guide To Search Engine Alternatives
This post was inspired by the surge in people mentioning the new Kagi Search engine on various lemmy comments. I happen to be somewhat knowledgeable on the topic and wanted to tell everyone about some other alternative search engines available to them, as well as the difference between meta-search engines and true search engines.
Understanding Search Engines Vs. Meta-Search Engines
There are many alternative search engines floating around that people use, however most of them are meta search engines. Meaning that they are a kind of search result reseller, middle men to true search engines. They query the big engines for you and aggregate their results.
Examples of Meta-search engines:
Duckduckgo: bing, Ecosia: Bing + Google Kagi(kagi.com): Google, Mojeek, Yandex, Marginalia, 10$/month premium for unlimited searches & email signup SearXNG:Too many to list, basically all of them, configurable, Free & Open Source Software AGPL-3.0 Startpage: Google + Bing Swisscows: Bing Qwant:bing
##True Search Engines & The Realities Of Web-Crawling As you can see, the vast majority of alternative search engines rely on some combination of Google and Bing. The reason for this is that the technology which powers search engines, webcrawling and indexing, are extremely computationally heavy and non-trivial things. They require powerful enterprise machines and the more popular the service (as in the more people connecting to and using it per second) the more internet bandwith and processing power is needed. It takes a lot of money to pay for power, maintenance, and development/security.
Examples of True Search Engines:
Bing: Owned by Microsoft Google: Owned by Google/Alphabet Mojeek Owned by Mojeek LLC Yandex: Owned by Yandex Inc YaCy: Free & Open Source Software [GPL-2.0]www.gnu.org/licenses/…/gpl-2.0.en.html), powered by peer to peer technology, created by Michael Christen, Marginalia Search: Free & Open Source Software AGPL-3.0, developed by Marginalia/ Martin Rue
This is a big financial ask for most companies interested in making a profit out of the gate, they determine its worth just paying Google and Bing for access to their enormous pre-existing infrastracture without the headaches of dealing with maintenance and security risk.
#About Some Of The Free & Open Source Search Engines SearX/SearXNG is a free and open source, highly customizable, and self-hostable meta search engine. SearX instances act as a middle man, they query other search engines for you, stripping all their spyware ad crap and never having your connection touch their servers. ##Decentralization Federation and Instances As Lemmy users, you should at least vaugely understand the power of a decentralized service spread out among many individually operated/maintained instances that can cooperate with eachother.
Companies that live and die by profit margins have to concern themselves with the choice of owning their own massive computer infastractures or paying for lots of access to someone elses. Decentralized free and open source search engines that can be self-hosted by thousands of computer hobbiest from around the world can spread collective user load and eat the meager maintenance cost of hosting an instance on a old desktop computer or extremely energy efficent Single Board Computer.
Additionally the spread of users across multiple instances helps prevent any one of them from exceeding the free/cheap allotment of api calls in the case of meta-search engines like SearXNG or being rate limited like 3rd party youtube scrapers such as Invidious and Piped.
In the case of YaCy decentralization is also federated, all individual yacy instances communicate with eachother through p2p to act as one big collective web crawler and indexer making it the only true independent FOSS search engine I know of.
##Self-Hosting For Maximum Privacy Of course you have to trust the SearX instance provider with your information. Trust is a big concern with every engine you use, because while they can promise to not log anything or sell your info for profit, they often provide no way of proving those claims to be true beyond ‘just trust me bro’. If you are absolutely concerned with privacy and knowledgeable with computers then self hosting FOSS software from your own instance is the best option to maintain control of your data.
Free As In Freedom, People vs Company Run Services
I personally trust some foss loving sysadmin that host social services for free out of alturism, who also accepts hosting donations, whos server is located on the other side of the planet, with my query info over Google/Alphabet any day. I have had several communications with Marginalia over several years now through the gemini protocol and small web, they are more than happy to talk over email. have a human conversation with your search engine provider thats just a knowlegable every day joe who genuinely believes in the project and freely dedicates their resources to it. Consider sending some cash their way to help with upkeep if you like the services they provide.
Heres a list of all public searx instances, I personally prefer to use paulgo.io All SearX instances are configured different to index different engines. If one doesn’t seem to give good results try a few others.
Did I mention it has bangs like duckduckgo? If you really need google like for maps and buisness info just use !!g in the query
[search.marginalia.nu](Marginalia Search) is a completely novel search engine written and hosted by one dude that aims to prioritize indexing lighter websites little to no javascript as these tend to be personal websites and homepages that have poor SEO and the big search engines won’t index well. If you remember the internet of the early 2000s and want a nostalgia trip this ones for you. Its also open source and self-hostable
Finally, YaCy is another completely novel search engine that uses peer-to-peer technology to power a big webcrawler which prioritizes indexes based off user queries and feedback. Everyone can download yacy and devote a bit of their computing power to both run their own local instance and help out a collective search engine. Companies can also download yacy and use it to index their private intranets.
They have a public instance available through a web portal. To be upfront, YaCy is not a great search engine for what most people usually want, which is quick and relevant information within the first few clicks. But, it is an interesting use of technology and what a true honest-to-god community-operated search engine looks like untainted by SEO scores or corporate money-making shenanigans.
I hope this has been informative to those who believe theres only a few options to pick from, I know these options are so unknown to most people.
Xavier@lemmy.ca 1 year ago
Excellent writeup! With constant updates to boot 🥳
I’m saving it for future reference.
Thank you for putting your time and effort on this.