For context I created a video search engine last year, I shut it down and put the data online. You can read about it here: bendangelo.me/…/failed-attempt-at-creating-a-vide…
I put that project on hold because of scaling issues, anyway I’m back with an other idea. I’ve been frustrated with how AI slop is ruining the internet and recently it’s been hitting Youitube pretty hard with AI videos. I’m brainstorming a tool for people to selfhost:
Self-hosted crawler: Pick which sites/videos to index (blogs, forums, YT channels, etc.). AI chat interface: Ask questions like, “Show me Rust tutorials from 2023” or “Summarize recent posts about homelab backups.” Optional sharing: Pool indexes with trusted friends/communities.
Why? No Google/YouTube spam—only content you choose. Works offline (archive forums, videos, docs). Local AI (Mistral) or cloud (paid) for smarter searches.
Would this be useful to you? What sites would you crawl? Any killer features I’m missing?
Prototype in progress—just testing interest!
CameronDev@programming.dev 3 days ago
I personally have zero interest in AI search, if you mean LLM. The fact that it can make stuff up, also means it can miss stuff as well. Neither are acceptable for a search engine.
If you mean some kind of deterministic algorithm for indexing and searching, then maybe.
Also, attempting to crawl sites locally sounds like a great way to get banned from those sites for looking like a bot.
T156@lemmy.world 3 days ago
I can’t imagine self hosting an LLM-based search engine would be too viable. The hardware demands, even for a relatively small quantised model, are considerable. Doubly so if you don’t have a GPU to accelerate with.
CameronDev@programming.dev 3 days ago
Yeah, absolutely. And running a GPU 24/7 to occasionally search is just a waste of power. I’m not convinced that google and bings AI search makes financial sense either, Google dropped live search (where the results updated as you typed realtime) because it was too expensive, how does LLM search end up cheaper than live search?!
JeremyHuntQW12@lemmy.world 2 days ago
You can run Deepseek on a Raspberry Pi.