Open Menu
AllLocalCommunitiesAbout
lotide
AllLocalCommunitiesAbout
Login

Making a Blocklist to Remove Spam from Search Engines

⁨140⁩ ⁨likes⁩

Submitted ⁨⁨8⁩ ⁨months⁩ ago⁩ by ⁨popcar2@programming.dev⁩ to ⁨technology@lemmy.world⁩

https://popcar.bearblog.dev/making-badwebsiteblocklist

source

Comments

Sort:hotnewtop
  • blindbunny@lemmy.ml ⁨8⁩ ⁨months⁩ ago

    At this point should we really be using these search engines? Feels like you’re just doing their work for them.

    source
    • Paradox@lemdro.id ⁨8⁩ ⁨months⁩ ago

      Very large part of why I moved away to kagi. It just works.

      Additionally, it’s system of weighting, instead of just a binary block, is very useful. Take fandom wikis for example. They’re awful, yes, but sometimes they’re the only result for a topic, and will do if needed. With a binary block list, you either see them or you don’t. With the weighted system, you can downrank them, so if better results show up, they appear higher in the listing than the downranked ones

      source
    • NutWrench@lemmy.world ⁨8⁩ ⁨months⁩ ago

      This. I think the best way to make the Internet less sh*tty is to get away from Google search.

      I like the SearX search engine. It gives old-school, relevant search results, not google ranked ones.

      search.inetol.net

      It’s also spread out over many separate instances, so you can pick the one that best suits your search needs:

      searx.space

      source
  • LedgeDrop@lemm.ee ⁨8⁩ ⁨months⁩ ago

    Fantastic! Thank you for sharing this.

    I have it installed, I’m curious how effective it will be.

    Lately, I’ve been reporting AI generated cruft as “spam” to duckduckgo. In fact, it’s not really spam - as there are some nuggets of useful information, but so sparse, I’d rather of skipped the article/website entirely. I hope these kind of Blocklists will evolve to include this kind of quasi-spam.

    source
    • Valmond@lemmy.world ⁨8⁩ ⁨months⁩ ago

      I bet we could make an effective blocklist just by discarding all websites with more than 200 words on them.

      source
  • Zarxrax@lemmy.world ⁨8⁩ ⁨months⁩ ago

    This looks like it has some potential. I’ll probably give this a try.

    source
  • Rambomst@lemmy.world ⁨8⁩ ⁨months⁩ ago

    Thanks for this, I’ll give it a go.

    source