hydroptic
@hydroptic@sopuli.xyz
- Comment on xkcd #2934: Bloom Filter 18 hours ago:
Well, yes and no. With a straight-up hash set, you’re keeping
set_size * bits_per_element
bits plus whatever the overhead of the hash table is in memory, which might not be tenable for very large sets, but with a Bloom filter that has eg. ~1% false positive rate and an ideal k parameter (number of hash functions, see eg. the Bloom filter wiki article) you’re only keeping ~10 bits per element completely regardless of element size because they don’t store the elements themselves or even their full hashes – they only tell you whether some element is probably in the set or not, but you can’t eg. enumerate the elements in the set. As an example of memory usage, a Bloom filter that has a false positive rate of ~1% for 500 million elements would need 571 MiB (noting that the false positive rate goes up the further you go beyond 500M elements.)Lookup time complexity for a Bloom filter is O(k) where k is the parameter I mentioned and a constant – ie. effectively O(1).
Probabilistic set membership queries are mainly useful when you’re dealing with ginormous sets of elements that you can’t shove into a regular in-memory hash set. A good example in the wiki article is CDN cache filtering:
Nearly three-quarters of the URLs accessed from a typical web cache are “one-hit-wonders” that are accessed by users only once and never again. It is clearly wasteful of disk resources to store one-hit-wonders in a web cache, since they will never be accessed again. To prevent caching one-hit-wonders, a Bloom filter is used to keep track of all URLs that are accessed by users. A web object is cached only when it has been accessed at least once before, i.e., the object is cached on its second request.
- Comment on xkcd #2934: Bloom Filter 20 hours ago:
Which example do you mean?
If you meant my user ID example, you’d prepopulate the bloom filter with existing user IDs on eg. service startup or whatever, and then update the filter every time a new user ID is added – keeping in mind that the false positive rate will grow as more are added, and that at some point you may need to create a new filter with a bigger backing bit array
- Comment on xkcd #2934: Bloom Filter 1 day ago:
That’s definitely not what they’re most useful for. I mean, you probably can use a bloom filter for implementing spell check, but saying that’s where they’re most useful severely misses the point of probabilistic set membership queries.
Bloom filters and their relatives are great when you have a huge set of values – eg. 100s of millions of user IDs in some database – and you want to have a very fast way of checking whether some value might be in that set, without having to query the database. Naturally this assumes that you’ve prepopulated a bloom filter with whatever values you need to be checking.
If the result of the bloom filter query is “nope”, you know that the value’s definitely not in the set, but if the result is “maybe” then you can go ahead and double-check by querying the database. This means that the vast majority of checks don’t have to hit that slow DB at all, and even though you’ll get some false positives this’ll still be much much much faster than having to go through that DB every time.
- Comment on xkcd #2934: Bloom Filter 1 day ago:
Do you want to learn about probabilistic data structures?
- maybe
- no
- Comment on Firefox version 126 introduces the collection of search data telemetry. 4 days ago:
And I am perfectly fine with that.
You wouldn’t be an asshole otherwise. Maybe some beautiful day you’ll realise being an insufferable twat might not be the best approach to life, but I’m not going to hold my breath.
- Comment on Firefox version 126 introduces the collection of search data telemetry. 4 days ago:
You’re not wrong, you’re just an asshole
- Submitted 2 weeks ago to science_memes@mander.xyz | 3 comments
- Comment on We can do all three things at once 2 weeks ago:
I absolutely would and have done so: www.vaneck.com/…/uranium-nuclear-energy-etf-nlr/
- Comment on It definitely *was* a good idea though 2 weeks ago:
I’ll call it the Me Prize
- Comment on It definitely *was* a good idea though 2 weeks ago:
Ah, a fun little joke I can do with the nitroglycerin I definitely don’t have because that would probably be illegal
- Submitted 2 weeks ago to science_memes@mander.xyz | 40 comments
- Comment on understanding 2 weeks ago:
Ooo, nice, thank you for the tip.
- Comment on Microsoft and IBM make MS-DOS 4.00 Open-Source 2 weeks ago:
Something being ancient and irrelevant tdoesn’t stop a lot of companies.
- Comment on The Tech Baron Seeking to “Ethnically Cleanse” San Francisco 2 weeks ago:
“Ethnically cleanse,” he said at one point, summing up his idea for a city purged of Blues (this, he says, will prevent Blues from ethnically cleansing the Grays first).
Conservatives are incredibly fucked up. They can’t fathom coexisting with people who aren’t like them without wanting to “ethnically cleanse” them, so they naturally assume everybody else thinks like this as well
- Comment on Microsoft and IBM make MS-DOS 4.00 Open-Source 2 weeks ago:
MIT license too, huh. I was sort of expecting a more restrictive one because, well, Microsoft and IBM
- Comment on Now all we need is a drink pairing guide 3 weeks ago:
Haven’t had the chance yet but I wouldn’t say no to licking uranium
- Submitted 3 weeks ago to science_memes@mander.xyz | 5 comments
- Comment on vengance 3 weeks ago:
Same, this was literally amazing. Here’s a LiveScience piece I found that’s a bit more in depth. It had a link to the article, and turns out it’s open access yay
- Comment on ‘Meta is out of options’: EU regulators reject its privacy fee for Facebook and Instagram 4 weeks ago:
lol tell us another one
- Comment on Dead satellites are filling space with trash. That could affect Earth’s magnetic field | Sierra Solter [Guardian, opinion] 4 weeks ago:
Based on your assumption that because screws, which notably aren’t in the ionosphere, don’t affect the magnetic field, neither will putting tons of conductive material in the ionosphere?
- Comment on Dead satellites are filling space with trash. That could affect Earth’s magnetic field | Sierra Solter [Guardian, opinion] 4 weeks ago:
Uh, how’s what you’re saying related to putting tons of metal in the ionosphere possibly being a bad idea?
- Dead satellites are filling space with trash. That could affect Earth’s magnetic field | Sierra Solter [Guardian, opinion]www.theguardian.com ↗Submitted 4 weeks ago to earthscience@mander.xyz | 4 comments
- Comment on ripperonis 5 weeks ago:
I don’t think that’s anthropomorphizing, just an awareness that animals are being literally killed for the research and being respectful of it
- Comment on Use TikTok to combat misinformation, MPs tell government 5 weeks ago:
The problem with this plan is that misinformation is much easier to produce and spreads faster because it doesn’t have to have even a passing resemblance to reality. You’re not going to be able to counter people spewing emotionally charged bullshit.
Well, that’s one of the problems anyhow…
- Comment on What will happen to large companies once poor people have no more money to use? 5 weeks ago:
Yeah that’s true, their revolution did fail in many ways and basically just turned into more of the same for a long time
- Comment on What will happen to large companies once poor people have no more money to use? 5 weeks ago:
Unless, of course, the French Revolution happens.
Unfortunately I think the only sort of revolution any country in the “West” is likely to see in the near future is a fascist one. Things haven’t been going too great for the past N+1 years, so morons are flocking to the extreme right in droves because they promise easy solutions to complex problems – namely murderizing the fuck out of anyone who’s not like them
- Comment on The Rogue Tesla Mechanic Resurrecting Salvaged Cars - Disregarding Tesla's prohibitive attitude towards third-party repairs, this man buys wrecked Tesla's and uses them to make working ones. 5 weeks ago:
“Rogue Tesla mechanic” gives me Harry Tuttle vibes
- Comment on Next on the hydraulic press channel! 5 weeks ago:
Yeah I came here to say that this is a fucking terrifying LOTO procedure that most likely will eventually kill someone
- Comment on Trust exercise 5 weeks ago:
Yeah the shame’s definitely on you
- Comment on How Hidden Nazi Symbols Were the Tip of a Toxic Iceberg at Life Is Strange Developer Deck Nine - IGN 1 month ago:
I have a sneaking suspicion that if I were to look at your past comments you’d prove my point for me. You might do it right in this thread for that matter