The one-liner:
dd if=/dev/zero bs=1G count=10 | gzip -c > 10GB.gz
This is brilliant.
Submitted 5 hours ago by some_guy@lemmy.sdf.org to technology@lemmy.world
https://idiallo.com/blog/zipbomb-protection
The one-liner:
dd if=/dev/zero bs=1G count=10 | gzip -c > 10GB.gz
This is brilliant.
At least in germany having one of these on your system is illegal
When I was serving high volume sites (that were targeted by scrapers) I had a collection of files in CDN that contained nothing but the word “no” over and over. Scrapers who barely hit our detection thresholds saw all their requests go to the 50M version. Super aggressive scrapers got the 10G version. And the scripts that just wouldn’t stop got the 50G version.
It didn’t move the needle on budget, but hopefully it cost them.
How do you tell scrapers from regular traffic?
Most often because they don’t download any of the css of external js files from the pages they scrape. But there are a lot of other patterns you can detect once you have their traffic logs loaded in a time series database. I used an ELK stack back in the day.
First off, be very careful with bs=1G
as it may overload the RAM. You will want to set count
accordingly
Yup, use something sensible like 10M or so.
Before I tell you how to create a zip bomb, I do have to warn you that you can potentially crash and destroy your own device.
LOL. Destroy your device, kill the cat, what else?
destroy your device by… having to reboot it. the horror! The pain! The financial loss of downtime!
It’ll email your grandmother all if your porn!
Anyone who writes a spider that’s going to inspect all the content out there is already going to have to have dealt with this, along with about a bazillion other kinds of oddball or bad data.
Competent ones, yes. Most developers aren’t competent, scraper writers even less so.
And if you want some customisation, e.g. some repeating string over and over, you can use something like this:
yes "b0M" | tr -d '\n' | head -c 10G | gzip -c > 10GB.gz
yes
repeats the given string (followed by a line feed) indefinitely - originally meant to type “yes” + ENTER into prompts. tr
then removes the line breaks again and head
makes sure to only take 10GB and not have it run indefinitely.
If you want to be really fancy, you can even add some HTML header and footer to some files like header
and footer
and then run it like this:
yes "b0M" | tr -d '\n' | head -c 10G | cat header - footer | gzip -c > 10GB.gz
Funny part is I was using derivatives of this decades ago to test RAID-5/6 sequencial reads and write speeds.
Interesting. I wonder how long it takes until most bots adapt to this type of “reverse DoS”.
Then we’ll just be more clever as well. It’s an arms race after all.
How I read that code:
“If the dev’s bullshit is equal to 1 gram…”
aesthelete@lemmy.world 49 minutes ago
This reminds me of shitty FTP sites with ratio when I was on dial-up. I used to push them files full of null characters with good filenames. The modem would compress the upload as it transmitted it which allowed me to upload the junk files at several times the rate of a normal file.