Yes, alert me when disk space is about to run out so I can ask for a massive raise and quit my job when they dont give it to me.
Then when TSHTF they pay me to come back.
Comment on All of Japan's Toyota Assembly Plants Shut Down for a Day Because Their Server Ran Out of Disk Space
Dkarma@lemmy.world 1 year agoThe answer here is not storage it is better alerting.
Yes, alert me when disk space is about to run out so I can ask for a massive raise and quit my job when they dont give it to me.
Then when TSHTF they pay me to come back.
There’s cases where disk fills up quicker than one can reasonably react, even if alerts are in place. And sometimes culprit is something you can’t just go and kill.
And sometimes culprit is something you can’t just go and kill.
That’s what the Yakuza is for.
Had an issue like that a few years back. A stand alone device that was filling up quickly. The poorly designed device could only be flushed via USB sticks. I told them that they had to do it weekly. Guess what they didn’t do. Looking back I should have made it alarm and flash once a week on a timer.
nickhammes@lemmy.world 1 year ago
Why not both? Alerting to find issues quickly, a bit of extra storage so you have more options available in case of an outage, and maybe some redundancy for good measure.
RupeThereItIs@lemmy.world 1 year ago
A system this critical is on a SAN, if you’re properly alerting adding a bit more storage space is a 5 minute task.
It should also have a DR solution, yes.
nightwatch_admin@feddit.nl 1 year ago
A system this critical is on a hypervisor with tight storage “because deduplication” (I’m not making this up).
RupeThereItIs@lemmy.world 1 year ago
This is literally what I do for a living. Yes deduplication and thin provisioning.
This is still a failure of monitoring or slow response to it.
You keep your extra capacity handy on the storage array, not with some junk files on the filesystem.
You also need to know how over provisioned you are and when you’re likely to run out of capacity… you know this from monitoring.
Then when management fails to react promptly to your warnings. Shit like this happens.