Comment on CrowdStrike downtime apparently caused by update that replaced a file with 42kb of zeroes
CeeBee_Eh@lemmy.world 3 months agoSure, for a personal PC I would not necessarily want a BSOD, I’d prefer if it just booted and alerted the user. But for enterprise servers? Best not.
You have that backwards. I work as a dev and system admin for a medium sized company. You absolutely do not want any server to ever not boot. You absolutely want to know immediately that there’s an issue that needs to be addressed ASAP, but a loss of service generally means loss of revenue and, even worse, a loss of reputation. If you server is briefly at a lower protection level that’s not an issue unless you’re actively being targeted and attacked. But if that’s the case then getting notified of an issue can get some people to deal with it immediately.
ChairmanMeow@programming.dev 3 months ago
A single server not booting should not usually lead to a loss of service as you should always run some sort of redundancy.
I’m a dev for a medium-sized PSP that due to our customers does occasionally get targetted by malicious actors, including state actors. We build our services to be highly available, e.g. a server not booting would automatically do a failover to another one, and if that fails several alerts will go off so that the sysadmins can investigate.
Temporary loss of service does lead to reputational damage, but if contained most of our customers tend to be understanding. However, if a malicious actor could gain entry to our systems the damage could be incredibly severe (depending on what they manage to access of course), so much so that we prefer the service to stop rather than continue in a potentially compromised state. What’s worse: service disrupted for an hour or tons of personal data leaked?
Of course, your threat model might be different and a compromised server might not lead to severe damage. But Crowdstrike/Microsoft/whatever may not know that, and thus opt for the most “secure” option, which is to stop the boot process.