Comment

Comment on Western Digital details 14-platter 3.5-inch HAMR HDD designs with 140 TB and beyond

At some point a drive can get TOO big

I was thinking the same. I would hate to toast a 140 TB drive. I think I’d just sit right down and cry. I’ll stick with my 10 TB drives.

source

Sort:hotnew top

rtxn@lemmy.world ⁨2⁩ ⁨weeks⁩ ago
This is not meant for human beings. A creature that needs over 140 TB of storage in a single device can definitely afford to run them mirrored with hot swaps.

source
- thejml@sh.itjust.works ⁨2⁩ ⁨weeks⁩ ago
  Rebuild time is the big problem with this in a RAID Array. The interface is too slow and you risk losing more drives in the array before the rebuild completes.
  
  source
  - rtxn@lemmy.world ⁨2⁩ ⁨weeks⁩ ago
    Realistically, is that a factor for a Microsoft-sized company, though? I’d be shocked if they only had a single layer of redundancy. Whatever they store is probably replicated between hosts and datacenters several times, to the point where losing an entire RAID array (or whatever media redundancy scheme they use) is just a small inconvenience.
    
    source
    enumerator4829@sh.itjust.works ⁨2⁩ ⁨weeks⁩ ago
    Fairly significant factor when building really large systems. If we do the math, there ends up being some relationships between
    
    disk speed
    
    targets for ”resilver” time / risk acceptance
    
    disk size
    
    failure domain size (how many drives do you have per server)
    
    network speed
    
    Basically, for a given risk acceptance and total system size there is usually a sweet spot for disk sizes.
    
    Say you want 16TB of usable space, and you want to be able to lose 2 drives from your array (fairly common requirement in small systems), then these are some options:
    
    3x16TB triple mirror
    
    4x8TB Raid6/RaidZ2
    
    6x4TB Raid6/RaidZ2
    
    The more drives you have, the better recovery speed you get and the less usable space you lose to replication. You also get more usable performance with more drives. Additionally, smaller drives are usually cheaper per TB (down to a limit).
    
    This means that 140TB drives become interesting if you are building large storage systems (probably at least a few PB), with low performance requirements (archives), but there we already have tape robots dominating.
    
    The other interesting use case is huge systems, large number of petabytes, up into exabytes. More modern schemes for redundancy and caching mitigate some of the issues described above, but they are usually onlu relevant when building really large systems.
    
    tl;dr: arrays of 6-8 drives at 4-12TB is probably the sweet spot for most data hoarders.
    
    source
    brygphilomena@lemmy.dbzer0.com ⁨2⁩ ⁨weeks⁩ ago
    I’d imagine they are using ceph or similar.
    
    You have disk level protection for servers. Server level protection for racks. Rack level protection for locations. Location level protection for datacenters. Probably datacenter level protections for geographic regions.
    
    It’s fucking wild when you get to that scale.
    
    source
    thejml@sh.itjust.works ⁨2⁩ ⁨weeks⁩ ago
    True, but that’s going to really be pushing your network links just to recover. Realistically, something like ZFS or a RAID-6 with extra hot spares would help reduce the risks, but it’s still a non trivial amount of time. Not to mention the impact to normal usage during that time period.
    
    source
    -> View More Comments
- MonkeMischief@lemmy.today ⁨2⁩ ⁨weeks⁩ ago
  
  This is not meant for human beings.
  
  This is for like, Smaug but if he hoarded classic anime and the entirety of Steam or something. Lol
  
  source
gravitas_deficiency@sh.itjust.works ⁨2⁩ ⁨weeks⁩ ago
Yeah I’m running 16s and that’s pushing it imo

source