Rsync, syncthing, backups, mp3s, photos, json files; idk, a lot of tasks involve large amounts of small files. I personally ran into this problem training models on millions of photos. My GPUs would only get up to 25% utilization with mirrored HDDs, so I had to switch to SSDs.
sobchak@programming.dev 1 day ago
HDDs have horrible random access times, so if you need to process or just copy a lot of small files, say photos, there’s a significant penalty.
Gladaed@feddit.org 1 day ago
Ok, but what are they doing that moves loads of random files?
sobchak@programming.dev 1 day ago
Rsync, syncthing, backups, mp3s, photos, json files; idk, a lot of tasks involve large amounts of small files. I personally ran into this problem training models on millions of photos. My GPUs would only get up to 25% utilization with mirrored HDDs, so I had to switch to SSDs.
Gladaed@feddit.org 1 day ago
Why are you doing that on a network storage as opposed to on device?