What I do right now is I have a rclone sidecar container that uploads files in a directory every few seconds, and I also have another init sidecar that runs before the main application and downloads those files (incl sqlite dbs) to the normal disk. This works okay but feels pretty clunky and can still result in stuff getting corrupted because I’m just backing up the db files and not using any sqlite commands to actually back up the db to another file that isn’t in-use first.
How do you handle a job going from one nomad node to another? Or do you pin jobs like grafana to specific hosts?
nopersonalspace@lemmy.world 9 months ago
That’s an interesting issue. Do you think the problem would be the same for any CSI plugin? I’m thinking of using my NAS as the storage brains of the operation and hooking it up with NFS or something, but would that have issues with stateful stuff like DB’s too?
nico@r.dcotta.eu 9 months ago
I have never used NFS, but I think it would fare much better than seaweedfs because it uses Fuse to implement CSI. So for NFS I am sure the protocol would consider half-assed writes
No, it would depend on the CSI plugin and how it is implemented. Ceph for example I know it has several, and cloud providers offer CSI volumes for their block storage (AWS EBS, GCP PD), and they will all perform differently. See this comment from a seaweedfs issue:
I found it was easier to make recoverable, backed up, host volumes than to make DBs run on high availability filesystems like seaweedfs (I admit I have not tried Ceph - the deployment looked a bit complicated/overkill for a homelab).
Postgres and sqlite are just not made for that environment. To run a high-availability DB, it is better to run a distributed DB made for that (think etcd, cassandra) than to run a non-distributed DB on top of a distributed filesystem.
Good luck! :)