Comment on [deleted]

<- View Parent
Jason2357@lemmy.ca ⁨1⁩ ⁨week⁩ ago

Parquet is great, especially if there is some reasonable way of partitioning records - for example, by month or year - if you might need to only search 2024 or something like that. Parquet is great for only needing to I/O the specific variables you are concerned with, and if you can partition the records and only subset a fraction of them, operations can be extremely efficient.

source
Sort:hotnewtop