Comment on [deleted]

wise_pancake@lemmy.ca ⁨1⁩ ⁨week⁩ ago

If possible convert those files to compressed parquet, and apply sorting and partitioning to them.

I’ve gotten 10-100gb csv files down to 300-5gb sizes just by doing that

That makes searching and scanning so much faster, and you can do this all with open source free software like polars and ibis.

source
Sort:hotnewtop