If possible convert those files to compressed parquet, and apply sorting and partitioning to them.
I’ve gotten 10-100gb csv files down to 300-5gb sizes just by doing that
That makes searching and scanning so much faster, and you can do this all with open source free software like polars and ibis.
Treczoks@lemmy.world 9 months ago
This depends on what you are actually looking for, and how you are looking for it.
Do you really need pattern matching, or do you only look for fixed strings? Then other tools may be faster.
If you need case independent search on an upper- and lowercase data set, make a copy that is all upper or all lower, and search there.
If you only search in certain columns, make a copy that only includes these.
Or import the data into a database.