I think the counter argument here is that you’re now including a CSV decoder in ...

catlifeonmars · 2025-10-02T05:08:28 1759381708

> How many different storage format implementations will there realistically be?

Apparently an infinite number, if we go with the approach in the paper /s

magicalhippo · 2025-10-02T08:29:06 1759393746

It does open up the possibility for specialized compressors for the data in the file, which might be interesting for archiving where improved compression ratio is worth a lot.

catlifeonmars · 2025-10-02T17:17:34 1759425454

That makes sense. I think fundamentally you’re trading off space between the compressed data and the lookup tables stored in your decompression code. I can see that amortizing well if the compressed payloads are large or if there are a lot of payloads with the same distribution of sequences though.