Working with this is pretty painful, so I convert the Pickled structure to other formats including JSON.
The file has always been prettified around ~500MB but as of recently expands to about 3GB I think because they’ve added extra regional parameters.
The file inflates to a large size because Pickle refcounts objects for deduping, whereas obviously that’s lost in JSON.
I care about speed and tools not choking on the large inputs so I use jaq for querying and instruction LLMs operating on the data to do the same.