Actually, I'm not an expert in this area, but I feel the challenge may not lie in data collection itself, but rather in ensuring the data remains secure, usable, and easy to maintain over many years.
A custom binary format can work, but it could be a long-term maintenance commitment (schema evolution, tooling, corruption recovery).