Preferred
1. Platform-independent, character-based formats are preferred over native or binary formats as long as data is complete, and retains full detail and precision. Preferred formats include well-developed, widely adopted, de facto marketplace standards, e.g.
a. Formats using well known schemas with public validation tool available
b. Line-oriented, e.g. TSV, CSV, fixed-width
c. Platform-independent open formats, e.g. .db, .db3, .sqlite, .sqlite3
2. Any proprietary format that is a de facto standard for a profession or supported by multiple tools (e.g. Excel .xls or .xlsx, Shapefile)
3. Character Encoding, in descending order of preference:
a. UTF-8, UTF-16 (with BOM),
b. US-ASCII or ISO 8859-1
c. Other named encoding
---
Acceptable
For data (in order of preference):
1. Non-proprietary, publicly documented formats endorsed as standards by a professional community or government agency, e.g. CDF, HDF
2. Text-based data formats with available schema
For aggregation or transfer:
1. ZIP, RAR, tar, 7z with no encryption, password or other protection mechanisms.
https://www.loc.gov/preservation/resources/rfs/data.html[0]: https://7-zip.org/7z.html
[1]: CVE-2025-0411
What are the advantages or reasons to use zstd in a 7z container versus just .zst?
Why use it w/ 7-zip though. 7-zip archives multiple files/directories and supports encryption. It has the UI too.. On Windows there is NanaZip that's available in the microsoft store which has been graced by corporate for user-install (unlike zstd that effectively needs WSL), and most folks won't be able to use the command line tool.
Of course using tar with zstd is always an option if you are on linux.