If I point "tape archive" at a file system, I want that file system archived to tape. And so, tar does.
If I don't, well, that's a fine option, and there's a fine option for that.
So it's less of a "workaround" or something that "gets worse", than, "No, I don't really want a tape archive of this filesystem, only of some of it." And that's supported.
That said, never seeing another .DS_Store should be a system-wide option!
Principle of least surprise is good engineering practice. The question is always whose surprise. Someone who expects tar to behave like other UNIX systems is going to be surprised by this. Someone who expects tar on Apple to have perfect fidelity would be surprised by not-this.
I increasingly feel like build systems should never be relying on any "native" utilities from the host system, and should instead be bringing them in via dependencies. You can't have this problem if your packaging system pulls in a specific portable `tar` library.
What is worse is that these utilities do not give any warnings when they do not make complete copies. For cp, the root cause is that it has bad default options, while for tar and cpio the standard file formats cannot store the metadata of modern file systems.
The various tar programs have their own different file format extensions to deal with modern file systems, which are guaranteed to work only when using the same tar program for both creation and extraction. The better tar programs implement both their own file format extensions and the file format extensions used by other popular tar programs.
The author of the TFA has used some obsolete tar program, which is the cause for the surprising behavior that was seen.
To avoid loss of data on Linux, I always use the PAX file format instead of tar or cpio, with the extensions implemented by "bsdtar --create --format=pax" from libarchive, and I always alias cp to '/bin/cp --no-dereference --recursive --one-file-system --preserve=all --strip-trailing-slashes --verbose --interactive', where cp has been built with extended attributes support.
They shouldn’t. The GNU tar manual already shows this behavior. https://www.gnu.org/software/tar/manual/html_node/What-tar-D...:
Because the archive created by tar is capable of preserving file information and directory structure, tar is commonly used for performing full and incremental backups of disks”
And yes, that same page also says:
“You can create an archive on one system, transfer it to another system, and extract the contents there. This allows you to transport a group of files from one system to another.”
> You can't have this problem if your packaging system pulls in a specific portable `tar` library.
You can’t pull in specific portable stuff all the way down (not even when running in Docker or a VM), so that will decrease the risk, but it cannot completely remove it. As an example, I think GNU tar will happily include .DS_Store files in archives.
See: the permanent undismissable red icon to "finish setting up your Apple TV with your iPhone"
I think that the surprise of more data than expected is more desirable than the surprise of data loss. So in this case, it seems like the safe choice.
But in this case, I think what it's doing is… basically fine? "Tar should faithfully reproduce the semantics of the source filesystem" is a perfectly reasonable starting point.
Ideally there would be a documented way to turn off the Apple-specific metadata with Apple's own tar, though.
--no-mac-metadata
(x mode only) Mac OS X specific. Do not archive or extract ACLs
and extended file attributes using copyfile(3) in AppleDouble
format. This is the reverse of --mac-metadata. and the default
behavior if tar is run as non-root in x mode.Well, you see, while this, frankly, applies not just to build systems but to most of software, the consensus in the community of distro-maintainers is that it's actually wrong: you should use your system's package manager, and tools it can install, and let it fiddle with the ambient environment and give you that delicious "path dependency". And if your distro's packaging environment doesn't allow to do the things you need (e.g. being able to install both mongodb 3.8 and mongodb 5.0, ideally at the same time, but okay, I can keep running apt remove/install over and over, but I do need to check if my app correctly handled the wire protocol changes), well, that's your problem for desiring strange things.
That said, a lot of work is done in content-addressed hashing, but AFAIK it’s not the default yet.
The traditional UNIX tar and cpio utilities cannot archive the modern Linux file systems without loss of metadata.
Most modern tar programs implement various file format extensions as a workaround for this, but the extensions may be incompatible between distinct tar programs and frequently they are very poorly documented.
Some years in the past, libarchive was the only archiver available on Linux that guaranteed lossless backups for the Linux file systems, e.g. xfs or ext4 (and also lossless file transfers between Linux file systems and FreeBSD file systems). Therefore that is what I have been using on Linux since then.
Presumably since then GNU tar and other tar programs should have caught up with it, but I have not verified this.
Whichever tar program was used in TFA, it was an obsolete tar program, and that was the real problem, not that the archives had been created on an Apple computer.
Yes please.
These can all die in a fire too, as far as I am concerned. macOS loves to treat the user's filesystem as its own personal garbage dump.
filesystem attributes are for decorating files with meaning. Anything else that attempts to use filesystems in "interesting" ways is silly.
Apple and MS really ought to consider why they do this sort of fragile, idiosyncratic nonsense.
A "Centralized thumbnail cache" in the user profile folder, where it's been for a long while.
https://en.wikipedia.org/wiki/Windows_thumbnail_cache
> so that it alway perfectly mirrors
Who cares? It's a cache.
Windows has been storing thumbnail cache in the user profile folder since Vista (2006).
It's been 20 years. Time to let it go.
If you want a faithful archive of the data then a tar archive or disk image is what you want.
Linux developers already do. Using a BSD can already be a pain in the arse, thanks to (often poorly thought out) Linux-isms cropping up everywhere.
The problem described in TFA is not specific to Apple, but the same problem appears when archiving any decent filesystem that has been designed during the last 3 decades and not a half of century ago, including all Linux file systems.
The problem described in TFA is not caused by Apple, but by the author using an obsolete tar program and not being aware of this.
The traditional tar file format cannot store a lot of the metadata that is contained in modern file systems (e.g. high resolution timestamps, access control lists, extended file attributes), so it is useless for such file systems.
Most modern "tar" implementations have added extensions to the tar file format, to make it usable with modern file systems, such as Linux XFS or Linux EXT4. But many of these extensions are incompatible between themselves, so certain tar files can be fully extracted only with the same tar program that has created them.
I strongly recommend against using the old tar or cpio file formats. Even with various extensions it is not guaranteed that they always work correctly.
I always use only the pax file format, which has also required extensions in order to work with the modern file systems, but the pax extensions are cleaner than those for tar, because the file format is better designed.
Libarchive, which was mentioned in TFA, is available in most Linux distributions or it can be built from source on any Linux computer. It provides an executable that is preferable to tar (better invoked as "bsdtar --format=pax") for the backup or transfer of any Linux files.
I have not checked recently GNU tar or other tar programs available on Linux, and I hope that meanwhile they have been upgraded to be able to archive losslessly the Linux file systems, but some years ago that was not true, so using tar or cpio on Linux could easily corrupt the archived files.
The warning can be suppressed by `--no-xattrs --no-mac-metadata`.
then just edited the code as - tar czf dist.tar.gz dist
+ COPYFILE_DISABLE=1 tar czf dist.tar.gz dist