upvote
> Arguably, principle of least surprise is very Apple.

Principle of least surprise is good engineering practice. The question is always whose surprise. Someone who expects tar to behave like other UNIX systems is going to be surprised by this. Someone who expects tar on Apple to have perfect fidelity would be surprised by not-this.

I increasingly feel like build systems should never be relying on any "native" utilities from the host system, and should instead be bringing them in via dependencies. You can't have this problem if your packaging system pulls in a specific portable `tar` library.

reply
What should be really surprising for the users of UNIX-like operating systems is when they lose data because traditional UNIX utilities like cp, tar or cpio do not make complete copies of files, as one would expect from their description.

What is worse is that these utilities do not give any warnings when they do not make complete copies. For cp, the root cause is that it has bad default options, while for tar and cpio the standard file formats cannot store the metadata of modern file systems.

The various tar programs have their own different file format extensions to deal with modern file systems, which are guaranteed to work only when using the same tar program for both creation and extraction. The better tar programs implement both their own file format extensions and the file format extensions used by other popular tar programs.

The author of the TFA has used some obsolete tar program, which is the cause for the surprising behavior that was seen.

To avoid loss of data on Linux, I always use the PAX file format instead of tar or cpio, with the extensions implemented by "bsdtar --create --format=pax" from libarchive, and I always alias cp to '/bin/cp --no-dereference --recursive --one-file-system --preserve=all --strip-trailing-slashes --verbose --interactive', where cp has been built with extended attributes support.

reply
> Someone who expects tar to behave like other UNIX systems is going to be surprised by this

They shouldn’t. The GNU tar manual already shows this behavior. https://www.gnu.org/software/tar/manual/html_node/What-tar-D...:

Because the archive created by tar is capable of preserving file information and directory structure, tar is commonly used for performing full and incremental backups of disks”

And yes, that same page also says:

“You can create an archive on one system, transfer it to another system, and extract the contents there. This allows you to transport a group of files from one system to another.”

> You can't have this problem if your packaging system pulls in a specific portable `tar` library.

You can’t pull in specific portable stuff all the way down (not even when running in Docker or a VM), so that will decrease the risk, but it cannot completely remove it. As an example, I think GNU tar will happily include .DS_Store files in archives.

reply
Apple is always surprised that non-Apple devices exist.

See: the permanent undismissable red icon to "finish setting up your Apple TV with your iPhone"

reply
Apple can't control non-Apple devices. They can only control their own. So this makes perfect sense.
reply
They could control their own Apple TVs to allow that dialogue to be dismissed via the TV controls.
reply
Agreed, but why not just finishing setting it up? Or do people own Apple TVs without iPhones? That never occurred to me since a large part of the value prop is phone integration
reply
No, the value prop is a streaming device with a clean UX not filled with ads. My phone (which is not an iPhone) has nothing to do with it. Apple TV is a far better YouTube device than Google TV. It's also the best device for Plex, Netflix, and all the streaming apps.
reply
What integrations do you use? I can't really think of what I would miss on the Apple TV if I switched from iPhone. I rarely use AirPlay, disable Photos for in-house privacy reasons, and… oh yeah, the remote control for keyboard, volume, and navigation via iPhone is neat! I think the Apple TV is just a strong product on its own.
reply
Yes, I believe it's possible to buy an Apple TV without owning an iPhone.
reply
> The question is always whose surprise.

I think that the surprise of more data than expected is more desirable than the surprise of data loss. So in this case, it seems like the safe choice.

reply
Agreed. I usually hate on Apple, and its terribly ancient utilities and gratuitous incompatibility with modern Linux utilities, motivated by hatred of the GPL license.

But in this case, I think what it's doing is… basically fine? "Tar should faithfully reproduce the semantics of the source filesystem" is a perfectly reasonable starting point.

Ideally there would be a documented way to turn off the Apple-specific metadata with Apple's own tar, though.

reply
From tar(1):

     --no-mac-metadata
             (x mode only) Mac OS X specific.  Do not archive or extract ACLs
             and extended file attributes using copyfile(3) in AppleDouble
             format.  This is the reverse of --mac-metadata.  and the default
             behavior if tar is run as non-root in x mode.
reply
> I increasingly feel like build systems should never be relying on any "native" utilities from the host system, and should instead be bringing them in via dependencies.

Well, you see, while this, frankly, applies not just to build systems but to most of software, the consensus in the community of distro-maintainers is that it's actually wrong: you should use your system's package manager, and tools it can install, and let it fiddle with the ambient environment and give you that delicious "path dependency". And if your distro's packaging environment doesn't allow to do the things you need (e.g. being able to install both mongodb 3.8 and mongodb 5.0, ideally at the same time, but okay, I can keep running apt remove/install over and over, but I do need to check if my app correctly handled the wire protocol changes), well, that's your problem for desiring strange things.

reply
Nixos has a pretty solid solution to this issue: key your dependencies with checksums of the content. That way you get the best of both worlds: you always get the exact version you want, and you can share a copy of that exact version with other software that wants to use that exact version too!
reply
So it sounds like you don’t get the exact version you want because metadata is thrown away.
reply
It's a checksum not the content itself
reply
Yeah, Nix-like distributions (e.g. guix, lix) do for Linux systems what some language package managers (e.g. cargo) do for individual projects.
reply
Are the xattr / chattr / umask checksums rolled into the main data fork content or are they hashed separately (or not at all)?
reply
IIRC Nix is checksummed in the hash of the source of the content, not the results.
reply
Hash of a normalization of the derivation, so this roughly means source, dependencies and the ‘build recipe’. The exception are fixed-output derivations, which are typically content-hashed.

That said, a lot of work is done in content-addressed hashing, but AFAIK it’s not the default yet.

reply
> I want that file system archived to tape. And so, tar does.

The traditional UNIX tar and cpio utilities cannot archive the modern Linux file systems without loss of metadata.

Most modern tar programs implement various file format extensions as a workaround for this, but the extensions may be incompatible between distinct tar programs and frequently they are very poorly documented.

Some years in the past, libarchive was the only archiver available on Linux that guaranteed lossless backups for the Linux file systems, e.g. xfs or ext4 (and also lossless file transfers between Linux file systems and FreeBSD file systems). Therefore that is what I have been using on Linux since then.

Presumably since then GNU tar and other tar programs should have caught up with it, but I have not verified this.

Whichever tar program was used in TFA, it was an obsolete tar program, and that was the real problem, not that the archives had been created on an Apple computer.

reply
If you think that most people who run the tar command are assuming it will work like a tape archive, you'll probably be the one surprised
reply
> That said, never seeing another .DS_Store should be a system-wide option!

Yes please.

reply
.DS_Store, .fseventsd, .Spotlight-V100, .Trashes, and ._this and ._that

These can all die in a fire too, as far as I am concerned. macOS loves to treat the user's filesystem as its own personal garbage dump.

reply
thumbs.db and those weird MS alternative stream files for recording origination.

filesystem attributes are for decorating files with meaning. Anything else that attempts to use filesystems in "interesting" ways is silly.

Apple and MS really ought to consider why they do this sort of fragile, idiosyncratic nonsense.

reply
But... thumbs.db is precisely not an "attempt to use filesystems in "interesting" ways" — it's literally a just hidden file with previews stored in it. Storing the preview in the alternative stream of the file with the picture itself would be "an interesting way".
reply
Agreed. Where else would you put that stuff? It’s gotta go somewhere, and this is the least surprising place IMO. Anywhere else would have to be a parallel store that follows filesystem mounts and unmounts, renaming directories, etc so that it alway perfectly mirrors the thing it’s configuring.
reply
> Where else would you put that stuff?

A "Centralized thumbnail cache" in the user profile folder, where it's been for a long while.

https://en.wikipedia.org/wiki/Windows_thumbnail_cache

> so that it alway perfectly mirrors

Who cares? It's a cache.

reply
In the particular case of thumbs.db, storing them in NTFS alternate data streams would have been a good idea; they're essentially caches for the main data stream, so if they fail to copy to different filesystems it's totally fine. Of course, that wasn't viable because 1) IIRC that was before the widespread adoption of NTFS, and 2) they probably still need the cache somewhere for vFAT USB drives.
reply
And .DS_Store is just your folder level preferences in Finder. If you don’t use Finder they won’t be created
reply
> Thumbs.db

Windows has been storing thumbnail cache in the user profile folder since Vista (2006).

It's been 20 years. Time to let it go.

reply
OTOH, If you want the information contained in those files, where else would you save it?
reply
To me it seems more sensible to store information relevant only to this OS in a specific cache somewhere within that OS. It would even make cache-like functionality such as evicting old entries super easy.
reply
There are some tradeoffs. Like if you used a usb and set up folder colours or any of the other things stored in the file, they would not move along with the usb when used on another computer.
reply
If I set a folder colour in Finder on my work MacBook, and then plug that USB drive into my personal computer which uses Thunar as a file browser on Debian, nothing would happen.
reply