upvote
I was just mentioning the Japanese word mojibake on the plain-text thread (https://news.ycombinator.com/item?id=47897681), and here you give an example. In fact, UTF-8 misinterpreted as Windows-1252 is the mojibake I personally encounter most often. Curly quotes (most often a right apostrophe inside a word like can't or it's or didn't) are the most common ones, with em dashes being only slightly less common. The other direction (Windows-1252 text being read as UTF-8) produces � (U+FFFD) everywhere instead, but either way, I still see those from time to time today. But far, FAR less frequently than I used to back in the late 2000's or early 2010's. I used to see — and similar sequences all the time 15-20 years ago, and now it's rare enough that I actually notice when it happens.
reply
This is probably the case of a bodged migration from one CMS to another.

My blog suffered the same, and going through loads of old pages to check and fix them just isn't worth the effort.

reply
The archived version from 2012 is showing the characters correct. So probably some migration like you said.

https://web.archive.org/web/20120319180000/https://fffff.at/...

The website itself has been closed since 2015 according to the front page.

https://fffff.at/

Which also suffers from encoding problems making weird characters show up.

But which was showing the characters the way it should on August 1st 2015 when the site was closing down.

https://web.archive.org/web/20150801234212/http://fffff.at/

Who wants to bet that at some point after the closing of the site, they switched over from a live CMS to a static copy of the site and in the process of doing so things got a little screwed up when exporting data from a MySQL database with the different encoding weirdnesses that can sometimes occur with MySQL and how the db schema was set there.

reply
Yeah this really looks like an encoding issue during migration.

I've run into similar problems when moving old content between systems, especially with MySQL and mixed encodings. It can get messy surprisingly quickly.

reply
> Why shouldn’t we be able to?

I have no idea why but my brain immediately interpreted this as a Scottish accent, like ‘shouldnae’. Weird.

reply
… because "â€" and "ae" are visually similar?
reply