upvote
The Cebuano wiki is a similar case, not spoken often, but it was a personal project of an editor that was mad at political articles and started making animal articles in the Cebuano wiki.

The solution is to differentiate and tag inputs and outputs, such that outputs can't be fed as inputs recursively. Funnily enough, wikipedia's sourcing policy does this perfectly, not only are sources the input and page content is just an output, but page content is a tertiary source, and sources by policy should be secondary (and sometimes primary) sources, so the system is even protected against cross tertiary source pollution (say an encyclopedia feeding off wikipedia and viceversa).

It is only when articles posing as secondary sources fail to cite wikipedia that a recursive quality loss can occur, see [[citogenesis]]

reply
Many sources for Wikipedia articles refer to Wikipedia without citing it. Many journalists will work from Wikipedia, and most of Wikipedia's sources are journalistic articles. It happens to be that often this isn't noticed because the information obtained this way is true and uncontroversial. Citogenesis only documents examples where, by bad luck, the result is untrue information.
reply
No citogenesis is present regardless of whether the information is "true". The concept of "truth" on wikipedia doesn't have much weight, mainly because it would be impractical, out of scope and original research to determine truthfulness.
reply
> It is only when articles posing as secondary sources fail to cite wikipedia that a recursive quality loss can occur

I've seen a college professor cite wikipedia in support of a false claim. On investigation, the text in wikipedia was cited to an earlier blog post by that same professor.

I wasn't convinced.

reply
I don't think it's entirely illegitimate.

1- citing wikipedia (or any tertiary source) is valid, the problem is just when the source is not cited. And also it's against wikipedia policy, but you are free to cite it elsewhere.

2- citing the tertiary source and citing the secondary source are distinct and valid. There is no "rule", in wikipedia or otherwise, that says you need to cite the underlying source. In fact citation chains can become quite deep, it would be very impractical. An example would be, you could cite the gospels when jesus talks with the devil. If we had it your way then you wouldn't be able to cite an apostle, you would have to attribute the quote to jesus, and furthermore if jesus quoted the old testament you would have to cite that? If you think the bible is an exception, consider case law, if you were to cite an attorney's defense and the attorney cited some cases, would you have to cite the original cases? If so, then which? There might be multiple, it's not just a citation chain but a graph.

In this specific example your professor was not just quoting himself, but his work is now part of wikipedia and importantly was not contested or was not successfully contested. Similarly to how a trademark works, you claim you own the trademark, and if a year or so no one contends it, you have a stronger case that it's yours.

reply
The background here is that Scots is not really a language. Try asking a Glasgow taxi driver who addresses you in 'Scots' whether he knows any English. Robert Burns wrote in English, with some of his spelling reflecting pronunciation in the Scottish English dialect.

The people who want it to be considered as a language for political reasons cannot be bothered to translate Wikipedia themselves. They read and edit English Wikipedia and understand it perfectly.

reply
Sort of?

The Glaswegian taxi driver may not consider themself to be speaking a different language but, if speaking to another local and leaving aside pronunciation, they’d use words, phrases and even grammar that’s incomprehensible to someone with no experience with Scots.

I’m a “posh Scot”, raised middle class in Edinburgh so my accent is minimal and thickens up or softens depending on who I’m speaking to. Even for me, there’s a lot of words, phrases and ways of speaking I’ve had to adjust to be consistently understood by American coworkers when over the last 10+ years.

reply
Brits do the same. At best it is a dialect at worst an accent. A lot of (most of) Scots is still English but spoken with different grammar or unfamiliar phrases and unfamiliar pronunciation.

Sort of like extreme cockney rhyming slang or for a more modern example thick BME* full of slang.

* = British Multicultural English, think fam n blud, lots of Jamaican english influence plus south east asian influence.

reply
> The background here is that Scots is not really a language.

This is supremely ignorant. Scots is its own language. It's a 'brother' or 'sister' of English, with both English and Scots being descendants of West Germanic languages.

The fact that many (all?) Scots speakers also speak English doesn't mean Scots not a language on its own.

You could make your exact same arguments that Irish isn't a language because you could ask a Cork taxi driver whether he knows any English.

Scots = a language with some of the same ancestors as English.

Scottish English = a dialect (and accent) of English

Scots Gaelic = another language, with the same ancestors as Irish and Manx.

reply
Australians, Jamaicans, African Americans and English-speaking South Africans do not have their own Wikipedia, despite all these dialects having more legitimate demographic and linguistic claims to being languages than 'Scots'.

James Joyce wrote in English, no Irish person pretends that he wrote in a third language distinct from English and Irish. The fact that they do not do so does not compromise the political basis for independence, republicanism or reunification.

If a Cork taxi driver, addressed you in Irish (very unlikely), and you asked him to speak English, the request would be both coherent and reasonable. The point you missed is that the Glasgow taxi driver would look at you with consternation and say "But, I am speaking English! What's wrong with my English?' (insert dialect spelling if you like)

Rabbie Burns wrote in the same language as his compatriots Louis Stevenson and Scott.

It would be ignorant if I did not know about the meretricious claim of a minority of Scottish people to have their own language, but it is not ignorant to reject that claim. I am Scottish fwiw.

reply
Scots is somewhat partially intelligible in written form to English speakers, but that does not make it the same language as English. You might as well say that Spanish and Portuguese are the same language.
reply
You might as well say that US English and Canadian English are different languages.

Geordie English is closer to Edinburgh 'Scots' than to RP English or US English or Indian English. Is it a dialect of Scots?

reply
There's also a smooth language continuum between Spanish and Portuguese, with varieties like Galego. This doesn't make them the same language. Historically the language continuum encompassed most of Europe, but people at the extremes would've had no expectation of understanding each other's language.
reply
What counts as a language is almost always determined by "political reasons" - as the witticism goes: "A language is a dialect with an army and navy."

There exists dialects that are less mutually intelligible than apparently distinct languages, and the designation of each as "dialect" or "language" is political. Language is often a proxy for culture, and political actors may wish to suppress or boost the legitimacy of such cultural expression depending on their aims.

reply
deleted
reply
[flagged]
reply
Yes, half of my entire political ideology is based on posts written by 12 year olds on the Internet. The other half is based on posts written by dogs[1].

1. https://en.wikipedia.org/wiki/On_the_Internet,_nobody_knows_...

reply
Yep, it's either dogs or r̸u̸s̸s̸i̸a̸n̸ chinese bots.
reply
> Yep, it's either dogs or r̸u̸s̸s̸i̸a̸n̸ chinese bots.

Please consider users of screen readers and other assistive technologies, as your nonstandard usage of nonstandard characters makes parsing your comment difficult if not impossible. Not a slight or a correction, as I am a fan of Zalgo text myself, but after being informed by others about how inscrutable it can be to the differently abled, I have reconsidered using it.

reply
Didn't realize that, thanks!

I wonder if the future of screen reading applications is bypassing these issues + avoiding parsing weird websites by just doing AI driven OCR.

reply
I used to use Zalgo text to make it harder to read my name when I use it, as I use my “real name” and didn’t want it to be scraped by bots, but some folks literally blocked me on social media after a bit of a spat that I’ll admit was caused by a misunderstanding on my part. Apparently, these kinds of characters’ interpretations are context-specific, and using them as a person as strike-throughs is readily apparent for some, but while the meaning is possible to be deduced by an AI, it shouldn’t be expected or assumed to be understood. My HN username in Zalgo text was taking over 30 seconds to read all of the diacritics per post on platforms I used it on, so I had to change my ways or admit that I didn’t care about the experience others had, the latter of which couldn’t be further from the truth.

AI has a hard time deriving how many r’s are in strawberry, so I won’t expect it to parse my text on behalf of others any time soon, though I don’t think you meant any harm. In the interest of respect for those who don’t have a choice in using tech to help them do what comes easily and naturally to me, I thought I’d pay forward the knowledge of how the world and our perceptions of it is as unique as every individual.

reply
TalkBack on Android seems to read it just fine, assumedly without needing any fancy AI or OCR
reply
Jfc, not everything is about that.
reply
I meant it as an example of--road to hell paved with good intentions and "and naively failed to recognise the damage they were doing".

But you do you.

reply
That's extremely tangential. Bringing hot-button political topics into unrelated threads flattens everything into political arguments and starves all other topics of oxygen.
reply
More specifically, it gets important HN discussions quickly flagged and dropped.
reply