upvote
Can you go a bit deeper on this?

If the risk of mistranslation is high, I fail to comprehend how letting AI "take a swing at it" does not reduce the translation quality?

How are they ensure no drop in translation quality?

reply
They're doing transcription, not translation - so, turning someones pages of scrawled script into typewritten text. They have around 20 people nationwide that are able to do this. Most of them are older volunteers who aren't all that interested in computer assistance, but about a third of them have started leveraging the newer AI tools and it has accelerated their throughput significantly.

Having a 'best guess' at the lettering is really handy - in some cases the writing is really rather difficult to make out at all. Even being able to run something as simple as frequency analysis on stroke patterns would be a massive benefit.

At this point they're becoming throughput bound on the scanning process. Diaries are digitized since the archive is in one place and their transcription experts are spread out over the country.

reply
I hadn't considered or read about this problem before but it makes sense.

It reminds me of the cuneiform problem. Between 500,000 and 1 million tablets have been collected. This is one of the earliest preserved writing systems. Even so, fewer than 10% of these tablets have been translated. I was surprised to learn this but it makes sense. There are several problems:

1. Scribes used a lot of shorthand;

2. Cuneiform itself changed over time;

3. Writers would use multiple languages (eg Sumerian, Akkadian), even on the same tablet. There are relatively few people fluent in these languages, particularly in multiple of them at once;

4. To some extent the tablets are 3D such that a 2D photo might not be sufficient to translate because you might need to physically turn the tablet to accurately see the marks; and

5. In some cases the tablets are incomplete or broken so you may not to figure out how things fit together.

I wonder if AI can help make inroads into this 90%. I really wonder what is waiting to be unearthed.

reply
Lots of 4,000 year old complaints about copper, I would imagine.
reply