upvote
I disagree because of how AI is progressing and because there's tons of neglected language markets they can pick up. Obviously your approach can work too, perhaps better. But 95% of language learning tools don't support Thai (my target language) for example so I am an eager user for that reason alone. I think they'll be able to make a generalized curriculum and have the AI use it in all languages.
reply
Most of the generalized curriculum stuff out there is crap because languages differ from each other in substantial ways. LLMs in principle should help here as they can use their knowledge of the structure of the language to modify, but we're just not there with context windows and thinking capabilities. They will need at least a per-language (ideally per language pair) system prompt that contains a rough outline of the curriculum.
reply
I think the curriculum areas you're referring to are for learners in the beginning and intermediate stages. In which case, fair enough, although I still think you could get pretty far by just prompting an LLM, as the LLM has read hundreds of books teaching how to learn each language. But that's not really my point; my point is that once you're an advanced learner (they claim this is their target market) who knows about 12,000 words, I think you know almost all of the grammar, and the remaining bits will get picked up along the way effortlessly via immersion. What you need help with in this stage is slogging through the next 10,000 vocab words you need to learn to get to extreme fluency or the next 25,000 you need to learn to become plausibly native-level, as well as the speaking and reading practice to make your reading faster (if it's a different character set to your native language) and make your speaking effortless.
reply
At that point why engage with an LLM? Just go read a book.
reply
Probably too long of an answer, but: averaged out over the months, I spend 30 minutes every weekday doing flashcards, 45 minutes with a tutor, and spend another 1.25 hours watching TV or reading books in my target language. With 2.5 hours every weekday on average and without life immersion (at your home or office) it's possible to get to reading/writing/speaking/understanding fluency (including in terms of speed) in a difficult language in about 3-4 years and near-native in another 2 years. It's very difficult as an English native to learn a language like Chinese, Japanese, or Thai. It's not like learning Spanish or French (which I have also studied). To answer your question directly, surprisingly, reading a book does very little to help your speaking or understanding abilities. The skills of understanding accents/pronunciations quickly enough and the skill of structuring sentences when speaking quickly enough are completely different skills. Writing/reading/speaking/understanding are four remarkably unrelated skills that must be trained separately. Actually, typing on a keyboard and writing by hand are also different. Because thai actually has a different keyboard on desktop vs phone, since it became good enough, I decided to simply use speech to text for the rest of my life. I'm remarkably fluent in comprehension and have read quite a few adult books and yet if you give me a pencil and paper my brain can't figure out how to spell a word that I can easily say or type. And why use an LLM instead of a tutor? To save $2,700 a year.
reply
Would you rather have a tool that teaches you accurate conversational Spanish ?

Or something that tries to teach 60 languages but does so poorly ?

reply
My tool supports Thai, if you'd like to try it - https://nuenki.app . I added it at the request of a user, who seems to be happy with it.

It's a browser extension that finds English sentences in webpages, and translates the ones at your difficulty level into the language you're learning.

reply
Thank you, I will try it, although I'd prefer to translate entire sentences into Thai randomly. Perhaps you can add this advanced mode. Actually, I saw your app before while looking for an alternative to Toucan that supported Thai, but at that point in time you hadn't added support yet. Thanks for doing so.
reply
Okay I installed it and this is pretty great. Although I think your extension doesn't work for Thai the way you think it does. Because there's spaces between sentences instead of between words in Thai, it's translating entire sentences even with the "words only" setting enabled. This is what I want anyways, but will be too difficult for most learners. I have written misc Thai learning softwares and just so you know you should use an LLM to do word-splitting, not a software library. If you do use a library, you need to split words while looking for the largest possible word, but it won't work well. Basically you can't tell without a brain whether it's a lot of small words next to one another or a smaller number of compound words. IME only an LLM or a human will do a good job of this.
reply
Translating entire sentences is the idea - I'm not sure what setting you mean with "words only"? I really ought to make the settings clearer, but it's hard to do when you know what they "ought" to express.

"Translate Isolated Words" allows it to translate "sentences" of only one word, but it doesn't disable full sentences.

And yeah, atm it word splits by spaces for the dictionary. I hadn't thought to do it with LLMs, though that's a good idea. There's a somewhat related problem when doing Furigana, where it has a hashmap of strings-to-pronunciations, and it starts with a 4-character sliding window looking for matches, then a 3 character, etc.

reply
That's a pretty sick idea. Unfortunately I presume it involves sending your browsing data (e.g. page contents) to the server?
reply
Yeah, though I've added lots of privacy protections to at least partially mitigate that:

- There's a global blacklist of sites, as well as phrases in the title/URL (e.g. "bank")

- You can blacklist sites yourself

- Each sentence is run against filters checking for medical/legal/etc info, as well as checks for addresses, card/social security numbers, etc. All the checks are done client side

- There are also some special implementations, e.g. it looks at the source code of websites to work out if they're an instance of an American health portal that I've forgotten the name of - each doctor's surgery self-hosts it.

- Websites can add `nuenki-ignore=true` on their end, if they'd like to disable it.

And of course it doesn't log anything, though there is an anonymous cache in order to make it economical.

reply
What about a whitelist? I might just be interested in only having certain sites, like this one or Reddit, translated into my target language. That way I can be certain it is only turned on for sites that I am OK with sharing browsing history and not be concerned that I might have missed adding something to the blacklist.
reply
That's a good point. At some point I ought to make a UBlock-origin style list of customisable rules.

At the moment I'm focused on translation quality, but I intend to add that.

reply
Thanks!
reply
This is a great idea. Specifically, I want this enabled when I'm wasting time but not when I'm working. So I'd like it to be enabled only on X.com. This whitelist+blocklist functionality could be a user-side setting like with Adblockers.
reply