upvote
https://www.youtube.com/watch?v=GH9-EmgtABw

Saw this video recently, by an AI company working to get contextual cues from tone and body language. I think they're converting it to text and feeding it into a LLM, so not natively multimodal, but I still thought it was really cool.

reply
I don't think written prompting will ever go away. Writing helps you organize your thoughts in a way that speaking, umm, ah, wait no, hang on, does not. Writing I can go back and change what I've already written before I hit send. Anybody who's prompted with speech for any length has been "wait no nevermind start over". So STT will get better, sure, it's already quite good. I just don't see text extry entirely going away because Human Intelligence (HI) just doesn't work in a way that speech would be the only interface.
reply