Don't want this to turn into a Matt Damon in Elysium type of situation for sure with that scene with the parole officer hahah (which would stem from a poor integration of such subjective signals into existing workflows, more so than the availability of those signals)
For emotional intelligence, I personally see this as a prerequisite for any voice / language model that's interacting with humans, just like how an autonomous car has to be able to identify a pothole, so does a voice / video agent navigating a pothole in a conversation.
Candidate: That's the hotel.
HR: What?
Candidate: Where I live.
HR: Nice place?
Candidate: Yeah, sure. I guess. Is that part of the test?
HR: No. Just warming you up, that's all.