There could probably be cool uses but I don't think it will be a pure "upgrade" as the repeating dialog is kind of a feature honestly.
We'll have to see how it pans out xD
Current video games are designed around streamlining content. As a player, your job is to extract all content from an area before going to the next. That's why most areas are designed as linear corridors so that there is a straightforward progression, and most NPCs interactions are meant to offer something meaningful so as to not waste the player's time.
But imagine if interaction with NPCs wasn't just a content delivery mechanism, but instead could sometimes be rewarding, sometimes useless, dynamically adjusted in how you interact with the world in non-predictable ways.
The player would just waste their time in their usual approach of canvasing each new area, which would become unsustainable. There would be no reliable way of ensuring you've extracted all the content. All he/she could do is roam around more naturally, hoping the glimpses they catch are engaging and interesting enough.
Maybe a new player skill would be to be able to identify the genuine threads of exciting content, be it designed or emergent, within the noise of an AI-generated world.
Realistically though, how do you build an exciting player experience with this framework? A starting point might be to approach it as something more akin to LARP or improvisation theater, you'd give each NPC and player a role they need to fulfill. Whether players actually enjoy this is another thing entirely.
That’s a slot machine, and the same mechanism which also gets us hooked on social media. Sounds like something which would immediately be exploited by vapid addiction-as-a-feature games à la FarmVille.
> The player would just waste their time in their usual approach of canvasing each new area, which would become unsustainable. There would be no reliable way of ensuring you've extracted all the content.
Sounds frustrating. Ultimately games should be rewarding and fun. Constraints are a feature.
> All he/she could do is roam around more naturally, hoping the glimpses they catch are engaging and interesting enough.
Good reminder to go take a walk outside. Take a train to somewhere we haven’t been. Pick a road we’ve never crossed. We don’t even need a mini map, and sucks that we don’t have teleportation back to base, but we do have a special device which always points the way back.
> Realistically though, how do you build an exciting player experience with this framework? (…) Whether players actually enjoy this is another thing entirely.
Agreed. Though not enjoying it and abandoning it is fine, I’m more worried about people not enjoying it but feeling unable to quit (which already happens today, but I think the proposed system would make it worse).
> Sounds frustrating. Ultimately games should be rewarding and fun.
this seems to assume that the only way to feel rewarded / have fun is by comprehensively extracting content from the game. in order to have fun in an "emergent" generative game of this nature, you'd need to let go of that goal.
i do agree with the risks surrounding engineered engagement.
Not my intention, that is not something I believe. I’m not a completionist (I get those who are, but to me it can get boring or stressful) and I see the appeal in sandbox games (even if I don’t usually play them).
Wrong right from the outset. Some games are designed around content and "extraction". Many are not.
While I think the parent post leaves a lot of open ended questions, I think they are spot on about the tightness of design in games.
In many open world RPGs, or something like GTA, you cannot open every door in a city. In street fighter you can't take a break to talk to your opponent. In art games like Journey you cannot deviate from the path.
Games are a limited form of entertainment due to technical and resource restrictions, and they always will be. Even something as open ended and basic as minecraft has to have limits to the design, you wouldn't want the player to be able to collect every blade of grass off of a block just because you could add that. You have to find the balance between engaging elements and world building.
Having a LLM backed farmer in an RPG that could go into detail on how their crops didn't grow as well last season because it didn't rain as much seems good on paper for world building. But it is just going to devalue the human curated content around it as the player has to find what does and does not advance their goals in the limited time they have to play. And if you have some reward for talking to random NPCs players will just spam the next button until it's over to optimize their fun. All games have to hold back from adding more so that the important parts stand out.
But even for story-driven games, you can signal when you're "done" extracting story-related details in various ways, by e.g. prompt the NPC to include dialogue element A,B,C when it fits the conversation, keep track of which were output (you can make it output a marker to ensure it's easy to track even if the dialogue element has been worded differently), and have it get annoyed and tell you it doesn't have more to tell you or similar as the repetition adds up.
But that's not how real life works at all, right? You talk to someone for as long as you want to talk to them, or until they start sending signals that they are done talking with you.
The way video game dialog works has always bothered me, it makes characters feel stilted and makes me care less about the characters and the world.
(Although it's a different game in many ways, consider by contrast how Portal 2 handles dialog, and the effect that has on immersion.)
How real life works is always a plausible interesting goal, but it's very often at odds with a bunch of other valuable goals for players.
A particular sharp example of this is sports video games. It might well be interesting (and certainly realistic) to simulate bad referees in a sports game. Horrible blown calls by tennis line judges, or missed calls by basketball refs, or bad umpire calls on pitches. Real-life soccer makes working the refs and their inability to see everything an art form, as far as I can tell.
Perhaps that's interesting, but the irony here is that real life refs are actually bad simulations of the original perfect game code in the first place, from a certain point of view. I think debates about the use of instant replay in sports gets at the heart of this, and one could imagine using real-time AI to help refs taking this conversation much further.
I think the sports case is a particularly sharp example, but it definitely holds with all sorts of choices in games.
For Animal Crossing in particular, I remember when I finally played it, it struck me after a while how much it had in common with recent MMOs (Everquest and World of Warcraft) that I had had fellow game developer friends have their lives severely disrupted by. And when I played the original Animal Crossing, I remember noticing specifically how careful the designers were in having players use up every bit of interesting content in a day after 45 minutes or an hour, so that eventually you'd run out of things to do, and that was the game's signal to put it down and pick it up again the next day. And I remember being struck by how intentional it was, and how humane it was... particularly given their goal of wanting to make a game that was asynchronously coop (where different family members could play in the same shared space at different times of day and interact asynchronously). As a game designer myself, I really respected the care they put into that.
Anyway, that's my immediate thought on seeing this (fascinating, valuable) experiment with LLM dialogue in Animal Crossing. The actual way NPCs work in these games as they are has been honed over time to serve a very specific function. It's very similar to personal testimonials by paid actors in commercials; a human expressing an idea in personal dialogue form triggers all sorts natural human attention and reception in us as audience members, and so it's a lot more sticky... but getting across the information quickly and concisely is still the primary point. Even dialogue trees in games are often not used because of their inefficiency.
I totally think that there will be fascinating innovations from the current crop of AI in games, and I'm really looking forward to seeing and trying them. I just think it's unlikely they will be drop-in replacements for a lot of the techniques that game developers have already honed for cases like informational NPC dialogue.
oh of course! Sorry, I was never trying to imply that that it was in any way realistic. For video games often the most fun / compelling choice is not the realistic one! Striving for realism can be a great goal and often has a lot of positives, but it is often limiting. Video games are just art, being photorealistic can be beautiful and amazing but is often not the best choice for expressing an idea.
That seems a bit like deck-chairs on the Titanic. The hard part isn't icon design, the hard part is (A) ensuring a clear list exists of what the NPC is supposed to ensure the user knows and (B) determining whether those goals were received successfully.
For example, imagine a mystery/puzzle game where the NPC needs to inform the user of a clue for the next puzzle, but the LLM-layer botches it, either by generating dialogue that phrases it wrong, or by failing to fit it into the first response, so that the user must always do a few "extra" interactions anyway "just in case."
I suppose you could... Feed the output into another document of "Did this NPC answer correctly" and feed it to another LLM... but down that path lies [more] madness.
EDIT: Also, having the LLM botch a clue occasionally could be a feature. E.g. a bumbling character that you might need to "interrogate" a bit before you actually get the clue in a way that makes sense, and can't be sure it's entirely correct. That could make some characters more realistic.
Basically you have your big clever LLM generating the outputs, and then you have your small dumb LLM reading them and going “did I understand that? Did it make sense?” - basically emulating the user before the response actually gets to the user. If it’s good, on it goes to the user, if not, the student queries Einstein with feedback to have another crack.
https://openai.com/index/prover-verifier-games-improve-legib...
Kind of like in real life...
There are some significant issues with it at the moment. One is that you have to train on vast swathes of text to get an LLM, and it's difficult after the fact to remove things after the fact. If you cooperate with the AI and stay "in Skyrim" with what you say to them it works out OK, but if you don't cooperate it becomes clear that Skyrim NPCs know something about Taylor Swift and Fox News, just to name two examples. LLMs in their current form basically can't solve this.
The LLMs are also prone to writing checks the game can't cash. It's neat that the NPCs started talking about a perfectly plausible dungeon adventure they went on in a location that doesn't exist, but "felt" perfectly Skyrim-esque, but there's clearly some non-optimal aspects about that too. And again, this is basically not solvable with LLMs as they are currently constituted.
Really slick experiences with this I think will require a generational change in AI technology. The Mantella mod is fun and all but it would be hard to sell that at scale right now as a gaming experience.
I didn't go into it in detail, but it isn't even that I got the NPCs to start babbling about Taylor Swift. What is was was just that they knew that she was a musician, and as such, might be at the tavern. That's very hard to remove.
One concrete example I'm sure these Skyrim mods aren't using is: enums in structured outputs [1] with a finite list of locations/characters/topics/etc that are allowed to be discussed. The AI is not allowed to respond with anything that is not in the enum. So you can give it a list of all the locations in the game in a huge array and it would be forced to pick one.
[1] https://platform.openai.com/docs/guides/structured-outputs#a...
I wouldn't ever want a game to use it for the core story writing, because it's pretty important that it is consistent and unable to be derailed. But for less serious NPC interactions or like an RPG scenario it is such a great fit.
I also wouldn't want a single player game to rely on remote inference, because that will get turned off eventually and then your game doesn't work.
(Yes, this is a Paradox callout. Give me less fancy particle effects in Vic3 and use the GPU for computing pop updates faster!)
(Probably the biggest barrier to this is the lack of a convenient C++/C#-level cross-manufacturer compute API. Vulkan is a bit too low-level for game devs to work with, OpenCL kind of sucks, and CUDA is NVIDIA-only.)
Incoming new type of health crisis: video game addiction coupled with LLM-induced psychosis. Dudes spending 12 hours a day farming gold in a MMORPG while flirting with their AI girlfriend sidekick
Another issue would be emphasizing the meaninglessness of the dialogue. For example, playing Trails in the Sky has lots of NPC dialogue that's repetitive, but at least the dialogue is relevant with how the NPC's life progresses in the grander scheme of things, such as having difficulty with her entrance exams, or having an argument with his fiancé. It's not main dialogue but adds flavor for anyone who cares about the world enough to interact with the citizens.
I don't think I'd like to interact with characters that I know whatever it is they have to say is generated on the fly and adds nothing other than random tidbits. The novelty would quickly wear off.
GLaDOS from Portal would offer one player pudding and another one a steak. You get to a wall which says “the ravioli is a fraud” and become utterly confused.
1. (predictability) Games like to have a clear arc and tend to use at least some of their NPCs to move that forward, it is harder to do this with a model that could make a choice you don't predict. They tend to have a set of items, quests, what-have-you that you need boundaries around.
2. (testing) Games like to test like crazy before launch, at least the AAA ones, so their QA folks just don't like a model that can have infinite responses/variants. Many then drop to a skeletal crew for maintenance and improvements after launch, where with ML models you actually need to keep improving the model, finding long tail bugs as more players interact with the system, etc.
3. (cost) Games are usually very cost aware, it's far cheaper to just have a set human-written dialogue path, then to run a model, even an offline one. Cheaper in both actual dollar costs if you're talking about a high end LLM service call, and CPU/GPU/memory costs if you're talking an on-box system.
4. (internationalization/localization) AAA games need to launch fast to many languages and locales, using a model for NLP and dialogue management/natural language generation adds added testing costs for each new language, that is just a very cheap translation normally that can be outsourced.
There have been some fun experiments in this space, and I expect to see this improve and become common use in the future, but it will take time and more work on how best to integrate a model into the flow of a game. I do love it for "presence" so talking to NPCs feels more human-like.
Nick and the crew at AI Dungeon (and related) have always done some interesting work in this space, trying out games where AI can be used in interesting ways.
1. I think people assume you have one LLM per character, but I think if you had specialized ones for each quest, item, etc.., this would actually work quite well.
3. I actually think if you cached responses under certain conditions, costs can be saved significantly. This would require quite a robust context, though, to still feel dynamic.
https://www.youtube.com/watch?v=xNPF9VKmzxw
Mods Name is
I'd argue throwing a game wrapper around an LLM is a new LLM experience, not a new game experience.
Once someone decides that will be a critical and fundamental part of a AAA game, the rest can be worked out despite what will be I am sure many unintended emergent behaviors.
https://www.youtube.com/watch?v=TpYVyJBmH0g&ab_channel=DougD...
The real immersion breaker and the holy grail of RPG is the fact NPC have no life or goal outside of what the player does. Imagine a game where NPCs have wants and goals and do things to get those done. Where you could leave it running for 10 years and things would have happened without you.
I think sandbox games like animal crossing are the exception, if it ever becomes reliable enough to put in anything other than an M rated game.
Animal Crossing, the Sims, Cult of the Lamb, and similar games would infinitely extend the life of the game. But I am sure we can all already imagine the headlines when these "family" games start saying things that they really should not be... especially given recent issues.
For the record, not dismissing the person who did this work. But doing this commercially has its risks.