Just ask me a clarifying question before going into your huge pitch. Chats are a back & forth. You don’t need to give me a response 10x longer than my initial question. Etc
It's a fast track to public disdain and heavy handed government regulation.
There is no way without the protections that could be afforded by regulation to offer such wide-ranging uses of the product without also accepting significant liability. If the range of "foreseeable misuse" is very broad and deep, so is the possible liability. If your marketing says that the bot is your lawyer, doctor, therapist, and spouse in one package, how is one to say that the company can escape all the comprehensive duties that attach to those social roles. Courts will weigh the tiny and inconspicuous disclaimers against the very large and loud marketing claims.
The companies could protect themselves in ways not unlike the ways in which the banking industry protects itself by replacing generic duties with ones defined by statute and regulation. Unless that happens, lawyers will loot the shareholders.
Recall: "As part of our 'treat adult users like adults' principle, we will allow even more, like erotica for verified adults," Altman wrote in the Oct.
Some of the labs might be less worried about this, but they're not by any means homogenous.
That's the best part.
A person can even hammer out an unstructured list of behavioral gripes, tell the bot to organize them into instructional prose, have it ask clarifying questions and revise based on answers, and produce directions for integrating them as Custom Instructions.
From then on, it will invisibly read these instructions into context at the beginning of each new chat.
Mold it and steer it to be how you want it to be.
(My own bot tends to be very dry, terse, non-presumptuous, pragmatic, and profane. It's been years now since it has uttered an affirmation like "That's a great idea!" or "Wow! My circuits are positively buzzing with the genius I'm seeing here!" or produced a tangential dissertation in response to a simple question. But sometimes it does come back with functional questions, or phrasing like "That shit will never work. Here's why.")
Except, of course, when that is exactly what the user wants.
Chat is a back & forth.
Search is a one-shot.
In a real human to human conversation, you wouldn’t simply blurt out the first thing that comes to mind. Instead, you’d ask questions.
This is the fundamental limitation with generative AI. It only generates, it does not ponder.
That is to say, all of that activity I listed is activity I’m confident generative AI is not capable of, fundamentally.
Like I said in a cousin comment, we can build Frankenstein algorithms and heuristics on top of generative AI but every indication I’ve seen is that that’s not sufficient for intelligence in terms of emergent complexity.
Imagine if we had put the same efforts towards neural networks, or even the abacus. “If I create this feedback loop, and interpret the results in this way, …”
My take is that an artificial model of true intelligence will only be achieved through emergent complexity, not through Frankenstein algorithms and heuristics built on generative AI.
Generative AI does itself have emergent complexity, but I’m bearish that if we would even hook it up to a full human sensory input network it would be anything more than a 21st century reverse mechanical Turk.
Edit: tl;dr Emergent complexity is a necessary but insufficient criteria for intelligence
You can get it to ask you clarifying questions just by telling it to. And then you usually just get a bunch of questions asking you to clarify things that are entirely obvious, and it quickly turns into a waste of time.
The only time I find that approach helpful is when I'm asking it to produce a function from a complicated English description I give it where I have a hunch that there are some edge cases that I haven't specified that will turn out to be important. And it might give me a list of five or eight questions back that force me to think more deeply, and wind up being important decisions that ensure the code is more correct for my purposes.
But honestly that's pretty rare. So I tell it to do that in those cases, but I wouldn't want it as a default. Especially because, even in the complex cases like I describe, sometimes you just want to see what it outputs before trying to refine it around edge cases and hidden assumptions.
It’s similar to the challenge that foreigners have with cultural references and idioms and figurative speech a culture has a mental model of.
In this case, I think what is missing are a set of assumptions based on logic, e.g., when stating that someone wants to do something, it assumes that all required necessary components will be available, accompany the subject, etc.
I see this example as really not all that different than a meme that was common among I think the 80s and 90s, that people would forget buying batteries for Christmas toys even though it was clear they would be needed for an electronic toy. People failed that basic test too, and those were humans.
It is odd how people are reacting to AI not being able to do these kinds of trick questions, while if you posted something similar about how you tricked some foreigners you’d be called racist, or people would laugh if it was some kind of new-guy hazing.
AI is from a different culture and has just arrived here. Maybe we’re should be more generous and humane… most people are not humane though, especially the ones who insist they are.
Frankly, I’m not sure it bodes well for if aliens ever arrive on Earth, how people would respond; and AI is arguably only marginally different than humans, something an alien life that could make it to Earth surely would not be.
You _could_ say humans output similar answers to questions, but I think that is being intellectually dishonest. Context, experience, observation, objectivity, and actual intelligence is clearly important and not something the LLM has.
It is increasingly frustrating to me why we cannot just use these tools for what they are good for. We have, yet again, allowed big tech to go balls deep into ham-fisting this technology irresponsibly into every facet of our lives the name of capital. Let us not even go into the finances of this shitshow.
This is especially obvious when viewing the reasoning trace for models like Claude, which often spends a lot of time speculating about the user's "hints" and trying to parse out the intent of the user in asking the question. Essentially, the model I use for LLMs these days is to treat them as very good "test takers" which have limited open book access to a large swathe of the internet. They are trying to ace the test by any means necessary and love to take shortcuts to get there that don't require actual "reasoning" (which burns tokens and increases the context window, decreasing accuracy overall). For example, when asked to read a full paper, focusing on the implications for some particular problem, Claude agents will try to cheat by skimming until they get to a section that feels relevant, then searching directly for some words they read in that section. They will do this even if told explicitly that they must read the whole paper. I assume this is because the vast majority of the time, for the kinds of questions that they are trained on, this sort of behavior maximizes their reward function (though I'm sure I'm getting lots of details wrong about the way frontier models are trained, I find it very unlikely that the kinds of prompts that these agents get very closely resemble data found in the wild on the internet pre-LLMs).