You are correct that these models primarily address problems that have already been solved. However, that has always been the case for the majority of technical challenges. Before LLMs, we would often spend days searching Stack Overflow to find and adapt the right solution.
Another way to look at this is through the lens of problem decomposition as well. If a complex problem is a collection of sub-problems, receiving immediate solutions for those components accelerates the path to the final result.
For example, I was recently struggling with a UI feature where I wanted cards to follow a fan-like arc. I couldn't quite get the implementation right until I gave it to Gemini. It didn't solve the entire problem for me, but it suggested an approach involving polar coordinates and sine/cosine values. I was able to take that foundational logic turn it into a feature I wanted.
Was it a 100x productivity gain? No. But it was easily a 2x gain, because it replaced hours of searching and waiting for a mental breakthrough with immediate direction.
There was also a relevant thread on Hacker News recently regarding "vibe coding":
https://news.ycombinator.com/item?id=45205232
The developer created a unique game using scroll behavior as the primary input. While the technical aspects of scroll events are certainly "solved" problems, the creative application was novel.
For example, consider this game: The game creates a target that's randomly generated on the screen and have a player at the middle of the screen that needs to hit the target. When a key is pressed, the player swings a rope attached to a metal ball in circles above it's head, at a certain rotational velocity. Upon key release, the player has to let go of the rope and the ball travels tangentially from the point of release. Each time you hit the target you score.
Now, I’m trying to calculate the tangential velocity of a projectile from a circular path, I could find the trig formulas on Stack Overflow. But with an LLM, I can describe the 'vibe' of the game mechanic and get the math scaffolded in seconds.
It's that shift from searching for syntax to architecting the logic that feels like the real win.
...This may still be worth it. In any case it will stop being a problem once the human is completely out of the loop.
edit: but personally I hate missing out on the chance to learn something.
Today with LLMs you can literally spend 5 minutes defining what you want to get, press send, go grab a coffee and come back to a working POC of something, in literally any programming language.
This is literally stuff of wonders and magic that redefines how we interface with computers and code. And the only thing you can think of is to ask if it can do something completely novel (that it's so hard to even quantity for humans that we don't have software patents mainly for that reason).
And the same model can also answer you if you ask it about maths, making you an itinerary or a recipe for lasagnas. C'mon now.
I'm using Copilot for Visual Studio at work. It is useful for me to speed some typing up using the auto-complete. On the other hand in agentic mode it fails to follow simple basic orders, and needs hand-holding to run. This might not be the most bleeding-edge setup, but the discrepancy between how it's sold and how much it actually helps for me is very real.
I want AI that cures cancer and solves climate change. Instead we got AI that lets you plagiarize GPL code, does your homework for you, and roleplay your antisocial horny waifu fantasies.
To bridge the containers in userland only, without root, I had to build: https://github.com/puzed/wrapguard
I'm sure it's not perfect, and I'm sure there are lots of performance/productivity gains that can be made, but it's allowed us to connect our CDN based containers (which don't have root) across multiple regions, talking to each other on the same Wireguard network.
No product existed that I could find to do this (at least none I could find), and I could never build this (within the timeframe) without the help of AI.
But I have plenty of examples of really atrocious human written code to show you! TheDailyWtf has been documenting the phenomenon for decades.
And this matters because? Most devs are not working on novel never before seen problems.
I can name a few times where I worked on something that you could consider groundbreaking (for some values of groundbreaking), and even that was usually more the combination of small pieces of work or existing ideas.
As maybe a more poignant example- I used to do a lot of on-campus recruiting when I worked in HFT, and I think I disappointed a lot of people when I told them my day to day was pretty mundane and consisted of banging out Jiras, usually to support new exchanges, and/or securities we hadn't traded previously. 3% excitement, 97% unit tests and covering corner cases.
Not to be outdone, chatgpt 5.2 thinking high only needed about 8 iterations to get a mostly-working ffmpeg conversion script for bash. It took another 5 messages to translate it to run in windows, on powershell (models escaping newlines on windows properly will be pretty nuch AGI, as far as I’m concerned).
Some people just hate progress.
Sure:
"The resulting compiler has nearly reached the limits of Opus’s abilities. I tried (hard!) to fix several of the above limitations but wasn’t fully successful. New features and bugfixes frequently broke existing functionality.
As one particularly challenging example, Opus was unable to implement a 16-bit x86 code generator needed to boot into 16-bit real mode. While the compiler can output correct 16-bit x86 via the 66/67 opcode prefixes, the resulting compiled output is over 60kb, far exceeding the 32k code limit enforced by Linux. Instead, Claude simply cheats here and calls out to GCC for this phase (This is only the case for x86. For ARM or RISC-V, Claude’s compiler can compile completely by itself.)"[1]
1. https://www.anthropic.com/engineering/building-c-compiler
Another example: Red Dead Redemption 2
Another one: Roller coaster tycoon
Another one: ShaderToy
You're not gonna one-shot RD2, but neither will a human. You can one-shot particles and shader passes though.
Also try building any complex effects by prompting LLMs, you wont get any far, this is why all of the LLM coded websites look stupidly bland.
As to your second question, it is about prompting them correctly, for example [0]. Now I don't know about you but some of those sites especially after using the frontend skill look pretty good to me. If those look bland to you then I'm not really sure what you're expecting, keeping in mind that the example you showed with the graphics are not regular sites but more design oriented, and even still nothing stops LLMs from producing such sites.
Edit: I found examples [0] of games too with generated assets as well. These are all one shot so I imagine with more prompting you can get a decent game all without coding anything yourself.