undefined

upvote

points

by astlouis4419 hours ago |

upvote

by 0x6219 hours ago|

[-]

FWIW I've been experimenting with Three.js and AI for the last ~3 years, and noticed a significant improvement in 5.4 - the biggest single generation leap for Three.js specifically. It was most evident in shaders (GLSL), but also apparent in structuring of Three.js scenes across multiple pages/components.

It still struggles to create shaders from scratch, but is now pretty adequate at editing existing shaders.

In 5.2 and below, GPT really struggled with "one canvas, multiple page" experiences, where a single background canvas is kept rendered over routes. In 5.4, it still takes a bit of hand-holding and frequent refactor/optimisation prompts, but is a lot more capable.

Excited to test 5.5 and see how it is in practice.

reply

upvote

by CSMastermind19 hours ago|

[-]

> It still struggles to create shaders from scratch

Oh just like a real developer

reply

upvote

by accrual18 hours ago|

[-]

Much respect for shader developers, it's a different way of thinking/programming

reply

upvote

by Pym15 hours ago|

[-]

One struggle I'm having (with Claude) is that most of what it knows about Three.js is outdated. I haven't used GPT in a while, is the grass greener?

Have you tried any skills like cloudai-x/threejs-skills that help with that? Or built your own?

reply

upvote

by import16 hours ago|

[-]

Using Claude for the same context and it’s doing really well with the glsl. since like last September

reply

upvote

by dataviz100017 hours ago|

[-]

LLM models can not do spacial reasoning. I haven't tried with GPT, however, Claude can not solve a Rubik Cube no matter how much I try with prompt engineering. I got Opus 4.6 to get ~70% of the puzzle solved but it got stuck. At $20 a run it prohibitively expensive.

The point is if we can prompt an LLM to reason about 3 dimensions, we likely will be able to apply that to math problems which it isn't able to solve currently.

I should release my Rubiks Cube MCP server with the challenge to see if someone can write a prompt to solve a Rubik's Cube.

reply

upvote

by variodot3 hours ago|

[-]

I’ve had a similar experience building a geometry/woodworking-flavored web app with Three.js and SVG rendering. It’s been kind of wild how quickly the SOTA models let me approach a new space in spatial development and rendering 3d (or SA optimization approaches, for that matter). That said, there are still easy "3d app" mistakes it makes like z-axis flipping or misreading coordinate conventions. But these models make similar mistakes with CSS and page awareness. Both require good verification loops to be effective.

reply

upvote

by dataviz10003 hours ago|

[-]

I think there is a pattern. It has a hard time with temporal and spatial.

Temporal. I had a research project where the LLM had no concept about preventing data from the future to leak in. I eventually had to create a wall clock and an agent that would step through every line of code and ensure by writing that lines logic and why there is no future of the wall clock data leaking.

Spatial. I created a canvas for rendering thinking model's attention and feedforward layers for data visualization animations. It was having a hard time working with it until I pointed Opus 4.7 to some ancient JavaScript code [0] about projecting 3d to 2d and after searching Github repositories. It worked perfect with pan zoom in one shot after that.

No matter how hard I tried I couldn't get it to stack all the layers correctly. It must have remembered all the parts for projecting 3d to 2d because it could not figure out how to position the layers.

There is a ton of information burnt into the weights during training but it can not reason about it. When it does work well with spatial and temporal it is more slight of hand than being able to generalize.

People say, why not just do reinforcement learning? That can't generalize in the same way a LLM can. I'm thinking about doing the Rubik's Cube because if people can solve that it might open up solutions for working temporal and spatial problems.

[0] https://jakesgordon.com/writing/javascript-racer-v1-straight...

reply

upvote

by embedding-shape16 hours ago|

[-]

> I should release my Rubiks Cube MCP server with the challenge to see if someone can write a prompt to solve a Rubik's Cube.

Do it, I'm game! You nerdsniped me immediately and my brain went "That sounds easy, I'm sure I could do that in a night" so I'm surely not alone in being almost triggered by what you wrote. I bet I could even do it with a local model!

reply

upvote

by versteegen9 hours ago|

[-]

Interesting (would like to hear more), but solving a Rubiks cube would appear to be a poor way to measure spatial understanding or reasoning. Ordinary human spatial intuition lets you think about how to move a tile to a certain location, but not really how to make consistent progress towards a solution; what's needed is knowledge of solution techniques. I'd say what you're measuring is 'perception' rather than reasoning.

reply

upvote

by William_BB7 hours ago|

[-]

> what's needed is knowledge of solution techniques

That's definitely in the training data

reply

upvote

by Melatonic15 hours ago|

[-]

What about a model designed for robotics and vision? Seems like an LLM trained on text would inherently not be great for this.

DeepMinds other models however might do better?

reply

upvote

by holoduke2 hours ago|

[-]

I bet I can even do it with the smallest gemma 4 model using a prompt of max 500 characters.

reply

upvote

by snet016 hours ago|

[-]

How are you handing the cube state to the model?

reply

upvote

by dataviz100016 hours ago|

[-]

Does this answer the question?

Opus 4.6 got the cross and started to get several pieces on the correct faces. It couldn't reason past this. You can see the prompts and all the turn messages.

https://gist.github.com/adam-s/b343a6077dd2f647020ccacea4140...

edit: I can't reply to message below. The point isn't can we solve a Rubik's Cube with a python script and tool calls. The point is can we get an LLM to reason about moving things in 3 dimensions. The prompt is a puzzle in the way that a Rubik's Cube is a puzzle. A 7 year old child can learn 6 moves and figure out how to solve a Rubik's Cube in a weekend, the LLM can't solve it. However, can, given the correct prompt, a LLM solve it? The prompt is the puzzle. That is why it is fun and interesting. Plus, it is a spatial problem so if we solve that we solve a massive class of problems including huge swathes of mathematics the LLMs can't touch yet.

reply

upvote

by libraryofbabel9 hours ago|

[-]

I wonder if the difficulties LLMs have with “seeing” complex detail in images is muddying the problem here. What if you hand it the cube state in text form? (You could try ascii art if you want a middle ground.)

If you want to isolate the issue, try getting the LLM itself to turn the images into a text representation of the cube state and check for accuracy. If it can’t see state correctly it certainly won’t be able to solve.

reply

upvote

by osti15 hours ago|

[-]

Can't they write a script to solve rubik cubes?

reply

upvote

by Jensson10 hours ago|

[-]

That doesn't test whether the model can follow and execute a dynamic plan reliably.

reply

upvote

by 16 hours ago|

[-]

deleted

reply

upvote

by Torkel16 hours ago|

[-]

*yet

reply

upvote

by vunderba19 hours ago|

[-]

I’ve had a lot of success using LLMs to help with my Three.js based games and projects. Many of my weird clock visualizations relied heavily on it.

It might not be a game engine, but it’s the de facto standard for doing WebGL 3D. And since it’s been around forever, there’s a massive amount of training data available for it.

Before LLMs were a thing, I relied more on Babylon.js, since it’s a bit higher level and gives you more batteries included for game development.

reply

upvote

by peder14 hours ago|

[-]

> It really seems like we could be at the dawn of a new era similiar to flash

We've been there for a while.... creativity has been the primary bottleneck

reply

upvote

by kingstnap19 hours ago|

[-]

The meshes look interesting, but the gameplay is very basic. The tank one seems more sophisticated with the flying ships and whatnot.

What's strange is that this Pietro Schirano dude seems to write incredibly cargo cult prompts.

  Game created by Pietro Schirano, CEO of MagicPath

  Prompt: Create a 3D game using three.js. It should be a UFO shooter where I control a tank and shoot down UFOs flying overhead.
  - Think step by step, take a deep breath. Repeat the question back before answering.
  - Imagine you're writing an instruction message for a junior developer who's going to go build this. Can you write something extremely clear and specific for them, including which files they should look at for the change and which ones need to be fixed?
  -Then write all the code. Make the game low-poly but beautiful.
  - Remember, you are an agent: please keep going until the user's query is completely resolved before ending your turn and yielding back to the user. Decompose the user's query into all required sub-requests and confirm that each one is completed. Do not stop after completing only part of the request. Only terminate your turn when you are sure the problem is solved. You must be prepared to answer multiple queries and only finish the call once the user has confirmed they're done.
  - You must plan extensively in accordance with the workflow steps before making subsequent function calls, and reflect extensively on the outcomes of each function call, ensuring the user's query and related sub-requests are completely resolved.

reply

upvote

by torginus18 hours ago|

[-]

It's weird how people pep talk the AI - if my Jira tickets looked like this, I would throw a fit.

I guess these people think they have special prompt engineering skills, and doing it like this is better than giving the AI a dry list of requirements (fwiw, they might be even right)

reply

upvote

by mattgreenrocks17 hours ago|

[-]

It’s not surprising to me that the same crowd that cheers for the demise of software engineering skills invented its own notion of AI prompting skills.

Too bad they can veer sharply into cringe territory pretty fast: “as an accomplished Senior Principal Engineer at a FAANG with 22 years of experience, create a todo list app.” It’s like interactive fanfiction.

reply

upvote

by dr_kiszonka12 hours ago|

[-]

That's quite similar to the AI Studio's prompt. You are a world-class frontend engineer...

reply

upvote

by eiksjs17 hours ago|

[-]

Indeed it is so utterly cringe.

reply

upvote

by eloisant15 hours ago|

[-]

Yes, this is cargo cult.

This remind me of so called "optimization" hacks that people keep applying years after their languages get improved to make them unnecessary or even harmful.

Maybe at one point it helped to write prompts in this weird way, but with all the progress going on both in the models and the harness if it's not obsolete yet it will soon be. Just crufts that consumes tokens and fills the context window for nothing.

reply

upvote

by irthomasthomas19 hours ago|

[-]

> Think Step By Step

What is this, 2023?

I feel like this was generated by a model tapping in to 2023 notions of prompt engineering.

reply

upvote

by retr0rocket18 hours ago|

[-]

[dead]

reply

upvote

by skirano18 hours ago|

[-]

Pietro here, I just published a video of it: https://x.com/skirano/status/2047403025094905964?s=20

reply

upvote

by tantalor19 hours ago|

[-]

It comes across as an elaborate, sparkly motivational cat poster.

*BELIEVE!* https://www.youtube.com/watch?v=D2CRtES2K3E

reply

upvote

by skolskoly14 hours ago|

[-]

https://m.media-amazon.com/images/I/71MTbRmLY8L._AC_UF894,10...

reply

upvote

by bredren18 hours ago|

[-]

The prompt did not specify advanced gameplay.

I do not see instructions to assist in task decomposition and agent ~"motivation" to stay aligned over long periods as cargo culting.

See up thread for anecdotes [1].

> Decompose the user's query into all required sub-requests and confirm that each one is completed. Do not stop after completing only part of the request. Only terminate your turn when you are sure the problem is solved.

I see this as a portrayal of the strength of 5.5, since it suggests the ability to be assigned this clearly important role to ~one shot requests like this.

I've been using a cli-ai-first task tool I wrote to process complex "parent" or "umberella" into decomposed subtasks and then execute on them.

This has allowed my workflows to float above the ups and downs of model performance.

That said, having the AI do the planning for a big request like this internally is not good outside a demo.

Because, you want the planning of the AI to be part of the historical context and available for forensics due to stalls, unwound details or other unexpected issues at any point along the way.

[1] https://news.ycombinator.com/item?id=47879819

reply

upvote

by ahoka18 hours ago|

[-]

"take a deep breath"

OMFG

reply

upvote

by jameshart9 hours ago|

[-]

Claude would check to see if it had any breathing skills, if it doesn't find any it would start installing npm modules for breathing.

reply

upvote

by mindhunter16 hours ago|

[-]

A friend is building Jamboree[1] (prev name "Spielwerk") for iOS. An app to build and share games. They're all web based so they're easy to share.

[1] https://apps.apple.com/uz/app/jamboree-game-maker/id67473110...

reply

upvote

by 18 hours ago|

[-]

deleted

reply

upvote

by 18 hours ago|

[-]

deleted

reply

upvote

by nemo44x15 hours ago|

[-]

It’s like all these things though - it’s not a real production worthy product. It’s a super-demo. It looks amazing until you realize there’s many months of work to make it something of quality and value.

I think people are starting to catch on to where we really are right now. Future models will be better but we are entering a trough of dissolution and this attitude will be widespread in a few months.

reply

upvote

by ZeWaka19 hours ago|

[-]

I personally don't think the gameplay itself is that impressive.

reply

upvote

by gregpred19 hours ago|

[-]

[flagged]

reply