upvote
If you have a model that only know how to model CAD but also doesn't know history, and was trained on visual language of said history, how is it supposed to be able to model the Pantheon in the first place? It'd only be able to model exactly what you can describe with text, or even worse, exactly what it'd be able to visually extract from images via the vision encoders, for "vision models", but it'd be a far cry from what you see in this blogpost, would be my guess.
reply
> In future are we going to have same model for everything?

A model that knows more in general, will often be better at specific tasks. e.g. If you ask a model to "make a program that estimates the annual production of a solar installation", it needs to have been trained on a lot more than just Python code.

reply
> A model that knows more in general, will often be better at specific tasks

Is this your hypothesis or broad conclusion among AI experts?

reply
You might combine a general world model with a python coding model in that case. Not sure if it's better, just saying.
reply
What's the difference between a "general world model combined with a python coding model" and a multimodal LLM?
reply