undefined

points

[-]

Yeah it's just a semantic pet peeve. Let me ask you this: What is a "Language Model", if this is a "Large Language Model"? Inversely, if a 1.5B model is "Large" then what is the recent 1T param models? "Superlarge"?

In my own very humble opinion, it becomes "Large" when it's out of non-specialized hardware. So currently, a model which requires more than 32GB vram is large (as that's roughly where the high-end gaming GPUs cut off).

And btw, there is no way you can train a language model on a CPU, even with ddr5, lest you wait a whole week for a single training cycle. Give it a go! I know I did, it's a magnitude away from being feasible.

by joefourier50 minutes ago|

parent|

[-]

Calling anything "large" in computing is problematic since hardware keeps improving. GPT-1 was an LLM in 2017 and had 117M parameters, when did it stop being large?

GPT would have been a better term than LLM, but unfortunately became too associated with OpenAI. And then, what about non-transformer LLMs? And multimodal LLMs?

Maybe we should just give up, shrug and call it "AI".

by Malcolmlisk6 hours ago|

prev|

[-]

Then rewrite the title and call it "learn how to do a non usable llm from scratch"

by improbableinf6 hours ago|

parent|

[-]

Opus 4.7 is non-usable for the tasks I have — but it’s considered an LLM.

And no one is stopping anyone from tweaking few parameters in this repo to go above 10M parameters.

by skinfaxi1 hours ago|

parent|

[-]

What tasks is it non-usable for?