undefined

points

[-]

That's a fair point TBH. I said in my post that this LLM is first of all a learning project and I skipped an important step: the training loop. But on the other hand, how many data scientists are writing their own training loops? Is it even worth it? And how much learning do you want for one project, I mean, where do you stop? Why use "Huggingface Transformers" when you can write it from scratch, for learning? Why use Torch when you can write it from scratch, for learning? Why use Python when you can write in C, etc. It's cheating, right? In my case, I decided to skip the training loop and focus on the data processing and the hyper params and the rest of the higher level steps that took a ton of time anyway, and I reduced the friction. I do get your point tho. Now that I know how to train an LLM, maybe I'll write a training loop from scratch as a project, to learn how to do it.

by abetusk4 hours ago|

prev|

[-]

This is like a modern form of "I could do that in a weekend". Try reading the article before making such statements.

There's a lot of pre-processing, experimentation and validation that went into this project. The training data collection and sanitization alone is a big undertaking.

As for the blog post itself, from the article:

> Note: This blog post is 100% written by me. No AI has been used whatsoever.

Put another way: You can ask the LLM yourself to do this project? Please do, share your prompt, I'd like to see it.

by JayNitram6 hours ago|

prev|

[-]

I get what you are saying, but at the same time I was bored on a Saturday and 'vibe coded' a small VR game, nothing special, but I had the LLM throw down a structure, and then I walked through it looking at and thinking about why placement of code was how it was and how different things were handled. It was basically exactly like my job, jump into some okay working legacy app, code I have never actually seen, try to get my brain around it, then personally tweak things until the app performs the way I fully want.

by skerit3 hours ago|

prev|

[-]

I've been creating my own little from-scratch LLM for months now with Claude's help. I can safely say I learned a thing or two along the way.