That said, I need to clarify: the content was not written by AI, and certainly not generated from a database in one shot. If there's some agent + prompt that can produce what I wrote, I'd love to learn it—it would've saved me two weekends :)
Before addressing your questions further, some context: I'm a developer with no ML background but plenty of Cloud Infra experience. I'm currently building an open-source AI Infra project, which is why I studied nano-vllm. So my writing reflects some gaps in ML knowledge.
To your specific points:
> it goes into (nano)vLLM internals and doesn't mention PagedAttention once
I didn't find any explicit "paged attention" naming in nano-vllm. After reading the first article you linked—specifically the "Paged KV Caching" section—I believe the block management logic and CPU/GPU block mapping it describes is exactly what I covered in both posts. It may not be the full picture of paged attention, but I interpreted what I saw in the code and captured the core idea. I think that's a reasonable outcome.
> Part 2 will cover dense vs MoE's, which is weird because nanovllm hardcodes a dense Qwen3 into the source
This reflects my learning approach and background. Same as point 1—I may not have realized the block design was the famous PagedAttention implementation, so I didn't name it as such. For point 2, seeing a dense Qwen3 naturally made me wonder how it differs from the xx-B-A-yy-B MoE models I'd seen on Hugging Face—specifically what changes in the decoder layers. That curiosity led me to learn about MoE and write it up for others with the same questions.
---
I completely understand that in this era, people care more about whether what they're reading is AI-generated—no one wants to waste time on low-effort slop with no human involvement.
But as I explained above—and as my hand-drawn Excalidraw diagrams show (I haven't seen an LLM produce diagrams with logic that satisfies me)—this is the result of learning shaped by my own knowledge background and preferences.
https://news.ycombinator.com/item?id=46858409
But: this was never a problem and now we have to distinguish between LLM generated, human generated, LLM polished and human generated. I'd much prefer it if people just wrote their own text, warts and all.
No offense intended to @yz-yu, by the way. I miss the times when more people wrote in an eccentric style -- like Steve Yegge -- but that doesn't detract from what you wrote.
So let me start from @jbarrow's comment: "AI written, generated from the codebase."
My actual learning process looked like this:
1. I walked through the nano-vLLM codebase, asking Claude Code some high-level questions to warm up. 2. Then I asked detailed questions one by one, let it explore, and double-checked the code myself. As someone without an ML background, it sometimes took hours to understand a single concept. 3. Once I felt I understood enough, I started drawing Excalidraw diagrams to explain what I learned.
Does this count as "generated from the codebase"? I don't think so.
Where we might disagree is the writing process.
As a non-native English speaker, my workflow looks like this:
1. Write a short paragraph (<100 words), then ask my writing agent to "fix this for readability and grammar." 2. Review the output. *If it changes any technical meaning, I correct it.* I consider this a responsible way to write a tech blog. 3. Move to the next paragraph.
Is this "AI-written"? I'd call it "AI-assisted." Every idea in every sentence is mine. Honestly, things like "em dashes" never stood out to me when reviewing. I suspect that's common for non-native speakers.
I wrote this comment the same way. The LLM fixed 14 grammar mistakes that I think would distract readers more than any LLM-ish phrasing.
That said, I'm open to suggestions on how to improve my writing process :)
To be honest most native readers wouldn’t register grammar errors full stop.
I guess I have more awe of people who speak a foreign language at all compared to piping it through some agent malarkey.
Hasn't made me change the way I write, though. Especially because I never actually type an em dash character myself. Back when I started using computers, we only had ASCII, so I got used to writing with double dashes. Nowadays, a lot of software is smart enough to convert a double dash into an em dash. Discourse does that and that's how I ended up being accused of being an AI bot.
The contrast might become even greater because some humans that did use them have stopped to avoid false accusations.
So if you're being accused of just spewing AI, then double down and spew what looks EVEN MORE like AI. What are you even doing?
Also, if the a single character is how you're red-flagging LLM output, do you know how easy it is to avoid? I didn't use it here at all, but how do you know I didn't run this through some slop-machine to tighten my prose? It's really low-effort take to say "just avoid em dashes so we know you're not an AI".
https://www.mcsweeneys.net/articles/the-em-dash-responds-to-...