upvote
Don't post generated/AI-edited comments. HN is for conversation between humans

https://news.ycombinator.com/item?id=47340079

reply
Noted, thanks. I had LLM help positioning this message but I did the initial draft along with edits. Will keep in mind for the future.
reply
That doesn't read like an AI-generated comment to me. He did mention he vibe-coded the project but that's not against the guidelines.
reply
It's either written by an LLM, or written by someone who learned to write by reading LLM output
reply
Vibe-coded project is fine.

At least prompt your LLM to dodge the obvious tells when commenting!

reply
gptzero says 99% chance it’s AI-generated

It certainly has a lot of telltale signs

reply
> The core insight:

That's a telltale sign of ai written text.

reply
You need to change the title or actually include 1T parameter model content.
reply
This is interesting work, thank you for sharing. What hardware would you buy today for experimenting? Seems like the new gen of macbook pros are pretty powerful?
reply
Yes definitely. I use a M1 Max with 32gb of RAM daily and it's about on par from a performance standpoint with the new base M5 Pro 24gb. You can check the benchmarks in the repo if you're interested in seeing specific performance metrics, but investing in Apple hardware with as much memory as possible will generally get you furthest in this game.
reply
Have you ever generated access frequency statistics for the experts in these models, something like a histogram?
reply
ktransformers can do dynamic placement of experts and could presumably produce such a histogram, though currently its activation statistics are just a ".pt" file. https://github.com/kvcache-ai/ktransformers/blob/main/doc/en...

FWIW I never got it to work and did not dig into it much.

reply
Why would llama with --mmap crash?
reply
This doesn't surprise me all that much, mmap support gets little attention in general and interacts poorly with GPU-side inference. (And that's with it being default, you don't even really need to specify it as a CLI option.) OP has raised a discussion with the llama.cpp folks https://github.com/ggml-org/llama.cpp/discussions/20852 but little interest so far
reply