undefined

points

[-]

Off topic, but I really like the writing style on your blog. Do you have any advice for improving my own? In an older comment[1], you mentioned the craft of sharpening an idea to a very fine, meaningful, well-written point. Are there any books, or resources you’d recommend for honing that craft? Thanks in advance.

[1] https://news.ycombinator.com/item?id=44082994

by spyckie28 hours ago|

parent|

[-]

The thing that inspires my writing is that the best sentences are self evident. Meaning you declare it without evidence and it feels so intuitively right to most people. It resonates, either being their lived experience, or being the inevitable conclusion of a line of thinking.

Making a sentence like requires deeply understanding a problem space to the point where these sentences emerge, rather than any "craft" of writing.

So the craft is thinking through a topic, usually by writing about it, and then deleting everything you've written because you arrived at the self evident position, and then writing from the vantage point of that self evident statement.

I feel that writing is a personal craft and you must dig it out of yourself through the practice of it, rather than learn it from others. The usage of AI as a resource makes this much clearer to me. You must be confident in your own writing not because it is following best practices or techniques of others but because it is the best version of your own voice at the time of being written.

by bergheim9 hours ago|

parent|

prev|

[-]

Curious why you think that? Stuff like

> Yes, there is a relative scale level...

> Yes, having the smartest model will...

> yes Chinese AI companies have ...

yes yes yes, I didn't say anything, why write in a way that insinuates that I was thinking that?

I mean it doesn't come off as AI slop, so that's yay in 2026. But why do you think it is so good?

by spyckie29 hours ago|

parent|

[-]

haha it is poorly written, its one of my pieces with the fewest drafts, i just wrote it and clicked submit to get the thoughts out of my head.

I think he is referring to the art of refining an idea though, which I do have something to say on his comment.

by adrian_b14 hours ago|

prev|

[-]

I agree with what you what you have written, which is why I would never pay a subscription to an external AI provider.

I prefer to run inference on my own HW, with a harness that I control, so I can choose myself what compromise between speed and the quality of the results is appropriate for my needs.

When I have complete control, resulting in predictable performance, I can work more efficiently, even with slower HW and with somewhat inferior models, than when I am at the mercy of an external provider.

by brightball10 hours ago|

parent|

[-]

What’s your setup?

by adrian_b9 hours ago|

parent|

[-]

For now, the most suitable computer that I have for running LLMs is an Epyc server with 128 GB DRAM and 2 AMD GPUs with 16 GB of HBM memory each.

I have a few other computers with 64 GB DRAM each and with NVIDIA, Intel or AMD GPUs. Fortunately all that memory has been bought long ago, because today I could not afford to buy extra memory.

However, a very short time ago, i.e. the previous week, I have started to work at modifying llama.cpp to allow an optimized execution with weights stored in SSDs, e.g. by using a couple of PCIe 5.0 SSDs, in order to be able to use bigger models than those that can fit inside 128 GB, which is the limit to what I have tested until now.

By coincidence, this week there have been a few threads on HN that have reported similar work for running locally big models with weights stored in SSDs, so I believe that this will become more common in the near future.

The speeds previously achieved for running from SSDs hover around values from a token at a few seconds to a few tokens per second. While such speeds would be low for a chat application, they can be adequate for a coding assistant, if the improved code that is generated compensates the lower speed.

by brightball9 hours ago|

parent|

[-]

Thank you for that, it's very interesting. I keep wanting to find time to try out a local only setup with an NVIDIA 4090 and 64gb of RAM. It seems like it may be time try it out.

by kaydub2 hours ago|

prev|

[-]

At my job and for personal projects I pay per token with claude and I've had no problems at all with it. No slowdowns, no "throttling", nothing.

I'm honestly surprised how many people have subscriptions and are expecting anthropic to eat the cost lol

by vintagedave10 hours ago|

prev|

[-]

My bad — I had Max, so more than $20. I can’t edit the comment any more. Can’t keep track of the names. I wonder when ‘pro’ started to mean ‘lowest tier’.

But your article is interesting. You think some of the degradation is because when I think I’m using Opus they’re giving me Sonnet invisibily?

by spyckie28 hours ago|

parent|

[-]

Hard to say, but the fact is the intelligence was there and now it's not.

Maybe they are giving Sonnet, or maybe a distilled Opus, or maybe Opus but with lower context, not quite sure but intelligence costs compute so less intelligence means cheaper compute.

by joefourier14 hours ago|

prev|

[-]

I used the $60/mo subscription and I bet most developers get access to AI agents via their company, and there was no difference. They should have reduced the rate limits, or offered a new model, anything except silently reduce the quality of their flagship product to reduce cost.

The cost of switching is too low for them to be able to get away with the standard enshittification playbook. It takes all of 5 minutes to get a Codex subscription and it works almost exactly the same, down to using the same commands for most actions.

by brightball10 hours ago|

parent|

[-]

Thank goodness for capitalism for providing multiple competitors to multibillion dollar companies

by colordrops10 hours ago|

prev|

[-]

So instead of breaking shit they should have just increased their prices.