upvote
EDIT:

After I have written the above, that the future cost of externally-hosted AI coding assistants is unpredictable, what I have written was confirmed by an OpenAI press release that the existing Codex users will be migrated during the following weeks towards token-based pricing rates.

Such events will not affect you if you use an open-weights assistant running on your own HW, when you do not have to care about token usage.

reply
You do have to care about token usage when chosing how to scale your hardware. If you do a negligible amount of AI inference for occasional simple Q&A (which is what most people do), you can get away with a very lean and cheap setup even when running very large, sophisticated models. Agentic use with function calls and responses etc. raises the amount of tokens you use over time by at least one order of magnitude.
reply