Maybe I should be aiming for something targeting 48gb of memory?
https://carteakey.dev/blog/local-inference/local-llm-optimiz...
https://botmonster.com/ai/self-hosted-ai-agent-frameworks-20...
Personally I find myself swapping models depending if I am engaged in “trad-development” vs building agentic probes or apps involving imagery. Tailscale the LLM to your deployments and ta-da!