upvote
This is not an LLM but a Binary to run LLMs as single purpose agents that can chain together.
reply
Yeah I was disappointed by that too.
reply
deleted
reply
Putting heavy AI workloads in a 12MB binary means you either make savage cuts on model support or you lock users to strange minimal formats. If you care about ops, eventually you hit edge cases where the "just works" story collapses and you end up debugging missing layers or janky hardware support. If the goal is to experiment locally or run demos, 12MB is fine but pretending it fits broader deployment is a stretch unless they're pulling some wild tricks under the hood.
reply