upvote
Why not try it yourself? Inference providers like BaseTen and AWS Bedrock have perfectly capable open source models as well as some licensed closed source models they host.

You can use "API-style" pricing on these providers which is more transparent to costs. It's very likely to end up more than 200 a month, but the question is, are you going to see more than that in value?

For me, the answer is yes.

reply
What makes you think I haven't tried it myself?

The "costs" are subsidized, it's a loss-leader.

reply
It's an important concern for those footing the bill, but I expect companies really in the face of being impacted by it to be able to do a cost-benefit calculation and use a mix of models. For the sorts of things GP described (iptables whatever, recalling how to scan open ports on the network, the sorts of things you usually could answer for yourself with 10-600 seconds in a manpage / help text / google search / stack overflow thread), local/open-weight models are already good enough and fast enough on a lot of commodity hardware to suffice. Whereas now companies might say just offload such queries to the frontier $200/mo plan because why not, tokens are plentiful and it's already being paid for, if in the future it goes to $2000/mo with more limited tokens, you might save them for the actual important or latency-sensitive work and use lower-cost local models for simpler stuff. That lower-cost might involve a $2000 GPU to be really usable, but it pays for itself shortly by comparison. To use your Uber analogy, people might have used it to get to downtown and the airport, but now it's way more expensive, so they'll take a bus or walk or drive downtown instead -- but the airport trip, even though it's more expensive than it used to be, is still attractive in the face of competing alternatives like taxis/long term parking.
reply