Cloud services like to present the illusion of an infinite amount of compute available at a fixed price per unit, but the reality is if you try to use too much of any service you'll find you have a quota and requests to increase it will fall on deaf ears if the provider doesn't have more of that resource.
Too much of my working life has been spent shoehorning services into less space/compute/ram/spindles or migrations to other data centers to solve such issues.
Having said that, I agree with you. You have to request limit increases often and can't scale even in those instances if you don't plan ahead.
There has to be a name for this deceptive marketing tactic where you say something is unlimited and then it is only unlimited as long as you don't use very much.
It would be one thing if you occasionally got a "no more capacity" error when requesting large amounts of resources but it doesn't work that way. They confine you to a relatively small amount of resources the entire time you have an account. If you want more you have to request it.
The tiny blog sure isn't for the cloud, but also it's not the main client of the cloud.
> it's 20% more than you are currently using and you pay 300% more for that.
I'm assuming you are comparing to self hosting. Then you need to account for things that are difficult to put a price like your time maintaining a physical infrastructure and the lessons you will learn with it.
Sounds like I'm defending the big cloud, but there is a valid use that is disconsidered because it's trendy to hate on the cloud.
> They confine you to a relatively small amount of resources the entire time you have an account. If you want more you have to request it.
It's a form of KYC, nothing wrong with that.
Like literally 10x times more expensive to do so, to run CI jobs...
I dont want to imagine the margin AWS has like generally, cause it can easily be a 90% too
I assume you're using your owned server and not a provider like Hetzner? So you did have a substantial delivery time. Although in my city is a recycled that resells used servers, and I could show up there with a truck and get a server within hours if I'm not too picky. Or use some random desktop or laptop off the pile, short-term.
Right now the biggest issue is the vibe coded CI program is not really meant to be a distributed multi-node thing yet, so we're on the biggest machines (there's some newer bigger stuff we could migrate too) and the only issue is on peak hours queue can get a bit slow.. but that was also some other bugs etc making not ideal.
Tbh it works pretty well, we just need now to scale it to more than one node etc (which is not to say that is easy, but still, x10 headroom to work with)
When Anthropic accuse Alibaba of distilling their models, you have run that by a reality check of what is actually possible.
1) You can use another model as "LLM as judge" to rate alternative outputs that your own model has generated. Useful data perhaps, but certainly not distillation.
2) If what you are interested in are the reasoning steps (that are hidden from you) that arrived at an answer, not the answer itself, then you can try to train a model to guess what those steps were (this is a published technique). This may be better than nothing, but hardly distillation if it's your model that is suggesting the reasoning!
3) Depending on the model, you may be able prompt engineer it to reveal it's reasoning, not just show a summary, but this should be very obvious. Anthropic cite this as something they have seen. This would be useful data if you can get it (presumably they've now done a better job of preventing it), but at the end of the day all you'd be getting is some training data cheaper than if you'd had to create it by hand.