upvote
At my previous startup: because AWS gave us a bunch of credits and helped us design the infra. It meant we ran for free what they designed for free.

At a previous bigger company, getting procurement to sign up to a new provider requires writing a business case, justifying the spend and then getting multiple competing quotes and speaking to their sales teams. Signing up to a new service takes _months_ even for $10/mo as they’ll negotiate for bulk discounts and the best possible terms for something that will literally cost less per year than one of meetings they hold to discuss the “value”. Meanwhile on AWS I can click a button in the marketplace and it gets thrown in the AWS account which is pre approved spending.

reply
Many a big company migrated because they have those very same slow procurement problems with internal data centers. I saw multiple cloud migrations because internal friction was at a level that the price didn't matter: 6 months for the smallest VM kind of thing. Very adversarial relationships, often with very poor incentives, as the service setup costs for other business units were way inflated, but then the maintenance costs didn't pay enough. Paying 3x-4x more a year for just a semblance of reliability was seen as a big plus.
reply
At my current team at a “bigcorp” I have noticed a similar pattern. We use aws not because it’s efficient in any way.

We use it because we don’t want to deal with slow procurement process. It kills all the momentum.

reply
Exactly. I want to set up elastic search - I can either have procurement go through their sales, or be up and running via the marketplace in less time than it would take me to fill in the RFQ form to send to procurement.
reply
Have seen this repeatedly also.

Watched one company end up with a $250k AWS bill when their credits expired (which they could not pay).

reply
If you let it go that far then you were going to blow it one way or another - it’s not an excuse to totally ignore the cloud spend but it’s a n excuse to defer it to a later date. If your successful, fix it, if your not then AWS aren’t getting paid anyway!
reply
Yeah, they had an impossible to use number of credits (YC) until they expired, so every problem became a AWS solution.

As an example, they needed a lot of proxy servers. Instead of just using a proxy service, there was a fleet of ec2 instances.

reply
If all you have is AWS credits suddenly every problem looks like EC2.
reply
I think AWS is liked is because when AWS started, being able to get a new VPS up in minutes was still quite unusual. Many hosts would require about 24hr, I suspect, for getting a new VM up. At least those are some experiences I had. But nowaways, they are probably many options for getting a VM instantly.

I agree that it's overcomplicated. Although having the self-service portal also for assigning IPs is useful. But most of it seems overkill. Although, being able to detach storage from VMs and such is also quite flexible. But still.

reply
It’s flexible but slow. we ran our C++ CI/CD on AWS at a previous company, and we used spot instances with volumes attached and detached dynamically. The performance was absolutely abysmal because in effect you’re running compilation across a networked file system, no matter what AWS says your throughput is.

Our 64 core spot instances on windows were taking 8-10x longer than our developer machines with the same core count, and there was a bunch of engineering went into the scaling, queue management, etc. if we’d just had a single bare metal machine from hetzner we could have saved money _and_ reduced our iteration times.

reply
> spending $20k a month on GCP

> burning money in cloud

I suspect there's two reasons why this happens.

One is just the disassociation with opex that seems ever present in the VC model. The other is that many startups settle in on a ops solution before hiring ops and the cost of switching isn't that attractive until they're faced with a dwindling runway and a down round.

reply
> many startups settle in on a ops solution before hiring ops

Sounds expensive

reply
Not really. It's cheaper than hiring an IT admin and sysadmin for a while.

Those tend to be tricky hires on the small end since you tend to want jack-of-all-trades who either demands a premium salary or doesn't exist.

When you have 10 software engineers, having 1 dedicate 10-20% of their time is cheaper than hiring 1-2 FTEs that aren't writing code.

reply
Unless we're talking actual PaaS (Heroku, Render, Railway, etc), the cloud also needs a dedicated skillset, so "cloud" doesn't remove the need from a sysadmin.

If you can get (and trust they do it right) developers to do AWS or Kubernetes, you should be able to trust them to do conventional Linux sysadmin on a bunch of dedicated boxes.

reply
I suspect you're either severely underestimating what the cloud offers or thinking of a very narrow set of software businesses.

A full stack/backed dev is more than capable of learning both, but one of those has way more foot guns than the other.

reply
> they said they must show they are on Hyperscaling cloud.

This is the main reason; and it applies to developers (they need cloud buzzwords on their resume), it applies to managers (who in turn hire only those with said buzzwords) and it applies to company execs/CTOs who can brag about the complex (self-inflicted) problems their company is solving at the next cloud provider conference, so they can justify yet another VC round.

Run this for over a decade, and you'll end up in a situation where an entire generation of "engineers" is no longer capable of configuring a Linux box to serve some basic webapp and will make up whatever reasons to avoid even attempting to do so.

reply
It's fairly easy to setup services without worrying about pages.

I can stand something up on AWS in a couple hours and be fairly confident it will run reliably (assuming their service offering is actually decent--some suck)

We test backups and they never fail. Metrics and logs always work.

>People are unnecessarily complicating stuff, and these clouds can go very expensive very quickly.

I don't think that's the cloud vendors fault. They make it easy to stand up new services so people get overly enthusiastic and create convoluted architectures. Have Postgres but need full text search? OpenSearch is just a few clicks (well hopefully IaC config..) away, let's use that! When you're building yourself and need to setup the stack, instrument, monitor, configure backups the cost is high enough where you say "hey, maybe pg fts is fine for now"

reply
> I replaced all of it with Prometheus, Grafana, Loki, and most stuff from Datastore to Postgres and Mongo with replicas. I added Redis.

But now you need staffing/headcount to be experts in, and maintain, upgrade and be oncall for this stuff?

reply
We are having this dilemma right now. I’m not involved in devops, but it’s quite annoying on how slow everything related to our backend is.

I think we spend around 5k in aws, and I’m pretty sure we could be much more performant for a fraction of that price.

The problem is, who is going to setup everything? Hiring someone would for sure cost more than 5k

reply
If you can get that 5k/month down to let's say 1k, that's a saving of 48k over the course of a year. You can get a consultant/freelancer for half that sum that'll happily do it for you.
reply
I worked for a startup company - the founders were really nice people and had put their own money in - quite a lot of money - to get the software built for the vision they had.

By the time I joined, 18 months after development had started, a giant, complex, hideously tentacled software beast had been built that used every possible AWS service that the massive offshore team of developers could find to use.

It should have been built on a single Linux box by a single senior developer with Python and Postgres or nodejs or Ruby or whatever.

They went out of business after not too long and I couldn't help wondering if things might have been different if they hadn't spent a fortune building a giant money making machine for AWS, instead of making a web application on a Linux box.

Every AWS project I have worked on has had some significant work put into programming AWS instead of writing business functionality.

reply
> hideously tentacled software beast had been built that used every possible AWS service that the massive offshore team of developers could find to use

To be fair, if they had a AWS Solution Architect involved they heavily push you down this road and if they manage to get in management's ear they'll push the idea that server-less AWS features is vastly cheaper.

If you're only responding to a handful of requests that's true, but once things ramp up you get "nickel and dimed" for everything: API Gateway requests, lambda execution time, DynamoDB read/write units, CloudWatch logs, outgoing data, step function transitions, S3 requests.

I understand all those services cost money and they shouldn't be free, but I question if paying all those micro-transactions is worse then paying for your own VMs, especially once your customers complain about the cold starts and you think you can fix it with "lambda warming"

reply
To be fair that’s an AWS problem not a lambda problem. If you replace lambda with EC2 the only thing you save in is lambda and step functions(and maybe api gateway but now you need to pay for a load balancer or a public IP), the rest you need to pay for anyway.
reply
The ease of getting things set up quickly and usually for free when starting up is very tempting. Later, migration is usually considered risky and not worth it because of maintenance overhead - which I would argue has become very easy.
reply
Grafana (and especially Loki) is hot garbage compared to what you get out of the box in GCP. I'm in a Grafana organization today and the sheer amount of developer and devops time it wastes is mind boggling.

You moved something from a single datastore to three different database technologies? I don't know your domain, but that sure doesn't sound like a complexity reduction.

reply
>You moved something from a single datastore to three different database technologies? I don't know your domain, but that sure doesn't sound like a complexity reduction.

what's bad about graphana? it's simply used for some alerts and monitoring, i've used it for really long time and it has never failed me not even once.

it's much simpler to query postgres or mongo compared to duplicating data dozens of times on datastore.

reply
The UX is dreadful? I could spend hours picking apart the terrible design decisions. After many years with the comparative luxury of the stackdriver tools on GCP, I moved to a BigCorp with a grafana/loki/mimir system and a whole devops army to maintain it. For two years I've been unable to find anything positive to say about this experience. Our devops folks are super smart, so I can only conclude that the software sucks.

I can't really judge your database choices; I don't know your specific problems, we're just trading quips on the internet for fun. But man oh man grafana is disappointing.

reply
This isn’t a like for like comparison though, is it.

You removed all of their logging and all of their redundancy and reliability and replaced it with shitters that will all explode if the small providers one data centre goes down.

And if someone penetrates this mega server, they’ll be able to wipe all your logs or tamper with them, to hide the attack.

If your storage servers go down, everything they have is gone. And these providers don’t offer the finest hardware. How do you know all of those drives aren’t from the same batch? They will be, because they’re a bulk buyer with a single data centre.

reply
>You removed all of their logging and all of their redundancy and reliability and replaced it with shitters that will all explode if the small providers one data centre goes down.

they'll never need it, a misconfiguration on those service ends up costing several grands.

>If your storage servers go down, everything they have is gone

It’s just logs for an app server, not some banking critical info that will cause a panic if lost. Most of what they are using for logging is for finding some errors, not for mission-critical things which must not be lost.

reply
> How do you know all of those drives aren’t from the same batch?

Because it's explicitly something you can request when doing your server order from your vendor. In this particular case several years ago, Nutanix did good.

reply
Credits. It wouldn't make sense without free credits. And when you are hooked, good luck in moving out.
reply