upvote
> Is this not what happens in most SaaS?

I think it's fairly popular to try to do more logical isolation in SaaS now, especially with VM-scheduling-as-a-service becoming more popular. For example, I did security architecture at a company who did relatively simple financial processing; we worked to move to a model where customer documents were encrypted using a tenant key which we'd then wrap in both a service key and a login key; users could only get the login key stapled to their session by authenticating against that account, and the processing jobs ran on a cloud vendor's logical isolation. So the user needed a login key, the service needed the attested service key, and the job ran in what amounted to a mini-VM, avoiding issues like "whoops we sent the wrong document ID and the backend gave it back to us" or "whoops, we routed the request to the wrong tenant backend!" This level of isolation would be really hard to achieve in an LLM vendor context.

> I don't see why they would not hire the best to implement these relatively boring/solved things correctly at an architectural level.

I think a lot of these things develop over time; obviously hiring people who have done them before helps, but it's hard. Even the people with strong experience often only know little slices. And unfortunately, every system operating at these scales has emergent behavior which can become really challenging at scale; mistakes like "we used hash(id) as a key in a memory cache without a collision list, and it collided" which would simply never affect most startups become more and more frequent at scale. High rate of change makes it hard to suss these mistakes out and root-cause them, too; "a customer gave us a log where we swapped X and Y" is hard to bisect when you're doing 500 code deploys a day.

reply