The only way to defend against these types of issues is to encrypt your environment with your own keys, with secrets possibly baked into source as there are no other facilities to separate them. An attacker would need to not only read the environments but also download the compiled functions and find the decryption keys.
It is not ideal but it could work as a workaround.
please don't suggest this. The right way is to have the creds fetched from a vault, which is programmed to release the creds auth-free to your VM (with machine level identify managed by the parent platform)
This is how Google Secrets or AWS Vaults work.
Or have whatever deployment tool that currently populates the env vars instead use the same information to populate files on the filesystem (like mounting creds).
For example, it is possible to create a vault lease for exactly one CI build and tie the lifetime of secrets the CI build needs to the lifetime of this build. Practically, this would mean that e.g. a token, some oauth client-id/client-secret or a username/password credential to publish an artifact is only valid while the build runs plus a few seconds. Once the build is done, it's invalidated and deleted, so exfiltration is close to meaningless.
There are two things to note about this though:
This means the secret management has to have access to powerful secrets, which are capable of generating other secrets. So technically we are just moving goal posts from one level to another. That is fine usually though - I have 5 vault clusters to secure, and 5 different CI builds every 10 minutes or so, or couple thousand application instances in prod. I can pay more attention to the vault clusters.
But this is also not easy to implement. It needs a vault cluster, dynamic PostgreSQL users take years to get right, we are discovering how applications can be terrible at handling short-lived certificates every month (and some even regress. Grafana seems to have with PostgreSQL client certs in v11/v12), we've found quite a few applications who never thought that certs with less than a year of lifetime even exists. Oh and if your application is a single-instance monolith, restarting to reload new short-lived DB-certs is also terrible.
Automated, aggressive secret management and revocation imo is a huge problem to many secret exfiltration attacks, but it is hard to do and a lot of software resists it very heavily on many layers.
Like, sure, you can go HAM here and use network proxy services to do secret decryption, and only talk from the app to those proxies via short-lived tokens; that's arguably a qualitative shift from app-uses-secret-directly, and it has some real benefits (and costs, namely significant complexity/fragility).
Instead, my favored option is to scope secret use to network locations. If, for example, a given NPM token can only be used for API calls issued from the public IP endpoint of the user's infrastructure, that's a significant added layer of security. People don't agree on whether or not this counts as a "token ACL", but it's certainly ACL-like in its functionality--just controlled by location, rather than identity.
This approach can also be adopted gradually and with less added fragility than the proxy-all-the-things approach: token holders can initially allowlist broad or shared network location ranges, and narrow allowed access sources over time as their networks are improved.
Of course, that's a fantasy. API providers would have to support network-scoped API access credentials, and almost none of them do.
Security researchers always need to give an answer whenever there's a security incident and the answer can never be "too much centralization risk" even when that is the only reasonable answer. You can't remove centralization risk.
IMO, the future is; every major centralized platform will be insecure in perpetuity and nothing can be done about it.
The service that encrypts the data should be the ONLY service that holds the private key to decrypt, and therefore the only service that can process the decrypted data.
It's easy to see how this would work with sufficiently sophisticated clients in some use-cases, say via a vault plugin, but posing this as a universal necessity feels like a big departure from typical oauth flows, and the added complexity could be harmful depending on what home-grown solutions are used to implement it.
As far as I’m concerned, the only sane way is to dump credentials in a well-known path and let the environment decide what to bind them with at runtime (which is how Kubernetes does it, at least the EKS version I’ve had to work with).
IOW, JEE variable binding (JNDI) did it right 20 years or so ago.
It might be worth for architecture designers to look back at that engineering monument (in all its possible meanings, it felt complicated at times) and study its solutions before coming up with a different solution to a problem it solved