undefined

points

[-]

I don't get how this can work, and Moxie (or rather his LLM) never bothers to explain. How can an LLM possibly exchange encrypted text with the user without decrypting it?

The correct solution isn't yet another cloud service, but rather local models.

by FrasiertheLion4 hours ago|

parent|

[-]

The model is running in a secure enclave that spans the GPU using NVIDIA Confidential Computing: https://www.nvidia.com/en-us/data-center/solutions/confident.... The connection is encrypted with a key that is only accessible inside the enclave.

Within the enclave itself, DRAM and PCIe connections between the CPU and GPU are encrypted, but the CPU registers and the GPU onboard memory are plaintext. So the computation is happening on plaintext data, it’s just extremely difficult to access it from even the machine running the enclave.

by boramalper4 hours ago|

parent|

prev|

[-]

They explain it in Private inference [0] if you want to read about it.

[0] https://confer.to/blog/2026/01/private-inference/