points
https://ai.google.dev/gemma/docs/gemma-3n#parameters
You can think of the per layer-embeddings as a vector database so you can in theory serve it directly from disk.