The answers started out ok, but fairly quickly it seemed to loose track of the mid-stuff in the documentation, insisting on using one concept instead of another even when I explicitly told it not to. Full attention on 1M context is not really feasible (I don't believe that Google actually stores upwards of 1T of data just for my query), and there are various ways LLMs use selective attention. I'm not sure if Google has published anything on how they do it?