The memory operation itself is O(1), around 100 ns, where at a certain point we are doing full ram fetches each time because the odds of it being in CPU cache are low?
Typically O notation is an upper bound, and it holds well there.
That said, due to cache hits, the lower bound is much lower than that.
You see similar performance degradation if you iterate in a double sided array the in the wrong index first.
This is an incorrect conclusion to make from the link you posted in the context of this discussion. That post is a very long-winded way of saying that the average speed of addressing N elements depends on N and the size of the caches, which isn't news to anyone. Key word: addressing.
And no, the articles you linked is about caching, not RAM access. Hardware-wise, it doesn't matter what you have in the cells, access latency is the same. There is gonna be some degradation with #read/write cycles, but that is besides the point.