The cat example is from the section on their block-causal attention mask. I really don't think this fixes the issue. So far as I can see, the block schedule dictates when they sample at each position. It does _not_ change that they basically have an array-of-token-vars representation, and once `t_i` is sampled, nothing can "move" that value left or right.
reply