I agree that this isn’t proof that it scales to trillions of tokens, but this does show a scaled up experiment would be worth a shot.
I do agree that it is a datapoint, but GP's point is that this model was undertrained, so it's hard to draw the same conclusions from it that we would from other research.