points
But even against E4B it's shaky, which is surprising given how many tokens they trained on. I guess it was on a lot of synthetic data.