I think the reality is at this point the frontier regards CoT as extremely valuable, none of them are giving you genuine CoT anymore. I don't think there is any future in attempting to measure or evaluate CoT from frontier models - I expect this to be a permanent shift.
reply