These days it's bi, actually :) Although I don't see any CPU designer actually implementing that feature, except maybe MIPS (who have stopped working on their own ISA, and now want all their locked-in customers to switch to RISC-V without worrying about endianness bugs)
ARM works the same way. And SPARC is the opposite, instructions are always big-endian, but data can be switched to little-endian.
You should actually use format-swapping loads/stores (i.e deserialization/serialization).
This is because your computer can not compute on values of non-native endianness. As such, the value is logically converted back and forth on every operation. Of course, a competent optimizer can elide these conversions, but such actions fundamentally lack machine sympathy.
The better model is viewing the endianness as a serialization format and converting at the boundaries of your compute engine. This ensures you only need to care about endianness when serializing and deserializing wire formats and that you have no accidental mixing of formats in your internals; everything has been parsed to native before any computation occurs.
Essentially, non-native endianness should only exist in memory and preferably only memory filled in by the outside world before being parsed.