Aside: I wonder if complex values neural networks with activation function just being sum(inputs)*conj(sum(inputs)) with threshold normalized by sqrt(num_inputs) could be the most universal, where incoherent inputs will average an absolute value of sqrt(N) and coherent inputs are N like lasers? (square amplitude would be N vs N^2 between uncorrected and correlated population)
For the purpose of inverting a negative vector, you can think of squaring as rotating the vector around the unit circle, 180 degrees, to make it positive. Higher order powers just keep rotating that vector back and forth- from this perspective the other even powers are the same transformation. Obviously with the magnitude being different.
And yet inverse distance laws for potential energy for gravity and electric fields use the absolute value because they require an unsigned distance and how you treat the singularity at zero is extremely important to the structure of those interactions