bool DawsonCompare(float af, float bf, int maxDiff)
{
int ai = *reinterpret_cast<int*>(&af);
int bi = *reinterpret_cast<int*>(&bf);
if (ai < 0)
ai = 0x80000000 - ai;
if (bi < 0)
bi = 0x80000000 - bi;
int diff = ai - bi;
if (abs(diff) < maxDiff)
return true;
return false;
}Signed zero and the sign-magnitude representation is more of an issue, but can be resolved by XORing the sign bit into the mantissa and exponent fields, flipping the negative range. This places -0 adjacent to 0 which is typically enough, and can be fixed up for minimal additional cost (another subtract).
func equiv(x, y float32, ignoreBits int) bool {
mask := uint32(0xFFFFFFFF) << ignoreBits
xi, yi := math.Float32bits(x), math.Float32bits(y)
return xi&mask == yi&mask
}
with the sensitivity controlled by ignoreBits, higher values being less sensitive.Supposing y is 1.0 and x is the predecessor of 1.0, the smallest value of ignoreBits for which equiv would return true is 24.
But a worst case example is found at the very next power of 2, 2.0 (bitwise 0x40000000), whose predecessor is quite different (bitwise 0x3FFFFFFF). In this case, you'd have to set ignoreBits to 31, and thus equivalence here is no better than checking that the two numbers have the same sign.
There are cases where the quantization method is useful, hashing/binning floats being an example. Standard similarity checks don't work there because of lack of transitivity. But that's fundamentally a different operation than is-similar.
I'm unconvinced. Doesnt this just replace the need to choose a suitable epsilon with the need to choose the right number of bits to strip? With the latter affording much fewer choices for degree of "roughness" than does the former.
I think I'll just use scaled epsilon... though I've gotten lots of performance wins out of direct bitwise trickery with floats (e.g., fast rounding with mantissa normalization and casting).
This breaks down across the positive/negative boundary, but honestly, that's probably a good property. -0.00001 is not all that similar to +0.00001 despite being close on the number line.
It also requires that the inputs are finite (no INF/NAN), unless you are okay saying that FLT_MAX is roughly equal to infinity.