Challenge accepted. Suppose we want to know the answer to 3 decimal places (so we'd match the headline). And suppose I allow my algorithm to be wrong one in a thousand times ("probably approximately correct").
Then sample some constant number C of random 64 bit integers. Run the following algorithm which separates each random sample into one of three classes: Y (has 32 but factors), N (does not have 32 bit factors), U (unknown).
Check if prime using probabilistic miller rabin. (Error prob goes to zero exponentially fast). If prime, return N. If it's not a prime, then run T steps of pollard rho to determine whether the number has 32 but factors; return Y,N, or U depending of the factors found up to step T.
The key observation is that T can be chosen to make the UNKNOWN class very small (with high probability), and so our estimate should rapidly converge to 17%Y, 83%N, ~0.001%U
For fixed error tolerance, this would run in roughly a constant number of iterations, independent of N.
While I find the 17% number interesting to think about, "most" is far less interesting. Multiplication doesn't care about order so you're instantly cutting 2^64 possibilities down to about 2^63. That's a hair's breadth away from "most" already, and considering even a tiny amount of overlapping results gets you there.
What gets interesting is actually trying to quantify the overlapping results.
Information quantities are more meaningfully expressed in number of bits.
Not sure I understand.
Adding two 32 bit integers takes you to 33 bit integers. (1111 + 1111 = 11110).
Addition doesn't care about order, so you're instantly cutting 2^33 possibilities down to 2^32. Or so is your argument. But in reality you can reach nearly all of those 2^33 numbers.
Commutativity introduces a relation on pairs of 32 bit ints (a,b) ~ (b,a), which accounts for one bit of information. Thus, at most 50% of 64bit ints show up as products of 32 bit ints.
The 2^64 number is the number of inputs. For an operation which is commutative, you expect the outputs to be 2^63+2^32 or smaller, since you’ve introduced symmetry.
Whether a 64-bit number can be written as the product of two 32-bit ones depends only on the prime factors of the 64-bit number - it's a property of the number itself, and apparently 17% of 64-bit numbers have this property.
However, since a * b = b * a, our input space has a lot of duplicate outputs. So from this alone you can conclude roughly half of the output space must be uncovered by any input pair, simply because there aren't enough input pairs.
(Just by way of example, for n=2^33, 2n=2^34 but also =2^17*2^17)
It's much worse than that. It's difficult for a 64-bit product to have the high bit set if the multiplicands are both no larger than 32 bits.
You have to redo the math to make the constraint work.
That should be "relatively few 20000000-bit integers", right?
Most 1s won't go towards 1.5, but sometimes you're lucky.
My current comment itself, for instance, also doesn't really add anything to the discussion about the article and I'd have no expectation people leave it from going negative. Maybe the will, maybe they won't, but there is no reason to expect they should in principle of me loving tangents :D.
Anyway here is a fun pattern you get when you multiply 8 bit unsigned integers. Not all pairs of (upper bits, lower bits) are reachable, and it has a lot of distinct patterns.
https://i.imgur.com/Gb3HDR0.png
(Should I host the image on GitHub Gists so it doesn't vanish?)
The chance of a random 64 bit integer being a 32 bit integer is 0.0000000233 %
The chance of a random 64 bit integer being a product of two 32 bit integers is 17%
Nice
Therefore the fact that relatively few 64-bit numbers are products of 32-bit integers means that a lot of pairs of 32-bit integers give by multiplication the same product.
This is more than just the prime numbers. For example, a 41-bit prime can be multiplied by 16 and it will still fit into 64 bits.