--------
[ academics ] Message 9266 (8 left): Sun July 11, 1999 10:37pm
From: Aaron Emigh (aaron@alumni.cse.ucsc.edu)
Subject: more bit twiddling
20 lines (?)
I have an O(lg n) solution, modulo bickering over machine capabilities.
Think of a number in binary notation as having, in each bit position,
the "number of ones" in each bit position, 0 or 1. Now think of using
each bit *pair* to hold the "number of ones" (zero, one or two) in that
pair of bits in the original number. Now think of each bit 4-tuple as
containing the "number of ones" (0-4) in the corresponding nibble of
the original number, and so on. Each successive stage of this transformation
can be done in O(1) with a shift and add (assuming a barrel shifter and
number of bits no greater than the native word size).
With lots of intermediate variables and funny formatting for exposition:
sum_2 = (n & 0x55555555) + ((n & 0xaaaaaaaa) >> 1);
sum_4 = (sum_2 & 0x33333333) + ((sum_2 & 0xcccccccc) >> 2);
sum_8 = (sum_4 & 0x0f0f0f0f) + ((sum_4 & 0xf0f0f0f0) >> 4);
sum_16 = (sum_8 & 0x00ff00ff) + ((sum_8 & 0xff00ff00) >> 8);
count = (sum_16 & 0x0000ffff) + ((sum_16 & 0xffff0000) >> 16);
--------
[ academics ] Message 9274 (0 left): Mon July 12, 1999 11:23am
From: Aaron Emigh (aaron@alumni.cse.ucsc.edu)
Subject: restatement
22 lines (?)
I'll go for the unification theory:
The algorithm as stated is O(lg n), where the fundamental operations are
AND, shifting and addition.
These fundamental operations are not in fact constant-time on arbitrarily
long datasets on real architectures -- they take n/, or O(n)
time. So you could call an implementation that followed the basic
algorithm exactly O(n lg n).
Of course, you wouldn't implement it that way -- it would be silly to
do these operations across the whole dataset. You'd just do it piecewise,
one natural word at a time, and add up the intermediate results. Each
word-sized piece will take O(lg ) = O(1) time since
is constant, and there are O(n) performances of the intermediate count and
subsequent additions, making the whole thing O(n), though with a smaller
constant factor than more traditional techniques.
Whether you want to call it O(lg n) or O(n) is ultimately a matter of
personal style -- I claim it's O(lg n) in theory, O(n) in practice.