This definitely sounds impressive, but, FMI:
On 11/28/06, Andrei Pelinescu-Onciul andrei@iptel.org wrote:
Why would you want to change the hash size from the config? Do you really know somebody who wanted/needed to do this? If you use a variable for the hash size the compiler will not be able to optimize the modulo operation (x % hash_size) and it will have to implement it using slow DIVs. Using a 2^k constant the whole modulo operation will be optimized to & (2^k - 1). A DIV takes minimum 56 cycles on a P4 and 16 on an AMD64 (and this if the operands are in _registers_ which will probably not happen for the variable containing the hash size). An "and" takes only 1 cycle (and the operand is an immediate constant). Look also at the rest of the hash function: it uses only XORs, shifts and additions, operations that all execute in 1 cycle => for normal small uris your DIV operation will take a very significant time from the total ammount spent hashing.
Unfortunately Daniel didn't reply anymore (maybe he wants to cover trade secrets ;-) ), but OpenSER uses now the much faster masking operation instead of the always costly modulo (which I expect to execute faster even with the cost of extra memory fetch for the hash size). I don't know whether hashing is also cheap or the result has a good distribution, but this should be something you should look at: I expect he did some research too, before changing, maybe this is a fix worth porting.
WL.