Hi,
Why would you want to change the hash size from the config? Do you really know somebody who wanted/needed to do this? If you use a variable for the hash size the compiler will not be able to optimize the modulo operation (x % hash_size) and it will have to implement it using slow DIVs. Using a 2^k constant the whole modulo operation will be optimized to & (2^k - 1). A DIV takes minimum 56 cycles on a P4 and 16 on an AMD64 (and this if the operands are in _registers_ which will probably not happen for the variable containing the hash size). An "and" takes only 1 cycle (and the operand is an immediate constant). Look also at the rest of the hash function: it uses only XORs, shifts and additions, operations that all execute in 1 cycle => for normal small uris your DIV operation will take a very significant time from the total ammount spent hashing.
Sorry for reraising this issue now, but I just became curious and did the math. :-)
First of all, the hash table size will most probably not have to be fetched from main memory. It is accessed frequently, so it is likely to be in the 2nd level cache. Furthermore, CPUs nowadays do very clever instruction reordering and prefetching, so loading the operand from the cache will likely only add a few cycles of latency. Let's assume that the total time required for the DIV instruction with memory operand is 100 cycles.
Let's assume that we have a very busy server that does 20000 hash lookups per second. Furthermore, let's make the unrealistic assumption that this server handles all the traffic using a single 3GHz CPU.
Now, the total time spent on the DIV instruction will be 2 million cycles per second, or 0.07% of the available processing time.
Let's say that the same server is preloading 1 million contacts from a database on startup. The total time spent on the DIV instruction will be 100 million cycles, or 33 milliseconds (while the whole preload operation will, at the very least, take several seconds to complete).
On the other hand, with a default hash table size of 16k, the hash buckets will still grow very big when the number of users gets large. I don't see the point in having users waste their valuable CPU time on searching through linked lists when they really wouldn't have to.
Many people will evaluate SER without first checking udomain.h to see whether there's any tunables inside. If they have many users, they're likely to be disappointed.
Looking at these figures, I don't see any reason not to make the hash table size configurable at runtime.
Just my 2 pence.
Regards, Jan