Re: [Serusers] Re: [Users] TM : retransmission timers

18 Mar 2007


      Hi,
...
...
Why would you want to change the hash size from the config? Do you
really know somebody who wanted/needed to do this? If you use a variable
for the hash size the compiler will not be able to optimize the modulo
operation (x % hash_size) and it will have to implement it using slow
DIVs. Using a 2^k constant the whole modulo operation will be optimized
to  & (2^k - 1). A DIV takes minimum 56 cycles on a P4 and 16 on an
AMD64 (and this if the operands are in _registers_ which will probably
not happen for the variable containing the hash size). An  "and" takes
only 1 cycle (and the operand is an immediate constant).
Look also at the rest of the hash function: it uses only XORs, shifts
and additions, operations that all execute in 1 cycle => for normal small
 uris your DIV operation will take a very significant time from the
 total ammount spent hashing.
Sorry for reraising this issue now, but I just became curious and did
the math. :-)
First of all, the hash table size will most probably not have to be
fetched from main memory. It is accessed frequently, so it is likely to
be in the 2nd level cache. Furthermore, CPUs nowadays do very clever
instruction reordering and prefetching, so loading the operand from the
cache will likely only add a few cycles of latency. Let's assume that
the total time required for the DIV instruction with memory operand is
100 cycles.
Let's assume that we have a very busy server that does 20000 hash
lookups per second. Furthermore, let's make the unrealistic assumption
that this server handles all the traffic using a single 3GHz CPU.
Now, the total time spent on the DIV instruction will be 2 million
cycles per second, or 0.07% of the available processing time.
Let's say that the same server is preloading 1 million contacts from a
database on startup. The total time spent on the DIV instruction will be
100 million cycles, or 33 milliseconds (while the whole preload
operation will, at the very least, take several seconds to complete).
On the other hand, with a default hash table size of 16k, the hash
buckets will still grow very big when the number of users gets large. I
don't see the point in having users waste their valuable CPU time on
searching through linked lists when they really wouldn't have to.
Many people will evaluate SER without first checking udomain.h to see
whether there's any tunables inside. If they have many users, they're
likely to be disappointed.
Looking at these figures, I don't see any reason not to make the hash
table size configurable at runtime.
Just my 2 pence.
Regards,
Jan
-- 
Jan Andres jan.andres@freenet.ag

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

2003

2002

Re: [Serusers] Re: [Users] TM : retransmission timers