[Thread Prev][Thread Next][Thread Index]

Re: 2.2.6_andrea2.bz2



On Wed, 5 May 1999, Michael Schulz wrote:
> > http://www.citi.umich.edu/projects/linux-scalability/reports/hash.html
> 
> i just got curious and tock a glance at  your page. Astonishingly you never
> used prime numbers for the size of the hashtable. That's bad because hashing
> lives from computing modulus of integers to get the associated bucket-id. 
> 
> So for obvious reasons a hash-function will always reveil better spreading
> over the table, when it is totaly calculated within the prime modulus, which
> is the tablesize.  If you use nonprimes, objects tend to be stored in the same
> buckets, which you basicaly what to prevent.
> 
> Another property of calculating within modulus is, that you don't get integer
> overflows, even though entry-values are big.

michael-

thanks for your note.  if you read Knuth, you'll see that prime-sized hash
tables are not necessary if you use multiplicative hashing.  and, you *do*
want multiplicative hashing with overflow because modulus hashing requires
a division operation in the hash function, which is more expensive than a
multiplication operation.

in fact, Knuth proves that multiplicative hashing is at least as good as,
and sometimes better than, modulus hashing on a prime-sized table.

read my report again, and you'll see a histogram that shows an almost
perfect bucket size histogram and an 87+% bucket utilization, all with a
simple multiplicative hash function.  that's as good as it gets.

	- Chuck Lever
--
corporate:	<chuckl@netscape.com>
personal:	<chucklever@netscape.net> or <cel@monkey.org>

The Linux Scalability project:
	http://www.citi.umich.edu/projects/linux-scalability/


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu
Please read the FAQ at http://www.tux.org/lkml/