This library supports two types of hash collections. Like the linear collections, these are based on arrays which are automatically expanded as more items are added. With the hash collections the connection to the underlying array is less direct, though, with the array accessed as a table indexed by the hash code for a key.

The hash collections in this library use a very simple form of hash table internally. The actual hash value of a key is used to find a starting slot in the hash table. If another key is present at that starting slot an offset value is added to the slot (wrapping if the result is beyond the end of the table) and the new slot obtained in this way is next checked for the key. This process is repeated as many times as necessary to either find a key in the table or get to an empty slot.

This approach gives very good insertion and lookup performance at the cost of more overhead when a key is removed from the hash table. Hash collections tend to be used more often for persistent values than for transient ones, though, so this gives a good performance trade off for typical usage.

Hash collections of both object and primitive types use considerably less memory than the corresponding standard Java library implementations, so a good approach to take when using these collections in high-performance applications is to be generous in setting the initial size for the tables. It may also improve performance if you lower the fill parameter value, so that the tables are expanded more often.

Hash collections use two separate base classes for primitive and object keys, com.sosnoski.util.PrimitiveHashBase and com.sosnoski.util.ObjectHashBase respectively. Each type of hash collection defines a specialization of each of these base classes, which is subclassed for the actual implementation classes.

The hash set collection type, in package com.sosnoski.util.hashset, supports collections of self-keyed values. The hash map collection type, in package com.sosnoski.util.hashmap, supports collections of keys with associated values.

Hash collections with object keys may be configured to use any of three possible combinations of hash method and key comparison techniques. The choice is determined by an optional parameter to the constructor, with the following values inherited from com.sosnoski.util.ObjectHashBase:

  • STANDARD_HASH - use object hashCode() and equals() comparison
  • IDENTITY_COMP - use object hashCode() and == comparison
  • IDENTITY_HASH - use System.identityHashCode() and == comparison

In testing with the Sun Java 1.3.1 JVM System.identityHashCode() turns out to be very slow in comparison to a typical hashCode() implementation, so be careful of using the IDENTITY_HASH option. IDENTITY_COMP is always faster than STANDARD_HASH, though, so it's a good choice when you know that objects are unique (such as Strings which have been interned).