wiki:Internals/LockFreeHashTables

Version 2 (modified by gz, 6 years ago) (diff)

--

Nearly-lock-free hash tables

CCL implements a modification of the lock-free hash table algorithm described by Cliff Click Jr. in  http://blogs.azulsystems.com/cliff/2007/03/a_nonblocking_h.html.

CCL uses this algorithm to implement thread-safe hash tables without the need for locking on every access. The result is that CCL's fully thread-safe hash tables have typical performance as good or better than non-thread safe hash tables.

The modifications have to do with the fact that our goal is just to minimize the performance impact of thread-safety, so I didn't bother with aspects of his algorithm that aren't relevant to that goal.

The main difference from Click's algorithm is that I don't try to do rehashing concurrently. Instead, rehashing grabs a lock, so that only one thread can be rehashing at any given time, and readers/writers will block waiting for the rehashing to finish.

In addition, I don't have a separate state for partially inserted key, I reuse the DELETED state for that. So in our implementation the following are the possible states of a hash table entry (where "object" means any object other than the special markers):

State Key Value
DELETED1 object free-marker
DELETED2 deleted-marker free-marker
IN-USE object object
FREE free-marker free-marker
REHASHING object rehashing-marker
REHASHING free-marker rehashing-marker
REHASHING deleted-marker rehashing-marker

No other states are allowed - at no point in time can a hash table entry be in any other state. In addition, the only transitions allowed on the key slot are free-marker -> object/deleted-marker -> deleted-marker. Once a key slot is claimed, it must never change to free or another key value, and once it's marked deleted, it cannot be reused. All this is true even after the hash vector has been discarded after rehashing -- because some process might still be looking at it. In particular this means rehashing in place is not an option. All rehashing creates a new vector and copies into it. This means it's kinda risky to use lock-free hash tables with address-based keys, because they will thrash in low-memory situations, but we don't disallow it because a particular use might not have this problem.

In addition to the state, a hash table maintains a count of available entries, i.e. number of new entries that can be added to the table. This is initialized each time the table is resized, based on the size of the table and the the rehash threshold specified by the user, and then decremented as entries are added. The count maintained is always less than or equal to the actual number of available entries - it doesn't account for deletions, and it might be decremented for an addition that doesn't actually happen. This means the table might be rehashed before strictly necessary, but it will never overflow.

The following operations may take place:

  • gethash: find matching key - if no match, return not found. Else fetch value, if value is rehashing-marker then maybe-rehash and try again; if value is free-marker, return not found, else return found value.
  • puthash: find matching key or FREE slot.
    • If found key, fetch value. if value is rehashing-marker then maybe-rehash and try again; else store-conditional the value -> new value, if fails try again.
    • Else have FREE slot, atomically decrement available count, if negative then maybe-rehash, else store-key-conditional free-marker -> key, and if that succeeds, store-conditional free-marker -> new value, if either fails, maybe-rehash and try again.
  • remhash: find matching key - if no match, done. Else fetch value, if value is rehashing-marker then maybe-rehash and try again; else store-conditional the value -> free-marker, if fails try again.
  • maybe-rehash: grab a lock (which will wait for any concurrent rehashing to complete), check whether the table still needs rehashing (either the available count is less than 0 or some address-based keys may have moved), if not, release the lock and return. Else rehash: estimate number of entries, make a new vector. Loop over the old vector, at each entry fetch the old value with atomic swap of rehashing-marker. This blocks any further state changes involving the value. It doesn't block state changes involving the key, but the only ones that can happen is FREE -> DELETED, and DELETED1 <-> DELETED2, all of which are equivalent from the point of view of rehashing. Anyway, if the old value was rehashing-marker then bug (because we have a lock). If the old value is free-marker then do nothing, else get the entry key and rehash into the new vector -- if no more room, start over. When done, store the new available count and then store the new vector in the hash table and release lock.
  • gc: for weak tables, gc may convert IN-USE states to DELETED2 states. For all lock-free hash tables (weak or not) gc converts DELETED1 states to DELETED2 - gc can do that because the entire gc takes place atomically wrt the lisp runtime. REMHASH cannot do it without violating the invariants.

It's left as an exercise to the reader to prove that this results in GETHASH always getting the correct value, no matter what other threads are doing.