hash: Rewrite hash table code

This is a complete rewrite of the code in hash.c

Move from a chained hash table implementation to open addressing with
Robin Hood probing. This allows to increase the maximum fill factor and
further reduce the growth factor, saving considerable amounts of memory
without sacrificing performance.

To make this work, hash values are now cached in the table entry
also avoiding many key comparisons.

Tables are created lazily with a smaller minimum size.

Insertion functions now report an error if growing the table resulted in
a memory allocation failure.

Some string comparisons were optimized to call directly into libc
instead of using the xmlstring API.

The length of inserted keys is computed along with the hash improving
allocation performance.

Bounds checking was made more robust.

In dictionary-based mode, unneeded interning of strings is avoided.
This commit is contained in:
Nick Wellnhofer 2023-09-16 19:12:25 +02:00
parent 4f221a7748
commit 4a513d5667
2 changed files with 855 additions and 896 deletions

View File

@ -1,4 +1,4 @@
Except where otherwise noted in the source code (e.g. the files hash.c,
Except where otherwise noted in the source code (e.g. the files dict.c,
list.c and the trio files, which are covered by a similar licence but
with different Copyright notices) all the files are:

1749
hash.c

File diff suppressed because it is too large Load Diff