Key Size - The implementation was really good for small keys (e.g. uint64_t), but much worse for larger ones.
Api - The "empty key" api didn't really sit well with me. Even though it was only exposed to the user in the class declaration.
So, I began reiterating my implementations, testing and benchmarking (which is a lot of fun), and after a while I came up with a result I'm quite satisfied with. Not only is the API nicer, but the performance is better; much more uniform, and in most cases faster than my previous implementation! And at ~270 lines of code, I think it's a pretty good alternative to the more common implementations out there :)
Here are some samples from the benchmarks: