-
-
Notifications
You must be signed in to change notification settings - Fork 98
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Hashing #161
Hashing #161
Conversation
``` | Method | N | Mean | Error | StdDev | |----------- |------ |-----------:|--------:|--------:| | MurmurHash | 1000 | 209.1 ns | 0.11 ns | 0.11 ns | | HashCode | 1000 | 133.8 ns | 0.18 ns | 0.17 ns | | MurmurHash | 10000 | 2,117.4 ns | 2.74 ns | 2.43 ns | | HashCode | 10000 | 1,349.3 ns | 2.95 ns | 2.62 ns | ``` Based on this removed MurmurHash3.cs entirely and replaced all uses with `HashCode`.
Hmm... this is difficult. Murmurhash3 was chosen for a reason. It has the lowest collision rate among the hash methods. In the past it has happened several times that some users complained that their program crashed or showed undefined behavior. This problem was caused by the hash and solved with the Murmur3hash. Therefore I hold back this PR for now until clarity was created ^^ |
Does the order invariance thing indicate a bug, or is it just a stale comment? |
Thats intended behaviour which is quite important because e.g. |
I just tested this out by running this program for 160M iterations. The final results were:
i.e.
In that case it looks like there's a much more serious bug here! MurmurHash3 is not order invariant:
|
Well i was referring that issue from the past here : #87 |
I noticed that Arch contains a custom implementation of MurmurHash3. Benchmarking shows that the built in
System.HashCode
is faster (and of course simpler):Caveats!
There are some caveats to this method which may or may not matter.
Stability
System.HashCode
is not stable between processes. Each process has a global hash seed that is mixed into the hash, so hashes must never be persisted or shared between processes any other way. I doubt this matters for Arch.Order Invariance
The comments in the code previously said:
The new method does not provide this.
However neither did the old method! This is either a bug in the old code (in which case I'll replace this with an order invariant hash) or a stale/misleading comment (in which case it's now fixed).