Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
host.id has lower cardinality (#687)
host.hostname has cardinality 100 while host.id has cardinality 50. This happen because in the dataset there is a host.if per each couple ho hostnames, like a single host.id and for each of them two hostnames like 'dustin.windows' and 'dustin.linux'. This is probably an artifact of the data generation script. Lower cardinality fields might: * reduce sorting overhead due to less comparisons * improve compression due to more data clustering together This change should at least allow us if there is any benefit in choosing a lower cardinality field. (cherry picked from commit e2ca95e)
- Loading branch information