In learning more about the inner workings of value lookups, Bloom filters (paper, Wikipedia) needed to be understood. Along the way discovered more are more code that uses Bloom filters or similar and was wondering if TerminusDB is using them deep down under. I am aware that Bloom filters can not be expected to work with values that are variables and so may not be a natural fit with the underlying Prolog. I ask because I don’t want to go down the rabbit hole looking as it could take some time and someone might already know the answer.
Places where Bloom filters or similar are found:
Interesting question @EricGT !
We actually looked at using bloom-filters a bit some 4 or so years ago. The main reason we were interested in them was for speeding up large “shortest path” queries between nodes where there are super nodes. If you have a node with extremely high connectivity to other nodes, it can slow down shortest path queries significantly. It therefore makes sense to have some sort of bloom filter to remove from consideration those entries which could not have connectivity to some other node, within some cost.
We didn’t end up using them simply because it increased complexity in a way which was not very flexible, and the demand for high performance over very large graphs for point-to-point queries was not in very high demand from any of our community. However it remains in our heads as something to be applied for similar problems.