Querying a graph is expensive. This idea is about making it faster.
Area grows quadratically in Euclidean space, volume by the cube, and so on. In hyperbolic space, area and volume grows exponentially. It’s quite insane just how much space there is in hyperbolic space, so here’s a video to get some intuition about it:
Anyway, the above implies an interesting property: graphs can be embedded in hyperbolic space without much distortion, in a way that the distance between the nodes is mostly preserved (trees can be embedded isomorphically, with no distortion at all), and that in a much smaller number of dimensions (for example, 2-3 instead of hundreds). Hyperbolic space can in fact be looked at as a kind of “continuous graph”.
So, how about using hyperbolic embeddings to index the RDF graph? Queries like “what is a short path between A and B?” could be turned into a geometric problem: the nodes close to the line between A and B will be candidates for the search. In fact, we may discover relationships that exist in real life but aren’t in the RDF database yet (“link prediction”).
I believe the Safe Network would allow for developing an embedding for the RDF nodes. In practice, these embeddings are built gradually by using positive and negative examples, such as “A and B are connected but A and C, A and D, A and E aren’t”. After each such step, A’s position (“embedding”) in the N-dimensional hyperbolic space is adjusted so that it would be closer to B but at the same time farther from C, D, and E. (It’s not necessary for the negative examples to be always correct because it’s a stochastic process anyway.)
So, the Safe node responsible for a given RDF entity would need to request a list of connected entities, each with their current embedding (position) and keep a cached list of random other RDF entities as negative examples, then update the entity’s embedding accordingly. This process would probably need to be piggy-backed on top of other sort of communication to be efficient.
Some basic references:
This paper sort of started it: https://arxiv.org/pdf/1705.08039.pdf, https://github.com/facebookresearch/poincare-embeddings
Their newer work uses a different model of hyperbolic space, one with better numerical qualities for optimization, but it doesn’t have code on GitHub (yet): https://mnick.github.io/publication/nickel2018learning/, https://arxiv.org/pdf/1806.03417.pdf