SAFE Network Explained: Architecture

Very complete description of safe network!!!

A few details to be corrected though:

  • What you call group is now called section. See the change of terminology in MaidSafe Dev Update – December 6, 2016. A group is now a subset of a section of exactly 8 nodes. It contains nodes that are the closest ones to a specific address.

  • What you call section is in fact called prefix (this is what you defined as “the leading bits of their identifier”)

  • Section size is >= 8 but is not necessarily <= 16 because a section is split in two only if both halves are greater or equal than 11. Meaning that it splits when it reaches 22 if it is well balanced, but can grow above 22 if it is unbalanced. The margin (11 – 8 = 3) is a hysteresis factor added to avoid a merge quickly after a split.

  • Permissions are not associated with specific keys but are applicable to all keys of a Mutable Data.

14 Likes

Amazing that you both picked up on a point that I really labored on while writing. For me, ‘data and communications’ seemed too broad to capture the readers interest, so using ‘storage and retrieval’ seemed focused and precise. Having seen this feedback I think it’s best changed to ‘data storage and communications’.

This is a good idea, but I think the document becomes too lengthy. I also think talking about what’s broken in existing systems doesn’t actually help in understanding the safe network in the context of this particular document.

Good one, I’ve updated this.

Datamap is defined in maidsafe/self_encryption as “Holds the information that is required to recover the content of the encrypted file.” The data it contains is just the list of chunks. There’s no indication of encryption for private data (although from memory this is not yet implemented).

Each datamap chunk “Holds pre- and post-encryption hashes” (L20), implying the chunk is what is encrypted (via self-encryption, not to differentiate public / private data).

So I’m not totally clear on the specific detail and would benefit from further clarification from maidsafe devs about how public / private data differs (preferably a link to an existing document).

I’d like the explanation in the document for private / public data to be a little clearer.

Yes I feel there’s not enough clarity in my use of the terms chunks vs resources vs files etc. I’m pretty sure the usage is consistent, but it’s not clear. I’ll try to improve it.

Yes, smart contracts are briefly mentioned in the Messaging section of the document, but I think it could benefit from a little more detail. I’ll see about adding some more info.

Good idea. I’ve added this.

Good catch on the typo. Fixed.

One detail I’m not clear about is how mutable data names are defined. The Mutable Data RFC doesn’t seem to specify this. Can anyone link to a document or code that clarifies how mutable data names are determined?

Thanks, I’ll fix this terminology. I tried to be consistent with the terms as used in Close Group Consensus vs Disjoint Sections… maybe some guidelines around when to use Group vs Section could be helpful to me.

Good catch. I’ve changed this.

I’ll update this as per your detail. Great use of the word hysteresis too!

My phrasing is probably a little ambiguous, but is still correct. The Permissions Section of the Mutable Data RFC says MD can have multiple users with multiple permissions. So permissions are applied per key, not to all keys of a MD.

To clarify via the code for MD permissions, permissions is a BTreeMap<User, PermissionSet>, which “Maps an application key to a list of allowed or forbidden actions”. This uses different terms for the same concept of User and Application Key.

Permissions as designed in the RFC allow multiple permissions per user with the definition permissions: BTreeMap<User, BTreeSet<Permission>> but as implemented in code is defined as BTreeMap<User, PermissionSet>. Just a subtle inconsistency in the definition of ‘multiple permissions’.

This also means permissions are allocated to users, not users allocated to permissions.

… but I’m getting stuck in the weeds here… I think the phrasing in the document is adequate to convey the concept! If you can think of a more suitable way to word it I’d be glad to know.


Thanks everyone for the feedback. This is such a great community and project to be part of.

21 Likes

Nice architecture overview! I tried to find discussion on performance aspect of the SAFE network but I couldn’t find or maybe i overlooked (apologies in that case). Could someone explain how the performance would be compared to traditional client-server model? since the files in the SAFE network are not only divided into multiple chunks and stored on multiple nodes, they will be encrypted as well. The response to a resource request will have to grab all those chunks, decrypt, assemble and serve the resource to the client. This whole process wouldn’t be a costly operation in terms of disk IO and therefore increase the response time significantly? I would appreciate some discussion/comments on this subject.

3 Likes

Ahh, that was an enjoyable read. Really outstanding work! Finally something perfect for showing around! Thanks thanks thanks!

3 Likes

The network doesn’t define it. It is chosen by the client app and can be anything, for example a random xor name or a hash of an application identifier.

While true, this doesn’t imply the following:

It only means that for example, user1 can have the right to create new entries and to delete any entries, while user2 can have the right to update any entries.

2 Likes

Maybe Profiling vault performance could be a good start. It’s not exactly a comparison to traditional client-server.

The chunks are assembled by the client, so the network only needs to know how to deliver chunks. This allows for very simple caching rules that should give high performance, especially for popular chunks.

What happens if this is chosen deliberately by the client app to be the same as an existing xor name on the network?

Or slightly extending that idea, what if a user computes the xor name for an immutable data chunk before uploading to the network, creates an MD at that same xor name, then tries to upload the immutable data? Won’t this cause conflicts? I’m just not quite clear on how names can be set by someone other than the network but still be secure.

Thanks for clarifying about MD. I really appreciate your in-depth knowledge of the code.

5 Likes

For MDs it’s XOR name + type tag that’s used. I’m not sure of the details of how this work, but I think Immutable Data has one address space and Mutable Data has one address space for each type tag.

If you try to upload an MD with the same XOR name and type tag as an already existing MD I believe you’ll just get an error message.

1 Like

Do you have a Maidsafe address for donations? :stuck_out_tongue:
That goes for you also @polpolrene

Great work gents, keep it up

3 Likes

Thank you @mav! I haven’t’ actually read the document yet, but I ran it through a simple spell-checker in Libreoffice and found at least the following:
eg -> e.g.
ie -> i.e.

I would recommend using another spell-checker to get rid of other possible mistypings. I think the one in MS Word may be decent for English. (I don’t have Word myself although I was on the team that created the checkers for e.g. Swedish.) Spelling and grammar checkers can be great tools for spotting mistakes, but never authorities on correctness.

2 Likes

Using eg and ie for e.g. and i.e. is perfectly acceptable - there’s no real right or wrong. I find Word always tries to Americanise words. For example Americanise -> Americanize! e.g. and i.e. are more common in the US where they tend to use more punctuation than Brits and Aussies do - for example U.S.

3 Likes

I mainly wanted to point out the usefulness of automatic checkers. I stick to American spelling, because I learned English in the US, but I guess documents for Safenet should be spelled according to British standard. Of course one has to choose the desired dictionary in Word manually. The problem with Word is so many functions are active by default. The problem is with the settings - not the checkers themselves, if used correctly.

(I also localized Clippy, the famous Office Assistant that everybody hated, but that doesn’t mean I like the implementation of it.) :wink:

“Enpoints” would also have been picked up automatically.

I’m no authority on English, but I still think those abbreviations should have periods (full stops). When it comes to spelling, the important thing is to be consistent, I think.

https://en.oxforddictionaries.com/definition/i.e.
http://dictionary.cambridge.org/dictionary/english/ie

1 Like

Good and logical choice to use what is already out there and proven thoroughly.
I don’t know the finer technical details of sha3_256 (Keccak), but I see that Belgians are involved in its creation, so it can’t be that bad :wink:
Probably better then making your own hash function, like IOTA: https://medium.com/@neha/cryptographic-vulnerabilities-in-iota-9a6a9ddc4367

1 Like

Exactly. This means that several mutable data having the same name but different tags can be uploaded in the network. They all will be stored in the same group of 8 vaults (the closest ones to the common name).

To avoid collisions with existing MDs having the same tag an application should use a random nonce concatenated to the source identifier and then hash the result to compute the name. This also renders the name unpredictable, which prevents an attacker or a competing app from squatting a name that the app will need in the future.

I share your concern about this. If all clients use a hash function to generate MD names then they will be uniformly spread in the xor namespace, which is good. But the problem is that apps are not forced to do that. I see a potential attack based on the creation a set of MDs having a specific name to overload a group of nodes in the network, for example to get control of a section by eliminating these nodes.

5 Likes

Won’t this only mean that in the worst case someone might be able to fill the hard disk space shared by some node(s) by creating lots of MDs with almost the same names, maybe just increase it by 1 bit every time, but then some other random(?) nodes should come and take their place?

Americanisation and Britishification of text is a challenge, even for those of us with a couple of decades in both countries.

Using tap instead of faucet, or boot instead of trunk just takes some getting used to.

The real tricky situations come into play when someone uses a word common to both, that means different things. The spell checker won’t fail it and eyebrows will raise - especially if you use fanny when you mean butt (or buttocks or bottom). It’s not the same part of the anatomy in Britain.

Just remember to splash in plenty of extraneous “u’s” and remove all “z’s” and your American English becomes British English, largely :wink:

Automatic checkers of spelling and even “grammar”, i.e. syntax, cannot and are not supposed to replace human knowledge and education. Just like a chainsaw is not supposed to replace the human holding it. But if used correctly, which takes some practice and studying in itself, the automatic checkers as well as the chainsaw can help humans in their work. Automatic translators will not replace humans in the foreseeable future either, but that doesn’t make them useless in all situations.

I have now actually read the article and would like to thank @mav again. A very useful article and a good read!

I wasn’t proofreading, but I happened to spot the following mistakes:
it’s -> its
"This causes it to leave it’s existing group and become part of a new group."
“The relocated vault must now store chunks that are closest to it’s new identifier”

2 Likes

Can we link or pin this somewhere where people can see it? Really awesome job on this one!

4 Likes

Excellent resource, thank you very much. A more techy friend of mine is visiting and has some questions about:

“The network is comprised of a graph of independently operated nodes (called vaults) that validate, store and deliver data. Vault operators can contribute to the retention of network data and network performance by supplying disc space and bandwidth for use by the network. Vault operators may join or leave the network at any time without affecting the security of data stored on the network…
The network utilises SHA3-256 identifiers for vaults and data in combination with XOR distances between these identifiers to anonymise and globally distribute all data and traffic.”

Specifically he’s asking how vaults can go online and offline while the network stays fast, whether the geographical distance of the data from the person fetching it effects the speed, things like that. I think an understanding of how XOR works should clear it up for him or am I missing something? Any links or relatively simple explanations appreciated. I already sent him the link to this architecture overview which I hope helps.

1 Like

(The XOR concept: you could just think of it like vaults responsible for a specific piece of data will be anywhere geographically. In XOR space they are close. So, statistically, the most likely thing is that they are spread out among the most heavily populated regions that are also well connected to internet and so on.)

It’s true that at “some point” it wouldn’t stay fast. The extreme example is that all nodes go offline, and then online again. The other end of the extreme is that no nodes go offline or new come online.

At some ratio of leaving/joining to existing number of vaults, we begin to see latencies that are higher than “normally accepted”. It would of course be a very gradual onset, so the range of ratios qualifying for this is broad, both because the distribution of occurrences would be large, and because perception of “slow” is a bit subjective. There’s probably some current business standard on what’s an acceptable latency for a regular app, and if we choose that as a simplified target, then the range of ratios is a bit more confined.

OK, so those were a lot of words, but still simple concept I hope.


There’s a constant flow of “data at rest”. So, this does not mean communication, but the stored data.
This is because there’s a constant relocation of vaults between sections, and this means that they drop their previously held data (for another vault to pick up), and themselves pick up data from the new section. The older the vault gets, the more seldom there’s relocation.
But just as there’s a (rough) doubling of programmers every 5 years, and so half of all programmers will “always” have less than 5 years experience (according to Uncle Bob), the same principle applies for a SAFENetwork all while it is growing heavily. So at any given point there’s going to be a lot of data on the move. You could say that half of the network vaults will always be very young, and relocate often. (Very simplified of course, but just painting the picture with rough strokes here).


This is an animation of how devices are connected and disconnected to internet over the course of a day. The data is old now (2012 IIRC), but you see the principle. This is how the data would move as vaults join and leave. You can see the high concentration of data as red, and low concentration as blue. These are the geographical regions the data will be concentrated at, and this is how the data will move over the earth, given that devices are actually ever disconnected (or turning off the vault software). (I think that increasingly, devices will never be disconnected or shut down. But you could perhaps map this movement instead to join/leave activity, which just as other human activity would be more likely to happen when they are awake, and so it would follow this pattern as well. Very far down the road it’s maybe more scheduled/automated and connected to price movements / markets, and I guess at that point these trends blur out, and we increasingly get a stable coloring over the day. Also, in this image, you’d have to take into account that over the coming 30 years, population in Africa might double, and connectivity to internet improve dramatically, things like that. Principle the same though.)

OK. So this depicts some of the effects of vaults joining and leaving and so on.


But, what happens with that data that you wanted to request?
Well, thing is, there’s going to be a number of vaults that step in right after one is leaving. Not sure if they already have that data. But at least there will be 7 vaults able to respond to your GET request, immediately after the 8th has been relocated. The work they are currently doing, is to synch up the data that the 8th one held, to the replacement vault, or voting on the membership for it, or verifying it’s resources - that kind of stuff. And so, this is where lag could come in, if this work related to the replacement vault slows down the serving of data on the GET request.
Not sure if Rust supports it, but if that work was happening on a thread with less priority than the serving of GETs, it would not interfere (significantly) with it, and thus all GET requests would be almost unaffected by the constant relocation, vaults joining and leaving and so on.

Even more words… But I hope it does gives insight into the matter in a comprehensible way.


So, as to answer the question: (Can vaults join and leave and still have a fast network?)
It depends how many join and leave in relation to existing vaults, as well as … I would say … how the work priority is managed at code level for the things that might affect GET latency. And basically, the better the performance the higher the join/leave ratio can be, but there will always be some upper limit.

10 Likes