[RFC] Data Hierarchy Refinement

Data Hierarchy Refinement

Note:

Bare in mind, that this is very much a work in progress, and there are inconsistencies as well as unfinished parts of this proposal. We have chosen to share it with the public in an earlier stage of iterations, for earlier exchange of ideas. If the proposal turns out to be desired and accepted, the earliest time of implementation would be post-Fleming.

Summary

This is a proposal describing a system where all data types are built of chunks, and together with decoupled metadata gives uniform data handling, for all types of data structures. This also solves the sustainability of the AppendOnlyData / MutableData types.

All content is built up from chunks.
The chunk storage is organised in various types of data structures (also referred to as data types) where the data held in the structure is a chunk or a set of chunks (via indirection using the chunk(s’) network address(es), wrapped as a Pointer).

The names of the data structures are the following:

  • Blob, whose structure is just a single blob of data.
  • Sequence, whose structure is a sequence of data.
  • Map, whose structure is a set of unique keys, and a set of values of data, where a key maps to a value.

A Shell instance holds information about a data type, such as the actual structure with the pointer(s) to the chunk(s), name, type tag, ownership and permissions.

Self-encryption could be applied to the chunks regardless which data type they belong to.
In other words; a user can put any type of data in any structure, the chunking of the data works in the same way regardless of data type.

Background

To chunk or not to chunk

In SAFE Network there are currently three data types, all handling data differently. Being a distributed network of data storage, there are some solutions that are more sustainable than others. Blob storage for example, is inherently sustainable, as it chunks the data up in smaller pieces - chunks - and distribute them over the nodes in the network. What is kept is a reference, a map of the chunk addresses. If the data map is too large, the process is applied recursively. This way, data is always spread out over the network, regardless of its size.

Map and Sequence (the result of splitting up AppendOnlyData and merging MutableData), inherited the design from AD and MD that all the data is stored within the group of 8 nodes closest to the data in xor-space. This is for obvious reasons not a sustainable solution. Previously a hard-coded limit on entry size and count acted as a forced distribution over multiple instances, by simply not allowing the user to store more in the instance, at the cost of limiting the data type use cases and utility.
The most recent design removes the hard-coded limit (i.e. entry size and count is now unlimited), but does nothing to solve the problem: a group of 8 nodes will typically have a very limited storage size, which will act much the same as a hard cap on entry size and count.

The problems to solve include to allow for Map and Sequence to have a sustainable data storage design just like Blob.

Moving around data

As described above, data is stored with the nodes closest to the data, i.e. id:s of the nodes and the hash of the data, are close in xor-space. As data in the network is only added or deleted, there is no change to actual data. It either exists, or does not exist. The older design of the data types complicated this concept.

Making a Blob Public would today change the XorName of the data (as private XorName is hash(owners + contents) while for public it’s just hash(contents)), i.e. require the data to move in the network.

An instance of AppendOnlyData and MutableData were more or less to be considered a chunk, since they were limited to 1 MiB in size. But these chunks were thus mutable, and regardless all the metadata were held in the chunks.

In a distributed network, where there are already uncertainties in how much time nodes will spend syncing data (forum discussions with estimates on the time required, have not had entirely comforting results), it should be a high priority, a clearly expressed goal, that data should not move around more than necessary. Anything else is a negligence of the physical reality and a waste of resources which, considering the rudimentary analysis on sync times, in the end could also be infeasible.

First priority should be an acceptable UX, after which we can relax the requirements a bit, compromise and prioritize other values higher. The current design however, did not put this problem at the forefront.
It was considered and arguments have been formulated around the rationale for current design, along the lines of the following:

I.e. problematic to adjust the rules for when and how xorname change?

Yeah, I believe so, because in case of Blob it is required because of the ability of an owner to delete it.
And now imagine we both have the same piece of data (which amounts to the same XorName), unbeknown to both of us, and I delete my piece of data which has the same XorName as yours. What would happen in this case? You won’t be able to access it anymore, hence we use a diff hashing mechanism for private Blob.

But the effort to solve the problem more or less ended there, which would be a natural result of not defining this problem as a high priority.

The idea of symlink is born

In the search for a simple unified way of handling data at the lowest level, and a way to decouple that from metadata changes, the focus was on extracting the structure of the data, and have everything at the lowest level be separate, as chunks. It would be more or less necessary if we were going to handle large amounts of data in all our data types, and keep the data truly distributed in the network.

Later, discussions led in to symlinks.

If instead of data moving, or indeed changing, references to data can be put in other namespaces. Like symlink, to a specific version of private data, (with whatever would be needed to access only that version).

[…]

How it works now: it’s stored at different locations, because hash(“hello” + Bob) results in 123 and hash(“hello” + Carol) results in 345.
I decide that I don’t need this file anymore, so I delete it (which is a fundamental property of private ImmutData - otherwise why have it in the first place?), so it’s not available at location 123 anymore.
However, your file stored at 345 remains unaffected since it’s your file stored at a diff location.
Now, imagine we use some sort of pointers or symlink.
I store my hello file and the location of actual data is now just hash(“hello”) (as it works with public data), so let’s say xorname 666.
I store a symlink data-loc: 666, owner: Bob at location 123 and you do the same, storing a symlink data-loc: 666, owner: Carol at loc 345.
You make your file public, and it’s all fine as it suffices to just move a symlink to a new location, from 345 to 678.
However, now I want to delete my file … and what happens then? Do I delete my pointer only? Do I delete the actual data at loc. 666, so that your symlink doesn’t work anymore? Or do I not delete the actual file at all, meaning that this data becomes undeletable? Which in the latter case does defeat the purpose of having private ImmutableData in the first place.

[…]

In combination with the symlink proposed we could have something like data is at place where it would be if published, each owner of private data has a symlink at hash(data + owner) pointing to it. When published data is not moved, but the xor_name used is no longer the unpublished one but the published one + ref_count is gone. If owner delete and refcount exists and reach 0 data deleted.
That would at least avoid having to move data when publishing, as well as deduplicate unpublished data.
But that probably have other issues.

[…]

I agree, reference counting is a common/usual way to solve this sort of problem. When last reference (symlink) is dropped, then delete the item it points to. btw, this ‘symlink’ proposal sounds to me more like a hardlink in filesystem layout, where it is hardlinks (filenames) that point to inodes, and symlinks point at hardlinks. info here

Metadata and chunks

The data stored to the network, i.e. chunks, doesn’t change. Chunks are added or deleted. What changes is metadata. So, a specific chunk, should only ever have one XorName. Basically, it is hash(chunk.payload) and so resides in one place in xor-space regardless of Private / Public scope.
The metadata is basically a light-weight wrapper around Private / Public scope, owners, permissions, what structure the data is organised in, and the Pointers to the chunks, i.e. the registry of where the chunks reside in the network etc.
Metadata is what changes; owners history is extended, permissions history is extended (so those two, if we are nitpicky, are also only appended to), Private can change to Public, reorganizing the structure (adding, removing and swapping Pointers to chunks).

The conclusion of this is the following:

Data has no reason to move around, other than section membership changes that would require it to be copied over to new members.

Not only in code, but even physically, what changes often should be separated from what doesn’t. This begs for metadata and data being separated.


Terminology

Word list

  • Gateway nodes: A category of nodes performing validation and acting as a barrier and gateway to further access into the core of System nodes.
  • System nodes: A category of nodes performing core functionality of the system, such as storage and retrieval of data.
  • Client nodes: The subset of Gateway nodes that connect to clients and manage their balances. Corresponds more or less to ClientHandler at Elders.
  • Shell nodes: The subset of Gateway nodes that holds Shell instances, and validate access to them and the chunks they point to. Corresponds more or less to DataHandler at Elders.
  • Chunk nodes: The subset of System nodes that hold data in the network. Corresponds more or less to DataHolder at Adults. Should only care about storage and retrieval of chunks, and be oblivious to anything else.
  • Chunk: A piece of data, with size of at least 1 KiB and at most 1 MiB.
  • RefCount: A number used for counting number of unique clients referencing an individual (private) chunk.
  • Data: Loosely defined as that what content is comprised of, and chunks hold.
  • Content: Digital information input to the network by clients. On the network it is contained in a chunk or a set of chunks, or even in a structure of a combination of the aforementioned. The structure itself can be an essential part of the semantics of the content - even though strictly speaking the structure holds the content.
  • DataStructure: Comes in different types, DataTypes, that define the structure in which data is held, such as Sequence, Map or Blob. Accessed through a Shell, which is held by Gateway nodes. A DataStructure is light-weight as it only holds Pointers (as well as versions and keys), and not the actual Data.
  • Blob: A structure of a single Pointer. (Usually used for a single large file, which is stored in a chunk or a set of chunks, hence only necessary with a single Pointer).
  • Map: Key-value structure of Pointers.
  • Sequence: Append-only structure of Pointers.
  • Pointer: A pointer to data, i.e. an address to a single chunk, or a set of addresses to chunks.
  • Shell: Contains a higher layer representation of the data held by chunks, also all its metadata. Without the information in the Shell, it would be impossible to locate the chunks that build up a certain piece of content stored to the network. It is like a map to the chunks, and a blueprint for how to reconstruct those chunks into a meaningful representation. This map and blueprint is protected from unauthorized access, using the permissions specified by the owner/user, and held in the Shell. This means that the access to the chunks and the content they are part of, is protected by the Shell. (This is why the nodes holding a Shell are part of Gateway nodes, since they act as a gateway to the data.) A Shell is always located in the network using its Id, which is based on an arbitrary name specified by the creator (owner) in combination with a type tag. In case of private data, the owner is also included in the derivation of the Shell Id. A Shell is light-weight, as it only holds the light-weight metadata components, such as name, type tag, owner, permissions and pointers to data. When key components of the metadata change (those that its Id and location are derived from) it is therefore a light-weight operation to move the Shell from one group of nodes to another.
  • ClientNodes(id): The 8 Client nodes closest to the client_id.
  • ShellNodes(id): The 8 Shell nodes closest to the shell_id.
  • ChunkNodes(id): The 8 Chunk nodes closest to the chunk_id.
  • Scope: The Scope of the data is defined as Private or Public. Public data is always accessible by anyone, but permissions for modification can be restricted. Private data is initially only accessible by the owner. Permissions can be added for specific users or groups thereof. However, it is not possible to add permissions for the User::Anyone category, because the Private data instance would then be indistinguishable from Public data in that regard. Private data can be deleted.

Gateway nodes and System nodes

The Gateway nodes subspecialize in various validation areas, such as payment for operations, or permissions to data.

System nodes specialize in the actual core handling of the system functionality, such as storage and retrieval of chunks.

The distinction is meant to allow for additional subsets of gateway nodes or system nodes to follow the same architecture.

Quick explanation of the word list

All content is held in the network as data in one or more chunks.
The chunks are stored individually at the nodes. The references to the chunks are organised in data structures of various data types. These hold the network address(es) to the chunks, wrapped as Pointers.
The data structures are the following:

  • Blob, whose structure is just a single blob of data.
  • Sequence, whose structure is a sequence of data.
  • Map, whose structure is a relation between a set of keys and a set of values of data.

A Shell instance holds information about a data structure instance, such as the actual structure with the pointer(s) to the chunk(s), name, type tag, ownership and permissions.

Self-encryption could be applied to the chunks regardless which data type they belong to.


The Shell holds the metadata

The Shell

The Shell for a Private data structure instance is found at the XorName address which is hash(owner + name + datatype + tag). The Shell for Public instances are found at hash(name + datatype + tag).

The Shell consists of:

  • Id: (hash(name + datatype + tag) | hash(owner + name + datatype + tag))
  • Name: (string, arbitrary name)
  • TypeTag: (u64, reserved ranges exist)
  • Scope: (Private | Public)
  • OwnerHistory: (Vec<Owner>)
  • PermissionsHistory: (Vec<Permissions>)
  • DataStructure: (Blob | Map | Sequence | Shell) Enum holding value ofPointer in case of variant Blob, a structure of Pointers in case of Map or Sequence (i.e. BTreeMap<Key, Vec<Pointer>> and Vec<Pointer>), and a tuple of Vec<Pointer> and Box<DataStructure> when Shell (where the Vec<Pointer> is for the previous Shells, and the DataStructure is for the data appended since the previous one was blobified).

Pointer holds either ChunkMap or XorName. It represents a Blob, Map or Sequence value, which can be either just a chunk, or a set of chunks, if the stored value was big enough.

pub enum Pointer {
    /// Points directly to
    /// another Shell instance.
    Shell(XorName),
    /// From large content.
    ChunkSet {
        /// Locations of the chunks in the network
        /// (the same as chunk post-encryption hashes)
        chunk_ids: Vec<XorName>,
        /// An encrypted ChunkMap.
        chunk_map: Vec<u8>,
    },
    /// From small content.
    SingleChunk(XorName),
}

Any request on data goes through such a Shell, which is handled at Gateway nodes, ShellNodes(id), (essentially corresponding to the DataHandlers at section Elders).
The request is validated against owner and permissions, etc.
If the request is valid, it is forwarded to the Chunk nodes at ChunkNodes(id), for each chunk in the requested data, while a receipt of the request is returned to the client.
This is the Scatter-Gather pattern, where the request is scattered over the set of ChunkNodes(id), and the aggregator is the client, which will asynchronously receive the Chunk node responses, and match them by the correlation ids also present in the receipt it received from the ShellNodes(id).

In other words:

  • ShellNodes(id) has the metadata and by that can send all the necessary requests to ChunkNodes(id).
  • ShellNodes(id) responds to the client, with a receipt of what responses it needs to expect from ChunkNodes(id), and how to restore that into the requested data.
  • ChunkNodes(id) receive information from ShellNodes(id) of where to send the requested chunks, but have no idea what Shell the chunk is part of (a chunk-address can exist in any number of Shell instances).
  • The recipient client could be the owner or could be someone with read permissions on the chunk, for all that ChunkNodes(id) knows.

Shell data growth

If the Shell size exceeds some pre-defined size, the Shell is blobified, i.e. stored as a Blob (with scope corresponding to the Shell scope), and the current Shell updated as follows:

  • Id: (no change)
  • Name: (no change)
  • TypeTag: (no change)
  • Scope: (no change)
  • OwnerHistory: (Vec<Owner> with only the last entry, new entries go here)
  • PermissionHistory (Vec<Permissions> with only the last entry, new entries go here)
  • DataStructure: (Set to Shell enum variant. The value is a tuple of a Vec<Pointer> and a Box<DataStructure>. The vector holds pointers to previous versions of the Shell, now stored as Blobs. The Box<DataStructure> will just be the Blob when structure is Blob. In case of Map/Sequence - the latest versions of the Pointers will be kept in the DataStructure, and new entries go here).

This Shell is now the current version Shell. Any changes to the metadata takes place in this instance. Previous versions are now immutable, as they have been blobified.
Previous versions are kept as references in a vector, which point to the Blobs containing the serialized Shells.
In case of Map / Sequence, every previous version of Shell holds earlier versions of Map keys, or Sequence values.

Since a Blob doesn’t have growing data structure, the only way the Shell would exceed the max-size, is if the owner or permissions history have grown beyond that size.

“Modifying” data

In the current version Shell (the only one held at Gateway nodes, earlier versions have been offloaded as Blobs), you can append the new data to a Sequence, or delete, insert or update to a Map (NB: which internally is also appending, regardless of Private / Public). This is done by storing to the network: the data as a chunk or as a set of chunks; and storing to the data structure held in the Shell: the reference to the chunk (the XorName) or the set of chunks (the ChunkMap) - i.e. Pointer.

Deletion

In the case of Scope being Private, we allow deletion of the actual data. By refcounting the chunks, and only deleting the actual chunk if the decrement results in 0. The end result will be the exact same as if two copies of the chunk was maintained on the network. A client never accesses the raw data, it always accesses Shell, and so that the network store the copies of same data from different users, at the same location in the network, is an implementation detail. In either way to go about it (multiple copies or deduplication), if more than one user has the data, the data still exists somewhere in the network when it is deleted by a user. The only difference between keeping two copies or one, is where in the network, which is a pure technicality completely opaque to the user since they always access it through Gateway nodes handling the Shell.

pub struct PublicChunk {
  payload: Vec<u8>
}

pub struct PrivateChunk {
  ref_count: u64, // Only private chunks have ref count, since they can be deleted.
  payload: Vec<u8>
}

pub enum Chunk {
  Public(PublicChunk),
  Private(PrivateChunk),
}

XorName of the chunk is hash(payload).

If an owner deletes a chunk, the ref_count of the chunk is decremented. If the ref_count is by that zero, the chunk is deleted from the network.

If an owner deletes a data structure instance, delete is called on all of its chunks (using the Pointer(s) in the DataStructure) and then the Shell is deleted from the network (and all blobified versions of it). Such a process could be a longer running task, and we could store the deletion process state in the actual Shell, to allow for monitoring of the process.

Storing data

Clients do the chunking of content and organize the encrypted ChunkMaps (held in Pointers) to them in some data structure. Then uploads the structure and the chunks to the Gateway nodes closest to the client_id - ClientNodes(id). These validate the balance, and then forwards the chunks to the Chunk nodes - ChunkNodes(id). ClientNodes(id) then continues with updating the Shell specified in the request, at the corresponding ShellNodes(id). They could be updating a Map with an insert, a Blob with create and a Sequence with an append, all with the same value - which would be an address to data, a Pointer.

  1. Client chunks the data, and acquires a ChunkMap.
  2. Client stores to network by sending all chunks to ClientNodes(id).
  3. ClientNodes(id) subtracts a StoreCost_a.
  4. ClientNodes(id) sends the chunks to ChunkNodes(id).
  5. Client encrypts the ChunkMap, and calls ClientNodes(id) to store it in some data structure.
  6. ClientNodes(id) subtracts a StoreCost_b.
  7. ClientNodes(id) calls ShellNodes(id) to update the DataStructure held by its Shell, inserting/appending the encrypted ChunkMap.

If the content size is too small to undergo self encryption, client has to encrypt the data before producing the ChunkMap.
A wrapper around the self-encryptor could handle this case, to ensure that any data leaving the client is always encrypted, as per the fundamentals.

Retrieving data

With a reference to a data structure instance, i.e. to its Shell, the actual data is retrieved and reconstructed by calling ClientNodes(id) for the relevant Pointer (a single one in case of Blob, the one mapped by a key for a Map or for example the current version of a Sequence). The ClientNodes(id) looks up the Pointer and sends a request to all relevant ChunkNodes(id) for the XorNames found in the chunk_id field of the Pointer, asking them to retrieve the chunk. Finally, ClientNodes(id) returns the encrypted ChunkMap held in the chunk_map field of the Pointer, to the client.
At the client, the encrypted ChunkMap is decrypted, and used to reconstruct the content, from all its chunks that asynchronously drop in from the network as they traverse it from their respective data holders.

  1. Client calls ClientNodes(id), requesting to access some entry in some data structure instance identified by shell_id.
  2. ClientNodes(id) retrieves the entry from the ShellNodes(id) nodes and A. request the chunks from the data holders (using the chunk_ids part of the Pointer) B. returns the entry to the client.
  3. The client decrypts the chunk_map of the Pointer part of the entry, and using the decrypted content (ChunkMap) reconstructs the content from all chunks that come in as a result of the request.

What we achieve

The above achieves the following:

  • All data is represented as chunks in the network.
  • All such chunks are deduplicated (depends on implementation details though).
  • Metadata is separated from data.
  • Modifying metadata (e.g. Private to Public) does not move around data.
  • We make it clear that we have two layers, with one protecting validation layer, and one core system layer.
  • We sustainably store “unlimited” count of entries of “unlimited” size in Map and Sequence.
  • It unifies the data handling and solves the problem at the system design level instead of code level.
  • Any additional data types, and how they are handled, are implemented in the Shell nodes, while Chunk nodes are oblivious to it, and only deals uniformly in chunks.
  • Any additional validation roles are implemented as a subset of Gateway nodes.
  • Any additional core system roles are implemented as a subset of System nodes.

Oblivious nodes, Data flows

When storing data to the network the client sends

  • Chunks, which the client nodes forward to Chunk nodes.
  • Pointers (in some structure), which the client nodes forward to shell nodes.

It follows by this that:

A. It is the responsibility of the client, to ensure that the chunks referenced in the Pointers are stored to the network. Otherwise, a GET-request for the Pointers would give an error.
B. It is the responsibility of the client to ensure that Pointers are stored to a Shell in the network. Otherwise the client will not be able to retrieve the chunks again, and restore the content.
C. Sharing of the contents, require encryption of the ChunkMap using a key which allows for the other to decrypt it.

pub enum Pointer {
    /// Points directly to
    /// another Shell instance.
    Shell(XorName),
    /// From large content.
    ChunkSet {
        /// Locations of the chunks in the network
        /// (the same as chunk post-encryption hashes)
        chunk_ids: Vec<XorName>,
        /// An encrypted ChunkMap.
        chunk_map: Vec<u8>,
    },
    /// From small content.
    SingleChunk(XorName),
    }

pub type ChunkMap = Vec<ChunkDetails>;

pub struct ChunkDetails {
    /// Index number (starts at 0)
    pub chunk_num: u32,
    /// Post-encryption hash of chunk
    pub hash: Vec<u8>,
    /// Pre-encryption hash of chunk
    pub pre_hash: Vec<u8>,
    /// Size before encryption (compression alters this as well as any possible padding depending
    /// on cipher used)
    pub source_size: u64,
}

Client full flow

(This is a simplified draft)

  1. Client wants to store content.
  2. The content is self-encrypted (if large enough).
  3. Each chunk is uploaded to the client Elders (ClientNodes(id)), who in turn send them to the ChunkNodes closest to the chunk_id.
  4. Client produces Pointers, with chunk ids (from self-encryption step) and the encrypted ChunkMap (also from self-encryption step).
  5. Client requests a Shell to be created or updated with the Pointer(s), by sending them to the client Elders (ClientNodes(id)), where the interaction with the ShellNodes closest to the shell_id, takes place, with the corresponding operation on the Shell.

Chunk

Shell

Unresolved questions

Caching

If Get requests go via Gateway nodes, and are then relayed to Chunk nodes, who finally call the client, we are most likely passing the network in a one-way circle, rather than on the same path back and forth. This complicates the caching.

If it came to be critical, it should be possible to ensure the response path is the same as the request path (although that would not be a step in the preferred direction, in terms of network design).
However it is probably not required; Even if response do not take the same path as requests as happen now, when we talk about popular chunks, they should be popular by randomly distributed people over sections, so basically section on the response path of some will be on the request path of others.

Hidden chunks

(The hidden chunks property of this system has received some initial positive responses, but it remains an unresolved question until it has been examined deeper.)

The location of the chunks is known by the nodes holding a Shell (that’s how they can fan out to the ChunkNodes, to request for them to send the chunks to the requesting client). They are stored in the Pointer field chunk_ids. But how to combine them into the original content, is only known using the ChunkMap. The ChunkMap is stored encrypted in the Pointer field chunk_map.
This way, although intermediate nodes can correlate chunks and clients, only the client can read the raw content of chunks uploaded to the network.

Explanation

Let’s say we wanted the feature of Hidden chunks (not saying we in the end would, but in case of)… (disregarding encryption of content pre-selfencryption, as that would disable deduplication and complicate sharing).

…we would then:

A. Using self-encryption, we produce the chunks, get their Ids and the corresponding ChunkMap out of some content.
B. Upload the chunks to the network (through ClientNodes(id)).
C. Encrypt the ChunkMap locally.
D. Place the ids of the chunks, and the encrypted ChunkMap into a Pointer.
E. Choose a structure for the Pointer to be stored in.
F. Create or update a Shell instance with given Pointer (through call to ClientNodes(id)).

(The final update of a Shell, can include a multitude of Pointer operations, in a transaction.)

Consequences

Since all interaction with data goes through Gateway nodes, by interacting with a Shell (referenced by its name and tag), checking for the existence of a chunk on the network will not be possible.
This means that it will not be possible to self-encrypt some file and check whether that file exists on the network.
This effect could be considered an improved privacy.

Scenario: Let’s say you are suspicious of some leak of information, a whistle-blower for example, and so you plant some documents where you know they will be found by someone you suspect. Later on you could be polling the network for the existence of these documents, and once found you will have revealed the whistle-blower.

A pro SAFENetwork user would know to attach some minimal information to any data uploaded, as to completely change the ChunkMap, and protect from the above setup. Nonetheless, it is not unlikely that people would regularly fall into such traps, for some reason.

Storage cost payments

In the above examples, one payment is made when chunks are uploaded, another payment is made for operations on the Shell (metadata).
This allows for differentiating the pricing. Maybe the cost per chunk could be variable based on the chunk size. Or maybe it is good to have one cost per chunk. Regardless of that, the Shell operations could also be priced differently. It could be argued that updating the Shell should be a cheaper operation.

25 Likes

I’ve often wondered how some of this worked under the hood but never dared look. :wink: Well done to those brave enough to have wrestled with this.

It’s a lot to take in. A couple of things stuck out on first read…

This leaves me uncomfortable. I see that this would be rare, but I’m assuming it will happen, in which case what happens? Presumably it becomes impossible to do certain operations in the data. I’m guessing the data would effectively become frozen - no more changes.

This seems problematic because losing the ability to mutate a particularly crucial data structure - exactly the kind that might accumulate lots of ownership updates - could have large secondary impacts. Given this is the perpetual web, we need to carefully consider the very long term, whatever that might mean!

Perhaps the limit could be mitigated after the fact? If so how? If not, the repercussions need examining.

If as I think was explained in another topic, Private data needs to be encrypted by the client while Public data will necessarily be unencrypted, won’t that mean that when data is published one chunk will be deleted and another uploaded?

Even though the plaintext is the same, the encrypted and unencrypted data will differ and so have different xor address and storage location.

If not, won’t we still have the problem of Private data being visible on its way across the network?

4 Likes

I think you misunderstood what it meant to say. I’ll see if I can make that clearer in the RFC.
All Shells can “grow” endlessly. After a certain size is reached, it is distributed, so that the Shell itself is never an unwieldy piece of data. But Blob would be less common to do so, simply because it has no data growth.

The method used for this (blobifying) is likely to change, I plan to do something similar to the “ever expanding database” if you remember that one :slight_smile:

This is exactly what this RFC is about :slight_smile:

No, not necessarily. In case of non-selfencrypted (<3KiB) then maybe, but that is also a very lightweight operation.
In case of larger data, the chunk map is just replaced with its unencrypted version.

3 Likes

Nice work. There is a lot to dissect here after reading through it a few times. The only way for me to give feedback in a coherent manner is to pick through it line by line as a running commentary. Although crude it is the only way I can manage a response in the wee hours of the morning. Here goes:

Good. I would go one step further to require that the “chunk” datastructure is a base unit used in the construction of ALL other datatypes in the hierarchy following an OOP construction by assembly approach, including metadata.

Based on this RFC and past discussions in the other thread on Data Types Refinement you have convinced me that the term Blob is an absolutely horrible descriptor for what you are trying to accomplish. More on this below.

It is unclear here if you really mean data “type” vs. an instantiated data “object”.

Just chunk it. Chunk early and chunk often. :cowboy_hat_face:

Interesting insight. I thought that if the local group of 8 did not have enough storage than the nearest neighbor search radius is expanded to include more than 8 nodes?

I agree that there is an opportunity for improved deduplication here.

I like where this is going. Viewing SAFE as a big hard drive in the sky with analogous operations/functions to a common tried and true filesystem like ext4 or xfs will help speed development IMO since you already have a stable, working and well documented model of what you are trying to accomplish at a grander scale.

If not careful with the definitions this could lead to some circular dependencies since the metadata needs to be stored somewhere too. Consider as an example the EXT4 filesystem where we have data blocks and metadata blocks. Regardless of data block type (meta vs. actual) they are all stored in fixed block sizes on disk (typically 4kiB to match hardware sector size). I view the EXT4 block on disk to be analogous to a SAFE chunk. This indicates that your metadata should ultimately be stored as a “chunk” too if you want to keep the logical consistency and benefits of assembling a well defined object hierarchy.

Specific comments about terminology:

Nice. I like the differentiation here. This could also be generalized to N layers extending from core to boundary. A few synonyms that evoke different imagery for the case of N =3:

  • “Gateway Nodes”, “System Nodes”, “Kernel/Core Nodes” for a computer reference.
  • “Exterior Nodes”, “Boundary Nodes”, “Interior Nodes” for a spatial reference.
  • “Frontier Nodes”, “Border Nodes”, “Control Nodes” for a geographical/political reference.
  • “Peripheral Nodes”, “Passing Nodes”, “Principle Nodes” for a roles reference.

The term Shell is not used appropriately here and also later in the document. In computing the term ‘shell’ is synonymous with a user interface that allows access to an operating system, its programs and services. It is confusing to equate shell terminology with pure data and datatype constructs unless you are specifically building to a shell program like the SAFE CLI. I do like the simple and self-explanatory definition offered by Client Nodes

Programming wise, if chunks form the base object in an OOP hierarchy from which other types are assembled, then your metadata should also be stored as chunks. This means that all nodes would store and retrieve chunks, but nodes dedicated to dealing with metadata would store and retrieve metadata chunks, and nodes dedicated to data would store and retrieve data chunks. For this reason I would recommend using the terms Data and Meta Data. To maintain continuity with previously employed terminology I suggest using Data Vaults and Meta Vaults here.

You seem to really like this term, but it is not a good designation for what you are trying to achieve here. This is made more evident by the picture you drew below. Really what your are designating by your Blob and Sequence is an Unordered Set vs. an Ordered Set. The mathematical definition of an Ordered Set is essentially a Sequence (when duplicate entries allowed) so you’ve got that one. Blob on the other hand evokes no intuition of a Set. So why not just keep it simple and call it a Set. You could also extend the Set terminology to a Collection where duplicate entries are allowed.

Everywhere else in the computing world this is called a File. The use of Files in the network is OK. KISS. “Everything is a File.” And like I said earlier, shell is usually reserved for user interaction with programs/services. Later on when SAFE has a computation layer, I could see “ShellNodes” as being the perfect description of an interface to this layer. These future ShellNodes would handle the running of SAFE programs and processes as part of a general SafeOS.

ClientNodes, FileNodes, DataVaults, MetaVaults.

I would be happier if you replaced the term “Shell” with “File” in this large section.

Seems inconsistent to have specialized ChunkSet and chunk_map types. Wouldn’t it be preferred to build this higher level types from lower level ones? So this way a chunk_map is instead Map<Chunk>.

This is a problem. Nothing in this world should ever have the indignity of being blobified at this high level of abstraction. I suspect that it would only get too large due to the owner history and permission history. Better to change these constructs from a Vec to a Sequence so that all of the #blobification can happen under the hood.

Perfectly logical and follows standard filesystem practice. However, is there a chance for a security exploit here where the refcount could be decremented maliciously?

You forgot one of the best features that is possible with this approach. The chunks can be encrypted again by a subtype of Gateway nodes prior to being sent to a Data Vault (your ChunkNode) and decrypted when retrieved from the vault. These keys would only be known by the Gateway layer, not the Client nor the System Layers.

All nice to see. I also like your data flow diagrams. They really help make things easy to understand. A lot of possibilities here.

12 Likes

Fantastic feedback, @jlpell. Much appreciated :slight_smile:
I’ll brewing on a response .

7 Likes

I need to correct myself here. Client node is actually ambiguous since that best describes the client/user computer that is actually connecting to the SAFE Network. So I would instead propose a rename of Client node to Shell node. Here Shell node is a better nomenclature since if I understand correctly these are essentially the network interface that handles client requests and does input/output to the clients.

Another correction needed here. Given the proposed rename above. These would be more appropriately called Process Nodes or Compute Nodes.

In summary, you end up with Shell Nodes, Process Nodes, and Data Nodes/vaults. An intermediate File Node aka Meta Vault to deal with your metadata needs might also be important to complete the picture and form an analogy to a general purpose computer system with typical levels of pointer indirection. After all, a general world computer is the end goal is it not?

6 Likes

Yes, the internet cables are simply the connections on the giant mother board. :slight_smile:


Where we differ on the view of chunks, is that I see all chunks as a write once immutable object.
You can write anything between 1KiB to 1MiB, but once written it can not be changed.

Metadata, since it is changing (you edit name, you change scope, modify permissions), would not go into chunks for that reason.

It should say “an instance of a data type” :+1:

Not AFAIK.

The metadata is stored on the ShellNodes. It’s a different structure on a different type of nodes. I don’t see that circular dependency that you mention.

Well, the intention is to keep things that change - metadata - separate from things that don’t (and can’t) change - chunks. The point is to separate them, so I don’t see why they should be made chunks with that in mind.
Metadata is a thin layer on top of the bulk of the storage that is chunks. It describes a set of chunks, and who can access them and what basic structure they are part of.
This thin layer can easily be handled and stored by the ShellNodes.

Chunks can only be accessed through this thin layer. So making the thin layer a chunk, short circuits the logic, since you have to access a chunk to get the metadata that is supposed to be the layer hindering access to chunks…

I’ve always preferred to use computer references when building IT stuff. I know all (or most) of them where once references to something out of our natural world. But that was before the new world existed. Now that it does exist, I very much prefer to use that language, and invent/bring in new only when absolutely necessary (i.e. there is no real equivalent in the domain, so you’ve invented something).

“Gateway Nodes”, “System Nodes”, “Kernel/Core Nodes” for a computer reference.

So, these I consider the best.

I agree. I basically hitch hiked on an existing use of the term today in MD and AD; it returns all except the data (so the permissions and owners :slight_smile: ). It is basically the same things that we already have, but organised a bit differently.
But again, there are better names than shell for this, I completely agree.

Not if you can only access chunks through metadata, which is one of the major features in this design. How are you going to load that metadata, when it is stored in a chunk?

Sure, you could say that they are stored in meta data chunks, but that to me only confuses. Which chunk was it now? Better with clear distinction. The meta data is information about chunks. It has another structure, another purpose, another lifetime. I think that is best modeled by letting it be something else than a chunk. It gets too goopy/muddy to try fit everything in that one concept. Better have them clearly defined and separate IMO.

I’m not modelling a set here though. I’m modelling a blob. (The picture isn’t great, I can see how you would get the wrong impression, so I will definitely update it.)
Also I think you look at that picture from the wrong side. The names of the structure are for those accessing it from the left (clients), and it is a layer of abstraction over the right (the sea of chunks).
When you have a binary large object, a bunch of data in no special structure, it is stored to the network as a Blob, and the result is that there is a single pointer (no set of pointers…).
A pointer is meant to point to data. It is not meant to point to a chunk.

A pointer is either of these three:

  • An address to a Shell instance.
  • An address to a single chunk.
  • A map to chunks.

Because your data could fit in one chunk, or it could be split into many chunks, or it could be put in another structure described by a Shell.

So, the pointer is to your data regardless of those underlying details.

  • So, when you have a Blob, you will have a single pointer.
  • When you have a Map, you will have pairs of keys and pointers.
  • With a Sequence, you will have an ordered set of pointers.

And I think this can be made clearer both in the picture and in the text.

In a way I like it better than Shell. But I think it comes with its own problems. Connotations that don’t quite apply.
For example, you could store a file to the network, and you could store it “freely floating” as a Blob, or in a special structure, such as in a Map, with a key to reference it. Not sure I think that thing is best described as File.

The Shell is an abstraction or a container, depends on how we see it, over/holding our meta data.
Our meta data I would say is of three types (out of 5 described here):

  • Descriptive metadata - descriptive information about a resource. It is used for discovery and identification. It includes elements such as title, abstract, author, and keywords.
  • Structural metadata - metadata about containers of data and indicates how compound objects are put together, for example, how pages are ordered to form chapters. It describes the types, versions, relationships and other characteristics of digital materials.
  • Administrative metadata - information to help manage a resource, like resource type, permissions, and when and how it was created

I’m not sure I think this abstraction or container is best described as a File, since it will collide with other notions of File, that are definitely not the same thing (such as the things you have on your local drive, that perhaps end up in a Map).

Metadata, could be used, but I think the abstraction / container is more than that (also, it is not a very sleek name). The metadata is what you find within it. And moreover, this abstraction / container is actually the only way you can access the actual data. So, it is sort of a gateway, or a shell, to the data, holding all necessary information (i.e. the metadata) as to fulfill that role .

I wrote this in a reply to happybeing, above, maybe you missed it:

Blobifying is basically about spreading large data as chunks, evenly over the network. Given that the Shell can grow indefinitely, there should be a way to spread that out as well, in a smart way with good operational complexity. I think there could be better ways than blobifying, and have planned to iron that out.

No, the actual structure, when Map or Sequence, can also grow. For every Key:Pointer pair you insert to the Map underlying structure, it grows. For every Pointer you append to a Sequence underlying structure, it grows. A Blob, only holding a single Pointer, would only grow by permissions and owner changes.

Not sure how you mean. Vec (the way mentioned in the RFC) is the rust data type we implement things with. Sequence is a concept in SAFENetwork.

Well, not more than a security exploit on anything else in the network (accessing someone else’s data, steal coins or what ever). It all depends on the same security (Elders, node age, consensus etc.). As long as that holds, then only those with correct permissions can do these things. The chunks are only accessed through the Shells, if you don’t have the permissions in there, you cannot access the chunk. When the ShellNodes have verifyid the request, they contact the ChunkNodes and ask them to send you the chunks. You can’t request the chunks in any other way. So if in the Shell metadata, there is no public key with permissions corresponding to your request, then you are not going to get to the chunks.

This is a cool one. But what problem does it solve?


Looking forward to hear your ideas :slight_smile:

8 Likes

@oetyng, replying here rather than from the dev forum, but in relation to the conversation there about the decoupling of metadata with the Shell layer, what would happen with metadata embedded in files, thinking exif or similar here. I’m assuming deduplication wouldn’t be effective in these cases, or can the chunking process be somehow ‘tailored’ for these type of files (which seems very complicated to me, but throwing it out there…)?

2 Likes

Shells and metadata

The metadata we talk about here in the context of Shells in this RFC, is very specifically for describing how your uploaded data is organized within SAFENetwork. That is something completely different from exif for example.

Details

So the metadata we’re talking about in the RFC, is mostly orthogonal to any file metadata (or any other metadata you can conceive of out in the world). I say mostly, because you can still give it a name, and you can represent a structure that might have some correlation with what ever metadata you’re bringing in from your environment. But that is completely an app developer design decision by then.

Same goes with the metadata in a photo. Your app could just store that photo file to the network as is. The app could also read the metadata out, and store it in some structure, so that it’s easy to read only that info, through traversing the Shells in the network.

If I were to write an app handling photos on SAFENetwork, I would let it extract the exif data, then it would upload the photos into some structure of choice, and the exif data into another structure.
This means that when the app is browsing and searching the photos, it would traverse the structures created for the exif data, and when accessing the actual photos, it would then traverse the structures holding that.

Additional info

These lowest level structures of the network, can of course be exposed directly in apps, and early on might be many apps that reflect these structures to the user in some ways.
But as these things progress, I think we would just use a virtual file system for example, that looks and feels just like the file system on your computer today, and you would have no idea that there are things like Shells, or Blobs or Maps or Sequences, (and some day, perhaps not even a SAFENetwork…).

Conclusion

The metadata we talk about in the RFC is about the smallest parts of the storage structure in the network, and how apps want to organize them.
You can use that storage structure to hold any sort of data, such as photos or exif data.

Related questions

Deduplication

With regards to deduplication, that is a bonus effect which applies as far as possible when possible. So if you have a single app that uploads your photos, it would deduplicate as it would use the same algo for treating your file (assuming it only uses one). Another app might not. And naturally, unless you send your photos around, the likelihood someone uploads the exact same photo to the network is extremely low.

6 Likes

Been reading through this some more and have been struck with a few ‘eureka’ moments. Will need to share those another time. A few quick responses/comments below.

Yes true. What is really required is a third object to form a suitable basis for both.

Perhaps they should be the same thing and offer a uniform interface…

Maybe it shouldn’t grow indefinitely… just have a very large max size that you can quantify and make design decisions with.

A typical/standard Vec lives in volatile memory only. Eventually you will need to serialize this to disk for persistent storage. I figured that reuse of a SAFE datastructure could handle this for you automagically.

I recall long conversations past with @neo about reference counting and data deletion. The consensus view was that this was very cumbersome and inefficient. From experience I know that it has some serious performance penalties on local disk operations in linux when you have lots of hard links to the same file. Do the chunks really need to be deleted? I’m not so sure. The previous standard method of letting a user delete the metadata to a private chunk (aka the “datamap”) but leaving the chunk on the network as garbage is probably fine. Copy on write is as safe as it gets. I know there was some pushback from the community when dirvine asked about this. To some extent I think dirvine was too accomodating for trying to keep the “hive” all happy and that his first intuition about append only and copy on write is a preferable strategy. This was more of a concern when Safecoin was a unique data object and not a section balance. With the balance method, those concerns are likely unfounded.

It solves the “obfuscation at vaults” mentioned here.

1 Like

Nice! Looking forward to hear them :slight_smile:

But they have very different interface. The only thing that can yield is the Shell model, and it would seem to be completely contrived to force it into that shape (looking at file descriptor interface, or what did you have in mind?).
A File mapping use case is rather supposed to be built on top of it I would say.

Well, strictly it’s not the Shell that is growing indefinitely, but the lifetime of people’s usage.
And in fact, in this proposal, a max size is suggested, which is quantified and used for design decisions (such as how to spread it out and access it in the network).
The main purpose is to solve the use case, which is indefinite lifetime.
Not sure what concrete aims or issues you are having in mind.

Ah, yes. Yeah, so for growing Shell, as I said I have other plan than the blobification, similar to the ever expanding database I posted about on this forum way back. That one builds a tree out of MDs, with O(log n) access time.
So, basically it’s about reusing Map and/or Sequence, like you suggest.

It was certainly not this model discussed.
There is a u64 associated with a private chunk, and it is simply incremented on adds and decremented on deletes. There is no other difference to not ref counting.

There is nothing resembling cumbersome or inefficient there, I must say.

Completely different thing. The only contention we’d see here is that of deduplication, which might mean many requests to a few nodes for very popular chunks. But that is orthogonal to ref counting, it comes with deduplication. Additionally that’s what caching is supposed to solve.

Considering that this proposal adds a new feature that didn’t exist back then - that chunks are accessed through Shells only - deleting chunkmap is perfectly fine for hindering any rubber-hose, hacked or other illicit access.

But that wasn’t the only argument.

With the very low overhead of refcounting, and the probably quite low ratio of occurrence (where naturally the work required is proportional to number of chunks being deleted, but still a perfectly distributed computation), as well as any potential benefit in at least in theory storing less junk, it is a very low hanging fruit to enable it, if only for the ability to let people feel that they can. People are not completely rational, and it’s a lot easier to just be able to say “Yes”, when they wonder if they can delete their data. That alone might be orders of magnitude more important for the future of the network, than any technical detail.

I did not perceive that discussion to have much to do with safecoin at all actually. It was mostly about key rotation, rubber hose and people’s desire to know that they could delete, as far as I remember.

Ah, this, yeah that’s nice. It should be completely redundant for anything self encrypted, but with data less than 3kb, that currently wouldn’t be self encrypted, it would give the protection at rest, even if clients circumvented any client side encryption. (I would think that if they did, then they probably knew the consequences as well, but yep would still be the bad apps of course.)

3 Likes

Memory isn’t clear, but the issue mentioned might have been with regard to the operations and checks needed to perform the delete when your refcount gets to zero.

You are not wrong.

3 Likes

May I suggest:

  • Triplet, whose structure is a set of tuples, each consisting of 3 keys, possibly representing two entities and their relationship.

I do recognize this could be stored as a number of Map entries but it may make sense to have it as a separate data type since it would be useful for RDF data and graphs in general and it also doesn’t add much complexity since each entry has the same structure and size.

8 Likes

Awesome @JoeSmithJr
As you already saw there, that’s exactly what this is designed to support; extending with new types.

I havn’t thought specifically about RDF (it’s an area I’ve not gone into yet).
I would have to think about how well it would play out implemented this way. @joshuef, @danda, @nbaksalyar and @bochaco might have ideas.

4 Likes

yeah, basically it sounds like a great idea to me. RDF is built of triples, so having a native data store should make RDF ops faster and more space efficient, as compared to XML or other text serializations.

3 Likes

I think having a Triples data type (Shell) could be the ultimate answer to natively supporting RDF, i.e. having effectively a native triplestore, the first thing that comes up to my mind is that we avoid the serialisation and probably inefficiencies introduced by it.

Nevertheless, some more thoughts/analysis would be needed just to understand how we deal with mutations and its history, it sounds like it could be similar to a Map but having the object to be the key, and the value to be a two-dimensional array for the predicate and subject, so that way we can handle mutations history of triples, and presumably then have the private and public (and perpetual) flavours of it?..not fully sure all this makes sense but just some thoughts.

I was refreshing my memory and looking up for some old research I was also doing in relation to this, also playing around with trees and luckily I found some diagrams I had made. The first one shows how you add pointers to what we are now calling a Shell:

…and a second diagram shows how you build up the tree with the older versions of Shells when they grow, and making them imutable chunks (or blobified as it’s being called here). I just thought it’s worth sharing it as it still fits in this discussion:

Back then I was also thinking that instead of serialising the Shell to make it immutable, we can simply have ImmutableShells, in that way we still can make use of whichever optimisation available for accessing the Shell’s entries, while they can still be immutable and even stored at the location calculated on its content, although I guess you’d anyway need to serialise it for calculating the hash.

5 Likes

Agreeing with @bochaco, @danda and @JoeSmithJr here.

Native triple storage will certainly make RDF more ‘first class’ (though we still have to get APIs to deal with that and probably offer some decent serialisation options to folk). But in terms of raw data storage, this is definitely ‘nice-to-have’ :+1:


Edit: Hmmm, thinking on this further (:coffee:) . If all the RDF data is triples, we lose some efficiency in terms of dereferencing entries. Or we’re allowing nested triples… but then how to easily reference those nested triplets…

Perhaps this is the reason that a lot of triple storage is done via ‘documents’, ( ttl , or jsonld ), which allows this flexibility

6 Likes

@bochaco I think native triple storage would have significant benefits for providing an RDF API. For example, Tpyypically you can ‘select’ triples with patterns using a term to match against each of subject, predicate, object. There are standards too for RDF datatype and API in JavaScript for example such as RDF/JS (as I’m sure you know). So these could be a part of the API for a triple datatype and eliminate the need to deserialise the data in order to perform the ‘query’ (as you note). Maybe this is exactly what you were thinking but no harm spelling it out!

It may be difficult to do with a more compact form of RDF storage (I forget the name). I suspect not, but it needs to be considered.

3 Likes

I think this is what you were referring to, Technical Specification – RDF HDT, which I haven’t dig in myself, but it seems it also focuses on a separation between metadata and data, so perhaps there are some nice matching there in the approach and this proposal.

2 Likes

Wouldn’t they be searched by any of their three parts and also by (object, predicate) and (predicate, subject) and (object, subject)? That doesn’t necessarily fit into the original key-value model of Safe. While each triplet can be uniquely identified by the hash of the tuple, there would be a need for 6 separate lookup indexes as well.

I can see two approaches for that:

  • some sort of a document based indexing scheme implemented (and reimplemented again and again) at a higher level of the stack
  • a new type of data lookup built into the core

The first of these have the downside that the index would need to be maintained manually by the owner of the application (with all the costs involved) even if the entries themselves were added by other users. This may or may not be a good thing, I won’t argue for either, just thought to bring it up.

The second would make graphs a more native feel on Safe. On a second thought, if index/lookup isn’t handled by the core, I’m not sure the idea has any benefit over just implementing the entire thing in the application layer.

Would that be an issue when the search is distributed over so many nodes? Possibly, or maybe even more, or maybe not at all. Safe will be very different from centralized databases.

2 Likes