DataStore over AppendableData design


#21

Thanks for the detailed clarifications I really appreciate it, especially the detail around BTrees.

I’m just going to braindump here to see if it pokes any thoughts in a useful direction for anyone, ie a bit rhetorical.

Yes the sequence part is really the issue. For chronologically sequential data AD is fine enough, but for other sequences (eg sort by temperature ascending) AD is quite prickly. The concept of sequences seems fundamental to this problem.

Problem1 is “fast and efficient data storage and retrieval”. The proposal is to use indexes stored on the network. Problem2 is that indexes change regularly but on SAFE (especially with AD) old indexes can’t be discarded which creates waste and/or leads to performance issues.

And this topic is about problem2, with problem1 acting only as a motivator… I just can’t help coming back to bitcoin blockchain + bitcoin lightning network, this feels so similar regarding the performance and permanence compromises.


I’m curious with MD mutations, it leaves garbage, eg if I mutate my DNS entry to point from ID AB12CD... to ID 34EF56... it would leave ID AB12CD... unreferenced and unused and presumably subject to garbage collection under non-SAFE conditions (but in SAFE it’s ID so can’t be removed). AB12CD... becomes garbage. This doesn’t affect read or write throughput, but it does affect storage. AD also create garbage but it remains readable garbage, whereas MD creates unreadable garbage. How do you rate the magnitude of the MD garbage problem in relation to the AD garbage problem? I feel this is more a political question than performance.

Do I understand correctly that the main issue with LSM Trees is the quantity of garbage rather than the mere existence of it?


Another suggestion for a solution…

Solve indexes as a ‘communication’ problem rather than ‘storage’ problem, which is probably as you say to “cheat of course, and not store the data in the network”! The idea being vaults use the network structure to arrange non-SAFE data (ie the indexes) which are read and stored with much weaker guarantees than SAFE chunks. Not very satisfying… This form of indexing seems a bit like torrent trackers, acting merely as a gateway into the p2p data. Solves problem1 but not problem2.

Thanks again for your clarifications, you’ve put an amazing amount of thought into this problem and it’s a pleasure to read about it!


#22

Thanks @oetyng for the clarification. This was one area I was bad at articulating and is an important consideration for the use of ADs.

Although I would say that the types of applications affected are much more than you described. And even some day to day applications for personal use will be affected too, albeit much less, but multiplied across the desired userbase of the world amounts to a big amount. (eg account keeping for homes to multi billion dollar industries where data is always entered in no particular order)


#23

I agree. This here, how to efficiently maintain sequences in the immutable storage, when inserts are in any order.

We have the compromise between appending changes and storing current state. In the most extreme case, we have the entire current state of our application that we serialize and store it as an ImD - very impractical of course. And at the other end, every little change is inserted to network, which is impractical in its own ways (maximum number of network requests), but also has benefits, (gives no duplication and minimal serialization overhead for example).
So as you say, that’s basically unchanged whether AD or MD. And AD can be used almost exactly as an MD. Except for adding entries instead of overwriting. So if the value is only a pointer, that is not much size to talk about (even though, in the large scale, maybe it actually is, that I don’t know yet).

I guess that is partly an answer to your question. For the design where AD/MD is holding pointers, and all data is ultimately in ImDs, they have the same basic problem of append every change or store current state (in that ImD), and that would mean the magnitude of differences between AD/MD is the size of 1 pointer more in overhead for every change for AD vs MD.

But if we look at how MD are also used today, the difference is (?) bigger:

An MD can keep quite a lot of data, so it doesn’t need to be pointers to ImD. If we are comparing storing actual data in MD (chunked if larger files), to storing it in AD, then we’re really seeing a difference. Each entry in an MD could be up to 1 Kb which would be like a simple row, could be an instance of a simple entity, and storing that as an ImD might be a bit overkill. (I’m not that familiar yet with the optimal sizes, when it is more performant to store in ImD instead of in MD).
So in this latest case, we could store up to 1Kb of current state in an entry, and modify just 100 bytes say of that current state, then store it again, but not incur that storage overhead of duplicating all the other 900 bytes + leaving the 100 previous bytes as history.

I didn’t look it up in the source code, but I have interpreted it as that MD actually has been immutable all the time, only that it has been completely opaque to the end user. I.e. that we didn’t actually overwrite the values in the entries, just append. If that’s so, then there is only the difference in the first example. And it might well be completely tolerable. But, the question could still be asked of course, if that is wise in the large picture, vs actually overwriting, with regards to garbage build up.

I think that is the full picture… Let me know if I’m missing something.

Spot on. And two problems indeed.

That’s what I would say. Problem2 is about the mere existence. And perhaps Problem2 is even more specifically about potentially detrimental excess of the garbage, not only the existence of it.

I say this because I also have in mind the estimations you did on sync times of data between nodes when joining sections; how much time they would spend downloading data before they could be productive, and what bandwidths were needed. I think there wasn’t any conclusion at that time, but IIRC it looked a little bit worrisome.
So my thinking is, we should not be adding to that problem, and my impressions here was that we would risk adding big time to it. (In case that time spent syncing really was a problem, I don’t know now).

I think it’s a good suggestion for an avenue to explore, I’ve been thinking about it as well.
We might need to think about indexes as a more transient form of data. It then really is a problem of communication as you say. If we can collaborate on the indexes out of band, easily and securely, then we can minimize the garbage load on the network, and have performant read and write. Maybe the indexes could be gossiped between participants as well (CRDT graphs) if we need to distribute it.
And for a single client to the data (which is not the end goal use case, almost ever I think, with regards to SAFENetwork) that would be much simpler, since no collaboration needed on the indexes .

One small-ish problem is that the indexes might in many cases be the actual key to reaching the data at all. Just putting the data in the network and then not knowing where it is, is actually not at all making it secured. It would be equal to erasing it (in fact, that is how erasing of data in the network is suggested to be done). So, theoretically, you could say that the data is not at all stored and secured in the network, unless the indexes as well are. But, that line of thought is not perfect, since at some point you will always reach the situation when the key into the data is not stored in the network, but as the login credentials to the network. It’s maybe more a philosophical issue.

Thanks to you for your input, and it’s fantastic to know that you enjoy it!
It’s a pleasure to work with these problems :slight_smile: and it’s even better when others join in.


#24

I’m wondering if we are not considering that we will also have the transient data mechanisms, which may be more suitable for these type of things, once the data transmitted through that other type of comm channel and it’s fully processed, then it’s when it becomes stored, versioned, and immutable on the Network…?

I know that feeling :smiley:


#25

Not sure how you mean. If you mean the data transfer between clients then we are talking about different things. I am talking about large amounts of stored data, in OP about ways to insert it, and following posts more about the data structures used to efficiently access that data.

Indexes are normally kept in the database. If that database uses SAFENetwork as underlying storage, then the changing nature of indexes (they are constantly re-arranged) for one, but generally all data actually, leads to more duplicate data produced on 100% immutable storage (i.e. not even garbage collection). Append only storage (other than event stores) is not uncommon, but it always has garbage collection as far as I can see.

There is a reason we think immutable data deduplication is something cool, a great feature of the network. If we end up producing loads of duplicate data anyway, then that isn’t exactly coherent in aspirations.

If the benefits still are greater than the drawbacks (going full on immutable), then to me it seems minimum duplication is achieved with only storing changes to the network. The more granular the less duplication, but also the more network requests.
Then combining that with keeping current state and indexes locally, we would still be able to effectively manage large amount of data on the network. With more than one client needing that access, it becomes significantly more complicated I would guess. But CRDT graphs could maybe be used. Don’t know. (And for that the other comm capabilities of the network, that you mentioned, could be perfect.)

But we have introduced a small weak spot there, since the indexes themselves are often the only access to the data, and also contain sensitive information. So it needs to be replicated and backed up securely etc. some other way. So we introduce a need for another layer with high security and reliability, since the reason we wanted it for the main data, probably reflects at least a bit also on the indexes.


#26

I am wondering @oetyng

One of the goals is to not have so much garbage (wasted index ADs)

What if you push some of the work onto each and every instance of the APP. In other words the indexes are not fully formed/sorted in data store and when they are read from the network the APP does any needed sorting and building of the final form for the index.

A very simple example of what I mean.

  • records are added randomly and you want to process the records sequentially on some key
  • for this simple example just have one index AD
  • Each record is added to ImD and the key added (appended) to the index AD
  • When the APP wants to access the index it reads the MD and does a sort on the entries in the AD.
    • 32000 sort on most computers is probably much shorter than the potential lag time of access from the network.
    • so for one AD index its extra time, but for multiple AD index then it can be done while waiting for the next and so on merging/sorting the ADs as we wait for the next AD
  • Database engines were/are built with the view that the client PCs do not have great processing power and one powerful box was cheaper than having everybody in the company having powerful PCs. Also the security & other things, but SAFE itself solves much of that and the APP runs on the user PC anyhow. So decades later the PCs (and phones) are quite powerful and shoving some work onto them is not unreasonable.

For much larger databases we can have a system where the index ADs created with the initial data are only partly filled which allows plenty of additions (append style) in each AD before a rebuild of (a section of) the particular index is required. This method increases the number of index ADs existing within the index structure, but reduces the number of rebuilds of the index structure.

Thus even for large data bases with plenty of “mutations”, the indexes can be updated often without the need to rebuild them as often. This saves on the worthless garbage residing on the network.

tl;dr
Remove the requirement for particular index ADs or group of ADs to be sorted and place the sorting requirement onto the APP device. This saves creating a new AD just because you add an index entry.