Appendable Data discussion

Hjuston we have a problem

Is this feasible / sensible without keeping MD as well?

1 Like

Exciting stuff! I am really curious to see a RFC outlining this ā€” especially the backward compatibility with MD.

I assume AD must have XOR predictability for ā€˜compatibilityā€™ with MD. E.g. DNS would be impossible without it. Without an option to pre-determine the XOR I see no way one can do high-level communication.


The crux is about whether we want to allow the network/user to delete data. Iā€™ve slowly assumed that data would never be deleted; MD is to have versions and to be addressable per version. Itā€™s only natural that you would call that ā€˜appendableā€™, because data isnā€™t actually deleted/mutated ā€” thus ā€˜immutableā€™. Whatā€™s in a name?

The points made by several members so far lead me to have mixed feelings about it. @neo puts forward a good case for truly mutable data ā€” and thus deleting data.

6 Likes

If published data were all or even in part included in a MD object that could delete or edit then history and proof is broken. If Mutable Data is append only then there is no place to hide, what you publish will stay and can easily be found. Like a built in Internet Archive with no missing bits.

For small throwaway data, then have that locally and throw it away, but when you publish then it is publically available forever. That is the proposal really.

6 Likes

I donā€™t know, I thought that I would be the owner of MY data. So if I want to modify it or delete it, then I should be able to it. It feels like a new google to me

2 Likes

If you can delete your data, then everyone can delete their data as well and we will have the ability to rewrite history then :wink: That goes for any evil dictator, unscrupulous business and more. So we say publish and it is forever. If you say bad things then bad things will stay forever.

Later in the networks evolution it may be that private data is editable, but it is not simple when then it is shared private data etc.

3 Likes

If your data is private and encrypted then not accessed or you stop using the system nobody will be able to read it. If it is public then it will persist forever.

2 Likes

For the avoidance of doubt here. Stay forever means the data will stay forever, but the Appendable Data does grow. So if you have a website etc. you can of course update it. Your old version will be there for anyone to see by browsing the history of that site. So persistent data means the data is, the representation of the data is still able to change through time as you would expect.

6 Likes

So have a data type that is modify data and cannot be used to be depended upon nor for web sites.

Then databases and temp data can use it and prevent massive blowout of useless data that has NO MEANING and the meaning is held in the resulting files from the temp files. Appendable data will cause databases to slow down and multiply in size.

2 Likes

It is possible to do that, but hard to prove data is useless.

Databases are a weird thing here. Linked data is more in line with the thought process. So SOLID / RDF type structures just point the the latest data representation and data. They should not noticably slow down with appendable data unless you are also reading the whole history of each link.

7 Likes

I agree with this, MD modifications need to be cheaper. In the past we had SDs which were free to update. The problem is that free updates open the doors to data spamming.

I think what could be done:

  • Creating a MD costs the same as creating an immutable data (this is current state)
  • Updating a MD costs a fraction of the price of creating an immutable data

For pre-safecoins networks like alpha 2 or community one, a hard-coded constant could do the job (like 1/10 of the creation price for updating a MD). This would raise current PUT balance limit without impacting stored data size.

6 Likes

But this doesnā€™t provide databases for keeping company data. The solid/rdf defines the data, but doesnā€™t really store as companies want. This is non-web data and propriety data. They may convert to SAFE if easy but if they have to completely redesign their data systems for append only they they will not do it. Also why convert if all that will happen is their database operations get slower and slower over the days/months.

And we are talking of million of business databases. This include web backends and mostly non-web data

EASY, the data type of the storage tells you that.

EDIT:

Also your reply has ignore the issue of APPs being forced to store temp files on the device in an attempt to not have datablowout and the security issues that it will cause

Also your reply has not addressed the data blowout I outline many posts back

1 Like

In meeting so quick.

Apps will not be forced to store temp files at all, they may have some in memory temp stuff happening etc. If the app was say a video rendering thing then it would probably wish to store temp info but this should be cleaned up by the app (writing zeroā€™s etc)

Data blowout as you have defined though is using the SAFE network like a big SQL database. I am not sure I quite get that part. Are you meaning business and people should be able to have postgress/sqlite etc. APIā€™s on SAFE so there is no rework and SAFE provides a SQL backend? Or do you include NoSQL things like hadoop etc. in that scenario?

2 Likes

Surely, if there are use cases for temporary/changing data then it is one that should ideally be supported? It feels like it would be limiting the usefulness of the network to rule these potential use cases out.

2 Likes

Then the recovery feature would be lost from apps since it is in memory. But many devices cannot keep all that in memory. Thus if kept on SAFE then data blowout on safe storing meaningless data.

  • edit some data
  • temp files created on safe to store intermediate changes and can be a few as large as 1/10 of the data being modified to a few times the size.
  • modified file is saved

In all this you still have the original data in immutable store and the new data in immutable data. So you have the 2 versions.

But with appendable data you have the temp files still stored and occupying from 20% or so up to 300% of the data that was modified.

This is the data blowout. upto 300% of useless data, no extra information retained since the 2 versions of the data exist.

And this is the reason app developers will opt for local store of temp files otherwise SAFE will fill up with all the changed data that could be 3 times the real data being stored. And thus they are being channelled into the unsafe practice of storing temp files on the devices.

not in particular. Just how databases will be implemented using SAFE. And appendable data will mean that highly active ā€œdatabasesā€ will cause the size of the data stored to be many times the real data after a short while of operating. And the reconstruction of the actual data every time data is retrieved will take a lot longer since all the changes have to be processed in order to know what the data should be. (especially when appends beyond one MD)

5 Likes

Ok, you want to implement a new feature (appendable data) but why not keep existing useful feature (mutable data) to reintroduce it later?

Or better why not design a single data structure that can do both? For example, adding an optional signing key field in each MD entry. This new field could have multiple usages (like transfer of ownership of a single entry), and one of them could be the special key value NO_OWNER_PUB_KEY to prove that an entry cannot be modified (and hasnā€™t been modified since origin if entry version is still 0).

This solution would keep compatibility with existing MD and allow non modifiable entries. Ofc, this is to be carefully thought, but the main point is to not throw the baby out with the bathwater.

Edit: An even simpler solution that would keep strict compatibility with existing MDs, would be to define entry version value u64::MAX as meaning this entry is not modifiable. This value can only be set when the entry is created.

12 Likes

Yes, like everything else SOLID is not going to be adopted by everyone and potentially only a segment of world. Also people are not going to accept blockchain style of growth of their data because data changes can only be appended.

9 Likes

This seems a lot of perspective issues. So alpha 2 did not have delete. i.e. it was appendable in essence (it did nullify some items but never removed/deleted them). It did not prevent apps from being created and used. So keeping MD as is with the extra stuff that currently is not used or clearly defining a simpler appendable data item is quite similar with the latter being more specific.

This is not removing existing functionality but should enhance it. For example I expect the user sig to be included in the data item as with BLS now we can simply have network based or client based multisig with no need for many round trips etc.

3 Likes

So it is more calling the current functionality for what it is: mutable, bot only in that things can be appended, not changed what is already written.
Iā€™m more open for practical reasons atm: e.g. functionality freeze to get alpha 2 done, instead of ā€˜ideologicalā€™.

3 Likes

For a start is give CRDT data types which is of course a valuable asset to the network.

I do not think that will be the case though? I agree if the network charged 1Mb for every 32Byte entry in an appendable data item the network would likely fail.

I think you are assunming every entry to appendable data (which will be 32Bytes) will cost 1Mb, it wont.

In your example for instance, the likes could be a single appendable data item where people add a like by inserting a 32Bytes entry to the appendable data item. There are more ways to handle much of this of course.

14 Likes