Appendable Data discussion

appendable-data
immutable-data
mutable-data

#21

Exactly that and many more. Without option to store cheap temp data this whole project is just another super expensive blockchain like database. This will also kill anonymity forcing devs to keep footprint data on client devices. If I need to change single byte of data and I have to pay for 1 MB of permanent storage for that than good luck with story about cheap alternative. Even blockchain with sharding will be cheaper since on blockchain I have to pay for transaction bytes not for whole 1 MB block. I also can’t imagine how to create dynamic web services on such network. For me this is disaster idea killing anonymity and usability cutting possible use cases of network by magnitude. This whole concept of storing everything forever at any costs is a nigthmare.


#22

This would be helpful. I understand that keeping data forever is one of the big ideas, but already it’s not an absolute rule since metadata and messaging won’t be kept - presumably that includes all the machine-to-machine sensor data that will grow exponentially as the IoT comes online. So there’s already a dividing line between data that will be kept and data that won’t, and as @neo points out having only the immutable data option could potentially be restrictive.

From an end-user’s point of view this would seem to be the ideal scenario. What would be the downsides? More complexity presumably … any others? Interested in your thoughts @nbaksalyar.


#23

Hjuston we have a problem


#24

Is this feasible / sensible without keeping MD as well?


#25

Exciting stuff! I am really curious to see a RFC outlining this — especially the backward compatibility with MD.

I assume AD must have XOR predictability for ‘compatibility’ with MD. E.g. DNS would be impossible without it. Without an option to pre-determine the XOR I see no way one can do high-level communication.


The crux is about whether we want to allow the network/user to delete data. I’ve slowly assumed that data would never be deleted; MD is to have versions and to be addressable per version. It’s only natural that you would call that ‘appendable’, because data isn’t actually deleted/mutated — thus ‘immutable’. What’s in a name?

The points made by several members so far lead me to have mixed feelings about it. @neo puts forward a good case for truly mutable data — and thus deleting data.


#26

If published data were all or even in part included in a MD object that could delete or edit then history and proof is broken. If Mutable Data is append only then there is no place to hide, what you publish will stay and can easily be found. Like a built in Internet Archive with no missing bits.

For small throwaway data, then have that locally and throw it away, but when you publish then it is publically available forever. That is the proposal really.


#27

I don’t know, I thought that I would be the owner of MY data. So if I want to modify it or delete it, then I should be able to it. It feels like a new google to me


#28

If you can delete your data, then everyone can delete their data as well and we will have the ability to rewrite history then :wink: That goes for any evil dictator, unscrupulous business and more. So we say publish and it is forever. If you say bad things then bad things will stay forever.

Later in the networks evolution it may be that private data is editable, but it is not simple when then it is shared private data etc.


#29

If your data is private and encrypted then not accessed or you stop using the system nobody will be able to read it. If it is public then it will persist forever.


#30

For the avoidance of doubt here. Stay forever means the data will stay forever, but the Appendable Data does grow. So if you have a website etc. you can of course update it. Your old version will be there for anyone to see by browsing the history of that site. So persistent data means the data is, the representation of the data is still able to change through time as you would expect.


#31

So have a data type that is modify data and cannot be used to be depended upon nor for web sites.

Then databases and temp data can use it and prevent massive blowout of useless data that has NO MEANING and the meaning is held in the resulting files from the temp files. Appendable data will cause databases to slow down and multiply in size.


#32

It is possible to do that, but hard to prove data is useless.

Databases are a weird thing here. Linked data is more in line with the thought process. So SOLID / RDF type structures just point the the latest data representation and data. They should not noticably slow down with appendable data unless you are also reading the whole history of each link.


#33

I agree with this, MD modifications need to be cheaper. In the past we had SDs which were free to update. The problem is that free updates open the doors to data spamming.

I think what could be done:

  • Creating a MD costs the same as creating an immutable data (this is current state)
  • Updating a MD costs a fraction of the price of creating an immutable data

For pre-safecoins networks like alpha 2 or community one, a hard-coded constant could do the job (like 1/10 of the creation price for updating a MD). This would raise current PUT balance limit without impacting stored data size.


#34

But this doesn’t provide databases for keeping company data. The solid/rdf defines the data, but doesn’t really store as companies want. This is non-web data and propriety data. They may convert to SAFE if easy but if they have to completely redesign their data systems for append only they they will not do it. Also why convert if all that will happen is their database operations get slower and slower over the days/months.

And we are talking of million of business databases. This include web backends and mostly non-web data

EASY, the data type of the storage tells you that.

EDIT:

Also your reply has ignore the issue of APPs being forced to store temp files on the device in an attempt to not have datablowout and the security issues that it will cause

Also your reply has not addressed the data blowout I outline many posts back


#35

In meeting so quick.

Apps will not be forced to store temp files at all, they may have some in memory temp stuff happening etc. If the app was say a video rendering thing then it would probably wish to store temp info but this should be cleaned up by the app (writing zero’s etc)

Data blowout as you have defined though is using the SAFE network like a big SQL database. I am not sure I quite get that part. Are you meaning business and people should be able to have postgress/sqlite etc. API’s on SAFE so there is no rework and SAFE provides a SQL backend? Or do you include NoSQL things like hadoop etc. in that scenario?


#36

Surely, if there are use cases for temporary/changing data then it is one that should ideally be supported? It feels like it would be limiting the usefulness of the network to rule these potential use cases out.


#37

Then the recovery feature would be lost from apps since it is in memory. But many devices cannot keep all that in memory. Thus if kept on SAFE then data blowout on safe storing meaningless data.

  • edit some data
  • temp files created on safe to store intermediate changes and can be a few as large as 1/10 of the data being modified to a few times the size.
  • modified file is saved

In all this you still have the original data in immutable store and the new data in immutable data. So you have the 2 versions.

But with appendable data you have the temp files still stored and occupying from 20% or so up to 300% of the data that was modified.

This is the data blowout. upto 300% of useless data, no extra information retained since the 2 versions of the data exist.

And this is the reason app developers will opt for local store of temp files otherwise SAFE will fill up with all the changed data that could be 3 times the real data being stored. And thus they are being channelled into the unsafe practice of storing temp files on the devices.

not in particular. Just how databases will be implemented using SAFE. And appendable data will mean that highly active “databases” will cause the size of the data stored to be many times the real data after a short while of operating. And the reconstruction of the actual data every time data is retrieved will take a lot longer since all the changes have to be processed in order to know what the data should be. (especially when appends beyond one MD)


#38

Ok, you want to implement a new feature (appendable data) but why not keep existing useful feature (mutable data) to reintroduce it later?

Or better why not design a single data structure that can do both? For example, adding an optional signing key field in each MD entry. This new field could have multiple usages (like transfer of ownership of a single entry), and one of them could be the special key value NO_OWNER_PUB_KEY to prove that an entry cannot be modified (and hasn’t been modified since origin if entry version is still 0).

This solution would keep compatibility with existing MD and allow non modifiable entries. Ofc, this is to be carefully thought, but the main point is to not throw the baby out with the bathwater.

Edit: An even simpler solution that would keep strict compatibility with existing MDs, would be to define entry version value u64::MAX as meaning this entry is not modifiable. This value can only be set when the entry is created.


#39

Yes, like everything else SOLID is not going to be adopted by everyone and potentially only a segment of world. Also people are not going to accept blockchain style of growth of their data because data changes can only be appended.


#40

This seems a lot of perspective issues. So alpha 2 did not have delete. i.e. it was appendable in essence (it did nullify some items but never removed/deleted them). It did not prevent apps from being created and used. So keeping MD as is with the extra stuff that currently is not used or clearly defining a simpler appendable data item is quite similar with the latter being more specific.

This is not removing existing functionality but should enhance it. For example I expect the user sig to be included in the data item as with BLS now we can simply have network based or client based multisig with no need for many round trips etc.