Appendable Data discussion

appendable-data
immutable-data
mutable-data

#61

I think it is possible, but hesitate until we detail exactly how that would work. It is not simple and has subtle issues. i.e. at the moment all data is public, but some is encrypted and private, but published data can also be encrypted (shared data etc.). So it is do-able but not simple


#62

Many ways to do this. The owner of a safecoin is identified by a key, apart from that it is basically an empty appendable data list (just owner). The owner part of appendable data is matadata/management data and can be replaced. As that is not crdt like it is done via PARSEC which is responsible for reducing one balance (changing owners) while increasing another.

Hope that helps.


#63

I know, I accept argument of simplicity. But I am 100% sure general public will be scared to use such service where they can’t have illusion of deleting of history. People want delete button to their FB chat, lol. They love GDPR nonsense because they have illusion of data deletion. Right to forget is illusion of control. And safenet with mutable private data is a way how to make such illusion reality. You can see how strong disagreement is just in this forum. Common people will not accept technical arguments, nor will they accept that their private chat, mail or home porn is on network forever.


#64

“Yes, I think we really must frame this better and explain it in some more depth”, he says breathing out :slight_smile:


#65

It always seemed to me that MutableData would be harder for the network to properly manage than ImmutableData. If MutableData is a liability to the network for fundamental technical reasons, then I’m in favour of replacing it with AppendableData, provided no other functionality other than deletion and mutation is lost. At a minimum we need to have predictable addressing like @bzee mentioned earlier, plus flexibility in GETting only parts of the history of the AD.

One argument in favour of being able to delete your private data: You never know if your login credentials will ever be stolen. If you have a lot of sensitive private history stored on SAFE, the only way to prevent it from potentially being compromised in the future (without deletion) is getting a new account and never again accessing your old one. Alternately, you can use additional passwords that are never stored on SAFE to encrypt your sensitive private data in the first place, but this conflicts with other goals of SAFE.

But like I said, if MD compromises the network then it needs to go. It’s better to have a limited product that works properly than something that has more use cases but is half broken.


#66

No, not if I can trace the movement of those funds from ID to ID, just like we can trace movement on the blockchain. The original premise of SAFEcoin was that it’s history would be deleted after two (three?) ownership transfers. That would allow the step you just described to actually provide anonymity.

If that history is not deleted, than funds can be tracked from ID to ID to ID, just like bitcoin can now be tracked, no matter how many addresses you funnel the funds through.

I think David has indicated in another post that he believes anonymity could still be maintained due to the ability to replace metadata. Since I thought Appendable Data could not be changed, only appended, I am not clear what can and cannot be changed, but I am delighted to hear that some of it can be, to provide anonymity.


#67

I agree with the sentiment, but I’m not sure it has to be a part of the official API. We’ll have SPARQL for queries on Linked Data, and it is closely resembling SQL, by design. So I’m thinking there should be existing translators from SQL into SPARQL, or even ODBC drivers that can be easily integrated into Excel and other software.

So it’s a matter of tools & wrappers around the API we provide really. Personally, I think this is a very cool idea and it can be one of the first ‘killer apps’ for the SAFE Network, along with SAFE Drive :slight_smile:


#68

I have to say I agree with neo and the others here about the switch to appendable data. This seems like a bad idea in my mind. I also have issues with the app development residing on SAFE where one can never delete anything and just append in perpetuity. My real gripe, though, is with the ideological push.

While I understand the ideology behind the change, in practicality, it would crush the perception of the SAFE Network to the general public. One needs to keep in mind their userbase when designing anything. Sometimes you have to compromise the best solution in theory for the best solution for your users.

There will be no way to make this ideological argument to the larger public, or ability to change people’s minds in any significant way. There is really one chance to make a first impression, and you don’t want to see the talking heads on a news station talking about the “new Internet” where you can never delete anything you put on it. You know how many people regret that picture they posted to Facebook, or a mean tweet they sent? Whether or not that is actually truly deleted from the Internet is besides the point. People have to believe that it is for their own sanity…


#69

Seems to early to come to any conclusions and no need for anyone to worry - Maidsafe always think things through from many angles and also present important areas such as this for in depth discussion before making decisions.

I’m curious. The only thing that I feel concerned about at this stage is the point @Seneca makes about everything in your private account being at risk. This would make it increasingly attractive for targeted attacks, so the ability to forget private data seems important from that respect. It need not be deleted as such, but I’d want the ability to make certain data inaccessible by anyone including the owner for this reason.


#70

Maidsafe never had not a consistent view on data deletion, going from one extreme to the other throughout the years.

A brief recap of history:

  • Initially an SD could be deleted but everyone would know that the SD existed and was deleted and if someone recreated it then its version was incremented. This was IMO a reasonable implementation.
  • At one time Maidsafe made an evolution that removed these controls and an SD could be deleted by removing it from the network. It could be then recreated at version 0 by anyone. This was what I would qualify as an extreme implementation were anybody is free to do what they want of their data, including completely rewriting history.
  • I had a hard time to convince Maidsafe to go back to initial implementation, see these 2 long threads: unsuccessful one Deletion of SD objects and then successful one Transparency or opacity of SD modifications.
  • Current MD implementation is the reasonable initial one at the entry level (an entry can be deleted but everyone knows that, and history cannot be rewritten).
  • What is proposed now is another extreme implementation were data cannot be deleted at all by users. This ensure that not only history cannot be rewritten but also cannot be erased. The price is less freedom for users.

To politicize Maidsafe position on this specific problem (data deletion) I would say the evolution is: center party -> right wing -> center party -> left wing.

This may be the right thing to do, but in the past Maidsafe was arguing against it and now they are going even further than what I proposed. This has been a big waste of time.


#71

We stared the project calling it perpetual data in 2006 with perpetual coin in 2007 as well.


#72

So, the full circle is going to be completed!


#73

I think there is great value in looking at various mechanisms though. Deletable public data is not good and never wanted, but there is scope/space for editing metadata. The answer is in there somewhere for sure. Ensuring public data is never deleted though has never changed, protecting how to achieve that and allow mutations of metadata or management data is tricky for sure. To start with on launch though it makes sense to be simple and with an API that is rich and allows apps as we have seen so far and hopefully much more will be a good start.


#74

But it’s not forced as the clearnet still exists … which as @neo has pointed out will drive them to stay on the clearnet. Unless of course you are secretly working for the government and will use their guns to make us all use the Safe Network :rofl:

It won’t lose data you choose for it not to lose. I’m fairly certain that most people don’t think having a choice is being in a weird place.

Wait wait wait … I thought we were talking about MD for private temp data - for apps and personal use … Why would this idea need to be expanded to the whole of the network for public data?

I don’t think many are opposed to appendable data at all … IMO the question is whether or not we have private temp data that can be erased.

Think about it this way … when you are working on an idea - say you are writing a paper on something that is rather political in nature … but since you are ‘working’ on it, you write a few things at first that you later come to understand are wrong … but if people down the track find your earlier views they will mud sling it all over the place and ruin your reputation … all because you started out with a brainstorm and wrote a bunch of nutty things down that you later regretted …

People have the right to privacy - so they need a right to delete.

You make my point exactly. Plus hacking is still possible either through deceit or coercion to gain access to someone’s accounts.

Appendable data seems great for collaborative projects and website backups, etc. but for private data it is not a substitute for deletable data.


#75

This I’ve suggested many times and solves a lot of the issue.

MD data allowed this and the idea of never actually deleting the containing MD with the version number provided a way to protect against recreating history. Or as @tfa points out delete the contents and set the owner to an owner with no known private key


The biggest issue I see with append only data is the growth of individual records.

For instance

  • a record with simple info takes up 100KB of data (yea a couple of blobs of data in that)

  • the data is encrypted so only the app can read it. (thus discarded data is encrypted and meaningless to anyone)

  • A change of fields occur about once a month. The average change size is 25KB

  • There are about 1 billion records in this collection of data (some sort of database by another name)

  • For appendable data this results in 25TB of wasted (to anyone else) encrypted data

  • But the worse is the app must trace through all the changes to the record to reconstruct the record when retrieved

    • This means after one month the updating APP has to reconstruct the 1 billion records during the next month and process 25TB of extra data due to reconstruction of each record (25KB ave/record)
      • the next month it is 50TB of extra data to process
      • after a year its 300TB of extra data to process.
      • and this is just one of the 1000s and 1000s of massive data bases that are being used
    • After 2 years the users accessing data have to process an extra 600KB of unused data just to reconstruct the record they are reading
  • And even worse is if the data is organised in a relational manner and to present one set of data many records have to be read

  • This represents a massive waste of processing worldwide and energy usage that will rival the blockchain mining today in scale. blockchain mining is minor compared to the data bases of the world.

  • this is the real barrier to adoption of SAFE as the storage medium if appendable data is the only way to store collections of data (database by another name)

Hey they are stored in immutable files and thus cannot be deleted or changed you are cheating here :sweat_smile:

Lets look at the idiot who stores a 4GB video in MDs. - If its stored in a deletable data type (non-versioned-kept MD) then people KNOW its temporary and they will copy it if they want it, then the copies can be perpetual.

No your mutable data type was the answer. Yea we know it was not in alpha 2 but it was the plan

Actually they are not the same thing, they are two different concepts. Perpetual data can be achieved with less issues by version-keeping MDs. By this I mean keeping a copy of each version of the MD

If you versioned-kept MDs and a change to a MD caused the previous version to exist would solve the problem outlined above (all the extra processing required to reconstruct a record).

ALSO for data bases this is not always a good idea because of the mutation rates that can occur in certain databases and how they already don’t lose data (journals and the like), but appendable data will multiply the time to process data through the databases. At least keeping a version copy of the MDs will not multiply the time.

If you version MDs by keeping a copy of the old MD upon change, then you can have a temp file type very simply by not keeping copies of each version of that particular MD

The browser app can refuse to display websites stored in non versioned-kept MDs thus forcing web page versions to be kept.

And all apps can do the same for data that should be kept perpetual. Remember this is only MDs and immutable data files already are kept perpetual

Exactly anonymity will be removed and all transactions will be visible.

Exactly temp files and databases able to operate at full speed not slowed down (by processing many times more data than the actual data) is an absolute necessity.

Exactly and agree with this.

  • the idea of temp files being deletable agrees with this
  • databases being able to mutate data agrees with this (even if versioned-kept MDs is used
    • appendable data causes any collection of records that are being updated to grow in size and access times.

No thats the cop out answer. People WILL reuse IDs for various reasons. EG so family know who sent the payment for instance. If appendable data or always-version-kept MDs is used then those transactions can be traced. All that is needed is one of the families IDs and then all transactions can be traced.

Also once I have one ID I can like blockchains follow the transactions gaining a lot of information along the way even if throwaway IDs are used. If I know 2 IDs then I’ve got you.

Nope no help in the case of payments - see just above. Not all cases will be use once only IDs for so many reasons and family scenario above is just one.


tl;dr

I am not against perpetual data. I am against append only data types replacing MDs (mutable data types)

  • append only data types require extra processing to reconstruct the data.
    • waste of energy and processing. Particularly bad for database style of record keeping where it keeps multiplying the problems month after month.
  • The fundamentals only say public/published (implied shared with others) data is to be kept perpetual.
  • Mutable data does not go against the fundamentals if optional keeping of MD versions is used. Various methods can be used to prevent non-versioned-kept MDs from being used for websites and other general applications.
  • The big problem with appendable only datatype is the growing of the records/data. The reprocessing of all that old changed data just to reconstruct the actually state of the data. OK if the data is small, but for data that is large (even 1MB) will cause the users to have to be reading mountains of data just to get to the data they are after. Keeping a copy of versioned MDs solve this problem.

#76

Thanks for that detailed explanation Neo, I didn’t understand the full benefit of MD’s plus the versioning of MD’s seems like a nice solution to replace the idea of appendable data.


#77

This is what I thought was meant by the fundamentals aside from the obvious immutable data.

I definitely agree that this is a major issue that truly goes against what has long been said on this forum for years.

I’m open to the RFC but I think the community is making solid points.


#78

I just looked and there is append only in it. But that is after all the language that does not suggest append only. So its contradictory language and the reason why so many missed it.


#79

Alternative idea to append only data type.

Keep the current commonly understood idea of MD - mutable data type

  • Allow a copy of previous versions to be kept as the default
  • allow MDs to be created with version copies set off
    • Once the MD is set to version copy keeping on then it cannot be turned off unless no mutations done since it was turned on
    • For version keeping off then an optional version keep flag can be set to allow a version kept on a case by case basis.
  • allows applications to access the data without the need to reconstruct the said data.
    • this allows collections of data records (ie database of any sort) to run at maximum speed without the increasing slowdowns caused by reconstruction of data and the associated lag time when reading additional MDs caused in time by append only data types
  • applications including the browser know the status of the version keeping flag in the MD and can reject such MDs if desired or flag them to the user as temporary data.
  • applications using collections of records (ie any sort of database) can either use version-keeping or not depending on the type of application the records serve
    • eg a private database of ones music collection does not need to have versions kept and is up to the person keeping the collection records to decide.
    • eg a health record database would definitely have version keeping turned on.
    • in both cases neither is slowed down by the having to trace through previous changes to reconstruct the current data.
  • allows the concept of private temp files that can be deleted and the MDs reused without multiplying the data stored on the network
  • Coin MDs are always not keeping copies and its optional.

Immutable files (chunks) are a separate data type and not covered by the above

  • They will be used for the main file storage and fulfils the perpetual data for that data type
  • Many web pages will be stored as immutable files and thus kept anyhow

#80

@neo Love the proposal. What about storage cost (and cost implementation/mechanism) for MD? Any thoughts on those problems? It seems like it would be complicated EDIT: but maybe current farming mechanism works for both immutable and MD?

I wonder how much of global data would be MD versus immutable.

Mods - can we have a thread for all of this?