EPHEMERALITY and data-persistence?

isafe · May 28, 2018, 2:25pm

Ok, I’m still getting acquainted with this platform that I just discovered.
I’m wondering : the distributed character of SAFE sure enough assures the persistence of the data (no one can really bring it down), that’s great for information you want to be able to always have access to : it simply can’t be deleted.
But there’s a flip side to this coin : what about the data i want to be volatile ? Like f.ex. private IM messages ? Not only do I want to avoid eavesdropping, but I would like to have a way to control the life duration of my exchanges on the side of the receiver.
I wouldn’t want each and every message eternally sitting on this mobile device or laptop, which can be stolen or hacked (or screenshotted).
So, is there a way to bake ephemerality in these exchanges ? This is the case in some IM apps, like Wickr or Signal : either setting an expiration time or BOR ?

DavidMc0 · May 28, 2018, 2:56pm

As I understand it, there is a type of data called ‘mutable data’ on the network, which is a data type that can be edited. If you use this type of data, a message can be stored, and later changed to a different value (e.g. zero) if that’s what you want (someone could develop an app to reset data at specific intervals if you’d like).

There is also the ‘immutable data’, which cannot be edited, but you can use the type you want to for various purposes.

So using mutable data would achieve what you want in terms of ephemerality of data.

On this point, your data on the SAFE network would never sit as a whole on any mobile device / laptop that could be stolen or hacked - it will only be viewable when you’re logged into the SAFE network, unless you specifically want to make an offline backup, which is your choice. So by default, this shouldn’t be a problem for you.

Traktion · May 28, 2018, 4:09pm

A slightly tangnetial answer, but you don’t have to persist transient data. Using Crust directly to send messages between two nodes may be of use here.

neo · May 29, 2018, 1:46am

Yes, as the above posters say - you have the option of many types of data.

Persistent files storage for your files that to be persistent
Mutable Data like posts, messages, etc that can be changed or deleted.

But rather like the current web, you need to use an APP that does what you want it to do. If you want editable/deletable messages/blogs/notes then use an APP that stores them in mutable and provides a suitable UI to change/delete them. If you want to store files than upload them or use an APP that does this for you.

A Note about private file uploads. If you want to lose any connection to the file then simply delete the datamap and then its gone. The chunks remain but without a datamap noone can put those chunks together even if they could find any chunks. Public files are different because well they are public and anyone can read them anyhow. But of course no one knows who uploaded the public files in the first place.

isafe · May 29, 2018, 10:11am

ok.

It’s crucial to get a precise idea of how these different types of data are handled, so thanks for helping me on this one.

The question I had in mind was this : would it be possible to set ephemerality before putting the data out : some kind of baked-in killswitch.

What you propose would be like a post-editing feature, resembling to but not identical to what certain IM apps call to RECALL a message, i.e. remotely wiping out from the receivers device the message you sent. The difference is that not only you sever the link between you and the scattered data chunks, but effectively wiping out the original message itself : it just gives you more control over your information : what type of info you want (mutable or not), its longevity (BOR or expiration date) and even its utter destruction from the recipients devices it was sent to in the first place.

This type of privacy control (thinking of Wickr here and similar apps) seems to offer that last byte of control over your information I don’t see SAFE does : the control not only over the intelligibillity of your information, but its sheer existence even once it’s emitted in the open. Even then, you still own the information given out and can still call it back.

This question of « recallability » of the existence itself of the message also poses itself, not only when the receiver did receive the message, but when the message is still « in transit ». Lets say I send a confidential mail to a girlfriend who decides to call it quits with me and refuses any communication, blacklisting me (will this be possible on SAFE’s mail ? Some kind of spam regulations or other rules ?) and thus she will never receive my confidential message, who will be there in the open - scattered and unintelligible if i delete the datamap - , but still, I may want to prefer those bits and bytes to be just « zapped » completely, not just the datamap. Will SAFE allow such a control ? It surely seems a legit concern to me.

Another thing linked with that is what I would call « JUNK » chunks, or « ZOMBIE » chunks, i.e. pieces of 1meg files of my original message I want to delete after or during delivery. If I can only delete the datamap, those chunks do stay alive, they keep sitting on someone’s HD, for nothing, because I can’t recall them anymore, or delete them and thus filling up valuable space. Is it imaginable that a vault’s HD space will get choked with these zombie-chunks that will never get delivered and that I cannot delete?

JPL · May 29, 2018, 10:18am

There is a looooong debate on this topic elsewhere if you have a few days spare, eg:

neo · May 29, 2018, 10:54am

I for one have suggested a new type of data that is essentially temporary files. So that you upload files and the files are tagged as temporary and the system will delete them after “X” number of system events. So one could say set them to last say 12 months worth of system events, it won’t be accurate but at least it will be deleted anywhere say from one month prior to 2 months later.

Your other choice is to have an APP that deletes datamaps according to your schedule. That way the files are lost to everyone including yourself.

isafe · May 29, 2018, 11:19am

thanks for the link, I’ll give it a read.

Well, why don’t the SAFE mods be clear about this in a FAQ section or so ? I mean, it’s all about privacy protection, so data handling is crucial here. It’d be really contradictary one couldn’t delete one’s own data !

So correct me if I’m wrong : deleting the datamaps (i.e. 50% of the container) also deletes the raw contents itself, and not just its readability ?

Another aspect of too much sitting junk data would be that it will never be asked for by anyone, so that could depreciate the value of that specific vault (sitting data is never asked for and as the disk is filling up, it can’t receive new data) meaning no safecoins or being kicked out …

JPL · May 29, 2018, 11:53am

It’s actually quite a complex issue. One school of thought is that no data should ever be deleted and that the rate of storage capacity will more than keep up with stored data, particularly because data on SAFE is automatically de-duplicated on upload. Another is that users should be able to delete their own data, then there are various points in between such as @neo suggests above. AFAIK a final decision on this has not been made, hence the lack of concrete info.

So correct me if I’m wrong : deleting the datamaps (i.e. 50% of the container) also deletes the raw contents itself, and not just its readability ?

No, just its readability. The data map, which only the data owner can use, is a map of where all the encrypted chunks that make up a file are stored on the network, allowing the file to be reassembled. Without the data map, the file is just a bunch of useless chunks spread around the world.

isafe · May 29, 2018, 1:28pm

How is transient data like let’s say voice call or video chat handled ? First of all, there’s immediate delivery, so no need to store data on any vault. How does that work ? Do these calls still go through the vaults, which are then like dumb tubes, kinda tor nodes ? Once a voice package reaches destination, I mean, it’s immediately deleted no ? Both the datamap and contents ? So these apparently have some kind of killswitch option built in ?

I also wonder how they will chop up voice and video data. 1mb files are large for that (latency, packet loss). Will each chunk be hashed with a new round of AES encryption ? Do we know anything about the codecs supported ?

dirvine · May 29, 2018, 6:44pm

We have demoed webrtc, so this is all between clients. The SAFE network, in that case, provides a secure signaling layer for clients. Therefore all data is between them, but encrypted.

isafe · May 30, 2018, 9:04am

Hi, and thanks for your reply. I’m just diving into some other posts containing webrtc.
Just a question though : the following url http://webrtc-security.github.io/ does seem to point to certain weakness in webrtc’s security setup, such as

4.3.5. A Weakness in SRTP
SRTP only encrypts the payload of RTP packets, providing no encryption for the header. However, the header contains a variety of information which may be desirable to keep secret.
One such piece of information included in the RTP header is the audio-levels of the contained media data. Effectively, anyone who can see the SRTP packets can tell whether a user is speaking or not at any given time. Although the contents of the media itself remains secret to any eavesdropper, this is still a scary prospect. For example, Law enforcement officials could determine whether a user is communicating with a known bad guy.

Is the fact we’re using webrtc over SAFE make that concern superfluous ?

Any reason why having preferred webrtc over ZRTP f.ex. ?

Anything decided concerning the audio-video codecs : opus, av1 ?

Also pondering whether it’d be useful to use the DTLS layer of webrtc for direct p2p file transfer ? Could this solve the problem of deleting data on SAVE vaults once there put out ? As far as I understood, only the datamap can be deleted, not the contents itself. Any thoughts on that ?

dirvine · May 30, 2018, 9:56am

Yes, the more worrying part is signaling. So normally you use NAT traversal servers. We remove those servers. This means we will not be using STUN/TURN as per those specs, but instead, use our own encrypted versions of a slimmed down version of those specs.

No, none at all, we just wished to show a secure person to person communications. The webrtc is an example and for sure we can extend that. I have not used/read up on ZRTP so always keen to know more.

No, as of yet this is likely a client-side choice, although I would love to have as much client interoperability as possible.

It’s a double edge thing, if you want to transfer a file then its easy to transfer the data map. So the content is already stored. If it is a file it is likely you have it (to send it) and therefor deduplicating the storage is OK. I am not sure these files will be temp, but more valuable than that, so stored?

isafe · May 30, 2018, 11:21am

Thanks for clearing that up. I was wondering about the codecs, because of bandwidth usage. Somme apps (like LINPHONE) have multiple audio codecs available with different bitrates, adapting to the bandwidth available.

For meshing ZRTP with webrtc : https://tools.ietf.org/id/draft-johnston-rtcweb-zrtp-02.xml

https://secushare.org/ has a mesh-like E2E approach for the Gnunetwork. They developed CADET, a new transport protocol for confidential and authenticated data transfer in decentralized networks. This transport protocol is designed to operate in restricted-route scenarios such as friend-to-friend or ad-hoc wireless networks. Maybe it’s worth to check it out ?

Could you elaborate a bit more about the editing possibilities for mutable data ?

Can a user delete from the SAFE network not only the pointers to the scattered chunks of the data (datamap), but the actual contents itself (all its 8 copies of it) ? If so, I admit it’d be a huge relief for me. This could be done manually, or, as with some IM apps (and even secure webmail, as protonmail), with a built-in “time-bomb” feature : expiration time of the message or Burn-on-read time (BOR), before the data gets send out.

Secondly, as I stated, some IM apps even allow for recalling sent data from the recipients devices once sent.
This is really fine-tuning the ownership of your information and data, and it just seems so fitting to have these options in such a AIO solution as SAFE. Thanks for any comments on that.

neo · May 30, 2018, 12:16pm

No for immutable data. That is part of immutable data specifications. Only the datamap can be deleted. Once that is deleted then it is impossible to retrieve that immutable file. For one there is no reference to the chunks and thus they cannot even be found then there is no self encryption keys to be retrieved. Easier to find a 1" piece of string in a 1000 foot high hay stack then find any of your chunks.

Now if things change then whatever.

isafe · May 30, 2018, 12:55pm

OK, and how about the mutable data ?

I’m also wondering how XORing voice and video chats through multiple, yet geographical potentially remote, hops will affect latency …

neo · May 30, 2018, 1:04pm

You can add, delete, edit, create

The APP you use will do these for you. Obviously you choose the APP that suits your desire. There will be so many APPs the problem will be choosing

isafe · May 30, 2018, 1:26pm

that’s for sure ? I can delete both the container and not just the datamap?

Another possibility for enhancing webrtc security would be PERC (Privacy Enhanced RTP Conferencing)https://datatracker.ietf.org/wg/perc/about/ and https://fr.slideshare.net/alexpiwi5/perc-webrtc-e2e-media-encryption-with-sfu

oetyng · May 30, 2018, 6:57pm

I have just this weekend been facing a situation where I’m wishing for a deletable data, when designing a reliable dictionary over SAFENetwork.

The problem comes from our use of such structures today, where we store projections in them. The projections are of arbitrary size, and are fetched and stored again for every incremental change in the form of domain events. So what that gives is that we have almost identical mass of data, with just a small change on a property, that would require an entirely new ImD.

There is MD, but they have fixed size. The ImD will allow arbitrarily sized data storage.
I guess I could craft a way to split data up over n MDs, but I think it’s fair to say that it wouldn’t scale well.

So, maybe this is not a problem at all, considering data storage capacity increase, but at a certain percentage of network applications using this kind of storage there will be a noticeable effect. We would see a duplication of data at the same rate as new events coming in (minus the occasional deduplication if the projections are simple).

We can assume it won’t happen, (other types of applications will be more common) do some estimations, but it is hard to predict how data storage solutions will be used, if there are no impeding direct negative effects on user that limits such use that would not be beneficial for the network.

I would just like a way to solve this problem without creating new (sometimes) useless data at an insane speed.

And then there is the philosophy of not removing data, which I am very keen on too.

neo · May 30, 2018, 11:22pm

Back in the day when storage was expensive and at a premium and a lot of it on Mag Tape we also had to devise ways to add/change small bits to large data on tape. Actually people did something similar in books where you could not reprint in an instant (margin notes in pencil/pen)

While Tape is not immutable data it was very expensive time wise and machine time wise to be rewriting whole tapes because you add a word in the middle of your data.

So what was used for say your dictionary is to store the current “known” facts on the tape and then have a secondary store for amendments on a fast storage (eg Drum/Disk/DECtape and similar to say MD). So then the application would look up the tape first then check for amendments on disk/drum/DECtape in order to create the actual record.

In your case would say storing your current version of the dictionary in immutable data then as changes come you have MDs to hold the changes. You could create the MD address loosely from what record is being changed so that you are not searching linearly through a hundred MDs but only a few.

Then when the time is right create a new immutable version of the data(base)

Topic		Replies	Views
Pre-RFC Suggestion: Adding a memory persistence model to SAFE Network Development	7	998	January 6, 2016
Orthogonal Persistence and Algebraic Data Types for documents on Safe Network Features	26	1633	December 27, 2017
Safe snapchat with one-time use data Features	6	1384	June 17, 2016
Database @ Safe in the Published Zone Features development , appendable-data	8	504	December 13, 2022
Appendable Data discussion Features appendable-data , immutable-data , mutable-data	285	8226	July 8, 2019

EPHEMERALITY and data-persistence?

Related Topics