Immutability - does it change everything?


#21

how this pans out I think largely depends on how the safecoin economy works. If you pay to store something and there is no limit on time, what is the motivation to clean up, unless of course you get some of that coin back.

To the point of IOT sensors: I think they are really not creating a lot of data. To store “the light is off” ever 100 seconds is really not that much data. Its just one yes or no variable. Can you think of an example of a IOT sensor that would be creating a lot of junk?


#22

I believe the assumption was mutable data would be cheaper because it is something that doesn’t require guaranteed access.

The problem I see with any sort of flagging mechanism is that the network is not time aware. If you set it to remove flagged data when churning elders, or something similar, that could be a minute after you put the data there, or weeks (I’m actually not sure what the average time on this is?). If the network charges a user to upload a file they were temporarily storing to share with a friend, and it gets deleted a minute after upload even though their friend has yet to access it, that causes a problem. With no time awareness, there can be no guarantee of availability for any given period of time, so no one could trust the network to upload such data.

Perhaps a fix would be a two bit flag, one bit the user sets it to mark it as able to be deleted, and one where the network sets the second bit after it has already gone through one elder churn pass.


#23

Is there any reason you could not buy a block of mutable space (prolly for a higher price to cover the extra costs of it changing) then its just like onedrive or whatever… you can just keep reusing the space you bought as many times as you like but if you just can’t bring yourself to delete any of those bad selfies you need to buy more space :stuck_out_tongue:


#24

also why would you not just buy this cheaper mutable space and just not mark things as deleted ever and hence have normal space for a cheaper price?


#25

I think I misspoke when I said it is cheaper because it doesn’t require guaranteed access. What I meant by that was changes can be made to the data without having to pay the full cost to re-upload that data. With an immutable file, if you find you need to make changes, you have to then upload and pay for the full size of the file a second time. This could get pretty expensive, even if you only needed one small change. Whereas with mutable data, you could change one byte and only have to pay one PUT. The result is, with immutable data, you have all copies of the file in their original form. In mutable data, you just have the new file. The new possible approach other than mutable data is appendable data, which is kind of a mix of these two.

The thought of having a chunk of mutable data space that is purchased in large blocks is interesting in my mind. Perhaps a future release could add such a feature if mutable data ends up being the way they go. For the time being, they are just trying to get to a final release as quick as possible.


#26

oh ok I understand now. Thanks for putting this in caveman speak for me! lol I totally agree this is a good idea. Just having like a onedrive with unlimited changes there is the potential the number of changes they make is too large for farmers to be fairly compensated. I like the idea of having another class of data where if you write over it you pay what that costs to do but don’t need to buy brand new frontier space per say.


#27

I forgot why that idea would not be (easily) possible. There still has to be some cost to upload mutable data to the network to avoid someone creating an attack where they flood the network with upload requests by constantly making small changes to files.

Perhaps there is some way around that, but I can’t think of any off the top of my head.


#28

Well even if its a legit user that just has some use case that requires a billion changes they should have to pay their share of the cost to use the network this much. I think there is no way around the physics of you want someone to do something for you and therefore must pay them for the effort.

Like I am curious if you could make Second Life 2 using this network and have like an unlimited sand sandbox. This would obviously entail a lot of changes but not infinity. I would hope you could use the mutable space solution to reserve enough for the action to happen but be constantly changing it as the alternate universe unfolded.


#29

Yes, I’ve mentioned that idea many times and still has no real traction unfortunately. And mentioned as a ordinary member of the forum and not as a moderator.

SO MD is mutable data which is just an AD with the modify (delete) flag set and allowing fields to be changed. Could restrict that modified fields cannot be changed in size once written if that simplifies the codebase. Alternatively it can be implemented as keeping a copy of each versions of the “MD” for perpetual data or as optional for deletable (<-- this is probable the easiest for tracing history - perpetual)

It would have to be clear to the software that reads the chunk OR MD that it is marked as deletable.

And I’ve suggested that once the user changes it to non-temporary it cannot be reversed and this would maintain the perpetual data. Also if another uploads a dup file then the chunk would be marked as non temporary, even if uploaded as temporary.

The onus is on the uploader to see if their chunk(s) already exist as temporary. And of course if it was already there as temporary then the uploader has no way to know if it will be deleted anytime soon.

So the purpose of temp files (chunks) is 99.9% as personal temporary files. The concept of public files having this property is a extremely specialised one and rarely used except for very special situations.

And the purposes of MD would include mainly database style applications, personal data collections etc

But still to get to further alphas and early beta then its best to implement AD and fix the other things first.

David already suggested that there would be different amounts for storing different things/sizes. This is rather easy to do with the current concept and Frazer’s concept for safecoin


#30

I think there’s a significant element missing here: is it easier for people (as in non technical people) to accept a transition from AD to MD, or MD to AD? If we implement one particular technical solution and then realise from real usage that actually the other solution is better, then the network would need a change which is always hard in a live distributed network (mostly politically and socially hard, rather than technically hard).

Maybe @SarahPentland or @dugcampbell could pitch in with their expertise from a PR perspective, would you prefer to manage a transition from temporary-to-permanent data or permanent-to-temporary data? I think this aspect is extremely important to consider.


#31

In my opinion that is a no-brainer.

AD to “MD” is the easiest transition.

MD to AD has a lot of implications that affect both technical and social. Whereas AD to MD would allow more specialised use cases to be implemented (without the waste). It would be like an addon.

Also AD to MD allows the MDs to be implemented as defaulting to AD functionality and MD as a temporary data concept rather than a normal. So then applications could reject MDs as reliable long term and allows database applications to not spew waste ADs/data everywhere when updating indexes and data stores. If using keeping copies of each version then speed is not affected for AD style. Most databases keep a log of changes anyhow so that a very efficient perpetual data is already inbuilt and much better than a generalised AD/MD could do.


#32

Well, you have this traktion on board! :wink:

It seems a very sensible idea to me. Similar to the likes of IPFS and Freenet and actually pretty capable format for short/medium term storage. Obviously, the degree of permanence would depend on vault implementation, but even the standard caching design would see popular data remaining present.

Just letting throw away data rot seems like a good foundation for garbage collection. Surely we want the network to keep everything that people ask for, but arguably it is the reverse for stuff people don’t want to keep.

Additionally, temporary storage could be free, much like IPFS. That is a good use case and marketing angle to push.


#33

Disagree here since people would simply do “refreshing” where they copy their temp data before it disappears. And do it all for free. Then there is the spam of data problem. Just keep storing temp data till it all overflows.

No – temp data needs to be paid for and within an order of magnitude of full sized chunk PUTs. Maybe 1/2 the cost or 1/4 the cost. Back of envelop maths tells me no less than about a 1/4. Keep it too costly for spamming but chap enough for temp files created by many APPs (eg editors, spreadsheets, etc) to provide recovery. Its important for recovery to work even if you change machines etc. For instance you might be in the middle of a massive edit session when the computer dies. So in memory recovery temp files are not good enough.

If browsers and publishing systems require the ADs/MDs to be perpetual then there is no reason people will publish using temp ADs/MDs. Also Apps will know if the MD/AD is temp or not and use it accordingly.

Maybe even charge MORE for the creation of a mutable data object so that it is discouraged for normal use. Then the normal modifications charged at the same rate as for appending to an AD. EDIT: And of course APPs will reuse the MDs when creating their temp files to save creation costs.

For immutable files that are stored temporary then also charge enough to discourage its use for cheap storage.

Also at the network level you can restrict the datamap for a temp file being stored in a perpetual AD thus forcing temporary status throughout.


#34

Excellent ideas @neo. I really like where this is going.

I think this is a great use case (and it is just a fragment of the possible ones like it).

Important data processing but unreliable physical setting? I sure can imagine that.

The network is so much more than security and privacy. It is also The most reliable storage, ever.
It would be a great miss of opportunity to not be able to use it like that also (among all the other things). It doesn’t need to be the whole full-fledged scenario of eternal data storage™ needs and ultimate encryption and security™ needs. It might just bee the need for ultra reliable storage. That magnifies the possible use cases tremendously.

I see a problem with this.
The garbage production is not immediately apparent for users. It’s like the plastic in the oceans. People don’t see it. If we don’t make it more expensive to produce garbage than NOT to do it… well, then we’re going to get it there.
I.e. people will use perpetual option for garbage rather than temp option.

But very interesting twitch of the idea anyway. Since it would minimize the use, and we don’t actually want temp data in the network (we just might need to be able to have it…).

I was thinking if it could be more expensive in return for getting some of it back on actively deleting it…? Makes it more complicated for sure, but could perhaps kill the fly and keep it alive (ehm :stuck_out_tongue_winking_eye:).


#35

You might have missed my edit. The MDs would be reused. when the APP needs temp storage again. So the extra cost is a once off to get the temp storage objects then the reuse is at the normal rates. The extra cost prevents people from using temp storage as a cheap permanent store, and encourages perpetual storage over temp/mutable if they had to pay extra to store their permanent files.

Also for databases the indexes would reuse any discarded parts.

Yes this would support the idea of paying (slightly) more for creating the temp storage objects

NOTE: This is for Mutable data objects. Temp immutable chunks has a different set of conditions.


#36

Very good point and one that increases the use cases for the SAFE network.

Surely the world knows the difference between perpetual published documents/sites/data and temp private data (or database records change).


#37

I’m not certain this would be the case. If it were, IPFS would surely be unsustainable as it would be a fight for free space. We are fortunate to have a working example of this to monitor.

We have to bear in mind that temporary data is also sacrificial. It will be the first data to be purged when more popular data needs storage and/or cache space to persist it. They would always run the risk of their data disappearing and there would be an overhead in maintaining sufficient “fake” demand to keep it in the caches.

Edit: interesting IPFS thread here about permanence: https://news.ycombinator.com/item?id=10437044


#38

I don’t think this could ever fly, unless it was so sensitive that content destruction became the incentive.

Personally, I don’t think we should be turning away temporary data. I don’t think it is a bad thing at all. Idle resources can be used to host and deliver it, delivering a benefit to everyone who wishes to access that data.

We have to consider that this temporary data maybe valuable to those other than the uploader. Providing value is surely what the network has to be about. Of course, we have to ask ‘at what cost’, but the retention time for both storage and caching of temporary data may simply depend on what a vault owner is willing to voluntarily donate.

IMO, having a default where temporary data is cached until it is replaced by more popular or immutable data would make sense. I also suspect that a vault with space would also store it until something else (newer or immutable) needed that space.

Given there are options in the wild that encourage or only provide this sort of temporary data, it is clear there is demand for it. Do we really want to turn people away from doing this on SAFENetwork, when the tech stack is fully capable of it (with relatively small changes, I suspect)? Personally, i don’t want to be storing temporary stuff on IPFS and permanent stuff on SAFENetwork - I want both on the latter.


#39

Then its not fit for the purpose.

Cannot express that enough.

You cannot then use MDs since they could disappear on you. No good for databases & indexes etc

Now as temp files in immutable chunks then it has some use, but this is separate to deletable/(truly)mutable data (MDs). And as temp immutable files then charging less could be valid.

Again I think I have made this a little more confusing than it should be. I was referring to deletable data in MDs (see NOTE)

The idea of paying more is to have deletable (mutable) data objects that are reusable by the user. Such uses include but in no way limited to

  • Temp files (stored in true MDs) used by applications for recovery if app crashes, PC crashes, PC dies and is replaced etc. Most text files being edited are one MB max, but MDs can be chained together if needed.
  • Database mutability
  • Indexes for databases and other applications
  • personal use in applications. EG star mapping. Collection of information in MDs but not strictly database.
  • IoT data collecting when not in a pure log type
  • IoT temp files needed for applications (ie NOT the normal stated applications) These files are often addressed by MD address and bytes in MD

For all these deletable data objects the idea of paying slightly more is good since it discourages using these deletable objects as normal file storage. The reason for wanting that is

  • Perpetually of (published & other) files and data is much needed in the world and should be encouraged
  • temp files encourage making SAFE sites from them and we will end up with what we have today, dead links all over the place. Cheaper to put up safe site using temp files and take the risk that one day (likely to be years away) some pages disappear. So just upload them again when found out. or just let site die out.
  • database applications NEED to know their records can be truly mutated and will not be removed just because a section runs low on space sometime in the next 10 years.

@Traktion It seems we are actually talking of two different concepts here. You are talking of truly temp files that can disappear at the whim of the sections and have some use. I am talking of temp files at the control of the user (apps).

These are two very different parameters even though they are similar on the surface. The best use case to illustrate this is the recovery files that MUST remain under the control of user (app) and cannot be allowed to disappear if space is low. If that happened then recovery files are not really secure recovery files are they since they still allow data loss.

So I see it as 2 aspects of being able to delete files.

My main idea in the past and now has been that of temp files at the control of the user and also temp files that have a guaranteed lifetime (expressed in say events rather like blocks in btc)


EDIT: and to IPFS being an example of how free temp files will work is invalidated by the example of the testnet prior to invites. Remember how account spamming filled up the testnet.

Created accounts giving them more PUTs to fill up the storage capacity of the testnet.

This is why I know that free temp files on SAFE will be abused and fill up the network.


#40

But data on the prior test net was not temporary and sacrificial. If we are talking about temporary in the sense that the network clears them when space is needed, it should be impossible to fill up; old and unpopular stuff will just be cleared to make way for new stuff.