SAFE Network Dev Update - August 20, 2020

Sorry, I edited my post since I first posted it. I hope it makes more sense now.

1 Like

NO worries chap, interesting convo.

The current state of art says that itā€™s not currently known how to work in this way. There are things like posterior updates (increased knowledge over time to re-order) and there is causal order. Itā€™s a Very deep area and this is what I meant with, easily disregarded.

A good mind experiment here is to imagine a single Actor and several replicas. Design the semi-lattice with last write wins or some scheme and then link back to the token change ā€œtimeā€. That alone is hard, but then add concurrent Actors and it becomes really interesting, so much so leaders in CRDT data types and working hard to try and solve that one. It is where the magic of natural updating happens though.

I mean it sounds easy, but getting down and dirty with it (which is massively interesting) makes your head spin, but in a great way.,

2 Likes

I donā€™t think ownership change and tokens arriving later etc are a problem with CRDTs.

Parallel operations would converge on a number of possible valid states:

  1. change owner, late request refused
  2. request accepted, change owner

One of these would then be picked randomly but deterministically so that everyone is on the same page.

But Iā€™m probably missing something if thatā€™s not how it goes :slight_smile:

2 Likes

That is last write wins or first etc. but it does not alone solve this issue actually. The issue is you have 2 semi lattice data types you are syncing. LWW etc. works well when itā€™s an individual semilattice capable data type, but syncing 2 is a whole new area.

3 Likes

Canā€™t it be ā€œsmaller hash worksā€ or similar? Then timing is no longer a problem. Is there an issue about updates arriving so late that rollback becomes too expensive?

Or is it just bad style to invalidate a previously accepted update? Though I donā€™t see how itā€™s avoidable e.g. with temporary network splitsā€¦ As far as I get, CRDTs do conflict resolution in two possible ways: a) allow forks and make them explicit, b) pick a winner and piss off the rest.

Iā€™m still not sure Iā€™m getting how this problem is worse for token based authorization than for permissions stored on the data itself. Isnā€™t it about the same problem for both?

4 Likes

There is no deterministic measure for the 2 data type sync that we know off. Desperate to find one though :wink:

I think it is very related, however.

Our solution right now is forking. So the data type is a list or something, letā€™s call it X. The Permission are another type, leys call it Y.

What we do is the whole lot are in a blob. We start with Y0 ā†’ x0, X1, X2 etc. so X updates while Y stays the same.

Then we change Y and we ā€œforkā€

So we have
Y0 ā†’ x0, X1, X2
Y1 ā†’ x2, x3, x4 etc. [edit x2 here is the same value]

But also it can have
Y0 ā†’ x0, X1, X2, X3, X4 (so X0 permission got 2 updates that cannot be applied to Y1, so Y0(x3) and Y1(x3) are not the same value).

This way that single container holding both crdt types mimics sync, but forking.

[EDIT, the critical thing above is Y0->Y1 is casually ordered as is all changes in x, i.e. x0->x1 etc. what happens id the X values are actually also causally linked with Y. So you can read each value above as Y0Y1(x), Y0Y1(x1), Y0Y1(x2), Y0(x3), Y1(x3) and so on]

Hope that helps, I did not want to tell you that though as it can form a design that is not always the best. I prefer to not know answers myself and find new better solutions. However, this probably describes more the problem.

So you end up needing the permissions in the container that has both permissions and data in there,

Another thing is, data republish is a PITA as you cannot publish the data until you can find the permissions for it as well. If they are contained in the single blob then you can.

7 Likes

I should add, my absolute rule in design like this is if that answer is not simple, then itā€™s not a good answer. CRDT work is very much like that when you dive into it. The answer is very very deep, but the solution is extremely simple.

12 Likes

Yes, imagine it was a safecoin xfer, then doublespend etc. could happen. Itā€™s a bit deeper again, but the semi-lattice means it must merge cleanly. As I say this is a fascinating area and developing quickly, but extremely solidly. I personally feel we could use a form of inference ( Bayesian like). So currently I would not say absolutely it cannot be done, I am just saying we need to handle these things with extreme caution. A great approach is simple as possible first, get the solution provably correct and secure than add a wee tweak, but each wee tweak can give an avalanche of issues.

tl;dr I love this stuff as it forces deep thought.

11 Likes

Random thought: If the keys and permissions are integral to the data, wonā€™t the network address change when the metadata is updated? How wil lthat be handled?

5 Likes

It will, but the section chain (part of the old data chains) handles that nicely. So data signed as network valid are signed by the section. To confirm a signature we just find the key in the section chain and check sig with that.

[edit I think I missed your point. This is not immutable data (blob) but mutable data (seq etc.) and the address of that is fixed, i.e. not dependent on content]

4 Likes

why is parsec being completely removed? I probably missed it since Im not up to date anymore but I remember everyone was very positive when it was achieved and marked as the most important piece?

5 Likes

OK so in the case of say a photo, the metadata is in the data map, not the chunks themsleves?

4 Likes

In theory it was great, in practise it made things unnecessarily slower, and more elegant solutions were found.

6 Likes

It was probably great in a strict order of everything world. Many in the team seen that as critical and I supported them. It did not perform as well as thought and strict order of everything controlled in once place is actually a terrible idea IMO. However, we did try and try hard to make it work, but the team who built it are gone now and there was no point flogging a dead horse in my opinion. It was a smashing piece of work for that mindset and as a thought experiment, it was good. Tested in a mock environment with no real data it looked great, in the real world it was not a fit for SAFE. It might be good for a money only network though.

Now we have significantly more elegant and efficient solutions, that are also much simpler with much less code.

31 Likes

Having a working network gets more important by the week.

Keep building team.

12 Likes

One thing I think that would be healthy at this stage again w SAFE is to evaluate again what aspects of the network are considered ready for mvp(the one that proves this network can be done without being as advanced as the v1 flagship release) and go from there. If any stuff is hardened enough for v1(meaning good enough for full limelight and needs no further investigations until way down the road (think like eth v1 present vs eth v2 pos) thats worth a shout too as more stuff in a v1 ready state means less iterative improvements post mvp to get MaidSafes flagship network out there for the masses.

It would also be good to understand whats currently still being ā€œponderedā€ for concrete solutions, basically whats the stuff still in research mode without a concrete solution in sight. As these are good to bring up in community topics for group debate to hopefully drive thinking in a good direction.

Some features may also be so bulky that components of it are solved yet pieces may be unsolved so they may fall into multiple categories and further explanation breaking down whats been solved about it vs not to help paint a proper picture and maybe get the smarter folks in the communities feedback.

I think simple snapshots of this done every 3-4 months is healthy to to see if your feature/work list is changing and ensuring progress is getting made in the core areas needed for mvp and ultimately a v1 release down the road. The dev updates are nice because it does give us some insight into what the devs are battling on a week to week basis but sometime a simple higher level statement of the features of safe deemed necessary for launch and notes on if its mvp ready, v1 ready and the why or what its lacking and the stuff needing study would be good.

Not pretty UI work on website either as most donā€™t wanna spend time updating that nor are those UI pretty tables really detailed enough, rather want something quick and dirty that takes 30 mins as just a forum text post jot down notes keeping that higher level view aligned without going too deep in the weeds but enough that someone technical can dig down in the dev updates and see how they tie into those higher level features we need to birth a network.

20 Likes

I see this as analogous to some strategies used in scientific programming for hpc. Sometimes you need strict ordering or barriers to ensure that race conditions donā€™t occur between parallel agents, but it kills your parrallel performance via Amdahlā€™s Law. If all network section operations can be made to not require strict ordering, then thatā€™s great. The way forking is implemented is an interesting strategy.

9 Likes

Yeah, I was also thinking if the minimal is still minimal. Like maybe a network with only public data or something like that. But I think the coin is essential to prevent spam and that I guess equals private data then, as the coin is private data in itself.

3 Likes

When we can see farming part on github?

1 Like

Iā€™m also thinking if there is anything going on with the PARSEC scientific paper publishing? I know it is not up to Maidsafe, but up to reviewers etc. But have you heard anything about it?

Even now, when PARSEC is removed, it would be one good starting point for discussions about what Maidsafe is aiming for, how, whyā€¦

2 Likes