Introduction to MaidSafe: what it is, how it works, and how it compares to Bitcoin


#21

(I think I can help clarify this and will be corrected if I err at all, I hope. The system docs are good and getting better but the system is so intricate that a lot of ins and outs are fleshed out through various threads on the forum. Also, final decisions as to exactly how things will work are being made along the way. Your summary is really kickass over all.)

I’m pretty sure the following is solid and won’t change:

Neither PUT or GET are a one-for-one cost or reward.

On the PUT side: One safecoin will cover a certain amount of data PUT by a particular user (like 1 safecoin for x number of megabytes PUT/stored to the network). This resource is purchased from the network at large. The amount of storage one safecooin will buy will certainly rise over time.

On the GET side: Successfully executed GETs will give the farmer the opportunity to request safecoin, which will be awarded on sort of a lottery basis. So not every GET fulfillment will result in an earned safecoin. More safecoin are available early in the network’s running and so will be easier to farm at first. The odds per request will be better. Over time they will get harder. Vault rank also factors in somehow.

Safecoin paid to the network for the ability to PUT data will be taken out of circulation and made available for refarming. Thus safecoin will act as a continual lubricant, even after all the original safecoin have been farmed. As safecoin gets more valuable it will buy more resources, so less will be recycled. This will have the effect of farmable safecoin continually getting rarer, and more valuable. But there will always be some there to farm. Pretty slick.

It’s hard to imagine how such a decentralized network can manage to be fair in all this, but David Irvine has been pretty convincing that it can. Every now and then a new “Ah-Ha!” knocks me on my rear.


#22

I have 2 questions on this point.

1.) If the Maidsafe software on the computer that is putting the file onto the network is compromised, then the data “could be” stolen before it even goes onto the network, right? I realize other viruses and bugs can be running on the computer / device outside of the Maidsafe software and not as much can be done about that if a virus scan misses those rogue elements. So if the data could be stolen before even being put into maidsafe, then it creates a need to have apps that are secured (and safe from being compromised) to be used to create the data from the beginning. i.e. a maidsafe secured camera app, maidsafe secured document creation software, maidsafe secured messaging, etc. But doesn’t all these 3rd party software apps that are maidsafe secured from the file creation point open up more vulnerabilities for data intercepts and potential compromised accounts? I guess I am wondering if a maidsafe secured camera app was used, what is to prevent the app creator from taking a copy of the data before it is chunked?

2.) Is the table that is created as a data map available elsewhere besides the computer that uploaded the file. I would assume all devices you log into get the table, and the table is stored on the network as well (encrypted), so you can access your data from any device you authenticate, correct? It was not made clear in the summary that the table would be made available to other authenticated devices outside the computer that uploaded the file. Obviously if that computer dies and/or is replaced you would not want the data uploaded from that device lost forever. But what if it is a shared device? Does the table used as a data map disappear when you log out?

Again, if you log into a compromised Maidsafe software app, and the data table is downloaded, is all your data that is found from the data map in the table now available to whomever compromised the software on the device? Or is there a layer of security to prevent the app from accessing that data map table? Again, a maidsafe secured camera app that you log into from a friends tablet to capture and store a video event. What are the vulnerabilities if they have a camera app that might be rogue? What if you want to browse / access your files, then you are seeing the entire data map. If that is from a compromised maidsafe app, then you have just given up everything, correct? What would prevent legitimate secure software being intercepted and replaced with a spoofed copy when it was originally downloaded? (if the original download occurs outside maidsafe?)

Forgive my questions, as I have not read all the documentation. I have just read this summary which was very helpful. Thank you.


#23

Really thorough explanation here and glad to see entrepreneurs looking at this software! And especially glad for the fact that this paper was reviewed and that I finally had a chance to read it :slight_smile: thanks @eblanshey


#24

Yes, so it was emphasized that it’s a summary.

A careful person will have several different accounts and segregate their personal and confidential stuff in a “cold” MaidSafe account.

No amount of engineering will prevent careless people from losing their data. If you use a crappy device which you bought on the cheap because the h/w vendor likes to save engineering costs on firmware upgrades, why should MaidSafe spend their engineering resources to make more than reasonable efforts to protect your data?

Of course many precautions are and will be in place, but it’s a tad unrealistic to expect that you can be careless to the point where you have everything compromised and yet somehow have your data can remains secure.


#25
  1. Yes, an individual computer that has a keylogger or other malware can still mess you up, just like now.
    The difference is that instead of hoovering up everyone’s data and then narrowing the target for further attack, someone would have to attack each individual machine. This is definitely a concern to the Maidsafe team, but it’s not the first priority, I think. First is to get the network functional and secure as a network, then spread the love to individual computer security (like maybe a dedicated MaidSafe OS?).
  2. Your “account” is actually a state that is maintained by the network, that keeps track of your virtual drive, etc. So you can log into it from any network client, as long as you can self-authenticate, and be just as if you were on the same machine.

#26

TL;WR: Client-side security isn’t MaidSafe’s problem.

MaidSafe’s main objective is securing data in (cloud) storage and in transit over networks (including the internet). Client-side security is an even harder (probably impossible) problem to solve, due to the fact that a general purpose operating system (such as Windows) needs to provide the freedom to run any application the user wishes.

That freedom is a double edged sword, since it can also be abused by malware, which can infect the system either through technical vulnerabilities in legitimate software, or simply because the user authorizes access by mistake.

The only way to have a truly secure client is to use dedicated hardware with a dedicated operating system with dedicated software with severely limited freedom for the user. And even then it’s very hard to patch all potential holes.


#27

Of course, but being device (and hardware) agnostic and resilient to hardware failures makes longevity of data more sustainable. The point of the cloud is to have back ups when hardware failures (or upgrades) occur.

Correct, I understand the pitfalls of the systems today, and why I raised the question I did. Of course sucking up data in transport is a lot more “efficient” than individual machine compromises. However, this could become easy with backdoors into the core OS. So thus why I posed the question to potentially think about making it safe 100% - end to end.

That would be great… with trusted hardware / drivers.

TL;WR


#28

Device-agnostic and resilient to h/w failures doesn’t help here. If I show you an on-screen keyboard and on the screen driver level I record the location of your strokes and then send that in a text message to my bot HQ’s, now I have your MaidSafe credentials and can access your data from anywhere.

There’s nothing MaidSafe need and has to do about that. Everyone should take care of their own problems.

Google and Hotmail have “useful” alerts about “suspicious” logins from different locations based on GeoIP and similar stuff, but they can be “useful” only because they track you (and not just for the purpose of being helpful).

EDIT: Fixed typos from the original post


#29

@eblanshey this is awesome. I’ve only just found time to read it through carefully and it is a great resources. It’s all good, but I really found the detailed explanation of how SAFE works helpful, pulling all the bits together like that (esp. as I’ve not gone into the docs at all, just been gleaning info from conversations.

Thanks very much :smile:

I have two queries:

In point 4:

Is this so? I haven’t read the docs, it just sounds to me like the data manager might know one of the four vaults for each chunk it manages, or do all 32 of the managers for a chunk know about and manage the same four vaults holding that chunk? It may well be so, I just want to clarify. EDIT: @dirvine says you have it spot on, all 32 know about/manage the same four vaults holding a given chunk.

In point 6:

I believe the vault doesn’t earn for every successful GET. What I think happens is that all vaults holding a chunk are sent a request for the chunk. One of them will be first to deliver the chunk, so the others don’t earn even if they send it. Furthermore, I think that the one which is first gets a chance to earn Safecoin, which may not in fact succeed.

Also, we say there are four copies of each chunk, but I believe it varies and is at least four, but that its just simpler to say four. You might mention that in a footnote. Its something David mentioned [here on the forum.][1]

EDIT: @dirvine elaborates on Safecoin earning below…
[1]: What exactly happens when a particular file or safecoin address gets DOSed?


#30

Yes

And again yes :slight_smile: , the number of chunks is 2-6 (when DM sees <3 then 4 of them create new stores). There is up to 16 off line as well as cache.

Safecoin farming is controlled by a network agreed (this means group agreed in reality) rate and this rate changes to encourage or discourage safecoin attempts. So when there is a ton of space the rate decreases and vice versa. The rate per vault is a sigmoid like curve so small to begin, increase rapidly near network average and then smooth off again. This picture shows that function, hope it helps.

[edit] imagine the Y axis is the farming rate of your vault (depends on rank) so higher rank is further along the X axis.
This is then multiplied by the network average rate (the network wants more space or less space so alters this rate globally)

So everything smooths out.


#31

After having “won” a SafeCoin farming attempt as described by @dirvine above, I believe the attempt itself may also fail. If I understand correctly, a random SafeCoin ID is generated. If no SafeCoin with that ID already exists, then that SafeCoin is created and awarded to the farmer. If a SafeCoin with that ID does exist, the farming attempt fails.


#32

Yes this is the case. The address space of the safecoins limit the supply and ensure more randomness in selecting successful attempts. It is like bitcoin hash attempts really, but not burning electricity to do it to anything beyond a few cycles and network messages, which is negligible.


#33

I’ve corrected some errors in the post, including the bit about getting paid for GET requests.

Thanks for the kind words everyone!


#34

So the chunks(software), datamanagers(managing hardware) and SAFEcoins(value) all have their own XOR-space?


#35

Sorta. They all have unique ID’s that can be routed to through a distributed hash table that uses XOR-space.


#36

Awesome thread! Great reading while settling a restless toddler! :slight_smile:


#37

@eblanshey I’m still seeing this, did you correct the OP or something else?

6 The vault receives the chunk and gets paid in safecoin for every GET request on the chunk. There are now 4 copies of each chunk distributed throughout the network.


#38

Very nice, thank you. Gotta love this:

It is for this reason that you’d need to control 88% of the network in order to reliably attack it (compared with Bitcoin’s 51% attack). The larger the network, the stronger it becomes.


#39

Oops, missed that one. I had corrected another section. Updated.


#40

Thanks for the great overview.

Questions not yet addressed that come to my mind:

Will data ever be deleted from the network? Can it even be deleted? Say someone tries out MaidSafe, uploads a few Gigabytes of videos and then decides he doesn’t want to use MaidSafe, never returning. Will this content linger on the network forever, not accessible to anyone because the original owner deleted his keys, being replicated over and over again as nodes are thrown off the network due to some outages, DSL reconnects or whatever? If not, that presents a scalability issue over time, doesn’t it?

Connected to that, in case I know I won’t need some data anymore, can I as the owner delete it from the network so that it isn’t taking up space and especially being replicated anymore? Especially for the use case of a key/value database where the contents might change rapidly, it might be important not to store every data point forever?

Also connected: Say my node is a vault and has currently stored Terabytes of chunks. Then there’s a network outage. The vault managers lose connection to my node and forget it, replicating “my” stuff somewhere else. Then my node gets connected again. Did I understand correctly that now it will get a cmopletely new ID and be in a completely different group of nodes with no connection to the previous vault managers, i.e. having to earn rank first, then maybe becoming a vault for others again—and all the chunks I had before are now useless and can be deleted from my file system as they have been replicated elsewhere, and my copy will never be accessed, anyway?

Another question: Each chunk is is broken into 32 pieces. What about very small chunks, i.e. I’m not saving a file in MaidSafe, I’m using it as a key/value database, and the whole chunk is only, say, 12 bytes in size?

EDIT: And another one: Messaging/streaming etc. is mentioned, but how is that implemented with that concept of data chunks that are stored on the network?

Thanks in advance for answers. This whole concept is very fascinating. :slight_smile: