Proof Of Storage prototype


#1

Proof Of Storage prototype

This post outlines a bit of a tinker inspired by this post by @JoeSmithJr (plus this clarification).

Background

Knowing how much spare space is available to the network helps match the supply of resources with the demand and minimize disruption during periods of high demand.

Mechanisms to prevent lying about spare space are subject to time inversions, where only a small portion of the claimed space is consumed and any ‘gaps’ are computed on the fly to pretend that space exists. This process is entirely hidden from the tester making proof of space an unreliable tool for measuring supply.

I previously considered this problem too easy to cheat for it to be useful, but maybe this isn’t the case…?

Protoype algorithm

The prototype presented here aims to provide a reliable measure of spare space. It stores a sequence of proofs that consume disk space.

A test can be periodically applied by asking for the proof at a specific point in the sequence. The user looks up the data and sends it back within a certain amount of time to prove they really are storing that data. Because the user doesn’t know which proof will be requested, they must store all proofs in order to respond in the required time.

If the user is trying to cheat by deleting some proofs, they might be missing the proof requested by the tester. The user needs to recalculate that proof (which requires data from some previous proofs). If any of those previous proofs are also missing there becomes a cascading effect and the calculation may take a long time. The more gaps in proofs the stronger the cascade effect.

Since all previous proofs have an equal chance of being required in the calculation, it’s not possible for a cheater to choose which ones are ‘safest’ to delete.

Implementation

See https://github.com/iancoleman/proof_of_storage. The code aims to be well documented via comments so I won’t duplicate the details of the mechanism here. Happy to answer specific questions of course.

The algorithm fills the drive up using a common seed value (eg the first block in the datachain for this section) and a derivation scheme that relies on sha3_256 hashes and ed25519 signatures (both are critical operations used in the normal operation of vaults so the filling doubles as a cpu test for those aspects of the network).

The algorithm can be ‘stretched’ via several parameters to make it take longer to calculate each proof. This means a cheater who needs to recalculate data can be detected because they would take much longer than a simple lookup. But making it too long per proof means the initial filling could take prohibitively long. So there’s a balance between easy-to-generate but hard-to-cheat.

The Honest User

How does the network (or any tester) test a user for how much spare space they have? First they supply the seed that allows the user to fill their drive up with a sequence of proofs. The user calculates consecutive proofs until they’ve filled up the space on their drive.

Once the proofs are calculated the tester can periodically ask for a random proof to ensure the space is still consumed by those proofs.

For example, the tester may request the proof at index 1004. The user looks up their proof at that point in the sequence and supplies the data to the tester. The supplied data should be the same value the tester has for proof 1004.

Since the data for 1004 could only have been calculated if all previous proofs exist, the tester can assume the user has 1004 proofs (each proof is 1 MB in size so they have at least 1004 MB of spare space). Since the user didn’t know which proof would be requested, the tester can assume all of the proofs up to 1004 must currently be avaiable on the users drive.

Running the prototype proof of storage code in real life shows an honest user can fill their drive at a rate of approx

39 MB per second (using an intel i7-7700 desktop with m.2 ssd; 26s for 1 GB of proofs)

24 MB per second (using an intel i7-4500U laptop with sata ssd; 42s for 1 GB of proofs)

To get an idea of what this is like compared to just plain writing to disk, the desktop can store 212 MB of random data per second. So it takes about 5 times longer to fill a disk with this algorithm than plain random data.

If the user takes more than a certain time to respond (lookup + network transfer), they were probably spending that time recalculating a missing proof and can be suspected of cheating.

If the user takes just slightly longer than the maximum allowable response time, were they cheating or did they just have a bad network connection? Ideally this doubt can be removed by making any cheating take a long time to respond, far longer than any variation in time to do a look+network.

The Cheating User

Consider a cheater that stores only even-numbered proofs and discards all odd-numbered proofs. They might claim to have 100 GB but in this case would only be storing 50 GB.

If the cheater is lucky and is requested to supply proof 1004 (which they have not discarded) they send it and appear to be honest.

But if they are requested to supply 1005, the cheater would be missing that proof. They would need to recalculate proof 1005, which depends on the data in some previous proofs, leading to a cascading calculation of any odd-numbered-proofs in the dependency chain.

Taking data from the example algorithm:

1005 depends on 711, which would be missing so they must calculate it.
    711 depends on 389, which would also be missing, so they must calculate it.
        389 depends on 88, which they'd have.
        389 depends on 80, which they'd have.
        389 depends on 260, which they'd have.
        389 depends on 94, which they'd have.
        389 depends on 238, which they'd have.
        389 depends on 71, which would be missing, so they must calculate it.
            71 depends on 52, which they'd have
            71 depends on 28, which they'd have
            71 depends on 58, which they'd have
            etc... until 71 is fully calculated from 10 dependencies
        389 depends on 321, which would be missing etc...
        389 depends on 359, which would be missing etc...
        etc... until 389 is fully calculated from 10 dependencies.
    711 depends on 148, which they'd have.
    711 depends on 43, which would also be missing, so they must calculate it.
    etc... until 711 is fully calculated from 10 dependencies
1005 depends on 382, which they'd have.
1005 depends on 612, which they'd have.
1005 depends on 177, which would be missing so they must calculate it...
etc... until 1005 is fully calculated from 10 dependencies.

After fetching 10 dependencies (and their nested dependencies if any are missing) the proof of 1005 has been calculated and can be returned to the tester.

The nested dependencies means there’s a lot of work to do to recalculate the missing proof. If each proof takes a short while, the cheater will take too long to respond compared with just a simple fast lookup had they been honest and stored the proof like they claimed.

Concepts

If a user is 50% honest the network has a 50% chance of catching them on each test. If there are 10 tests then the chance of catching them is 99.9% (1-0.510).

The frequency of tests relates to the degree of honesty that should be expected.

The degree of honesty may change through time, possibly becoming extremely dishonest in a very short time.

Questions

Security

How much stretching should be applied (ie additional hashing / signing operations)?

What ratio of hashing:signing should be used for stretching? It probably depends on typical network load, since there will be more of one than the other.

Does verification of hashes / signatures have a role to play in the algorithm? Hard to see how it would since it doesn’t create any data so can’t be used for filling. But it would be nice to have since a lot of network activity is performing verifications.

How much depth should be used for the scavenging? (apologies for using lingo only appearing in the code).

Is it ok to have vaults prove a smaller amount intially and over time increase the amount they have available?

Can / should the stretching factor be dynamic?

Cheating

What degree of time / memory inversion is possible?

How can a cheater create an the optimum cheat-table? Something about the distribution of dependencies…

How much advantage do they get from this?

Does advantage from cheating compound over time?

Future proofing

How does the introduction of signature / hashing ASICs affect the algorithm?

How does this scale for users with a lot of spare space?

Can testers test vaults larger than themselves?

Broader context

How does this combine with measuring used storage to create a model for economics and incentives?

Is this any better than using sacrificial chunks?

Is consuming space with ‘useless’ data a good idea? (related question about bitcoin mining).

Intuition

My feeling is that network timing will totally dominate the response time, even if cheating happens. I think exploring what happens when the various consts at the top of the code are changed is important to better appreciate how to make cheating computationally expensive without compromising the time it takes for the initial generation.

What Next

I think the algorithm works well enough to be useful. I want to create a cheater and see how much tweaking it takes to be able to differentiate between honest and cheating.

Any ideas or mistakes would be really interesting to hear about.


#2

Well I guess it is in that regards, that less bandwidth is used, because the network only needs to send the seed to the vault-to-be.


#3

This would depend on the effects to the network verses the amount of overstatement. I suggest that a cheat who claims to have 100MB extra is not going to hurt the network anywhere as much as one who claim to have 1TB extra.

Thus I suggest the timing is based on a CPU that is 1000x the reasonable fastest today.

I suggest that 3 proofs are asked each time and spread across the claimed space. EG first is in first 30% second is in the 30-70% and third is in the last 30% range. This way you will reduce the statistical possible situation where your 10 tests (over a period of time) are all in the low part where it is easy.

Thus to get 3 right even with a 1000x CPU while overstating by a significant amount is going to be difficult. So the stretching only needs to fulfil the above. If you stay with the one proof required per test then you will need larger stretching.


#4

I think this would be good.

On that note if you fill the vault from the top down with the proofs then as the vault fills then the proofs will still be there. Proof #1 being the last before vault is “full”

The main advantage from cheating is disruption.

Another potential advantage is by cheating (a lot) is that the section potentially will raise the PUT cost later than it should giving some cheaper storage in that section. The true benefit is questionable and potentially cannot be taken advantage of in a practical way.

If you have multiple algorithms then this makes ASICs harder to make and more expensive. I’d worry about GPUs though.

Still remember that one expects at least 2/3s of vaults will be running non-cheating code. If we have more than 2/3s then spare cheating won’t be the thing we need to worry about. And with 2/3s non-cheating vaults then the effects of a cheater who uses a 10000 core super computer will basically be wasting his money big time.

If you fill the proofs from the last chunk towards the first chunk in the vault then the used space will be from the 1st chunk onwards. This also allows the vault to store the proofs once and the testing is only asking for proofs from #1 proof (last chunk) to amount of claimed free space.

And the used space + free space == vault space.

Only for checking free space for non-full vaults. In other words all vaults since a vault is only given 1/2? of its chunk space chunks to store, the other 1/2 is to cover merges etc so that the network does not become full just because a merge occurred.

Also it is better than sacrificial chunks because the bandwidth is greatly reduced. Sacrificial chunks requires the storage of the chunks which have to be sent to the vaults.

The principle is that the user supplies spare resources, so the space is spare and can be filled with anything the network deems fit. Related to this is the question of whether we should allow vaults to increase their size. I think that should be allowed and obviously a new set of proofs would need to be stored. But allowing vaults to increase space pretty much nullifies the negative issue of filling disk with useless data.

My gut feeling is that correct stretching and appropriate timeout setting then the cheater will be 10 or more times slower than slower response times. That is the cheater with significant overstatement of space and 1000x times faster CPU than the fastest today.


#5

Lots of little cheats are still as dangerous as one large cheat. Remember virtualisation is a double edged sword, it can be used by attackers too.


#6

Did you read all of what I said.

Who cares about VM. It doesn’t matter if its one VM host with 1000 VMs or 1000 PIs. Also you still don’t understand the issues with too many VMs on one machine. This was talked about just in the last couple of days, go read @mav’s first post that this one comes from.

Also your 10 or 100 or 1000 or 10000 little cheating vaults are spread across all the sections. They don’t all end up in one section. Also the load on your host computer will ensure that even the tiny cheater cannot respond quick enough ever.


#7

Great job! You do the picking of the “scavenged” blocks just like I imagined, by using the remainder of some pseudorandom series, seeded by the current block’s value. Wouldn’t it be better to use the end of the block for the seed though? Using the first few bytes may open it up for an attack. But I may be missing something about your algo and it may not be a problem.

Let me add one comment. As you may have noticed from my terminology, I like to distinguish between chunks and storage blocks; they probably shouldn’t coincide. For example, it would be wasteful to reserve a whole chunk’s worth of space for a 1 KB MD.


#8

The first storage block is seeded by something related to the section, so you can’t cheat by using more vaults only if you can put at least two in the same section. This may or may not be feasible, depending on the number of sections in the live network.

Done well, the algo can be storage bound enough that fast calculations, ASICs, etc can’t do much; one will need the actual space to cheat, but if they have the actual space it’s not cheating anymore.


#9

I don’t think it matters (see this comment for the reason). But perhaps using the end is safer because it ensures the client has stored the full proof and not just the start. But then you could argue all it shows is they stored the end of the proof.

Yes the terminology needs tightening. I think just ‘proof’ is a good noun for the piece of data being stored. Chunk is what a client stores on the network, block is what each part of the datachain is made from, and proof is neither one of those two things.


#10

I’d say you need to pick bytes throughout the since they potentially need to only store that portion for later proofs. Need to be somewhat random too so that they cannot store certain bytes either.

I think at this stage proof is fairly good, as what is needed for a “proof” may evolve as the discussions move on.


#11

Even better, the blocks could depend on random parts of random blocks determined by randomly chosen parts of the current block. It would bring up the space required for cheating closer to the space actually reported. EDIT I just noticed I basically repeated @neo’s point.

We’re talking about storage blocks in a very real sense here. Free storage blocks must always form a continuous list, even after deleting MD or cached chunks at random spots, and that requires managing allocation sort of like how a file system does it.

Is there a reason why not use really small storage blocks? Other than having to deal with fragmentation. MD chunks are expected to be real small and real numerous, so storing each in a block hundreds of times its size would be inefficient.


#12

I was under the impression that currently each chunk is stored as a file and the file system handles the actual storage.


#13

I know. I’m wondering if that’s still sufficient with this PoS proposal. May well be.


#14

Well you only have to prove that the vault software will actually store that amount of data in the claimed space. A cheater can always supply the space but at some stage just stop doing it. And that is what resource proof and punishing vaults that do not retrieve a requested chunk is about.

So I’d say that using the file system to store chunks as files is certainly good enough.


#15

I was thinking more along the lines of efficiency. File systems have additional overhead and that’s why we have a single swap file and not thousands of tiny swap fragments for example. But, as I said, it may or may not be an issue.


#16

When talking about proof of storage and randomness, the easiest solution what came to my mind is when validator asks for proof by sending ordered list of block IDs, and to prove ownership I have to compute something like, take first X bytes of first block B1 as a number(starting from position P0, first byte of block). Make modulo by size of next block( 1MB or less) => P2 (position of next X bytes in block B2), than take X bytes of B2 starting from P2, and repeat the process for P3. Return last X bytes from last block as a proof.

This way computation does not require to read all the blocks, just X bytes from each block. Random order of IDs of the block makes it impossible to cheat. There is nothing to cache, the only way is to store whole block. There is not cryptography involved, so it is really fast and requested list of IDs can be very long to prevent cheating by downloading those blocks on the fly.


#17

Edit: Tl;Dr Below more complicated proof of storage ideas which also requires little bandwidth. Also extra efficient because less work for testers and is even more difficult to cheat.


I haven’t (yet) read everything concerning these ‘proofs’, but I don’t get it completely.

What I had in mind: use the group consensus characteristics to prove spare space.

Have at group/section level a list of seeds (i call it seed numbers below) derived from datachain or random numbers.
If a vault has 200 GB spare space: divide this in 200 parts of 1GB and couple each part on 1 such ‘group’ seed number. But prevent that 1 such seed number is used twice for 1 vault. This 1GB I choose here out of convenience: the point is it doesn’t have to match with the size of the 1MB of the chunks in the ‘used’ space. Whatever is the size that strikes the best balance between a time consuming ‘filling’-algorithm and having checking flexibility.
If a vault needs extra space to store new chunks: take 1 of these 200 GB parts, so that only 199 GB testable free space remains. That the free part of the 1GB isn’t tested is ok: it is not much in comparison with the 199 and it will soon decrease because of new chunks that are created.
What the testers (e.g. the elder nodes) have to know: the ‘part index-seed’-table of each vault to check.
When a new vault enters the section, the vault says to the group how much space it has and gets such a ‘part index - seed’-table back from the group.
Now the vault can start filling the spare space using the seed for each part. The algorithm to fill should take enough time so that later, during size check tests, it can’t be used to cheat by quickly creating a part a vault claims to have, because it takes too much time. And of course the resulting data on the spare disk space part after the filling algorithm is done, has to be always the same.
Maybe an extra ‘fill algorithm type’-field could be added to this table to mix different algorithms or for future, improved algorithms.
The group can make it so that the different vaults share parts with the same seed number: e.g. vault number 1 has seed number = x for part with index = 1, vault number 2 and 3 have also seed x for part 1 (or another part), but vault number 4 has no part with seed x.
Why all this trouble draw, you could ask yourself by now (if you are still reading this). Well this way a tester (e.g an elder node) could check multiple vaults at once, without having the data to check himself.
A tester could send a ‘check space’-command at a certain time to all/a lot of other, to be tested vaults of his section, so that they will calculate a hash for each (or at least a lot) of the spare space parts they have.
This command will also contains a random ‘salt’. Each tested vault has to calculate a separate hash for each spare part it has. To be clear no ‘linking’: not using the hash result of other parts to the hash calculation: only the ‘salt’ and the contents of the part.
When the tester gets the hash results back it can compare e.g. the hash of spare part with index = 1 and seed number = x of vault number 1,2 and 3. If this hash of this part is the same for vault 1 and 2, but for 3 it is different, then vault 3 has failed the spare space test. This way each part that is present at multiple vaults (because it has the same seed number) can be checked.
This is also scalable: the more free space a vault has, the more parts it has to provide a hash for in 1 reply to a ‘check space’-command, which also means only 1 round trip network message latency to take into account when calculating the maximum allowable time to respond…
You could make it so that the more spare parts a vault has to ‘proof’ this way, the more time a vault gets to respond to the ‘check space’-command.
If there are tested Vaults with unique free spare parts (with seed number that no other vault in the section has): no problem.
Make sure the tested Vault doesn’t know which parts will be checked and which will not.
Worst case (like in the example above if the hash of the part with the seed number x of the Vaults 1, 2 and 3 are all different) a tester can generate a to be tested part himself by generating it with the seeding algorithm and doing the hash check. Of course the tester gets the necessary time to generate the part to be tested.
Why the arrangement that not every vault’s spare space has exactly the same seed number for e.g. the part with index = 0 ? --> To prevent some cheating vaults in 1 section that could fake spare space by sharing 'proof ’ via a backchannel connection. If this a too unrealistic scenario, this part can be left out of course, to keep it as simple as possible.
My idea also has the advantage to cope with a constant in- and outflow (churn) of vaults.
I hope this has made a bit of sense :wink:


#18

As usual, thank you mav for presenting your ideas so succinctly and help us focus on important issues… lot’s of very interesting ideas here. However, I don’t think the timing of proofs are as appealing as the good old fashioned “sacrificial chunks” from yesteryear…

I don’t think so.

I think the notion of spare or “free space” in the network is something that should be shied away from, (just like the notion of free time :wink:) In other words, a network with lot’s of free space is a network that is not maximizing its use of available resources. Instead, why not use this excess capacity by filling it up with many sacrificial chunks for maximum redundancy rather than special “proofs” or other objects whose only purpose is to check free space. When a vault joins the network, the vault managers can begin sending it sacrificial chunks until some stopping criteria is reached or an out of space condition occurs and the vault cannot store any more. This proves the the vault’s storage capacity to the network. Sourcing those sacrificial chunks will require older nodes to prove their own storage is valid. The number of sacrificial chunks serves as a distributed indicator of available network resources. Assuming that 33% of the nodes are corrupt, then the network will need a minimum of 12 copies of a chunk to ensure that 8 copies are valid. Giving a little extra margin, elders could just refuse to let new vaults join when there are 16 or 24 copies of all the primary chunks in a given section, which is an easy way to control the growth of the network and control GET rewards and PUT costs. When new PUTs are made to the section, some of the sacrificial chunks get overwritten, which would trigger the section to accept new vaults if the total number of copies for a chunk drops below the 16 chunk threshold. Sacrificial chunks have a lot going for them.

In my view, the network should never be idle. Section managers would continuously be requesting chunks from vaults to ensure the chunk is being properly stored/managed by the vault even if there were no human users doing so. The sacrificial chunks would continuously swirl and flow between nodes in a section, and I would presume that any vault/node who is unable to deliver would quickly be found out and removed via standard network error checking mechanisms already planned during the chunk retrieval process. I think this process should be transparent to the vaults and look no different than standard user puts or gets for data. Perhaps this automated verification rate would occur at some inverse multiple of human get/put requests in order to maximize the user experience while also maintaining maximum network load? Perhaps there was a reason that sacrificial chunks fell out of favor that I am unaware of?


#19

That would be nice if bandwidth was unlimited. Once I dedeicated HDD (or part of it) to be a vault, I dont care what is there and how full it is, but with bandwidth it is diffrent. With most ISPs you will run in problems when you use your line 24/7 at 100% even the “unlimited” plans are often “not so unlimited”.
I think many vault hosts will run into problems with their ISPs even without this and as much i like the idea of increased redundancy, we should try to keep the bandwidth required to run a vault low. If running a vault will need too much bandwidth, it will lead to centralization to datacenter hosts, where the connection is cheap and close to unlimited.


#20

I think the biggest issue with this is that it’s hard to generate and hard to validate afterwards. (bitcoin mining is hard to generate, as you need to brute force the block hash, but easy to validate as you only need to hash the block once.)

How would this work? Each section would have it’s own seed (thus all vaults in that group would at least start generating the same proof’s). And the tester would download/“request” a proof from the testee?

  • Would that imply that only a node with more “free” space could test a node with less “free” space? resulting in not testable proofs. Or you could store the proofs in a decentralized manner but that would be just a reimplementation of the “normal” SAFE PUT/redundancy procedure and would generate the same amount of traffic, but store useless data instead.
  • How would this prevent that the tested vault could ask another vault for that specific proof? That’s not an issue for GETs because of the timing, just the fastest vault gets the reward, thus there’s no time to proxy request data from another vault. I guess this could be mitigated by using a deterministic pseudo random source (eg. hash of the last block) to determine the testee and the needed proof, thus we could identify invalid requests.