New data type: Free Data with PoW signature


#21

I can’t follow this. How exactly could the bad guy steal the hash from the good guy? It’s different for each piece.

It can be made valid for just a few minutes (as I outlined previously), thus pre-computing it or replaying it is impossible.

Checking if a PoW is valid is cheap. Creating it is expensive.


#22

Because the bad guy has insider information (they run nodes too)

But still the free PoW method has more code in the network code and more code in the client and all those clients have to use extra energy for every blob to be stored. This will represent a lot of real $$$ world wide and for what? A system that opens up a new attack vector that didn’t exist before and one that can see millions and billions of data blobs stored (never deleted since its an attack) over time in selected sections causing massive problems for the section trying to maintain those blobs as nodes go off and on line. And all done from a relatively few desktops or datacentre machines.

If you just pay for the blob then you save that extra code and simply do the smaller code of subtracting the PUT cost from their PUT balance (very quick and not a safecoin transaction). NOTE the consensus to store the blob also is for the PUT balance being reduced. I showed above how the PoW requires more network side code time. (ie the consensus to send a PoW request and the consensus to check the hash and consensus to actually store the blob). This is without considering the extra code time for the client.

Its unnecessary complexity to try and have the dream for free (not really in $$$ terms) blob storage.


#23

I agree with this … if adding complexity isn’t overly compensated for by strong benefit then it’s probably not worth it. It’s arguable whether or not free puts are a game changer in terms of extra adoption or not, but can’t prove one way or another except in a live test IMO … so … but even if free isn’t the best option the other parameters of your new data type are still interesting Joe.


#24

Very much so and I realised I had not said so in direct terms.

I think data blobs could be developed into a very useful data store. Need to discuss if they will provide a better store than just an MD which can be very small if desired, except MDs cannot be refunded when “deleted” and have additional functionality that isn’t in data blobs.


#25

Already lots of discussion that duplicates my own thinking but this is my take on the proposal.

This is why I think it’s so important to design vaults in a way that allows everyone to run a vault. Aren’t vaults an ideal proof of work and safecoin an ideal encapsulation of it? Why would we ‘just add proof of work’ to prevent spam which is really a big step backward from ‘just run a vault’. I would like (in an ideal world) to see ‘run a vault’ be the canonical answer to spam problems. I think it’s possible.

I don’t like the high-complexity-to-store-forever option. This would almost definitely create strange dynamics between the proof mechanism and safecoin farming.

Part of this feature that appeals to me is the impermanence. I think the permanent nature of all data on safe is not something people are ready for yet, but also think it’s inevitible and we have to learn to deal with it so why not now.

I would be interested to see more examples of use cases. I’m not really sure how this would be used so I’m not really sure how to reason about it. I guess there are loads of intended uses, unintended uses, and malicious uses. I’m especially curious about unintended uses.

I’m curious what your intuitive sense is for how much cache most vaults will have available? I think there could be quite a lot of caching (maybe 100x chunk store size) since cached responses will ‘block’ farm attempts by other sections. But maybe it won’t be worth the cost? Add freedata space as well… it might begin to really distort the purpose of vaults. Always good to think about this from different perspectives and I think you offer a really valuable perspective with this idea.

My gut feeling is safecoin faucets in the early days will be a better mechanism. Early adoptors have a lot of reasons to get more people involved, so the cost benefit of giving vs keeping a coin to grow the ecosystem is (to me) pretty clear. It’d be great to have a technical rather than altruistic/capitalist solution but this particular freedata proposal is for some reason not getting me there.

This is based on research from 80000 hours - from the Multiplier Effect section: “For whichever actions are highest-impact, it’s always even more effective if you can mobilise more people to take them.”

Maybe the point of the attack is simply to make the cost high? If this data type is used by many apps then it may potentially be a way for one app to raise the costs for competitors.

Maybe the vaults can communicate between themselves to organise an out-of-band (oob) solution to this problem. Not all vaults would necessarilly participate, but I can imagine vaults sharing a list of oob options they’re currently offering. One of those oob options might be ‘free storage’. Even though it’s not a formal part of the safe network, the oob layer piggybacks on the existing safe trust layer to deliver a second set of functions at a different (lesser) level of trust and guarantees.

This could be a good way to run experiments if the protocol for sharing oob solutions was standardised.

Broken record time: EXPENSIVE_HASH should be derived from the common operations for vaults (eg sha3-256 or ed25519 signatures) so any optimization can be used direclty for improving the end user experience via better vaults.

I guess this can be generalized to ‘PoW must work with a spread of 10,000 times difference in computational power’ - and this spread is only going to grow in the future. How much spread do we tolerate and how is it managed? That’s a tough question. Do we exclude the ‘poorest’ users? Do we empower the ‘richest’ users? IMO proof of work is the wrong solution to the spam problem mainly because of inequality of compute power across the user spectrum, which is only going to get worse into the future. Is safecoin a better equalizer?

Everything can be implemented on asic, but the amount of improvement varies depending on the algo. Given enough incentive asic is inevitable. I’m 99% confident the crypto primitives of SAFE will be implemented in asic within 2y of launch.

What is the cost of managing a 10,000x difference in compute power to make the hash algorithm broadly useful for all users, vs the cost of PARSEC consensus on an equally available safecoin token mechanism? The complexity is not just algorithmic or consensus oriented, it’s also a sociodemographic / governance / equality complexity. Maybe PARSEC is insanely efficient and the equality it brings is much better than any alternative approach…?

I’m not convinced yet that we can assume a hash check is simple compared to a payment transaction.

Hmmm. I think you’re overstating the simplicity of the PoW option. If sections have different difficulties then don’t they need to communicate and agree on that difficulty between each other to know if the freedata is valid? Just a minor niggle.


#26

Actually the PUT cost is subtracted from the user’s account PUT balance. And I believe this is part of the consensus for actually storing the data chunk. So the free PoW system has 2 more consensus parts. (one for asking for PoW and one for checking the result). It maybe possible to combine the PoW check with the storing consensus, which means at least one extra consensus for PoW free store.

Also a PUT cost is NOT a safecoin transaction either. As I said above it is simply subtracting the PUT cost from the PUT balance as part of the store procedure/consensus.

So the divide in complexity is even greater for PoW verses charging a PUT cost.


#27

I did a little spreadsheet to work out the maximum time the PoW function can take before the energy cost in $$$ exceeded the charged amount. And from that the minimum number of blobs that can be created by an attacker per period.

Some of the assumptions/parameters I used. The cost of storage should actually be reasonably lower than I suggested since a large portion of the storage is spare capacity and not new gear.

  • Storage based on the cost for a high performance desktop 10TB drive. I would reasonably expect the fiat cost of storage to work its way to be lower than this as the farmers will be happy for lower amounts due to them using spare resources.
  • Power costs based on my local cost. Others maybe lower or higher. This is after all a ball-park calculation and not precise. Although it should be close.
  • The maximum time for PoW to run is calculated so that the energy for PoW equals the cost to store.
  • Since the blobs are very small it would be reasonable to expect a special charging rate appropriate for maximum data size of 1/2K as specified in the OP
  • Then the number of blobs being stored per second is based on the maximum time allowed for PoW and the energy cost to be the same or less than the blob-put cost.
  • From the calculations it is seen that the server time gives the lowest of these maximums
  • I did not consider the time required for a phone IF the PoW used is tailored for the server rather than for a low end phone. It will be much longer than the equivalence point and thus be more costly in energy than just paying for it. The point of this exercise was to determine the rate at which blobs could be generated for storage.
  • the actual PoW calculation is not needed since we only need to specify the maximum time it can take for energy equivalence to blob-put cost.
  • If the PoW time is actually made faster than determined in the calculations for equivalence then the rate of blob generation will be faster making the attack stronger.

From this simple calculation it would not be difficult to overload the network with blobs that are free to store, while trying not to cost the users more in real $$$ (energy cost) than blob-put cost expected.

EDIT: This is the sort of logistics that need to be considered with “free” data. The idea of PoW to make it free may be a good idea at first, but without even know the details of the PoW we can see that there is not an available sweet point for the amount of PoW to make it work without allowing attacks to succeed.


#28

That’s one reason. Others I had in mind:

  • Some data only matter when fresh (yes, everything can be logged and preserved for eternity, yet some stuff is just not worth the space after some time). This is handled by explicit expiry.
  • The value of some data is unknown at creation time as it gets established by people’s actively requesting it (and then the same way it may fade away). This is handled by the cache-only storage model.

No attacker could make the cost significantly higher without making the attack very costly for themselves as well. You need to recognize the attacker’s cost grows exponentially: he can only raise the cost by generating higher volume, but then he pays not only a higher price but also for higher volume. This will become prohibitively expensive real quick, enough so that the very idea of launching such an attack becomes stupid.

You can make the hash memory bound enough to makes ASICs infeasible because the kind of memory (SRAM) that can be integrated on ASICs takes up much space.

Another thing to realize here is that, unlike in crypto, this PoW is not about making money, so the only incentive to make these ASICs is launching a DoS attack, which may not be enough to justify a multi-million dollar development that will never be returned because you can’t scale up because there’s no demand behind your circuits.

Here’s the thing. If Safecoin microtransactions will indeed be available at an insanely unimaginable rate and at a similarly low network/etc cost, then there’s really no reason for PoW.

The thought process behind PoW was that payed PUTs work with chunks ranging from several KB (pebbles) to one (or more?) MB (rocks) and maybe the same elaborate orchestration is an overkill for 200-500 bytes (sand). Maybe it isn’t but it’s worth thinking about.

How exactly would you do that? The balance is not floating in the air, accessible from whenever, it’s also handled by a section, different from the section where the data will be stored (unless by accident.)

So, we have 2, not 1, sections to work with. PoW, on the other hand, would need only one section to deal with.

Nobody asks for PoW, it’s part of the original request. There’s no checking for the result either, as it’s part of the response exactly the same way as (I suppose) any other storage request gets a response.


So much for now. I’m terribly sorry but I’m quite busy these days and I could only skim through the developments on this thread. Thank’s for all the great contributions, I’ll try to go through them in detail once I can breath again.


#29

I can help a little here @JoeSmithJr When a user charges up their account, they give a safecoin to the account holder group (or proof of burn). The account holder group then reduce that safecoin cost in the users account by the price of a PUT in real time with every PUT. This is made simpler by the fact that in the near future clients will connect to the client manager group instead of a proxy node. This is more secure, but the network cna then monitor activity as well, so spammy clients may get banned etc. Also it allows push notifications and some more, like deducting fractions of a safecoin, even when no small safecoin denominations are possible elsewhere in the network.


#30

Complete goodbye to the proxy-nodes???


#31

As proxy’s yes that is likely, there will be bootstrap nodes though as it may be likely that your whole client manager group has changed since your last vist and you will need to get back on line. So bootstrap nodes will still be there and do very little.


#32

So then its simply working out shortcuts to the PoW seeing as the attacker can choose all their input data. PoW is made hard when the one doing the PoW does not create too much of the input data. By allowing the attacker to generate all the input data then shortcuts can be found to shorten the time. IE use part results in multiple blobs is just one way

Then you exclude so many devices that also do not have much free memory. An ASIC can be a PCI board with one ASIC per cpu core on it and they use the desktops memory and this allows that simple desktop to have as many threads as they can generating blobs.

ASICs can be cheaply made in China. Making custom ASIC chips is really not all that hard for anyone who has worked with them before and using the ones already existing for a starting point. Personally I could not be bothered and I haven’t worked with crypto ASICs

Since this attack is not resource heavy (ie a few machines can destroy the network), if a large organisation wants to destroy SAFE’s performance then the cost to develop a ASIC is nothing. Puts a new meaning on the “Google” Attack topic.

Also if you looked at my analysis, it does not matter one iota how the PoW works, the moment you want it to not cost more in real $$$ to use the Free blobs then the attack is simple and highly destructive. If you make the PoW very difficult (long to generate) then it would single handed become the largest energy user of any singular PoW in the world once SAFE became global. This is because the time required to prevent attacks makes the PoW a heavy usage multiplied across 2+ billion users and all the 1000s to 100000s APPs that use blobs.

Read David’s post, PUT transactions are as simple as subtracting a value from “The account holder group then reduce that safecoin cost in the users account by the price of a PUT in real time with every PUT. This is made simpler by the fact that in the near future clients will connect to the client manager group instead of a proxy node.”

So not correct because paying for blobs is cheaper than the extra consensus to validate the PoW.


Logicically though PoW that does not cost more in energy $$$ than paying to put, will allow massive attacks on the network.

Just pay for the blobs upfront (user or APP) with refunds on delete and then it becomes a very interesting and potentially useful data type.


#33

I have two pictures in my mind, so one of them is definitely incorrect: :sweat_smile:

  1. My client is connecting directly to the group that holds my account (i.e. client manager group == account holder group). So, when I PUT a chunk, the request goes right through the account holder group and they can adjust my balance before they forward my data towards the section that would store it.

  2. My client is connecting to a group (referred to as the client manager group) but my account is on another group (referred to as the account holder group). When I PUT a chunk, the client manager group either just forwards the request “as is” to the account holder group to handle it, or it just coordinates with it and when received note that the balance is adjusted, sends the chunk towards its section.

In either case, I may have to concede that payments are not that bothersome.

That’s not actually true. At the risk of going a bit off-topic, let me try to shed the myth ASICs are the cheap magic tool people think they are.

Their cost is made up of a freakishly high base cost for the design and the masks, and a much lower variable cost for the wafers. The only way to get cheap ASICs is by having enough demand to offset the base cost. The application we’re talking about isn’t such.

The first half of the base cost, the design, is basically just software development, but on multiple levels, each requiring a specific set of tools and skills:

  • an ASIC at its core is just a program like any other, with some limitations implied by having to be translated to a physical thing, and it’s usually written in one of the two popular languages for this purpose;
  • a logic node list generated from the above (gates and the like),
  • mapping the above to physical components such as transistors (you can’t even do this without having a contract with a foundry, complete with NDAs and the like, to get the library),
  • verification for all these steps,
  • a bunch of other stuff I forgot.

Much of the above needs not only expensive software but also expensive talent.

The masks are the hardware component of the design, and they will set you back by tens of thousands (super old tech) to hundreds of thousands (not so old tech) to millions (current tech) of dollars. And that is for after you already have a design, which again can cost in the range of hundreds of thousands.

You can of course get a few circuits for a few thousand dollars by “shuttle” services (e.g. by TSCM) or at the edge of other production wafers, a bit like cube sats. These options are often used for prototyping and for verification before the final tapeout, which again may again give you an idea just how expensive just designing an ASIC is.

We haven’t addressed the problem about memory bound calculations. SRAM is big (costly) and you can’t put DRAM on an ASIC because it’s a different technology.


#34

And it still boils down to one of logistics in that if you keep PoW cheaper than energy $$$ cost then the attack is easy as pie to do. You have to make the PoW very expensive and then storing Blobs is no longer free. The attack is still easy even if PoW $$$ energy cost is many times blob PUT cost, so it has to be a lot higher in $$$ cost. Free as in no safecoin but costly in energy cost.