Rewardable cache

Rewardable Cache

This is an idea that sort of combines Perpetual Auction Currency (PAC) and algorithmic approaches to the reward mechanism. It allows users to express some choice (like in PAC) but within a totally determined and fixed reward environment (as with algorithmic mechanisms).

Cache (and sacrificial chunks) is replaced by the concept of derived chunks. A derived chunk is something like hash(chunk_name) for derivation depth 1, hash(hash(chunk_name)) for derivation depth 2 etc derived as far as you like.

If the derived name is close to your name then you can (but may choose not to) cache that chunk under the derived name for possible future reward. (You can cache it even if itā€™s not close to you but you canā€™t be rewarded unless itā€™s close to you.)

The node may use that derived chunk to respond to requests (thus shortening the route). This is the same as the currently designed behaviour of cache.

But the advantage to the node is they may be rewarded for this action. For example if signature(original name + derived name + depth) crosses some difficulty threshold they may be eligible for reward.

The PAC element of this design is the reward amount is proportional to the depth. So you may get the full reward for depth 0, half the reward for depth 1, quarter for depth 2 etc.

This means nodes can choose the depth that works best for them. If they have lots of spare resource then they can cache deep, if not much then cache shallow. And every reward thatā€™s issued also contains the depth so the network knows how much spare resource there is by taking stats on the depth of all rewards. Nodes are essentially voting with the depth. (OK this isnā€™t really that much like PAC but it does have some voting aspect to it, but the vote is not a competitive one).

If there are lots of nodes being rewarded with high derivation then spare space is abundant and perhaps there are less nodes allowed to join. If rewards are mostly very low derivation then maybe more nodes should be allowed to join. Perhaps a target depth of 2 for the 75th percentile can be used for the allow / disallow rules, ie aim for 75% of rewards to be from depth 0 or 1 and aim for 25% of rewards to be from depth 2 or more.

Advantages:

  • Data is even more widely distributed, but not in an arbitrary way, in a cryptographically verifiable way.
  • A measurement of spare space is available, and also a measure on the value of that spare space.
  • Nodes can decide how much they value their spare space by how far they derive. They can also decide if they derive nothing or a little or a lot. They get to decide how much is useful and how much becomes wasteful. This can change over time and be detected by the network because the reward distribution to that node would also change.
  • Invalid derivations or reward claims can be detected and punished.

Considerations:

Is on-the-fly derivation damaging? I think not but is worth considering. The intention is to derive-then-cache, not to derive-then-discard.

What info should be used in the derivation? Should it be the chunk name or the chunk data or both? Should it be unique per node (eg add the node name to the derivation) or should it be universal?

What is the deepest derivation allowed? If itā€™s very deep then it could waste time when verifying whether the derivation is valid. Iā€™d say probably 10 is deep enoughā€¦?

Is it robust against replay attacks? How can this be done (which is a good question in general for the reward mechanism).

Because nodes get to decide whether or not to claim a reward, for high derivation depth they are deciding between a) getting a small individual reward which is good-for-me at the same time as making it easier for others to join which is bad-for-me vs b) not getting an individual reward which is bad-for-me at the same time as making it harder for others to join thus keeping my ā€˜monopolyā€™ position which is good-for-me. This is possibly a messy mixed double-function combining reward vs resource and reward vs exclusion-rights. Iā€™m cautious about how closely to link the disallow rule with the decisions of existing operators (ie decision to be rewarded or not). I think in reality itā€™s not too big a problem, but Iā€™m being very cautious in my approach. This is probably impossible to avoid since all events are triggered ultimately by human decisions.

What quantity of safecoin to reward?

Does this sound like a useful feature? Iā€™m keen to hear your thoughts.

17 Likes

Thinking of just an ordinary user, say grandma, how does this choice present itself? I assumed if itā€™s algorithmic then say the depth could just be set appropriately to the proportion of dedicated space you select for your vault. Is it more complex than that or could it be that simple and still be effective?

Which I feel like this answers most of my question.

I like this. Provably more physically secured network data.

This seems useful and I know Iā€™ve seen you and David talk about network health metrics before which I would reckon this would be a huge part of that.

This sounds messy. Maybe Iā€™m missing something but why have the choice of being able to claim the reward or not? Why not just enforce claiming the reward so itā€™s easier for other to join so itā€™s better for the networks health and everyones data? It almost seems like getting to choose between positive reinforcement for being good and lack of guidance for being selfish. I get you arenā€™t being rewarded for being selfish but you arenā€™t being punished either and may still be benefiting in some way which seems exploitive and a drain on others.

Not sure how helpful this feedback is but it sounds interesting. By the way, this is meant to accompany or compliment the farming reward mechanism, correct?

2 Likes

Iā€™ve always liked the idea of rewardable cache. Interesting ideas above, but to start it might be easier to just simply weight the farming reward for all chunks. A weighted distance metric based on the xor distance between a chunk and the vault that supplies it would work nicely. (Client wait time or other latency measure could also be considered in the weighting metric). For example, consider a GET request for a chunk has the 8 storage vaults return it to the requester, in addition to 8 cache nodes. These 16 vaults would all share the current GET reward for a single chunk. The closer the vault is to the chunk and/or the client in ā€œxorspace-timeā€ the higher their relative share.

The practical application of this may likely require some extra signing and extra encryption like you mention above.

2 Likes

I always thought the cache was covered in the farming rewards.

We must be mindful that to create multiple separate rewards will be a higher network load compared to incorporating them into the farming rewards as just one reward system

4 Likes

That is exactly what I was recommending above.

2 Likes

Yea, and I agree the vaults with the chunk should share in the rewards if they successfully retrieved the chunk or able to, I have suggested this in the past 1/2 shared between nodes and other 1/2 to successful node (actually two parts to the equation)

But I doubt its good to include the nodes in the pathway because it is a potential problem with anonymity since now the return path is mapped out in the system and poses a potential ability for hacking to disclose who requested what

2 Likes

I think this is a good point. Probably most normal thing to do will be start as many vaults as I can with the spare resources I have, and then with whatever is leftover use it for cache.

For example, if you could start 4 vaults youā€™d be better off starting 4 vaults rather than starting 1 vault and using the remaining ā€˜3 vaults worth of resourcesā€™ as cache. But if you can only start 1 vault with some left resource over that couldnā€™t run a whole other vault, it would be good to contribute the leftover resource as cache. The opt-in nature of it means you can run ā€˜half vaultsā€™ if that makes sense.

Although with the current idea of having full vaults with redirects for chunks that canā€™t be stored in the closest vault (because itā€™s full) then maybe this doesnā€™t matter so much, your ā€˜last vaultā€™ would be full rather than for cache.

So the choice the proposed type of cache would present is ā€˜should I use my leftover scraps of resources for cacheā€™.

Grandma probably just sets the total storage to allocate and the vault software would then work out how much to use as cache based on how many chunks are actually allocated by the network to that node.

But Iā€™m not sure how the choice to cache will be made in reality (using this cache idea or any other idea).

This idea hoped to make the cache ā€˜obviously worth doingā€™.

Elders (who are presumably issuing the reward to the node) canā€™t know if the vault is actually storing the cached chunk or not, since itā€™s a choice. So the node must submit their claim for reward.

In other words, elders should ideally not have to ā€˜knowā€™ anything about the cache of nodes in their section. So that leaves it up to nodes to claim the reward in that case.

Might be misguided here, but that was my thinking.

Yes, this sounds pretty simple. Nodes can still choose whether to cache or not but based on distance rather than derived name.

If 8 cache nodes return the chunk, why would the request continue further? Doesnā€™t it stop there? I guess with RMD thereā€™s no way to know if someone else in the delivery group for your section has the chunk cached so youā€™d send it on to the next hop which may lead to the situation you describe.

I donā€™t think cache is mentioned in rfc 0005 0012 or 0057, so I always have assumed cache is not rewarded. Happy to be corrected though.

I thought routes were deterministicā€¦? Is the request ā€˜onion styleā€™ encrypted from the client? Or does each section look at the final destination and calculate the next hop from the final destination (ie the destination is revealed at all hops)?

Seems like to do onion style encryption from the client the client would need to know all keys along the route, which seems like a lot of info required before being able to make a request.

2 Likes

For anonymity the section with the chunk does not know path, it knows the destination and chooses the closest section to the final address and passes it to there. Then rinse and repeat.

That is both ways. Storing the chunk and returning the chunk

2 Likes

Nice @mav this feels very close to a more formal mechanism of what we called way back deterministic cache. Where now we have probabilistic cache this one you describe would be good for very popular data that changes quickly. As this is deterministic then it means original holders can push out updates to the data as well. In this sense, the deeper holders respond to client requests giving the tier 1 holders time to process writes to the data.

7 Likes

I should add some emphasis to the original post that I feel one of the main benefits of the idea is that it gives a measurement of spare space. The reward motivates the particular behaviour that leads to measurement.

Iā€™m still fairly dubious about having full vaults as per rfc0057 (but remain open-minded). It seems to add too much fuzziness to the idea of ā€˜close xor distanceā€™. By introducing some gradient of reward like in rewardable cache it hopefully allows for a more granular / accurate measurement of spare space than full / not full.

So I agree there are multiple benefits of this type of cache but I wanted to bring some emphasis to the measurement aspect.

9 Likes

It is important to verify that vaults that are responsible for storing a chunk actually do hold it. I guess if the caching nodes ā€˜interceptedā€™ the request for a chunk, a mischievous vault would be at risk of being found out if there happened to be no nodes caching that chunk. If the request continues to the section, the vaults responsible for holding the chunk would prove they have it on each request, but the benefit of lower network load offered by caching would not be realized (but the client making the request could potentially still benefit from the reduced latency offered by caching)

1 Like