That is true, however its a bit more efficient. With disjoint sections you have neighbors (sections who differ in 1 bit). So it is not 1 bit per hop in practice, you select the closest neighbor and send it there (you are connected to all neighbors).
This process/example cannot cache ![]()
![]()
Nobody on the request path saw the returning data oO
Ps: oh not true - the own section does so dos wouldn’t work here/a simple rate limit would probably do the job… And caching would be for others…
Pps: actually - I can just spawn 10k nodes and reconnect until all of them end up in the same section and a simple rate limit wouldn’t stop me from keeping ln(network size) sections busy - so a local cache in the section comes with benefits ![]()
Technically it can cache but the benefit or incentive becomes a bit weaker because the vault is not on the return path so ends up using more bandwidth by caching.
Eg if the second-last hop responds from cache the paths become
Request hops: 1101 => 0101 => 0111 ( => 0110 hop is never completed)
Response hops: 0111 => 1111 => 1101
Reasons to cache (all are ‘external’ or ‘indirect’ benefits)
- client gets the data faster (because there’s less hops) so it’s a better user experience (which is really important especially if farmers are also users which I assume most would be)
- possibly stops an opportunity for other vaults to farm coins, meaning better chance for this vault to farm coins
- improves the overall efficiency of the network
- might lower the overall requirement for running a vault so increases overall vault diversity and therefore security
Reasons not to cache
- might be able to avoid handling the response so bandwidth would be saved by not caching
- might increase workload for other vaults which is an advantage in a competitive setting
Good summary
From the perspective of this node, I do not see the connection. The node caching is not storing the chunks it is caching and so by caching chunks it is not affecting its chance whether it caches or not. By having network wide caching the vault has less chance of farming.
So in fact this is not a reason to cache.
Cache does both a) increase my vault chance for reward and b) decrease other vaults chance for reward. (but cache is not always an advantage; let’s dig in deeper below)
This thought experiment is an extreme example:
My vault is the only one with cache in the network and it caches every single chunk.
Any request coming via my vault prevents a reward for some other vault.
But no other vault can prevent me from getting rewards.
So my vault with cache has an advantage for the total reward it gets.
Change the extreme example to every other vault caches just one chunk, the advantage is still there. Change to two chunks, etc etc. The advantage of a large cache remains but diminishes as others grow their cache.
But overall, the largest cache gets the most advantage (although the advantage may be fairly small in the live network).
I dig deep on this because I’m cautious that even a 1% advantage can have compounding effects over time which centralised power and resources and undermines the security of the network. I’m sure there will be countering advantages for other vaults with smaller caches and it’ll be swings and roundabouts… we’ll see.
But if caching is the norm then the (a) option cancels out if it ever existed and the (b) option is the result.
It is still not an advantage it does not increase the number of GETs. It only reduces the other vaults GETs.
This means that the only vault that caches increases its proportion of GETs compared to others, but does not advantage its own GETs as they remain the same whether it caches or not.
Yes this is the advantage that I’m talking about. I think it’s important to consider since it’s better to have a diverse set of vaults. If one operator can increase their reward compared to other vaults then it might lower the diversity of vault operators and affect the security of the network.
Seems that we’re on the same page about this ![]()
It is still not an advantage to the number of GETs, and if caching is used by every node since they don’t want to be “disadvantaged” then the question boils down what size of cache?
Memory caching has a practical limit of 128GB since anything larger than that requires big money and destroys any benefits from the advantage
Now if you change the code so that you use disk space then you have to question whether its better to use it for a vault or for cache?
Without quantifying any figures which would be hard at this time, my impression is that for a reasonable sized network the influence on other vaults your one node (with TB’s caching) will have is minimal compared to the whole network. And as soon as many vaults do this then that small advantage will keep reducing as more do very large caching.
If you were the only vault with a ultra large cache, would the advantage (GETs compared to others) even be noticed when compared to the whole network and is the advantage greater than just using the vault as disk space.
My thought is if the advantage of more GETs compared to other vaults will even be measurable above the statistical noise level, and doubt you could ever measure it for a reasonable sized network.
In essence this advantage to your vault is not the way to look at it. It is a disadvantage to a number of other vaults further down the paths you are in, and the ones most affected are the downstream vaults. Even if you are the only cache and cached every chunk from them the most affected might see a reduction in your path. Now if those most affected vaults are in a section connected to 8 other sections and the section size is say 64, then its 1/512 average of requests you cache. 0.2% you cache. The 0.2% is worse case since 2 sections away there are many more paths through the sections since those 8 neighbours have 8 neighbours themselves, just unknown how many are common neighbours.
Now if other vaults cache then that percentage effect drops dramatically since every vault including yours are reduced anyhow and likely more than the 0.2%. If other nodes cache on average 5% of chunk requests and we use the 8 neighbour sections and 64 per section then we get every request to vaults are 5%. The 0.2% your one ‘infinite cache’ node disadvantages the most affected vaults is now basically noise. And reduces further as more nodes have “infinite” caches as that 5% increase.
5% cache would be +/- 2% easily and the 0.2% is thus noise. And you infinite cache node is now getting less requests since caching is reducing its GET rate too by 5% which makes the 0.2% max insignificant.
Could we have some kind of light vault with cache storage only? Like most of mobile phones probably will not run 24/7 but they can help with most wanted content using own cache.
Yeah I agree with this. I think cache is going to be fine and isn’t really a large opportunity for attacks.
One last thought experiment:
I run a vault but it’s not using that much of my drive or cpu or bandwidth so I start a second vault on that computer. They’re in different sections (most likely) so they have different data. They should be able to use each others chunk stores as cache (just makes good sense to do that). Then I start a third vault… and so on until this computer is working to the desired capacity. Then I decide to extend my operations further into my LAN, pushing my vault/cache stores even further.
It wouldn’t surprise me if ‘multi-vault’ is the norm and every computer will be storing data from several different sections. It makes sense to make the most of all that, which means any co-processor vaults should be able to pick the shared chunks out for usage.
This changes chunk store from a per-process idea into more of a per-computer idea, or even a per-LAN idea.
Not going anywhere particular with this as far as ‘gaming the rewards’ is concerned, but I’m trying to get the notion of vaults and storage and access and cache a bit less hard and a bit more fluid to different ways of using and optimizing.
For now I’m more than happy to work with the ‘vanilla’ ideas and intentions for launch, but I reckon the norms will become pretty whacky pretty quickly once the optimums can be found.
David mentioned that in the live network the vault will not know what it is storing by some additional encryption. So then you could not use chunks from one vault to another vault/cache.