Farming Pools and InfiniteGB Nodes

Looking for feedback on whether this scenario is feasible and / or likely.

Some background info:

Currently (based on rfc0057) new nodes are only allowed when there are more than 50% full nodes [1]. If less than 50% are full the existing nodes keep getting filled until the 50% full threshold is crossed. Maybe the 50% number will change, maybe the disallow rule will change, but the idea of farming pools is still valid if there is any situation where new farming nodes are not allowed to join the network.

Hereā€™s the farming pool idea:

I try to join the network but my node is not needed so is disallowed. I really want to start farming, and I come across something called a ā€˜farming poolā€™, a service offered by an existing node on the network. The node allows me to join their pool right now and theyā€™ll give some chunks to look after and a proportional share in the rewards (they take a small fee but Iā€™d prefer a small fee compared to not farming at all). Great, I can start farming right now! I join the pool and in the background Iā€™ll keep trying to join the network as a real farming node. This pool lets me earn rewards while I wait.

This makes the pool operator node appear to the network as a very large node (the network has no idea itā€™s a pooled resource). Ironically, the pool participant has reduced their chance of being able to take part as a ā€˜realā€™ farmer because now itā€™s become even harder to cross the 50% full nodes threshold.

Does this sound like a feasible situation? Would it be a problem?

A second similar thing is if datacenters start taking part they could do the same thing, appearing as one massive node (or more likely a maximally-viable number of not-full nodes). Thereā€™s incentive to do this since full nodes earn less reward (rfc0057 says reward is halved for full nodes [2]), so two full nodes are rewarded less than one not-full node. The incentive to not be full is very strong.

I feel like using full nodes as the measure for when and how to take action is potentially dangerous to the health of the network.

Just brainstorming here, would be interested in other views on this.

I feel like rather than using full nodes as the measure, we could use degree-of-redundancy as the way to decide network actions (like the sacrificial chunks idea in rfc0012 but a little more flexible). There can be a fixed minimum redundancy but any amount of extra redundancy that floats above that can be used to measure how excessive or how stressed resources are.

For example, enforce a minimum of 8 redundant chunks. If measured redundancy is 20 thereā€™s a lot of spare resources and the network can maybe start rewarding less to weed out inefficient resources. If measured redundancy is 8 the network keeps the reward where it is. If measured redundancy is 7 the network takes immediate action to bring it back up to 8 by increasing the amount of reward.

Allowing a floating amount of redundancy means there is no disallow rule, any node can start farming at any time (onboarding may take time due to bandwidth constraints, but there is never a disallowed node). The lack of disallow rule means thereā€™s no incentive to have pooled farming, all nodes may as well join the real network.

The disallow rule is seen as a necessary security mechanism to reduce the chance of the network being flooded with new nodes. But I feel the disallow rule also has other side effects which are potentially quite dangerous (eg farming pools). Is the disallow rule a net positive? Tough questionā€¦

[1] rfc0057 ā€œif that ratio of good nodes drops or remains below 50%, vaults will ask routing to add a new node to the section.ā€

[2] rfc0057 ā€œif flagged as full { nodeā€™s age/2 }ā€. Age is used to weight the reward, so the effect of halving the age is not exactly to halve the portion of reward, but is close to that.

14 Likes

Can you look at scenarios in which there is more than one Safe Network. Letā€™s say this farming pools farms simultaneously in all available Safe networks and moves the resource where it pays the mostā€¦

2 Likes

Wondering what the consequences would be for the pool if one or more of the nodes misbehave. Does the network punish the main pool node? I would assume node age is affected but what about getting booted by the network entirely, which I believe has been mentioned before? If so many nodes are part of this pool, how does the pool operator guarantee that he wonā€™t get punished when a bunch of careless or malicious nodes can join the pool?

Perhaps pool operators will have to have their own malice detection?

5 Likes

I like the concept of maximizing redundancy. Imo unused disk space is just a wasted resource. Better to fill it with extra copies of data from N nearest neighbors. Intuition tells me an algorithm that is consistently trying to maximize chunk redundancy would be more robust than one that seeks to maintain a fixed number or enforce a minimum. The communications overhead is the main drawback though.

For this reason I donā€™t see pools as a technical problem for the network. However, their existence would represent a failure in the realm of user experience.

Not sure if this really applies anymore with node age at play. The nodes could be brought in at age zero and assigned a variety of grunt work tasks or serve as cache. So they can join, they just wouldnā€™t be full fledged nodes or become a real vaul until they are older.

7 Likes

Farming pools is mainly a discussion about managing incoming resources, I think you are trying to bring in a different topic about how to manage departing resources. Departing resources is definitely a worthwhile topic, but is probably better to discuss it in the multiple Safe Networks topic or in a new topic about managing departure of resources.

edit: I think you have raised important points about the departure of resources, Iā€™m not trying to sweep it under the rug here, just I want to try to keep this topic on track and focused on how to best manage incoming resources. If you feel Iā€™ve misunderstood your point please do clarify further.

Yeah I feel your latter question here is probably what would happen, pool operators would mainly end up managing and judging their underlings and would end up as kind of a meta-node in a lot of ways.

Yes, certainly farming pools would not be a problem in the technical sense. However they increase centralization and undermine the benefits that node participation brings to the governance of the network. So to me farming pools seem to be a ā€˜suboptimal solutionā€™ rather than a ā€˜technically incorrect solutionā€™, and one that hopefully can be avoided if possible.

8 Likes

Agreed. Thereā€™s no need for middle men between a vault and the network.

6 Likes

In my opinion, it is reasonable for such a pool to maintain nodes in all available Safe networks and to direct the incoming resources to which network pays the most to make the most money.

This does not mean that it will departure resources from the less popular Safe networks, this is not optimal, because there is no way to know if the price will not be the highest there tomorrow.

2 Likes

I wonder what the extra lag time might do to chances of being rewarded.

If all nodes (not considering pools here) participating successfully in retrieving a chuck are given something in reward then a pool node might still make reward anyhow.

But if it is only the node (or nodes responding quickly*) that ā€œwinsā€ then the pool may miss out on many chances for collecting a reward. For instance I join a pool in the USA, the lag is round trip since the request has to be sent to me and the response sent back.

But this is an interesting idea and eventually may have success. It also may be a way for a home user with spare drives and SBCs could pool rather than have multiple nodes. No lag

3 Likes

Somewhat surprised there is a limit to joining, as for all the risk that those joining see a small return in coin, that enthusiasm surely only adds to variety, which lends to stability and perhaps affects speed.

50% arbitrary?.. why not 20%ā€¦ more the merrier.

I suppose thereā€™s risk to volatility but impt there is fast response to growthā€¦ and those using the network initially will tend to be those hosting vaults??

1 Like

Iā€™m not sure, but think the reason was efficiency :wink:

2 Likes

Iirc itā€™s a defense mechanism. Consider an adversary that spins up 1M vaults. If those were all accepted immediately, the might could swamp a network section. Instead they go in a waiting pool. This gives time for many other potential vault operators to join the waiting pool/queue. When the network actually needs a new vault resource enough time has passed for the pool to have accumulated another 1M non-malicious nodes. The probability that it will randomly select a malicious node from the pool is now only 50% instead of 100% under the direct join scenario.

4 Likes

I thought thatā€™s the case to make the network resilient against vaults trying to game the network by storing no data at all. By making GETs when they need to serve a GET themselfs and acting as a mere ā€œproxyā€ to other nodes.

1 Like

This pokes at a lot of different ideasā€¦

Letā€™s say thereā€™s no join limit. Anyone can join any time.

This doesnā€™t mean join is instant. It still takes time to redistribute chunks to the new nodes.

Letā€™s say a section would split if the two new sections would have 100 nodes each (so probably splitting when there are about 200-250 nodes). If thereā€™s a section with 150 nodes, then suddenly 1000 new nodes join the section all at once, should the section split now then redistribute chunks, or should it split only after all chunks have been redistributed?

Iā€™m not going for a binary ā€˜this or thatā€™ answer on this, Iā€™m just trying to conceptualize the relationship between chunk stability and section membership stability. When is a node ā€œa nodeā€? When is a node ā€œqueuedā€? When is a chunk ā€œstoredā€? When is a chunk ā€œat riskā€ or ā€œlostā€?

Looking at the waiting pool idea by @jlpell, Iā€™ll call it a queue, would I be able to get a hundred new nodes from my laptop into the queue? Or a million? Or only a few? Whatā€™s the limit? Presumably being ā€˜in a queueā€™ is not a resource intensive action, otherwise itā€™s not queuing, itā€™s joining. Having more nodes in the queue increases my chance to be selected for the next join. So Iā€™m not really too clear on how the queue mechanic would function as opposed to simply joining. Maybe Iā€™m not looking carefully enough into the queue and thereā€™s a simple way for it to work?

(If we call the queue a membership pool we can stir up some confusion by calling it mempool, which in bitcoin is short for memory pool! No letā€™s not do that :slight_smile: )

A disallow rule or a join limit etc, itā€™s sorta naturally going to happen anyway since chunk redistribution isnā€™t instantaneous. But a disallow rule is also sorta naturally not going to happen because resources spent managing a queue is a wasteful-type-of-joining.

Perhaps this all just adds a lot of mud to an already murky poolā€¦


Letā€™s look at the idea of chunk redistribution when new nodes join, since this seems like a key part of whether or not new nodes are disallowed.

Maybe chunks do not need to be redistributed? This is sorta natural anyhow since if vaults can be full then chunks must be ā€˜nearā€™ their xor address, not ā€˜closestā€™ to it.

We could have nodes be part of the network without them having all the closest chunks. The new node missing a chunk would not be surprising since it happens anyway with full nodes. (I find this idea unappealing but it seems like a necessary consequence of allowing full nodes; I prefer the strictness of all redundant chunks being at the actual closest nodes, not just nearby).

This raises a question of how new nodes might be filled. Some options for how to do the filling:

  • The node doesnā€™t store any historical chunks, only new chunks, and fills up as new PUTs arrive. There is no redistribution process when a node joins (redistribution only happens when nodes depart or sections split). Iā€™m not sure if this is feasible or not, Iā€™d need to explore it further.
  • Elders give the new node a list of all the close chunks to that node (ie all the chunk names that the node is required to store to satisfy the minimum chunk redundancy). The node is responsible for filling itself by doing a normal GET for all those chunks. Periodic audits ensure the vault has done the work and the level of redundancy is correct.
  • Nearby nodes could push chunks to the new node, rather than have the new node pull them.
  • Maybe some other ways are possible?

Zooming way out, it seems like the disallow rule and join queue etc originate from ā€˜responsibility for redundancyā€™. How can the network ensure redundancy and detect failures of redundancy? If joining is too rapid the degree of redundancy becomes unclear and might put data at risk when chunks are poorly distributed (there could be any number of reasons for chunks being poorly distributed - maybe lots of nodes are suddenly departing, maybe some nodes become overloaded and laggy, maybe redirection due to full nodes becomes extreme).

It feels to me like elders are in the best position to manage the redundancy. Maybe thatā€™s not necessarily true, perhaps redundancy can be managed in a less strict or controlled way?

Maybe instead of looking at the potential damage from farming pools and how to avoid the damage, maybe we can say if farming pools are a natural consequence of unavoidable network friction how can we incorporate them into an intentional network mechanism (btw I donā€™t think farming pools are inevitable).

6 Likes

Donā€™t attribute this one to me :sweat_smile:. Itā€™s just my current understanding of what the intent was for a defense mechanism based on bits and pieces I picked up here on the forum from dirvineā€™s descriptions and others. The ā€˜queueā€™ might just be nodes at age zero, but Iā€™m just guessing.

1 Like

This is actually how it works now, since
earlier this year.

Iā€™m also a bit ambivalent to it. On one hand chunks are not at the closest node, on the other hand thereā€™s no need to sync data when joining.

Wrt join queue, naively it looks to me like the rate of inflow to network is controlled, but it does not prevent the queue from being flooded, which means the network is deprived of inflow of good nodes. Anything it adds will be an attacker.
I havenā€™t looked at that area though, for detailed solutions.

6 Likes

The idea of allowing new nodes only when X% of nodes are full seems like it could possibly give us some grief. Filecoin has provided us with a very useful experiment demonstrating why there may be some troubles.

In the first 50 days of the filecoin network ~1 PiB has been stored (source). ~1 EiB is available for storage (source).

There are currently 786 filecoin nodes.

This would give 1.3 TiB storage per node (1024 TiB / 786, assuming the 1 PiB figure includes all redundancy).

745 out of 786 (94.8%) of filecoin nodes have more than 1.3 TiB of storage (thereā€™s a list of node sizes here).

217 out of 786 (27.6%) of filecoin nodes have more than 1 PiB storage and could store the whole of filecoin data.

If we took filecoin storage distribution as it is now and applied Safe Network rules to it, it would take a very long time before any new nodes would be allowed to join.

Thereā€™s some considerations for this comparison thoughā€¦

Storage on Safe Network could be cheaper than filecoin so it would fill the spare space faster and reach an equilibrium sooner. This is fine, I accept the reasoning, but filecoin is already usually 20x cheaper than major cloud storage, so why does filecoin see only 0.1% storage utilisation? Iā€™m not convinced that being cheaper means weā€™ll achieve better distribution.

If the top filecoin nodes broke their single massive nodes into many smaller ones they would have most of them not allowed onto the network. Iā€™m not exactly sure how the logic goes, but with 99.9% unused space virtually all nodes on the network can continue to accept new data for a very long time. Doing a maybe over-simplified analysis, letā€™s say 0.1% full in 50 days means it would take another 50,000 days to fill the remaining 99.9% of storage (assuming no additional storage came online). Thatā€™s 136 years of spare capacity.

Filecoin has separate mechanisms for storage space and uploading and pricing, which allows a lot of spare space to come online very quickly. Safe Network doesnā€™t have this, it links pricing and spare storage space and uploading all in together. So Iā€™m not sure how the difference in pricing / storage / utilization functions for these two networks will show themselves in the real world.

Weā€™re really lucky that filecoin has shown us the utilization rate. If weā€™d only seen the uploads of 1 PiB in 50 days weā€™d say ā€˜nice work filecoinā€™ but we are lucky to also be able to see 1 EiB of unused space which gives us some real head scratching to do. In our network we wonā€™t get to see how much spare space there is, so we have no idea how long it might be until the 50% full nodes mark will be reached.

My main worry is (to use an exaggerated example) if we end up with the top 10 nodes of filecoin as the first 10 nodes in Safe Network (between 23 PiB and 71 PiB in size) weā€™ll be waiting a very long time for new nodes to be allowed to enter the network because it would take a very long time to fill 50% of those nodes.

Should we be worried about the huge amount of spare storage out there hindering growth and node membership? Iā€™m not sure but filecoin makes me feel we should consider things carefully, they have a really huge amount of spare space.

13 Likes

This data supports what we see in other decentralized storage networks.

There is no demand from end users for such product. There is no demand from business customers for such product.

Of course we have a better product and we will show much better results. But there is one serious but. Our results will not change the fact that there is no demand for such a product from millions of users.

This means that we have to prepare for years if not decades before our network replaces the old internet. During these years, the greatest danger will not be someone putting big farms into our networkā€¦

The biggest danger will be someone making a copy of our network without us. As Mav said, the unique information in our network is our tokens. Copies of Safe will try to steal the value stored in the tokens.

How?

An easy way is by using part of the inflation in their network to allow specific groups of people to upload for free - YouTube content creators, Spotify copies, etc. This will allow their networks to be used by people and attract new farmers.

If we want to be competitive, we must provide an option to use part of our inflation for free uploading of data in our network so that new farmers can be attracted to us and not to the foreign networks.

2 Likes

The other unique thing is the personal data we all will be keeping about ourselves that cannot be copied to a new network by any copycat.

This is your

  • personal backups
  • App data. Such things as game, assignments, documents, preferences, ledgers, etc
  • Attached (mounted) drives that is actually the Safe Network data. This is useful for using different devices yet having the same ā€œdriveā€ attached/mounted. No longer is it a limited sized USB drive you have to lug around and mount on one device at a time. But it is one that is open ended in size limited only by tokens to store more data.

It is this sort of data that APPs (including programs used on PCs/apple now) will be storing in your private area.

This cannot be copied by a 3rd party

7 Likes

Of course, but this information is not expensive and can easily be copied to another network.

Even more so, if as the data show, it is so small in size and therefore cheap to copy by the ownerā€¦

2 Likes

Would it help to limit the maximum volume of a vault? In the beginning it could be like 100GB - just to throw a number there, I donā€™t know what would be good in reality. Then it could be increased in proportion to the network size so that we can allow larger nodes later.

Iā€™m not sure if this is effective, since the storage seems to be so cheap, that the price of upload may not be significant factor in any case. Or maybe it is, but Iā€™d like to see some calculations first. On the other hand, the act of paying for uploading is in itself going to be a some degree of a barrier, so if you could make that barrier invisible, then it could have real effect, I suppose. I think free vs. very cheap is in this case way bigger difference than very cheap vs. very expensive.

Also I think that Safe Network should not be marketed as a (perpetual) storage in the beginning. At least I personally would not do it, just because it will take some time to see if this thing flies or not. I defenetily would not recommend it to anyone as an ā€œonly backup you will ever needā€.

From technical point of view the storage aspect is the main thing, but as a use case for the end user, it is not, at least not in the beginning. (Well, OK the data has to be somewhere, but I hope you know what I mean. I donā€™t consider the server holding my homepage files as ā€œstorageā€, in any serious capacity.)

By the way @mav, I agree that those Filecoin numbers sound huge, but I donā€™t really have anything to compare them with. Would you care to dig some numbers, maybe something like volume of torrents, volume of whole internet, or volume of onion sitesā€¦

3 Likes