Next step of safecoin algorithm design


#1

This post is about a possible safecoin algorithm that could be a next step in design following RFC-0004 Farm Attempt and RFC-0012 Safecoin Implementation (both from 2015).

The main new idea it introduces is ‘target parameters’.

Summary

Putting the end at the start, here are the targets I propose:

100GB target vault size, doubling every 4 years (but gradually, not stepwise like bitcoin halvings)

14 days between relocations (on average)

50% of coins issued (still 2^32 max, but always moving towards 2^31)

Targets In The Real World

There are some fundamental ceilings on the ability of vaults to run on the SAFE network.

For example a very large average vault size of 1 PB is not a beneficial size because right now it would take a specialised computer to run a vault that large, and extremely sophisticated networking to initialise and relocate vaults.

However vaults that are extremely small are also not beneficial since this decreases the performance due to increased overheads.

Likewise issuing safecoins at a rate that’s extremely rapid or extremely slow is not helpful to the growth of the network.

And making it very expensive or very cheap to store data is going to cause adverse side-effects on the sustainability of the network.

This post explores some realities of growth that may feed into the design for the incentive structure of safecoin.

Background and Assumptions

This topic requires the consideration of a value system, as in, what do we value and care about? I don’t believe it’s possible to create an unbiased network - any design will reflect (intentionally or unintentionally) some sort of value system.

  • Who should be able to run a vault?

    • 90% of internet users?
    • 50% of the whole world population?
    • 80% of all governments and companies?
    • The top 5 chip manufacturers?
  • What commitment should be considered the minimum for viability?

    • 50% of the spare space on a consumer grade laptop and internet connection running 8 hours overnight?
    • 90% of the resources of a commercial datacenter?
    • An internet connection that’s in the top 30% fastest in the world?
    • A single high-end desktop PC?
    • A mobile phone?
  • What is a satisfactory client experience on the network?

    • Download a 4K 2h movie within 1h?
    • Upload all my photos for $50?
  • What is the likely future growth rate?

    • Of supply of bandwidth, storage etc? eg will Moore’s Law continue, how rapidly will developing nations improve their infrastructure, how will the technology inequality gap grow or shrink over time…
    • Of demand to store data?
    • Of demand to retrieve data?
    • Of demand for computation rather than storage?
    • Of improvements to the security and performance of cryptographic algorithms?
    • Of participation in the network as a vault and / or a client?
  • How are changes to network parameters managed?

    • Is it managed by setting a predictable growth rate (eg bitcoin supply) and fixed targets (eg bitcoin block time)?
    • Is it managed by voting (eg monero dynamic block size)?
  • What security is acceptable?

    • What is punishable and how harsh should the punishment be?
    • How do we compare the desirability of geographical distribution vs performance due to latency?
    • What error rate is acceptable, for chunks, for vaults and for the network as a whole?
    • What are the various aspects of centralization and what is an acceptable degree of centralization?
    • What degree of confidence should we have in various aspects of the network?

There is not necessarily any one right answer to this. It depends what we value and which aspects best express those values. There are always trade-offs. However some solutions express our values better than others. This thread tries to eliminate some of the obviously ‘wrong’ solutions that are outside the scope of current physical constraints. From within those constraints we can try to shape a useful network incentive structure and debate what we value and how to create it.

In this post the main assumption used is a 50% inclusion rate of normal internet users, but that’s just to work the numbers and is totally open to debate and change.

Growth Items

These aspects of vaults are expected to improve over time and the network should be able to adapt to these changes.

  • Storage
  • Bandwidth
  • Computation
  • Algorithmic performance (ie software improvements)

Storage

Affects: Maximum vault size

I’m confident that neglecting to set a target on vault size will lead to very large vault sizes, centralization of farming operations, and most importantly would lead to unintentional exclusion of participants. Allowing this to float freely would be dangerous to the network and it would eventually arrive at an unstable point where it would no longer automatically correct itself and lead to unstoppable centralization.

Currently the maximum ‘normal’ desktop computer storage is 240 TB (10 TB drives × 24 SATA ports).

Currently the average ‘normal’ computer storage is 1 TB (a median priced laptop from a retail consumer electronics shop).

Not all 1 TB would be allocated to SAFE vault storage, so conservatively let’s say 10% or 100 GB is acceptable for the user to commit to a vault. (Also the granularity means users can run 5 vaults to consume 50% of their drive if that’s what they want to do, but a 500 GB target size excludes any users that only want to use 100 GB of their drive).

So I propose a starting point for the targeted vault size of 100 GB.

We can expect storage to increase at a rate doubling every 15 months. This has been true for decades and seems reasonable to assume will continue for decades more. The network may automatically adjust the target vault size every so often to stay in line with the changing availability of technology. More discussion is needed about how this target would be updated.

Control over vault size can be achieved by loosening the restrictions for new vaults entering the network when average size is above 100 GB and tightening restrictions when the average size is below 100 GB. The exact mechanism is open for discussion, but I think there are several ways to achieve this which fit into the existing design.

I would love to hear other perspectives about the idea of targeted vault sizes.

From this premise of vault size targeting there are quite a few natural effects that arise.

Bandwidth

Affects: Churn rate

The median internet speed is currently around 7.5 Mbps (source) with the 90% point of the market being around 3 Mbps.

We can expect bandwidth to increase at a rate doubling every 48 months

Consider the previous target vault size of 100 GB. Joining or relocating a single vault would take an average user 32 hours (using a 7.5 Mbps connection to download 100 GB). Is this acceptable? It depends on the desired proportion of time the vault spends doing churn vs not doing churn.

If it takes 32h to complete a churn event and if we aim for 90% time-not-relocating it would mean relocating roughly every 320h (approx 14 days). Depending on the desired portion of time allocated to relocating we can work out a desired churn rate. (Factor in less frequent relocations due to age etc and it becomes more complex, but it’s just a matter of managing the maths, the idea remains the same.) The point is the ‘value system’ of inclusiveness underlying the relocation mechanism should be based on what is desired and practical both today and into the future based on a) desired inclusiveness b) bandwidth availability and c) desired portion of time spent relocating.

If relocation is too frequent it may unintentionally mean excluding participants due to bandwidth constraints and this could affect the growth of the network.

But considering the rate of increase of storage (fast) vs the rate of increase in bandwidth (slow) the target size of vaults cannot grow too fast or churn will take longer and longer into the future. So the target vault size discussed above should probably also factor in bandwidth and time to relocate.

How is this rate controlled? Some churn is uncontrolled, eg unexpected vault departures. Some churn is controlled, eg allowing or disallowing new vaults, punishing / evicting vaults, design of the relocation algorithm. I don’t exactly know how the control mechanism would be designed to aim specifically for 14 days but I’m sure there are ways.

For further consideration:

  • What is the experience for the slowest 10% of viable participants going to be like?
  • How does relocation affect the ability to earn safecoin? Would relocation be considered ‘downtime’ or not? Is the 32h forfeited time or merely turbulent time?
  • How might 32h of continuous maximum bandwidth consumption affect the participation incentives and dropout rates due to inconvenience?
  • How do we get an accurate idea of bandwidth availability and distribution?
  • How do cascading relocations affect this calculation?
  • How does the reduced frequency of relocation due to ageing affect this calculation?
  • How does the rate affect the security of the network, eg sybil attacks?
  • How do different vault capabilities factor in to the target relocation rate, eg archive nodes

Computation

Affects: Simultaneous vaults per machine

If the maximum ‘normal’ desktop computer storage is 240 TB and the average vault target size is 100 GB this means a power user may run up to 2400 vaults on a single machine. However, is this feasible considering their need to verify signatures and perform other computations to maintain their value to the network?

For now I’m not going to consider this. We know that ASIC can provide orders of magnitude increase in performance and efficiency and that it can happen in relatively short timeframes. But it’s worth putting here since it’s an avenue to centralization and opens the door to unintentional exclusion of participants.

Price To Store

Affects: Coin recycling

We have a desire to keep vaults at a specific size and to churn at a specific rate to ensure hard drive space limitations and bandwidth limitations remain within some inclusive range.

There are some issues with this though.

Imagine if uploads suddenly increased. This would put a strain on the network by requiring either faster churn rate (so more vaults are allowed and vault sizes can stay relatively stable), or it requires larger vault sizes (so churn rate can stay relatively stable).

In that scenario, where vaults are churning more rapidly than the target or are getting larger than the target, the price to store can be adjusted to reduce the upload rate. It would be preferable to increase supply rather than reduce demand, but the limitations on storage and bandwidth mean supply cannot increase instantly (unlike price which can increase very rapidly if needed!). So there must be some control over demand, and that is done by adjusting prices. Eventually supply catches up and prices can be lowered again.

Likewise if the upload rate slowed dramatically either there would be excess bandwidth available, or there would be excess storage space available (or both), so the price to store can be reduced to encourage more participation.

This should lead to efficient price discovery for storage based on demand, and should allow growth to happen at the most efficient rate for the desired degree of inclusion.

Storage price is set by the degree of variation from the target vault size and relocation rate.

Reward Schedule

Affects: Farming

The network uses safecoin to ensure the sustainable operation of the network by manipulating the behaviour of participants. Growing too fast is harmful due to the overhead of churn. Growing too slowly limits the usefulness of the network.

Since safecoin can be both created and destroyed, the maximum flexibility for control via safecoin is when half of all coins are issued. At this point there is the most potential to both create and destroy coins to drive future behaviours. Operating the network with very few coins to spend or very few coins to reward reduces the ability to motivate behaviour in the restricted direction.

Just as bitcoin tends toward 21M coins I propose that safecoin should tend towards 2^31 coins (retaining the maximum of 2^32). However, unlike bitcoin which increases at a predictable rate, safecoin will fluctuate around the target amount depending on the magnitude and imbalance of activity on the network throughout time. If someone needs to do a lot of uploading, there’s lots of coins around to facilitate that. If there’s a spike in participation there’s lots of coins to issue rewards and make it worthwhile. But if the network has supplied 5% or 95% of coins there’s less ability to cater to changes in supply or demand.

This mechanism is very similar to the existing idea of Farm Rate but differs in one important way. The existing method tries to measure and control spare resources, whereas this method tries to measure and control existing coins. I don’t see measuring spare resources as possible or desirable, especially considering this proposal is structured around a fixed vault size (chia network is the only network I know of that’s seriously trying to address the problem of measuring spare resources, and so far they haven’t come out with a solution although I am really closely following for when they do release something).

Some questions for discussion are:

  • when there’s a deviation from 50% how much correction should be applied?
  • should the return back to 50% aim to be within a certain time (eg within 1000 blocks)?
  • should it be possible to always be at a certain supply, eg 60%, or should there always be corrective measures?
  • how much normal variation in the supply-to-demand ratio is to be expected? What’s a once-a-month peak event look like? Once-a-year? Once-a-decade? How do we design for these?
  • should the changes to rewards (supply) and recycling (demand) be directly connected to each other or be allowed to float?
  • is 50% truly the ‘most effective’ portion? Or are the demand spikes greater than the supply spikes so it should be more like 70%?
  • can the current supply of coins actually be known and how much error is there likely to be?
  • how does the initial state of the network (15% initially allocated) work?
  • which behaviours should be rewarded and which should be punished?

Unused but still interesting ideas

Voting by vaults for changes to network parameters. It’s such an interesting mechanism and has a lot of game theory to consider but I can’t see how it doesn’t lead to voters using their power to cause centralization and exclude participants.

Totally freeform floating parameters. I like this for the elegance but think it will be used to constantly and gradually push the bottom participants out.

Enforced geographical distribution. Is it possible? I think so. Is it worth the overhead? I doubt it.

Network Health Metrics (spreadsheet and discussion). These are still important but it would be nice if they didn’t have to be explicit in the algorithm, rather happen as a natural effect of the algorithm.

Is degree-of-inclusion the right basis for this? I believe so, considering decentralization is largely a question of inclusion. If decentralization is removed then the safe network is just a complex centralized solution. We could say that performance or efficiency is a better basis to design the safecoin algorithm around, but it’s inevitable that it would reward centralized solutions since they provide the best performance and efficiency. So there’s a lot of room to debate the basis, but my belief is the design should be primarily around the desired degree of inclusion.

Summary (repeated)

The targets I propose are:

100GB target vault size, doubling every 4 years (but gradually, not stepwise like bitcoin halvings)

14 days between relocations (on average)

50% of coins issued (still 2^32 max, but always moving towards 2^31)


#2

FYI Moore’s law never applied to rotating disk growth nor to bandwidth. Also it was never intended to describe their growth rates. And now that 3D chip design is coming in it no longer will describe transistor densities

While transistor densities growth is slowing down the same cannot be said for rotating disk or SSD and certainly not bandwidth. While SSD is transistor based it has only in the last 2 years been focused on for major development and thus major advancements are due this and next year. Also it is moving the way towards 3D chips and Moore’s law cannot apply.

Anyhow thought I might mention that. No charge :rofl:

Gotta say I’d love to see 240TB on a “normal” desktop :thinking:

At 12TB for the largest drive for desktops that would be 20 drives in your “normal” desktop.


Good post and plenty of food for thought there. My only comment at this time is Keep It Simple should be an underlying theme in any future proposal.


#3

ASRock X99 has 18 SATA ports, then add expansions… it was more of a ‘rough maximum’ than a strict upper limit!


#4

Brilliant stuff @mav This is going to be a huge discussion/debate and this is perfect timing as well. I feel this one cannot be rushed and should take a long time with a lot of thought. We need to think deeply about this again and make sure we give the network the best start. Thanks again

I wonder if these two are not more inextricably linked. The old method did try and measure spare resources, by creating backup chunks (sacrificial) and then ensuring they were stored. When nodes started to not be able to store them or delete them to store the primary copies then the group would start to adjust farm rate. So as far as disk it did measure to an extent spare space in a way we could be pretty sure of. With penalties in place, a vault not delivering chunks would be penalised to help the network see what the more realistic stress level of the network is. This delivery of chunks takes into account the machines space (or capacity to have held the chunk) as well as CPU and bandwidth. The nice thing though is we do not measure each independently, but instead we just measure the goal, could the vault do its job

In terms of coins supply, this floated via increased put costs (higher recycle rate) when space was low to encourage new farmers by increasing the farm rate.

So if we look more at coin supply more as the metric then we need to still have a measure of how stressed is the network and increase put costs when that level is high and reduce when it is not stressed.

I cannot see it from the side of coin availability just yet, but like the notion of diving very deeply into this part. I suspect as we make the algorithm as simple as possible we will see it takes care of multiple measurements at once. So I feel the solution will involve a grab net of stuff we want vaults to do, then finding a single measurement that tells us it is doing all of the stuff we want.

Again Ian, really nice timing here.

I am struggling with this part if we consider node age as it will mean the relocation time is exponentially increased as a vault ages. So I feel we need to define more what we feel on average could mean in this respect.


#5

As far as coin supply was concerned then this was actually in the RFC. As the existing coins increased then the success rate of coin issuance attempts decreased. And was done in a real simple way.

I had a thought of how to measure if the vault was being truthful about the capacity and that was to have each vault in the section store the same randomly generated chunks and then request hashes off the vaults of randomly selected parts out of each chunk and if any vault disagreed with the consensus of the hashes then that disagreeing vault is penalised. This way it could be done at various times to keep tabs on the vaults and if any are misbehaving in relation to their storage.

Obviously there has to a way to handle varying sizes of the vaults but this idea allows for once only bandwidth to the vaults and very low bandwidth for checking hashes.

So then the section can know what storage is reliably available.


#6

Yes, I agree. This is similar to the sacrificial chunk idea. Where sacrificial chunks existence can be confirmed and they are also useful (kinda like pre archive nodes) as they meant the network had more copies, but importantly in different sections. Your idea and this are similar in that they measure “was space available” as opposed to “can you promise space is available” and I feel the former is the correct route, so that is good :+1:.


#7

Another excellent post. I’m always impressed by the way @mav is able to gather up all the loose ideas, vague concerns, forum flame wars and cans being kicked down the road and combine them into a framework of concrete proposals and considerations to move the project forward.

Managing the trade-offs between eligibility to earn Safecoin, network performance, economic viability, security, the desired wish to be as inclusive as possible and avoidance of centralization is like wrestling jelly. My own vague concern is that once the autonomous machine is in motion it will be hard to reset the parameters should any of these factors go out of whack as the proving ground will be the live network and hard to replicate in a testnet.


#8

This is, basically, the old zero-knowledge proof. Use a nonce and a chunk and the vaults must answer with the hash (nonce+chunk). If some Vaults fails are punished. With this proof we make sure that the Vaults have the data.

A couple of old ideas. Some time ago I proposed to use the velocity of money to calculate the FR. @Seneca reached similar conclusions by another way.


#9

Blimey @mav you have put together the mother of all brain crushers. I like both your ‘here are lots of ways to think about this, and the things we want to try and optimise’ approach and David’s, ‘try to distill it all t town into something that does what we want and can be measured’. Is distilling things a Scottish trait I wonder? :tumbler_glass:

They seem contradictory but I think are both important.

Anyway, the only comments I have atm are:

  • what uptime do we target (down for no more than a minute in any day, down for no more that one hour in a week etc)?
  • targets are all a bit either or, and so I see much of this post about getting us to think about the problem rather than as meaning we need to come up with a design that has these factors in it. Imagine the regression tests to be written! :wink: For example, we could forget about the role of mobile phones and miss out on their potential contributions (eg fast local short term cache, or as offloading routing tasks from vaults for the short periods of the day they are online, etc)

I really like what you bring to this project @mav. Thanks. I hope you will keep doing what you do :slight_smile:


#10

I really like what you bring to this project @mav. Thanks. I hope you will keep doing what you do

Seconded - Hear hear!


#11

Kryder’s law does.
But actually it is irrelevant, both describe similar growth rate in both industries.


#12

Agreed. The ideas in this post came out of at least a year’s worth of thinking for me. I’ve pondered this topic really heavily! It deserves a lot of careful and deep discussion and is one of those really fascinating ideas that’s hard to drop once it takes hold.

I really liked the sacrificial chunk idea. But the punishment design is quite tricky. I’ll try to think of how to express my ideas on it but right now they’re a bit too muddy to write about. It feels gameable to me.

One thing that sticks out for me in RFC0012 is there’s no description of the punishment for not storing sacrificial chunks. So to maximise farm rate all vaults would simply store no sacrificial chunks. There’s no punishment for it.

I think the measure of stress using difference-from-target-vault-size and/or difference-from-target-relocation-rate is more honest and simpler to measure than ‘spare resources’. But I look forward to the discussion and maybe changing my opinion.

This is definitely the right way to go, but my bias is going to be ensuring simplicity doesn’t cause unintentional exclusion (ie centralization).

Sometimes ‘less parameters’ is not actually ‘more simple’. Especially if any one parameter is coupling several different incentives into the one control mechanism, it becomes extremely complex to analyse despite the simplicity of implementation. Ideally one or two mechanisms could control it all, I’m sure we’ll find a good design, but so far I don’t feel like it’s possible to house all incentives under one roof.

Relocate every 14 days is an oversimplified placeholder for the broader concept of Relocate vaults aged A every N days:

  • At some age the relocation rate is targeted to be 14 days, let’s say at age 4.
  • Therefore age 5 is ideally relocated every 28 days, 6 every 56 days etc.
  • Age 3 ideally relocates every 7 days, age 2 every 3.5 days and age 1 every 1.75 days.

This means that the newest users are expected to join a section (ie download 100 GB) and then 1.75 days later do it again, then 3.5 days later again etc. If it takes most users 1 day to download 100 GB then infancy will be a pretty brutal time for new users.

The overhead of relocating (especially during the infant period) should not be too high that it excludes users.

Having very high vault size makes infancy very difficult. So does having a very high relocation rate. So I think they should be targeted to a sensible value. But on the flip side, constraining these two parameters is in conflict with what a rapidly growing network would need. So we need to make a compromise, and I think deliberate targeting is the best way to do it.

Whether the churn rate is decided to be fairly slow or fairly fast is a matter of engineering the numbers; what I want to convey is the principle of not making infancy so hard that we accidentally exclude participants we would have liked to included. I think a totally flexible non-targeted relocation period is dangerous.

But… if infants are allowed to store less than the full amount of chunks then churn rate is less of an issue. I’m not sure how that would work though.

Yes this is a really elegant mechanism. The proposal in OP doesn’t do away with it, just makes it more explicit.

This is a big topic. I’m not sure if varying sizes of vaults is feasible. I suppose the algorithm for deciding precisely ‘which chunks live in which vaults’ is the main thing that needs tweaking.

I’m not sure about this. If the sacrificial chunk is requested from the primary section then presented as ‘being available’, it’s not a good test. This is the same problem as the grinding attack / time inversion as explained in chia network (see Talks and Papers). You can put a time limit on the response so they can’t fetch it and pretend to have it, but the variation in network performance across users means faster vaults may still be able to cheat (possibly counteracted by having to traverse the network which should average the speed out and remove individual advantages). Maybe I’m being too binary / fixated, but I’m not convinced that sacrificial chunks can measure resources / stop cheating.


#13

Since the 1980’s though the rate has been rather constant at approx 10 times every 5 years, only slowing down in more recent times. SSD though is going through a massive rate increase right now now and if industry insiders are to be believed then 10 to 40 times increase over the next year or two in SSD. Also SSD is becoming a 3D technology and as such those observational rates will be meaningless.

Although it does have relevance for this discussion since we need to be mindful of the future potential growth in these areas.

Bandwidth is more difficult to apply any observational laws to even though some have. I have been in telecommunication comms since the 70’s and there is different aspects to the industry and different sections have seen different growths. Partly because of commercial reasons and the desire to milk a particular technology for all its worth rather than progress. But the technology is moving ahead much faster than storage and has been for a long time. The advantage is that packing more channels in a single “buried” cable is improving all the time and then each channel is improving. Even undersea cables (eg the southern cross) are upgraded from time to time with massive improvements. When the residential bandwidth usage starts increasing then we will see the industry move to improve their bandwidth.

While it might be difficult, I think its is of great importance. We cannot expect people to be providing a set size if we expect to have vaults in all sorts of devices so that people can use spare resources.

The 8 copies can be allocated to 8 vaults even though there are 30 or more vaults in a section. This would mean that various sizes of vaults is possible since you can have more of a mix-n-match situation. i.e. when a chunk is to be stored then you only need 8 vaults to have enough room for the chunk to be stored and a new vault might actually be one of those.

Maybe it might be better to have a “unit size” and vaults are made up of multiples of those units. This might make the allocation of which vaults are to store a chunk easier. In other words there is a minimum vault size, but it is still relative modest. Maybe in terms of today, 2GB might be the “unit size”

One way might be that a (sacrificial) chunk is stored with a crypto value unique to that vault and is used as salt for the chunk. That way it cannot grab the chunk from another vault and present it as if it came off their vault. A shared crypto value that requires the section to solve.


#14

Thanks @mav that now makes more sense.

I agree, but also there is a tension there as relocation could also be part of the ongoing resource proof. i.e. can the node handle this section it is moving to?

I think this is something that we actually do need to solve at some stage, It could be infants only store the gossip graph, the 5-year-olds only add in handling client requests, then 6-year-olds, add in mutable data mutations etc. (this is a simple not workable notion but an idea similar to this could help a lot).

Again I agree to an extent, I suspect we all will with many aspects of this one. The issue with target-vault-size that I currently feel less comfortable with, is us setting a size. It may be currently unavoidable (for launch), but would love to think more on this part. The beauty of it is that gives us targets for resource proofs etc. that are more accurate. I wonder if a parallel thread would help with this part, as it is key to the safecoin RFC really, but maybe an RFC in it’s own right?

Cannot agree more here and expect this on to go on for several months with quiet periods, well for me anyway, as these are the really tough parts to get right, but will be seen as oh they just guessed and that part is simple :wink:

I wonder at times if this is not a situation where there is a multivariant existing algorithm though or if it really is separated concerns. I am attracted to a single measure that allows us to say the vault is good enough, even if that measure is continuous monitoring of a nodes ability to do as clients ask.


#15

One other thing I’m thinking a lot about (but making little headway) is ‘why not have tiny vaults’? I’m biased toward thinking we should aim to make vaults larger if possible, not smaller if possible, and they should grow with time instead of shrink, but that bias is not based on any hard facts, only on assumptions.

What are the real reasons not to have hundreds of 50 MB vaults per machine?

A network with many small vaults would have longer routes (which probably decreases end-user performance), but is it decreased by much? Maybe it’s not?

How much extra overhead is there to run many tiny vaults rather than one mega vault? In theory the number of chunks is the same so the overhead may not actually be that much more. Routes become longer but if they’re only 0.1% of workload and chunk delivery is 99.9% then you could run very long routes without much consequence.

The benefit is that relocations would be extremely fast, failure of a vault is almost meaningless to operators and the network, and disk consumption of vault operators can be very accurately controlled.

Not suggesting we try to make tiny vaults work, just trying to challenge the status quo and think from genuine first principles.


#16

What if vaults could come in a few fixed sizes? When a new vault is allowed to join, it would start out small (as the section decided, for example 16GB) and at each step it aged it would be allowed to double the size. This way, young vaults would not be given much responsibility but older and more reliable vaults could hold huge amounts of data.


#17

One is that for the home user the router box they have will most likely not handle many hundreds of vaults. Just the PARSEC + Vault functions will likely have a few open connections per vault/node and many routers today go slow with over 500 open connections and die at some higher value of connections.

Another is that there is caching being done by the node/vault and this equates to bandwidth usage.

Another while related to caching is the hopping of chunks through the node/vault and if you have hundreds then the bandwidth required for the home user is very much higher than for just one or few vaults/nodes.

And need I mention the messages between nodes in the section that each vault connects to and each uses connections and bandwidth that will have some impact on open connections and bandwidth

Just like the “unit size” I mentioned above, and good idea of using age to limit the vault size while young.


#18

Sorry, I didn’t get time to catch up on everything in the thread yet. I mentioned something like this on the weekly update thread as well but I think nobody noticed haha.

Exactly my thoughts about Safe vaults this morning when I had to restart my cable modem again. I left bittorrent on overnight and DNS packets stopped going through the modem. It does this all the time. We’re well into the 21st century and yet the state of consumer networking appliances is still hovering between “terrible” and “useless”.


#19

A non-binary way of thinking about it is turning it into a race. Nodes take turns requesting a chunk and whoever responds first gets a point. Keep the last N results and punish whoever gets much below the average win rate. Maybe analyze the results with a Dirichlet distribution for more sound statistical properties.

This would punish nodes with high latencies (maybe it would even push sections to be more geographically correlated? could be a problem) and it wouldn’t solve cheating either, but it is not so important as long as they are pushed to the fringes enough that network could keep functioning well.


#20

maybe it would even push sections to be more geographically correlated? could be a problem

Indeed. Forced guarantueed geographical seperation could be too difficult or not wanted (poor effort from me
here). But the algorithm that ‘reviews’ the latency should at least try to not promote geographical correlation too much.