Fixed target vault size

Fixed Target for Vault Size

What would be the benefits and drawbacks of having a fixed target for vault size?

This post aims to explore the idea. I’m personally mildly in favour of it over a floating vault size, but am really just looking to explore the topic.

The Idea

Vaults have a fixed target size; I’m going to use 50 GB throughout this post as the example target.

The target is not a strict boundary. It’s used as a yardstick by which other operations are performed, such as how to adjust storecost.

Why

Network size increases proportional to the amount of data stored.

This may not seem important, but consider the reverse of this, where network size has no relation (or very little relation) to the amount of data on it. The amount of overhead (mainly from routing messages) would depend on some other constraint of the network (eg maybe the amount of spare space, maybe the economic algorithm etc). There is probably a very long and complex conversation to be had about this point.

Joining and relocating has a fixed cost (in terms of bandwidth and storage, not necessarily dollars).

This is important because if vault size becomes very large then it restricts who can take part in the network. It could get to the point where joining or relocating is impossible unless you have an industry-grade network connection or disk storage. This might be acceptable or even desirable (mainly for end-user performance reasons) but I think running a vault should be possible for many users. One way to ensure this is by making the startup conditions achievable by setting a fixed size. The difficulty changes of bitcoin mining and constant increases in hash efficiency has made mining ‘industry only’; an unconstrained vault size (probably tending toward large vaults because of demand for ‘efficiency’) would lead to industry domination of farming in a similar way. Again, a very complex conversation could be had here.

It improves the network structure.

I’m not sure about this… Having a fixed vault size, like having a fixed chunk size, allows better understanding of the effect of events that affect the structure of the network. Having very large vaults compared to having very small vaults would require different responses when a relatively infrequent event happens, say, 10% of the network dropping out. Having a fixed vault size allows the potential responses to be better understood and accounted for.

It has a better ‘worst case scenario’.

The worst case scenario if bandwidth and storage continue to grow exponentially but vault size remains fixed is a bit like having a very small fixed vault size today, eg only 100 MB. The consequence is more routing overhead than the ‘optimum’ vault size that suits the current group of vault operators. I think this non-optimal-routing cost is ok, since a too-small-vault-size has the effect of expanding the potential set of vault operators. This also forces vault operators to focus on improving the routing efficiency (which is an open / common / communal problem) rather than vault efficiency (which is a closed / individual / private operator problem).

The consequence would be that individual operators would be required to run multiple vaults per machine. I think that’s better than having a floating vault size that leads to industrialised farming.

Fixed vault size suits the fixed chunk size better.

A fixed chunk size makes assumptions about the current state of networking and computing and storage mediums, but does not account for what may be ‘optimum’ in the future. This is an acceptable compromise, since many small chunks is not seen to be significantly problematic compared to few large chunks (possibly incorrectly). So the question is, if a fixed vault size is not desirable, why does the same not apply for a fixed chunk size?

I know there’s a pretty strong mindset here against magic numbers, and I agree on the most part with that. But I’ve never quite reconciled that with the ‘engineered’ magic numbers of, say, 1 MB chunk or 8 redundant copies or X section size etc. It’s a tricky situation no doubt about it.

It simplifies the economy.

The economy is loosely attached to the concepts of supply and demand of resources such as bandwidth and storage space. The ability to increase supply of network storage when vaults are 1 TB in size is much more constrained than if vaults are 1 GB in size. Fewer people can address the supply shortage for the large vault scenario. Adding flexibility in the supply improves the economics of the network.

It simplifies the membership rules.

Currently the proposed disallow rule for joining the network is based on the portion of full vaults. This is a very coarse measurement and during high stress events may not provide enough buffer. Alternative disallow rules based on the amount of spare space require pretty complex algorithms to measure the spare space. With a fixed vault size both these problems are greatly reduced. This is a bit of a brief explanation and there’s a lot of detail to dive into here, which may end up showing this perspective to be too simplistic. It’d be great to see more thoughts on this particular point.

Operational expectations.

Having many small vaults per operator requires them to approach vaults as ‘cattle’ rather than ‘pets’ (see this technology parable). This requires a certain degree of tooling and failure management from the start. Hopefully proprietary tooling and operations don’t become an advantage to large operators. But if we come to have a mix of pet vaults from small operators and cattle vaults from industry it might mean industry has a big advantage because they’re motivated to have superior tooling and failure modes than the pet level operators. Why not just make everyone work with cattle?

I admit I may be overdoing it here… There will always be areas for improving operational efficiency no matter if the network is pets or cattle or a mix. I think the topic of operations is a risk worth mentioning. It’s easy to take for granted how much work it is to run a bitcoin miner or safe vault. Lots of small vaults makes the operations even harder perhaps, but at least that difficulty is seen for what it is up-front rather than years down the track.

Proposed Mechanism

A network section allows a new node to join at any time, but only one after the other, not multiple at the same time (or maybe this could be some fixed number of simultaneous joins?). Any node trying to join while a new node is being accepted is turned away. This sets an approximate join rate depending on the bandwidth of the joining node (see modelling below). Note that the overall joining rate may be slower if potential new nodes decide not to join the network and the ‘join queue’ is empty.

StoreCost is adjusted every time a new vault is allowed to join, calculated by the difference between the average vault size and the target vault size. If the average is larger than the target size storecost goes up to try slowing down the upload rate. If the average is below the target size storecost is reduced as there is spare space to fill. If the average is close to the target, storecost remains the same. The exact amount of the adjustment is open for debate!

Rewards are set by the rate of nodes being turned away. During the period while a new node is joining, the number of nodes turned away is counted. When the node has finished joining, the rate of join attempts is calculated. If there’s more nodes trying to join now than previously, rewards are slightly decreased. If there’s less nodes trying to join now than previously, rewards are slightly increased. If it’s about the same rate rewards are kept the same. There’s a natural tension here for existing operators: they will naturally want to have more nodes join, but trying to join too aggressively will reduce the reward for their existing nodes. It also gives new operators the ability to be more aggressive in joining than existing operators.

In effect, rewards are intended to control the join queue size and storecost to control the upload rate. This is all sitting within a ‘temporal framing’ of the join rate, which is determined approximately by the average bandwidth of new nodes.

Open Questions

Does having a fixed target vault size affect the security of the network? Is a bigger network (more nodes) inherently more or less or equally secure than a small network? Is there an optimum network size for a given level of technological development?

Is there such a thing as ‘too small’ for vaults? Is this true always or only true once we have a certain level of technology?

What are the primary drivers for how a floating vault size evolves over time? Is it primarily economic? Technological? Evolving toward exclusion or inclusion? Evolving toward high performance or low performance?

What are the costs and benefits of a floating vault size and is it a preferable compared to the costs and benefits of fixed vault size?

Is there a good hardcoded value to use for a fixed vault size?

How does it affect the economic model for safecoin storecost and rewards? My first thoughts would be a) use reward amount to manage the queue size of vaults waiting to join (not too big not too small) b) use storecost to encourage more uploads but not too much that it’s stressful.

Modelling

Network Size

This modelling is based on 50 GB fixed vault size, 100 vaults per section, 8 copies of each chunk.

Small network, 1 PB of data (ie 8 PB of chunks), requires 160K nodes, 1600 sections, maximum 11 hops.

Medium network, 1 EB in size, requires 160M nodes, 1.6M sections, maximum 21 hops.

Large network, 1 ZB in size, requires 160B nodes, 1.6B sections, maximum 31 hops.

Those numbers seem reasonable to me. Maybe slightly on the high side but not a complete show stopper.

Variance

Because of the random distribution of xornames for chunks, not all vaults will store the exact same amount of data.

This is covered in the topic Chunk distribution within sections.

For 50 GB average storage, expecting a reasonable max/min size ratio of 1.5, the worst variation would be about ±10 GB (ie 40 to 60 GB), with the majority of vaults being between about ±4 GB.

Joining Requirements

50 GB download time (a relevant link here is List of countries by Internet connection speeds)

connection download time
1 Mbps 5 days
10 Mbps 12h
100 Mbps 1h
1 Gbps 7m
10 Gbps 43s

Growth Rate

Consider a network with 1600 sections × 100 nodes per section × 50 GB per node = 8 PB of chunks or 1 PB of data.

There can be up to 1000 new nodes joining at a time (one per section).

They will all need to download 50 GB which on a, let’s say, 10 Mbps average connection, takes 12h each.

So storecost and reward rate is adjusted approximately twice a day in every section.

The network would grow at 50 GB × 1000 sections = 50 TB every 12h, or 100 TB per day, or 1.25% per day, or nearly 100-fold in a year.

This sounds ok to me. Maybe the growth rate is a little too high, but if the join queue reduces then the growth slows, so fastest growth of 1.25% per day is roughly in line with some ballpark intuition for reality. Would be nice to explore this growth rate under various conditions a bit more deeply.

The larger the network the faster it can grow (more sections means more simultaneous joining possible).

The faster the average network connection the faster the network can grow.

Summary

I think fixed vault size makes a lot of sense and would create a more sustainable network than a floating vault size. The fixed size means it’s easy to understand the joining requirements and it’s less likely to lead to centralization and exclusion due to very large vaults.

There are some additional overheads introduced as a result, but I think the cost is not prohibitive and is outweighed by the likely benefits.

One of the main benefits of a fixed vault size is it would force vault operators to focus on addressing routing efficiency (which is an open / common / communal problem) rather than vault efficiency (which is a closed / individual / private operator problem).

As I said at the start I’m only mildly in favour of this idea so would love to hear your thoughts. This long post makes it seem like a well-formed idea, but really I’m just wanting to discuss it and am open to all responses.

30 Likes

I’m not educated enough on the topic to probably give valuable feedback but this stood out to me.

I think anything that prioritizes speed/efficiency and the overall network rather than single operators is good. Obviously we need single operators, many many of them but farming algo takes care of them and if this helps store cost then that would simplify things on that end too I would reckon, yeah?

Also a huge plus and absolutely necessary for maximum security and redundancy of data imo.

It seems to me like it would make much more sense to know metrics as opposed to not knowing so the real question to me would be is there any benefit at all to a floating vault size?

4 Likes

Arbitrary limits are like good intentions… where do you draw the line.

If limit size, then limit speed, then limit connection… safe is for everyone, is a great maxim.

The network should prefer what works best overall… and at different times, different capability may well be useful.

I don’t know if for example larger nodes might be very useful for any recovery action relative to random kinds of damage. The ability to recover from worst case is most important, for a network that promises forever.

How the network views different nodes is for optimising but there should be no absolutes. :thinking:

Edit: just thinking the counterpoint to this is how the network chooses to distribute data; so, perhaps it might choose its principal copy as widely distributed as possible. That if effect limiting the size of each nodes hosting of principal data relative to the size of network. Secondary, is some suggestions of that data will move towards where is it wanted, to save time on popular content and that might flux rapidly, so wanting reaponsive nodes. Thirdly then archive nodes perhaps would be larger.

8 Likes

Some nice and interesting effects of these ideas.

A few thoughts that came up when reading:

Attacks on reward with botnets. Do not need to be capable machines just increase the queue size.

Correlate internet connection densities with country internet speeds, to get a rough estimate of network performance.

On max network growth:
(Still simplified, but more accurate estimate I’d say.)
Start 1k sections of 100 nodes (splitting occurring at 2x section size),
2 joining nodes per section per day for 100 days

1k sections, 100 days, then split
= +10k TB
2k sections, 100 days, split
= +20k TB
4k sections, 100 days, split
= +40k TB
8k sections, 65 days, split
= +52k TB

1 year = +122k TB, or 122 PB = 15,25 times network size at year start.

1525% growth (rather than the ~9300% estimate of extrapolating 1.25% per day)

5 Likes

I like the simplicity of Storecost being influenced by the number of nodes seeking to join and being turned away, but as @oetyng points out, this leaves the network economics open to manipulation by botnets and ‘join farms’. Is there an example in other areas of technology where such an attack can be prevented without resorting to CAPTCHA’s etc.?

Secondly, why would managing multiple small vaults be harder than managing one big one? What are the processes that would add complexity?

5 Likes

I dont know if it can be an issue, what is the overhead of runnig xy vault instances instead of one (CPU,RAM, open connections)?

I can see the benefits of fixed size vaults, but I feel like the less hardcoded numbers in the network the better for the future. I can imagine network with fixed size vaults now, but there should be a way how to chage it if future shows its needed.

5 Likes

i was assuming that vault scaling will be/needs to be provided by the original vault software in any case. cause of:

  • cache sharing between vault instances: there’s only one cache storage used by all vaults running on the same machine
  • cache/storage sharing between vaults: vaults use other vaults storage as cache. when vault A receives a request for dataset F00 and vault B is the owner of that dataset, A responsds with F00 as if it was stored in A’s cache
  • share routing between vaults: A can pass messages to B if B’s xor-distance is lower than the distance of each routing entry in A’s routing table. (would this even work with reliable-message-delivery? i think it should cause there’s no need for the path to be deterministic for rmd to work?!)
3 Likes

Lately I was thinking about a similar system, using the Join Queue, for the StoreCost calculation. But I added another variable called the Dropout Ratio (the number of nodes that voluntarily* leave the network). Together with the Join Queue could give, in an indirect but simple way, the health of the network to calculate ,in a quite precise way, the optimal StoreCost.

Adding farm bots would only reduce the StoreCost so it does not seem economically beneficial. We would have to consider the possibility of dumping, trying to eliminate competition, although, if the node age determines the profit, it does not seem to be an attack that should concern us.

BTW, in this link we have more recent information about connection speed. Fixed Broadband has increased by 50% during the last year.

*Of course a node can leave the network for a multitude of reasons but I think we can safely assume that it does so because it is no longer cost-effective.

4 Likes

Really nice presentation on a challenging topic @mav. I very much like the idea of fixed vault sizes, but have trouble reconciling it with the different rates of progress in bandwidth vs. storage capacity and their relation to network efficiency. Stick with me for a moment while I consider a few “back of the envelope” numbers…

  1. Expected rate of storage capacity increases in the next 5 years is approximately 5X.
    https://hexus.net/tech/news/storage/123953-seagates-hdd-roadmap-teases-100tb-drives-2025/

  2. Expected rate increase for broadband in the next 5 years (assuming a previous 5 year trend continues) is 1.9X:
    https://www.cisco.com/c/en/us/solutions/collateral/service-provider/visual-networking-index-vni/white-paper-c11-741490.html#_Toc532256804

  3. Latency will essentially be the same (speed of light) over the next 5 years.
    https://www.igvita.com/2012/07/19/latency-the-new-web-performance-bottleneck/

Compound these changes out to 10 years and the picture changes even more in favor of ensuring that vault sizes are as large as feasible/possible with respect to bandwidth.

I think it’s important to look at the extremes for this discussion.

  • Case A : Single large vault per public IP address
    – Minimizes hops, minimizing routing complexity, minimizes communications overhead
    – Minimizes operator management complexity.
    – Maximizes “pain” if the vault goes offline unless the network allows it to rejoin and continue serving the chunks it once did.
    – Maximizes network logic required to allow a large vault to rejoin the network with the same chunks it had prior to going temporarily offline.
  • Case B : A subnet of up to 255 vaults per public IP address
    – Maximizes hops, maximizes routing complexity, maximizes communications overhead
    – Maximizes operator management complexity.
    – Minimizes “pain” if a single vault goes offline in the case that the vault data is flushed and downloaded again. However, any serious situation at the site (ex. extended power loss) will just as likely take the entire subnet of 255 vaults offline simultaneously.

So when considering “fixed” vault sizes, I think it is also important to also consider the ratios between compute capability per vault and bandwidth capability per vault. And with that I think we may want to also consider bandwidth and compute when it comes to “fixed vault sizes”.

Bandwidth Considerations

The download time alone for your hypothetical 50GB vault doesn’t really matter in the grand scheme of things. Nothing stops big vaults from vacuuming up chunks at constant download bandwidth, non-stop, 24/7, for months to years. IMO the only real limit on vault size is the when the rubber meets the road and the vault must prove its ability to serve the chunks in its possession at the “upload” bandwidth offered by the ISP. In other words, there is a point where if a vault is too large relative to its upload bandwidth, it won’t be able to service all its GET requests and is therefore too big for its own good. For example, assuming a 10:1 download vs. upload speed ratio your 10Mbps example can serve about 1/8 of a chunk per second. Here’s a similar table to yours from that perspective; where the total vault upload bandwidth needs to match the aggregate client download bandwidth…

down connection up connection 1MB Chunks per second
1 Mbps 100kbps .0125 Cps
10 Mbps 1 Mbps .125 Cps
100 Mbps 10 Mbps 1.25 Cps
1 Gbps 100 Mbps 12.5 Cps
10 Gbps 1 Gbps 125 Cps
10 Gbps 10 Gbps 1250 Cps

Another constraint on the system is data integrity. IMO the SAFE Network needs to periodically check to see if chunks are being well taken care of, even if no client has requested the data. This internal auditing is key to protecting against bit rot and vaults that cheat and lie. I am unaware if this is currently planned in the code, but it should be. To facilitate this, I would propose that the GET rate to vaults is maxed out 24/7 at whatever upload rate the vault can manage. Interspersed between client GET requests should be audit GET requests by the vault’s manager within the section. So the question becomes, “At what frequency should all the chunks in a vault be audited?”

Let’s assume for a moment that a 24 hour audit cycle is the maximum time allowed between testing a chunk for validity. In other words, every 24 hours all chunks within a vault are served to either a client or an auditor and verified at least once. The upper bound on vault sizes then becomes…

down connection up connection max allowable vault size for 24-hr audit cycle
1 Mbps 100kbps 1GB
10 Mbps 1 Mbps 10GB
100 Mbps 10 Mbps 100GB
1 Gbps 100 Mbps 1 TB
10 Gbps 1 Gbps 10 TB
10 Gbps 10 Gbps 100 TB

There are likely some rather clever ways to optimize the audit process and significantly boost the allowable vault size under this scenario. For example, the vault manager could precompute a fingerprint for each chunk that consisted of a set 256 bit hashes starting at random locations within the chunk . The destination vault can’t guess or cache this fingerprint and must be in possession to return the correct sequence when audited. Instead of returning the whole chunk to the vault manager, only the 256 bit hash of the requested starting location needs to be returned, which yields a 96.8% decrease in audit bandwidth requirements. Under this scenario the entire set of chunks would only need to be passed to the vault manager roughly three times per year and could be queried for validity at 1 hour intervals. Presuming a 25% audit to 75% client GET ratio, and a 0.1 fingerprint size ratio (FS), I figure the upper bounds on vault size (VS) under different bandwidth (BW) scenarios become:

VS = (0.25 * (BW / 8 Bpb) * 120 days * 24 hrs/day * 3600 s/hr)/(1+FS)

down connection up connection max allowable vault size for optimized audit cycle
1 Mbps 100kbps 29.46 GB
10 Mbps 1 Mbps 294.6 GB
100 Mbps 10 Mbps 2.946 TB
1 Gbps 100 Mbps 29.46 TB
10 Gbps 1 Gbps 294.6 TB
10 Gbps 10 Gbps 2946 TB

(Thinking about this a little more, the audit procedure has the capability of naturally limiting the sizes of vaults until more bandwidth is made available. For example, if the vault is unable to return the audit requests according to the audit frequency, the network has a clear indication that uplink bandwidth, or cpu, or storage read/write speed is lacking and the vault should not be allowed to receive new chunks until it can keep up. These essentially serves indirectly as type of ongoing resource proof.)

Processor Core Considerations

Each vault needs at least 1 processor core. However, not all cores are created equal. Perhaps this has the biggest effect on the our ability to run multiple vault on a singles machine. Processor constraints place a typical lower bound on storage size for high bandwidth systems. For example, consider a server with 32 cores and a 10 Gbps uplink. Based on the above consideration we’d like to have each vault with about 92 TB of storage. In contrast a mobile user is looking at a single core with 1 Mbps and 256 GB. The bandwidth ratio for these systems is much higher than the processor core ratio. The only way to reconcile this is to oversubscribe vault processes to processor threads.

Given the above discussion, can we reconcile the various hardware capabilities to yield a standard fixed size vault?

Yes, I think so. All you need is to have something in the code that caps the vault bandwidth to a specific value. Once bandwidth was capped/throttled a fixed vault size is determined the reasoning above with regard to audit rate. In consideration of all these factors I would formally propose the following fixed vault properties.

  • Fixed vault specification 1.0
    – Bandwidth = 8 Mbps ( = 1 MB/s which is equivalent to 1 chunk per second IO).
    – 1+ processor cores (64 bit ) per vault .
    – 1 TB fixed vault size ( 1 Chunk = 1 Million Bytes, 1 Vault = 1 Million Chunks).

Note that for this specification we can achieve the following audit overhead of 10.6%, which is rather respectable. Also with current storage prices at around $18 per TB the cost is not too unreasonable.

17 Likes

Thanks @mav for another very deep thinking exercise that you’ve both set out very understandably, and explored so comprehensively!

What stands out to me is the difficulty of coming to any conclusions with such a complex set of interoperable questions. I can read this, understand most of or all of the small points but frankly not feel I can grasp such a complex set of questions well enough to do more than listen to my gut, which has no conclusive remarks to make at this stage :slightly_smiling_face: Some things appeal, others make me scratch my head and others make me anxious.

One thing for clarification, it occurs to me you are talking about homogenous versus heterogenous vault size rather than fixed versus variable. We could for example imagine a network that can vary the homogenous vault size over time, so maybe the terminology should be adjusted?

My gut is now suggesting that while this approach to such a complex problem is useful, thinking deeply and having discussions which can clarify, raise and hopefully answer some questions. Yet it may not be sensible to then try to engineer something this complex to such a fine degree. At least not without some underlying framework that tries to keep the system within acceptable limits, given the behaviour of such a complex system will be unpredictable. But maybe that is the real aim here anyway - to understand what limits can be set and how to keep the system in bounds rather than to control it in fine detail. Yes, I think that’s how I see this.

Anyway, thanks again Ian, I really appreciate the time and energy you give to thinking about these issues and how you involve the community in them. Now I have a headache, cheers! :crazy_face:

17 Likes

:clap: :clap: :clap:
From my persective is bandwidth/vault size ratio important to keep bellow some limits to allow fast loading times and relocation possible not more than just few days. Btw. The upload bandwidth is even in quite industrial developed countries like Israel only 10Mbps for mobile and 14Mbps for fixed internet connection. And ratio DL/UL 10/1 is still quite common.

If there will be no barrier to become a farmer. The network will be bigger and more safe. Also the price for users should remaine low if farmers does not have high maintanence cost.

To find balance between no limits-some limits is the key.

3 Likes

Yes, this is an important distinction. A time varying homogeneous size is essentially a middle ground between fixed (forever) homogeneous and a completely arbitrary and heterogeneous floating vault size.

Again, I think it is important to stress what vault “size” really means since it’s a multi-valued property. A vault will have a certain size of storage capacity, number of processor cores, processor frequency, memory, chunk cache, AND bandwidth.

5 Likes

Great post, @mav!

I had wondered in the past what would be optimal and there are some interesting outcomes of this approach.

  1. Reduces complexity. It provides a single vector for scaling the amount of storage, which is the number of vaults hosted. This allows the network to optimise for this topology and reduces the logic paths to implement it.

  2. Considering 1 above, for the initial implementation it would make sense to try this first and see how it scales. More complexity can be added later if necessary.

  3. Simulating changes to the network become simpler as vaults cannot vary in size. This will help to further optimise performance in the future.

5 Likes

Very interesting topic and proposal, I think this can be not only considered for long term but perhaps for very short term for first test-nets, as said perhaps it can help us to make things easier to understand and see if they work as expected if we have the “fixed” sized vaults. I say “fixed” since I’m also imagining what others here, that the size shall not be hard-coded or time-based adjusted but perhaps based on other variables of the network and/or sections, like section size, section capacity, maybe network size should also influence such a number. The thing that comes to my mind though is if this number varies, how will we manage reducing it if a vault is already storing that amount of data, do we relocate data from that vault to others in the section so we can reduce such a dynamic size?..nice discussion again!

10 Likes

Maybe the size does not need to be fixed, but more like economically optimal at certain size. We or netwrok decide somehow what is optimal vault size. Make vaults with certain prefered size earn more coins per GB than those with larger size. Problem solved. Any edge case will not harm network much, since vaults can have more than 50GB with small penalisation. It does not enforce anything but motivates for such preferred size. Which is IMHO much better.

Talking about artificial optimum size has many drawbacks. If it is really necesarry, than maybe best aproach is to use some network past history statistics. Network can measure its own vaults size history, and try to keep the prefered size in bounded median. Penalize by lower mining reward those with too high or to low size.

6 Likes

Plus, add a factor by which a vault age can allow it to still earn the max ratio at a larger size, i.e. an infant vault earns more at the preferred size, but an elder can earn as much also with a larger size if the vault prefers to add capacity.

3 Likes

I’d say there is nothing wrong with fixing to a reasonable size until a point where it isn’t ideal. While we may want the network to scale optimally, having a sub-optimal working network trumps this, imo. Not only because it gives us something useable, but also because it allows us to learn from its limitations.

7 Likes

I presume mapping of which chunk belongs to which vault is easier when all vaults are same size. Would it also help if the size was 2^x number like 32 GB or 64 GB?

2 Likes

Let’s see if I can distill down what I’m hearing from comments:

  1. Set a max_vault_size hard limit, across all vaults.
  2. Have a rule that max_vault_size can never decrease.
  3. Have a rule that max_vault_size can be increased, but only by consensus of elders when certain conditions are met. exact conditions: tbd.
1 Like

No, set the vault size to be fixed to the same value for every vault (ex. 1 TB ). No min max range but an actual specific size that can be memmapped, allow for good optimizations, and allows for easy network decisions based on vault count…

No, have a rule that the vault must preallocate the storage to the designated fixed size (ex. 1 million 1MB chunks in a mem map). This satisfies an initial proof of resource/work.The chunks are all there from t=0, just initialized to null,analogous to formatting a hdd or SSD to a set size.

With a fixed vault size of 1 TB there is really no need for this. Vault sizes of 1 TB will likely be good for the next 15 to 25 years. At that point just do an overnight code update that ransitions all vaults to a fixed size of 1 PB. (Slight exaggeration here but you get the idea.)

3 Likes