That was a nice set of approximations to see played out, thanks mav.
A more recent set of figures can be found in IDC’s Data Age 2025 study, sponsored by Seagate, April 2017. They estimate that the current total amount of digital data is somewhere around 25ZB and will go to 160ZB in 2025. Probably will go higher. I wonder if these figures include redundancy and data duplication? Probably no and yes. So 8x redundancy coupled with 8x reduction due to deduplication probably makes this a reasonable estimate
jlpell’s other guestimations:
Most dedicated desktop users who want to get involved will throw 500GB to 10TB at SAFE on launch. Prosumers will throw about 10TB to 100TB and mobile, timid, or just curious will be about 10GB to 100GB. Business ventures will be in PB. Depending on timelines, these numbers might be higher by a factor of 2x. I don’t think storage will be the issue, and the surplus storage will allow for extra redundancy to help spread out bandwidth load. My hypothesis is that working with ISPs and forming new ones or mesh networks in order to get low latency and stable connectivity in an evolving regulatory landscape will be more difficult as popularity rises.
Yes, from a basic user’s point of view no one wants to sit and wait for a 500GB download. I think about 1 hour is a psychological limit for most, before they need to start seeing some kind of safecoin flow their way no matter how small. Current typical broadband speeds allow for a single 10GB vault to be filled in about an hour. Multiple 10GB vaults could be run in series to fill up a 1TB drive, but there are limits to going the multi-vault route based on the number of processor cores and computational requirements for each node process.
Tiered groups ranked by performance level might alleviate this, ie. smaller groups of higher performance nodes vs. large groups of lower performance nodes. This keeps google competing with amazon, and us competing with each other. Kind of like weight classes in sumo wrestling . Computation is going to need nodes that are clustered by performance level anyway, although nothing says that computation nodes and storage nodes need to coincide on the same machine.