Next step of safecoin algorithm design


#161

Sorry I think I need to clarify. I don’t mean that large sections cause centralization as in lots of data centralizes into a single section. I mean large sections cause centralization as in only a few centralized people are able to run vaults so consensus becomes more centralized. My meaning was that large sections pose a risk to consensus and power centraliztion, not data centralization.

This poses a bit of a conundrum. Extremely cheap prices attract Jane Smith as an uploader but discourage Jane Smith as a vault operator because of the large vault sizes (which are required to have cheap prices). The pricing algorithm has a natural tendency for uploaders to say ‘I’m just going to use these sweet cheap uploads and let the big operators worry about the vaults so they can make my uploads even cheaper by getting even bigger’.

I’m sure some interesting analysis could be done about how hard it would be for a group of ‘communist’ vault operators to combat large vaults and keep sizes relatively small and consensus distributed…


One other change that may be helpful in achieving one of the general aims of rfc-0012: “the farming rate decreases as the network grows”

Currently the farm rate decreases as the section grows, since it depends on the size of TP and TS which are specific to each section.

I think a better way to capture whether the network has grown is to include section prefix length in the calculation of farm rate. That way the overall size of the network can be calculated which better achieves the goal.

To illustrate why this matters, consider two networks with very different sizes but the same farm rate:

10 sections and 100K:90K TP:TS chunks per section (rfc-0012 gives a farm rate 0.1)
vs
1000 sections and 1M:900K TP:TS chunks per section (rfc-0012 gives a farm rate 0.1)

The second network is overall 1000 times larger than the first (100 times more sections and 10 times more chunks per section) but has the exact same farm rate. So farm rate has not ‘decreased as the network grows’.

I think it’s a mistake to incentivise increasingly large sections. Including section prefix length would allow farm rate to decrease as the network grows without also needing sections to get large at the same time.


#162

But perhaps my reply would still have some application here. Consider these points.

  • Data being stored is expected to be in a fairly random distribution.
  • Thus sections should get a fairly even spread of chunks.
  • if using rfc0012
    • then if too many farmers join one section then yes price really low and farmers will pull out if they end up there.
      • and therein is the expected solution, farmers will be pulling out and rejoining to get a better section.
      • problem expected to solve itself
    • ELSE if we adopt your FD = TP or @JoeSmithJr’s idea then is there a problem of “centralisation”? The price will not be too small.

For that to be truly successful then the “big” operators have to be in a large majority of sections otherwise any uploads will either be marginally cheaper for large files or randomly cheaper for small files.

Also the large operators are also getting small rewards and from the other topic energy costs alone are not insignificant when trying to do scale. Add to that cost of operations (see drop box figures) will mean that very large operators will want bigger rewards than a home user just to cover costs. So a problem for the large supplier of vaults who would cover a large percentage of sections because they need to recover costs.

Actually this is in the RFC. coin scarcity, but maybe not enough for your purposes. After the initial growth period when the network starts to mature it is expected that the number of coins will be increasing and thus the farming reward success rate reduces proportionally to the number of coins existing. And since it is expected that the number of coins existing is increasing then the effective (not actual) farming rate decreases.

Thus compared to the actual farming rate the effective rate is

  • early 15% of coins exist EFR = 85% of FR
  • say 1 year 20% coins exits EFR = 80% of FR
  • say 5 Year 40% coins exit EFR = 60% of FR
  • say at 10 year 60% coins exist EFR = 40% of FR
  • say at 20 Year 80% coins exist EFR = 20% of FR

Yes currently the calculations assume that a section is representative of the whole network and thus the figures can come purely from the section.

I am not sure that the section prefix is much better since again it is assuming the section is representative of the whole network. For instance one section prefix maybe 20 long yet others who have not seen anywhere as much splitting might be 10 long. Maybe these are two sections at the extremes of the average prefix length. Pretty much the same sort of thing that happens with other variables of the section.

I am not so sure that a large section will just keep increasing

  • the section size increase due to spare space increasing since the storage of data is to be assumed fairly randomly distributed across all sections.
  • If the section grows due to more spare space then FR decreases discouraging farmers from remaining in the section (ie they just restart)
  • Node churning moves vaults around anyhow so would you ever get a section remaining so large its an issue?
  • Basically the larger the section, the more spare space it has the lower the FR and thus the lower the desire to remain farming. Thus a positive force reducing the section size.

#163

Nice catch! Something that isn’t obvious.


#164

Yeah this is a good point and is one of the variables I neglected.

But effective farm rate only affects coin rewards.

Farm rate is also used to set price, but there’s no ‘effective farm rate’ for pricing, only for farming rewards.

So the statement ‘farm rate decreases as the network grows’ can be reframed as
‘rewards are reduced as the network grows’ (due to effective farm rate)
but not as
‘prices are reduced as the network grows’ because there’s no equivalent ‘effective farm rate’ for price.

Maybe the role of Number Of Clients is intended to serve a similar purpose and create an ‘effective farm rate’ for storecost?


#165

Good point about effective FR (rewards) and about number of clients.

Suppose we need to keep in mind that sometimes simple solutions have some good effects without too much negative edge cases.


#166

These pointers are fantastic. I’m currently working on a kind of ‘vector map’ that shows the various forces at play within the parameter-space of rfc-0012. It will hopefully give some idea of how behaviour could play out and what motives and incentives push and pull in various directions within the economy. So thanks for the points they all help fill in the gaps here and there.


#167

General idea of activities that cause change to the farm rate and coins remaining:

The effect of each type of client and farmer activity can be summed together to give an overall magnitude and direction to the change (image below). This will fluctuate through time as spare storage and GETs and PUTs naturally fluctuate, so the arrow could point in any direction and vary between large/unsteady and small/steady.

I think there’ll be a natural tendency of farmers to always be pushing slightly toward the right, since it’s easier to have less spare storage and they’ll find it more desirable to have faster reward rate.

I think there’ll be a tendency of clients to always be pushing slightly toward the top since it’s easier to browse than to upload.

So I think the overall natural tendency of rfc-0012 will be toward the top right. Does that sound reasonable?

Farmers that are also uploaders are important to the network since they have incentive to push toward the bottom left (cheaper uploads and more coins remaining) which counteracts and balances the ‘natural’ or ‘lazy’ tendencies toward the top right. Hopefully most farmers are uploaders, but I’m sure some will be there just for the economic activity.


#168

This I think will be one of the incentives to farm, to get coin and because they realise the network needs farmers if it is to survive and maintain their data.

Honestly it is only a tendency and not as major as it might seem. Once the network is accepted as operational and safe to store their data then I expect that “Need” will become a driving force. When social media, blogs, forums, etc are being used on SAFE then the desire to PUT will have greater forces at play than the immediate desire to get PUTs as cheap as possible. People who need to upload (their cat vid to social media, their holiday photos for their circle of friends) will change that tendency and vault size, spare space, PUT cost will not be the immediate concern, but the need to store the data will be. Obviously if the price is outrageous then they won’t But I expect that the price will tend more to an acceptable cost due to the fact people will tend not to store when price rises (like in your diagrams) but that range of acceptable will be reasonably large for the majority since they store medium amounts and not like the ones “archiving the internet” will be.


#169

The ratio of client download vs upload (ie GETs vs PUTs) is significant since it gives an idea of the rate that safecoins are issued and spent.

What is a realistic GET:PUT ratio? I gathered some data to try to get some ballpark understanding.

safenetforum: GET:PUT is 16
youtube: GET:PUT is 70

For safenetforum this is calculated from the 30 latest topics as views/(posts+likes)

Likes is calculated using the average likes per post from the about page multiplied by the number of posts in each topic.

For youtube this is calculated for a popular music playlist of 10 songs as views/(comments+likes+dislikes)

Spreadsheet of data is here: views_per_content.ods

Some data will have a lower GET:PUT ratio, eg periodic backups.

Some data will have a higher GET:PUT ratio, eg pornography.

But it looks like a reasonable range of expectation for most public data would be between 10 to 100 GETs per PUTs.

I imagine this ratio will have significance for the farmrate and thus also for the storecost.

It may also have significance for optimizing vault caching and maximum network size.


#170

Except I will throw a spanner in this one and youtube also will have some of this.

When you view a topic there is one/more record updated on the server that keeps track of what you have seen and not seen. So in fact for each forum content GET done there is a GET for that usage record and an update of that usage record. So its more like 2 or 3 GETs to one PUT.

Youtube also keeps global and personal usage data so its not going to be 70 to one but lower.

The point is that even for SAFE sites they will likely keep some personal usage data so that your experience is better and since that data is your own (owned by you) and not shared then its not an issue for anonymity etc.

Where you can use that data is for files that are uploaded for others to use and do not have specific APPs for. How many files are popular to unpopular, how long are they popular etc. For this your youtube figure might be closer to the mark.

But then youtube automatically plays the next vid so how many devices are just playing vids and no one is really watching them after a while (ie wasted gets) whereas they would not download file after file automatically. So then maybe your youtube ratio is a very very high figure for category of file downloads (incl your suggestion of porn).

Then mix the different types of usage and its probably a lot better than those figures. Mainly because of the fact that good sites will store in your personal data some info about your site usage so that you can come back to the APP and know what you’ve read and what you have not etc etc.