Randomness and Clustering

Hi all, does anyone have information on what factors influenced the decision on how much redundancy gets built into network resources and storage? I was considering the nature of randomness and how elements of a truly random set will typically exhibit clustering over the dimensions of the set. Wikipedia has a good explanation…


The illusion here is that the data isn’t truly random but in fact clustering does occur and to avoid it the distribution would have to be artificially managed and thus be LESS random.

As I understand it in SAFENet XOR is used to assign responsibility of data chunks to farmers in order to achieve random distribution (but remain deterministic). I’d guess that the number of copies should be set high enough to keep the probability of accidental clustering ultra low? Haven’t looked into it more yet but will add anything useful I find here.