Safecoin possible attack vector

There are a few ifs here, but with several vaults (and my understanding of the system which may be flawed) I believe this could be pulled off.

(Assertions)
*Gets on safe are free.
*Gets generate a potential income for the first to respond with the correct data
*chunks are put in cache once received
*Users can see what chunks they hold
*chunks are vulnerable to plain text attack (talking public data)
*chunks are requested from the closest data managers in xor space
(/assertions)

It seems that one could compile a list of all known data that they hold (by scraping public data, converting to chunks, comparing to what they hold in their vaults) and requesting that data from different places all around the world. (thinking cheap droplet servers/botnets) if they also have vaults stashed all over the globe, and a list of the chunks each holds it seems they could generate a nice profit for themselves limited only by their bandwidth. If there are only 4 copies of each chunk, you have a 1 in 4 chance (plus cache) of getting your own.

You could also to the same by having a large private data store of data and doing the request as soon as cache has been replaced (depending on how fast data moves, maybe once a day?) so you would always have a 1 in 4 chance of generating income/playing the safecoin generating loto, assuming you know you have at least 1 chunk of your own data on a vault you control. If not, back to obscure public data. (less likely to be cached)

Thoughts?

The testnet will be small enough to try this on :slight_smile: It’s like the Google attack or similar we have gone over previously. I cannot see it working myself, especially as the network grows. The thing with the decentralised or distributed approaches are we tend to think linearly, grab all copies etc. if the network is of any reasonable size the data you will get close to will be tiny I would imagine and then you need to figure how to make sure you provide it when asked.

It boils down to getting a huge % of vaults really, the bigger the more money you will make, but that’s good as long as you behave :slight_smile:

3 Likes

Well, the way I envisioned it happening, you wouldn’t need a huge number or vaults, just able to correlate what’s on own vault to known data. More vaults would just make it more profitable. I mentioned the requests from all over the world, but that wouldn’t effect xor space, so nix that. Just a bunch of clients with different id so their close groups are spread in xor space.
If I know what data my vaults have - be it my own data or public data, known because of plaintext attack - I can request that data at no cost to myself and have a 1/4 chance of attempting a safecoin creation.

Not sure this invalidate your scenario but consider this:

You can’t request a chunk you hold because you don’t know what its XOR chunk ID is. The local chunk ID that you see in your vault is different then the XOR chunk ID a user needs to request the chunk. Only your Data Manager knows the real XOR chunk ID. It hashes it before giving it to you and keeps a dictionary mapping the real ID with your local one.

You can find more info in this discussion.

Maybe an example is the best way. I wouldn’t need to know it’s xor id, just it’s name and that I have it.

I go to Wikipedia and find the longest but least accessed page. More data, more chance I get a piece but less likely to be cached somewhere.

I pull down the file, chunk it locally and grab all the filenames (sha512 of content). I’ll see those filenames locally on my vault with a file manager (I believe). So I write a script that whenever one of those chunk names shows up on my vault, generate a boat load of clients spread out through xor space to request the full file. There’s a good chance one of those will get pulled off my vault.

As I browse the network and grow my list of filenames, it’s no skin off my back to request any known file I have a chunk for, as it costs me nothing.

I see. Not sure it would work well when the network gets big and your vaults represent only a microscopic share of all available space.

But if on day one someone invested a massive amount of HDD he might be able to control a dangerous share of the network and might be able to game it. It wouldn’t last long but it could do some serious damage.

I’m starting to think that maybe the safest way to go about it would be to delay the activation of Safecoins reward until the network reaches a certain threshold to avoid any day-zero attack like that.

Thoughts?

Safety is in numbers, not too many but for sure the more the better.

In the early days, it wouldn’t be too hard to have a large portion of the network under your control. That’s bad, but for more reasons that just making some extra safecoin. It can lead to fraud, losing data, among other things.

What I’m proposing will work just as well with the few TB I plan to bring online to the network as someone who has a few PB. Regardless of the size of the network. A chunk stored in only 4 locations means you’ll always date a 1/4 chance of fulfilling a non-cached request regardless of network size. It scales well, more chance while holding more data, but would work just as effectively (though not as often) for a small timer as a big guy. You don’t need much storage, just be able to know when you have a known file’s chunk, then ask for that file.

Think like an ant colony with 6 ants in it. Easily defeated, start adding then it gets stronger, not linearly though. It’s the basis of all we do really.

1 Like

That’s the hard part when the network gets massive. Your chance of finding a known chunk diminishes as the network grows. Also the effort(cpu, bandwidth, electricity cost) you need to create your list of known chunks gets less and less efficient as the network gets bigger because each file you scan have a smaller chance of getting into your vaults.

If someone got time and some math skills it would be interesting to see the math behind this.

EDIT: Btw I’m not saying it’s an invalid attack vector, but I suspect it doesn’t scale well.

2 Likes

Ah, I see the points you guys were trying to make. We were using “size of the network” as two different things. I meant nodes, you meant data.

I concede your point. My idea will indeed scale with nodes, but not data. Severe diminishing returns as data starts to pile up.

3 Likes

You can see every address see’s it’s own network in xor space no two nodes or data see same network. All paths are different lengths (distance metrics) from each other point in the xor space, very confusing, but very powerful effect.

1 Like

I think this is still a valid attack, but not a serous concern.

Network size makes it harder to know which chunks you hold so you can request them, but if you manage to grab a few early (while the network is small) in you could in theory live off them for a while, until they were moved to another vault.

Not likely worth the effort I think, because as pointed out, it becomes very hard to get a foothold once the network becomes large.

1 Like

To mitigate this why not have a mass installation event? Have a countdown on maidsafe.net have the client ready to download.

That may help slightly, but it still wouldn’t prevent someone from starting to scrape data as soon as data starts getting put on the network. First public file they see, start seeing if they have any of those chunks, or chunks of their own private data.

I don’t know what the value of SafeCoin will be early on but if it’s near null reward to hosts, then it wouldn’t be worthwhile to leech before it became too hard. Perhaps one option would be to gear the reward relative to the network size… give it some logarithmic warp. That would have some effect on SafeCoin early on but would prevent such a leech taking advantage.

1 Like

One way to make this harder would be to not enable public files for a time.

2 Likes

That’s true, but why bother? If public files are to be attacked, why not earlier than later?
When, precisely, would you enable public files?
Why would you want to force farmers to sell their capacity for less?

Earlier people were saying that this would be a viable attack mainly before the network starts to scale in terms of data.

So the response was, if the attack only works until a certain scale of PUT data is reached, well then just dont enable public files until then.

Right, so the best way is to get enough farmers quickly. And you do that by paying well, which is done by allowing public data right away, not by restricting demand for SAFE.