Massive scale maidsafe farm

Hi,

Coming from a unix/storage/datacenter background I could easily build a massive farm (If I had the harddrives I could have a 72 tb storagebox by tomorrow morning just with the stuff I have laying around my house).

So lets do some dry shopping.

I’d use refurbished HP enterprise hardware and a bit of sorcery to glue it all together.

I’d hire a full rack in a datacenter (not that expensive in the amsterdam region) and put in the following hardware:

Main storage node: Mr Fatboy

1x HP DL580 G5 (4 cpu, buttload of ram, 10 ? pci-e slots). I’d put a couple of LSI 3801 SAS cards in it, each card can connect 8x HP MSA 60 (2u, 12 bay, max 2 tb per slot). Lets say 2 cards, so I can hook up 16 drive trays supporting 192 tb of data (I’m just counting raw data without any redundancy). I’d put in a nice qlogic fibrechannel card in target mode and hook it up to a san switch. Sounds expensive, but really isn’t anymore if you use the older 4gbit hardware…

This node will be providing fibre channel storage to a couple of pizzaboxes that will act as maidsafe farming nodes and all use the same storage (via fibrechannel) from the fatboy mentioned above. Using thin provisioning all pizzaboxes will have 192 tb at their disposal, and every pizzabox will direct connect to a couple of different carriers. Lets say I use 4 pizzaboxes with 4*1gbps each, all to different carriers. To the untrained eye, it then seems there’s 16 farms.

So, farming away and away and away… everybody lives happily ever after (especially me), but then mr Fatboy breaks and all my 16 virtual nodes die.

Would it be possible that my greedy setup causes dataloss on the maidsafe network, because all copies of the same data are stored on Mr Fatboy because there’s actually no way of telling where the data physically resides ?

And, would a massive farm setup like this help or screw the maidsafe network ?

I hope I’m explaining this well…

3 Likes

I wish I could actually be able to witness something like that… :smiley:

Why, want to see a grown man cry :smiley: ?

I hope it does.
Please try to do this when a beta version comes out.

1 Like

It’s all about probability. Anything is possible… but your setup would not likely have any effect.

It would help since it’s providing resources but you might be better off splitting the storage into many vaults.

2 Likes

That’s the beauty of the backend san storage. It will be many many small vaults, but they’ll share the storage. Maybe even share bandwidth if I choose not to peer them with x number of isp’s.

I can’t see any other way to make storage cheaper than amazon S3 or a vps/dedicated of some kind. S3 will deliver, as long as you pay for what you use, but every vps/dedicated server provider will cap your output in some way or charge you for the datatransfer.

Even my internetprovider at home will probably kick me if I congest my 3 mbit uplink 24/7 (fair use policies and blahblah). I do live in the Netherlands, where fibre-to-the-home is getting more and more common, except in the tiny village I live in. I can get max 30 down 3 up on a dsl line here…

If I understand correctly your reputation in the network goes up if you can always deliver. So to monetize, I would need my babies to be busy all day long, with the best latency possible.

2 Likes

@ioptio is right, to elaborate slightly…

It depends on:
the amount of data stored on the network
the amount of storage on the network
the size of your FatBoy system in nodes and storage

To affect the network, FatBoy would need to amount to a large proportion of the data storing nodes. So as the network grows size of such a service would have to be ever larger to cause a risk. I think the network will be big enough quite quickly for such systems to pose little risk, but obviously it could be vulnerable depending on how much storage comes online on day one, and whether such a service was able to gain a lead in terms or rank (which will affect how much data it is trusted to store).

I don’t know enough to put numbers to any of this, but I agree with @janitor, I’d love to see you try and break it!

3 Likes

I have a good idea for the BETA or testnet which is to run 300 nodes, gain some rank;
and then offline them and see what happens… run 300 nodes storing 120 gigabytes with 1 cpu and 512 gigs of ram with 1.6 gigs of network throughput;

I wonder if any impact will happen and what the minimums of the network will be required before 300 nodes like this will cause a problem should they go down; + each node running its own vault;

and then to try running 300 nodes in just one geographical locations; and doing the same thing;

and then try running 300 nodes in 4 different geographical locations and then suddenly offlining them;
We shall see.

I hope that these tests succeed in proving the resilience of the network (meaning that the tests demonstrate that the network continues to function even if this thing that I described was to occur)

3 Likes

seems like that this would be the bitcoin equivalent of a 51% attack.

also one should test what would occur if the connections between clusters of nodes were cut off some some time period and then brought back together.

not sure if that is even realistic but assume that all satellite links and intercontinental links were broken for some time period, then re-established.

Some time after typing the previous post, is that I’m not so sure even 300 nodes is enough to prove anything, maybe in a testnet or during the beta… though if the entire planet is appending itself to this network; I don’t think that it will be possible to establish enough centralization to interfere with the entire network. Even if I tried with 90,000 of these nodes, if only the guys upstairs would let me do that I don’t think it would really make difference if the the entire population of the city were to run a node so… my 90,000 nodes for a few hours vs static nodes of 3,000,000 homecomputers and laptops… Consider this.

Plus all of the centralization resistance photons being built into the core. @dirvine

1 Like

We already do simulate millions (billions in maths) to get answers, Fraser and I this week will be simulating some edge cases with tests that run for several weeks and hundreds of millions of nodes. The real proof though will be when humans and their computers interact. That part we cannot even begin to simulate, which is a shame. We do go a bit mental though, at the moment there are three separate teams (2 University groups and us) simulating attack vectors, so its relentless, but has to be. This is why the code quality is so so important.

6 Likes

Is there any way I can assist, I can launch not hundreds of millions, though many tens of nodes in 4 separate geo locations - 3 continents. Otherwise Ill keep mining monero until human trials of the network. :stuck_out_tongue:

2 Likes

Absolutely. We will produce initially some code you can run on a powerful simulator computer (checkout MaidSafe-Common/address_space_tool.cc at next · maidsafe-archive/MaidSafe-Common · GitHub) which will be upgraded with many more tests (this is a fallicy test right now, but with 10,000 nodes and 3 consensus chains an attack of over 200,000,000 nodes is still not enough (be aware this test takes nearly a month to run)).

In testnet2 all the nodes and churn (creep churn and sudden mass loss) we can throw at this will be great, @Viv and the visualiser will be analysing events and allowing us to monitor closely the effects. It’s hard otherwise as many attacks just disappear as the network swallows them up, so we have left out some security features to allow us to attack in other ways.

2 Likes

In this case I could only run 25 nodes in each of 4 places in 3 continents for a whole month, without asking for more, I could set this up right now with some guidance; does this help?

It will for sure, best in testnet2 though. You can play around with the simulations right now though, it’s based on a kademlia/XOR like routing table but does not use our nodes for speed. It’s very interesting actually and will form the basis of an IEEE paper very soon.

1 Like

During the event to take place 11 days from now at Rackspace http://www.rackspace.com/
with MaidSafe Technology as the VIP, I expect many additional Rackspacers to be there, and gain the support to more easily expand resources towards this endeavor.

Here is the event planning document:
https://pad.riseup.net/p/maidsafe_wwww

2 Likes

Brilliant news, any help we can provide then please shout.

2 Likes

An hour with you to go over the technical presentation, as I am gathering everything I’ve reviewed regarding MaidSafe over the past month, literally 60%>= of my life = 350 hours :smiley: studying things I’ve never heard of before or seen or ever considered or knew could be valuable to humanity!..!

4 Likes

There is a bunch of video faqs being published now ( @ioptio and @frabrunelle did them with @Shona editing), that will help then lets jump on skype or some other snooping network and let everyone hear :smiley:

4 Likes

Feel free to edit the planning doc with any tasks you feel are important, I’ve a couple of hours before some supporters arrive to meet with me;
So after the weekend will be the best time/even sunday evening; I work ultra late and as much as I can handle so time differences are not much factors here.

1 Like