Low-level filesystem integration

Tim87 · January 24, 2016, 6:57pm

I mentioned it here. I think it’s a pretty decent idea, if somebody is willing to pick it up

One could modify a filesystem (or the Linux VFS itself, if it knows about what physical areas are used and vacant) to lend its free blocks as low-level opportunistic storage space for SAFE.

If we needed to reclaim some space (e.g. for a new file), the FS would just do its job, but it would also compile a list of the chunk copies that got deleted in the process, and report them to the storage manager the first opportune moment.

I understand stuff like this are of insignificant priority at this point, but it sounds kinda cool, so I thought why not throw it out there.

anon81773980 · January 24, 2016, 7:09pm

I actually was thinking about this in the last week or so. What about personal encrypted files that doesn’t need to be save in the safenetwork. I did some studying, and found out that zfs or btrfs would be a suitable choice. The reason being is they offer “pooling” system. This particular pool is encrypted and requires local password to access to it, not the safenet password.

But again, why do this when you could just use zfs / btrfs with LUKS, and be done with it. Btrfs and zfs already have compression algorithm. When you save files, it compressed and put into storage. When you call them, it decompress but doesn’t require to save again. Take a good look at the speed in decompression, it is really well done. Could self-encryption serve faster rate than btrfs, and zfs? I doubt it. I did couple testings. We need do more testing on that. But as right now, it is not. It took me around 3ish minutes to decompress 90mb file. When you decompress, it saves somewhere whereas btrfs does not. Btrfs could decompress 90mb file within seconds…

Something you should think about…

Tim87 · January 24, 2016, 7:54pm

I’m not sure we’re talking about the same thing, though. Filesystems store stuff on top of a block device. Some of the blocks are occupied by data, some of them are not. What if we could borrow the blocks that don’t store data?

If we actually reserved those blocks, that would be detrimental to the performance of the filesystem (full drives don’t perform well.) However, if those blocks are just opportunistically borrowed, then we can look at that space as a somewhat unreliable backend for immutable chunks (unreliable, because they can be overwritten by actual files at any moment.)

This way, we could utilize 100% of the free space on the filesystem. Of course, the filesystem should notify the safe server about the deleted chunks so it can report them as lost to the storage managers.

neo · January 25, 2016, 1:27am

If you did this then the network would be marking your vault as unreliable and exclude it from future use.

A vault is expected to be near 100% reliable while it is operating. If not then it loses rank quickly and marked as bad if enough losses occur. A loss is the inability to serve up a chunk when requested, or do caching, etc.

When a vault is powered off it basically gets removed due to rank because it is no longer responding. When the vault is powered up, it is given a new ID and its rank is basically restored.

But you want to have a vault remain on line but no longer serve up chunks the OS had to delete to get the space. That is going to get the vault removed from the network, by the other nodes.

If you were to try something like this it has to allocated the space when needed and not reclaim it until the vault is turned off and restarted.

People who run virtual machines can have their disk grow as needed, but once again the vault cannot be deleting chunks without suffering the consequences.

Tim87 · January 25, 2016, 1:28am

Even if the vault voluntarily reported chunks when they are deleted?

neo · January 25, 2016, 1:37am

Even if there was a mechanism, it still is admitting that your vault is faulty

Tim87 · January 25, 2016, 1:54am

This is sad such a fun idea.

I thought it was possible because it seems like a useful feature to be able to reduce the storage space I’m willing to provide, so I was sure it’s possible to say “sorry guys, I can no longer account for this chunk, but no hard feelings.”

I don’t understand yet how everything works; was this left out because it would complicate things too much?

neo · January 25, 2016, 1:59am

I think the reason is that if you allowed it, and then if it is used then the security of the network is diminished. If it became widespread then the flux within the network could become high and a lot of unnecessary bandwidth to reallocate (copy) the lost chunks.

Why do it? Just to allow someone to get a few more chunks into their vault.

There is a reason why the farmer is paid and that is to provide a stable vault, that stores, retrieves reliably, caches, etc while it is online. This makes the dynamics less dynamic and less overall bandwidth across the network.

Tim87 · January 25, 2016, 2:12am

Yea, that makes sense. I’m still sad , my little dream just died; I’ll deal with it

janitor · January 25, 2016, 4:20am

Been trying to explain this to you on the Amazon farming example where you said vault space could be dropped at will…

Tim87 · January 25, 2016, 11:02am

No[quote=“janitor, post:10, topic:6799, full:true”]
Been trying to explain this to you on the Amazon farming example where you said vault space could be dropped at will…
[/quote]

I can’t recall you mentioned anything about vaults marked unreliable as a result of discarding data But go ahead, act all victorious for no reasons.

This doesn’t apply to the Amazon thing anyway; they always have enough extra capacity to run millions of vaults (my estimate) and if they need to dump a few complete vaults sometimes because of an unforeseen requirement, well who cares.

janitor · January 25, 2016, 11:14am

Here: “Another thing is you’d not only have to come online and have your vaults filled up, but also stay online for a while to gain reputation to get data and later read requests.”

As Neo said you can’t just stand up new vaults and disappear when convenient. New vaults won’t even get any data uploaded after they’ve been online for a while, so as I said on that other topic you’d have to pay (reserve) vault capacity up front, wait till the vault gets sufficient reputation to start getting data, and then wait till you get the first GET requests (and this last step is where you earn). That’s also why I was saying you’d have to pay operating expenses up front and start earning SAFE coins incrementally maybe a week or two after you put the vault online.

Then if some Amazon customer wants to buy more storage for $30/TB/month, if you dropped the vault you’d lose 1-2 weeks of power and bandwidth investment in making the vault reputable enough to be used by the network.

Tim87 · January 25, 2016, 11:38am

I stand corrected: you did mention that.

It’s not as relevant for Amazon, Rackspace, etc as here. There’s always a huge, constant leftover capacity for companies like that. It’s just a tiny percent of their whole, but their whole is enormous. As for how much we’re talking about: Amazon installs enough storage etc every day that could have powered their whole retail business back in the day.

Vaults would stay online for not just a while, but for years. Yes, they would need to turn off some of their vaults every once in awhile when lightning hits or when they get an unexpected big customer (I’m sure they would pick the newer, less valuable ones, obviously) but it would still be worth using that capacity. Let’s say you can add 100 vaults a day, but you lose 50 every week because stuff happens. You’re still keeping 650 to grow old and valuable.

Also, I’m not sure how much extra electricity it takes to read/write 1% more an already spinning drive. I’m sure however that they’re not paying for their lines by bandwidth.

All this added together suggests it’s at least something to think about.

janitor · January 25, 2016, 11:50am

No problem, there’s been a lot of comments in these topics in recent days.

I’ve no doubt it’s possible to integrate SAFE with existing services in ways that make use of stranded capacity, but at this point I just can’t speculate what will be possible and what will be economical. So I’m neither upbeat or negative on that right now.

We’ll find out soon.

Topic		Replies	Views
Advice wanted on node storage filesystem Beginners	2	470	August 29, 2022
Suitable backend storage for Vaults? Safe-Node	0	876	November 7, 2016
SAFE Rsync and Chunks etc Development	6	1281	February 11, 2018
Filesystem Hierarchy Features	16	2062	September 20, 2015
Having fast storage nodes and slow storage nodes differentiated Marketing	1	726	April 29, 2014

Low-level filesystem integration

Related Topics