Process Separation

I haven’t gone through the entire codebase yet, so if I am incorrect please steer the conversation in the correct direction.

It appears that the main application is the “vault” that runs on the machine of the user. I believe that this single process is handling the P2P connections (the routing), is providing the FUSE capabilities, and the rest API (although I’m not sure how this factors into the rest of the program yet).

I do not recommend de-crypting user data in the same process that is handling the P2P connections. The P2P connections will be to a peer that might be malicious. If a peer finds an exploit in the routing layer (never assume that testing has you covered), then the attacker can easily see de-crypted data because the data is contained within a single process. Think OpenSSL heartbleed here - one exploit allowed an attacker to see other data “owned” by the process. At a minimum, the routing and the part exposed to the local user should be in separate processes so that information is in separate virtual memory spaces. This will allow the operating system to provide some protection against data being stolen.

As a secondary measure I recommend running the P2P portion in a container (i.e. LXC or BSD jail) so that if an attacker finds a major fault in MaidSafe code or one of the library it uses, the attacker will be stuck in a limited environment.

If there is interest in this idea, I can expand on it more.

1 Like

This is only the vault, it cannot decrypt data and has no FUSE section. This is possibly the best diagram to start with http://maidsafe.net/core-developers and then the docs here http://maidsafe.net/documents it will help get the architecture quickly I think.

I apologize, this was a little sloppy of me. I think I understand the role that vaults play in the system. However, what about the user applications? It appears that each user application will have its own P2P connections, is that the case? That would be a good way to provide anonymity across applications. But won’t the user applications have decrypted data and P2P connections in the same process space too? Particularly I’m thinking about protecting against worms that infect peers and steal data while doing so. No special privileges on the network or the host system should be needed. Its late, maybe I’m going off the rails here.

Yes… for now. However, we’re looking at running the Routing/NFS client as a separate process and having apps communicate with this via IPC. That comes with its own set of security headaches of course, and would probably only be viable on non-mobile platforms.

Shouldn’t the routing and NFS be process separated too? Otherwise user data and the P2P connection is still in the same process space. This complicates things even further, but the P2P connection seems really bothersome - am I the only one with these concerns? There seems to be little information available on P2P networks and worms, but this seems like an obvious attack. Traditional client/servers have different implementations for each portion, but this network will have symmetric implementations (this implementation should be the dominate one for people in the network). Other P2P networks have managed without too much difficulty, but this one could face higher scrutiny.

Was Protobuf used to mitigate the concern of parsing external sources of data? Protobufs will lower the attack vector since the parsing code should be near rock solid, but its usage will increase memory and latency, while lowering throughput. Tough tradeoff.

Yes we go for type safety here. so parse into validatable types when possible. i.e. immutable data must hash etc. Its all a tradeoff of many things. Be interesting if you did a write up on why separating process improves security and what other security holes are opened up by dong so. Well if you want of course, Just interesting if you have a particular itch to scratch, perhaps we can all get involved a bit. It is a minefield and nearly every solution has issues and my concern is that many security enhancements actually make things worst (rfid etc.)

Currently the NFS API deals with the encrypted chunks in the form of ImmutableData, and pointers to ImmutableChunks in the form of StructuredDataVersions. Neither of these should hold unencrypted data, they’re returned from NFS in exactly the same form as they’re held by the Vaults on the network.

Having said that, we’ve discussed adding more user-friendly functions to the NFS API which range from just a very basic WebDAV interface to a full(ish) POSIX filesystem API. If we do that, then the encryption and decryption would happen behind such an API, and it would then make use of our existing “low-level” NFS API to handle the encrypted content. So, long story short, I don’t think we need to separate Routing off into a different process. Maybe NFS as it is now, but not NFS and Routing.

No :slight_smile: We have to try and close the door on as many attacks as we possibly can, but we’ve also got to balance that with releasing robust code in a reasonable timescale. As @dirvine said, it’d probably be good to get a more fleshed-out discussion going, so we can think through the implications and try to find the best possible solution.

Well, the implementations are similar, but not symmetric. Even Routing handles client connections slightly differently to vault ones.

@dirvine’s already answered this, but so you’re aware, we’re watching the progress of Cap’n Proto with an eye to replacing Protobuf.

1 Like

After reading your post on NFS, I think I need to become more familiar with the actual implementation before saying anything further. Also, is there any data on throughput performance I can look at?

I’m excited now. Kenton is an expert in this very specific field, so using his work is always a good idea. The project is also very new, so that comes with stability concerns. The Cap 'n Proto encoding format appears to be more complex than Protobuf, so obvious security concerns in that too.

Agreed, this is important to watch. I think Kenton has some conservative opinions there which is great to see. Our goal is strong type safety with self confirming types all the way through the system, At this time all systems need to use various type erasure and this is where I see a ton of danger in the industry. The sooner the strong validating type system can be throughout then the sooner we get to strong types, cheaper code, simpler code and much stronger security. I believe its very possible, but not focussed on near enough and weak types are not the answer.

This are is a constant pain for us and due to the nature of what we do it exposes a weakness in software engineering when data transport is involved (either to disk or across the wire). Nobody talks about this enough, but we will for sure.

Firstly, we exclusively use C++11 which very considerably reduces the attack space over C or legacy C++. The main vulnerabilities will be where we drop into C style code in RUDP and other low level layers. Secondly we run daily passes of memory correctness tools over the unit tests, something most other code bases even today don’t do. Thirdly we have the very latest stack smashing protection turned on, this feature only came into the just released GCC 4.9. And fourthly release builds ship with the undefined behavior sanitizer turned on, this auto repairs the effects of any exploit of undefined behaviors. On everything other than Windows we are an exemplar of modern C++ practice, and on Windows we shall continue to go as fast as the toolset allows.

Process separation is of course better again, and in the post launch refactor when we go to work on eliminating some of the eleven memory copies we do between network sockets and the user it will be done in a way where the GPU’s memory could be used to hand off data between privilege separated processes. Certainly for me this is much more a performance fix than a security fix - the current design is light years ahead of anything shipping in BlackBerry 10 for example. Obviously all new code will always have security defects, but I expect this code to be significantly better than average for new code.

Niall

3 Likes

This post is going to be extremely pessimistic. But you have to understand I saw some “raw” code and bugs at my last job, and even ran straight into a compiler optimizer/code generation bug which was fun to track down. I was nearly convinced that C++ could never produce a bug free program entirely, but yet I keep coming back.

How does it reduce the attack space? Do you mean the additional language features and libraries will reduce the number of bugs that developers write? That might be true, but you still need highly competent developers for C++. The rules for template programming are so complex in C++11, I’m not sure how to even explain it to someone new to the language. And what about the integer promotion and conversion rules? They are still complex and confusing. Changing an interface from void foo(long) to void(short) will still implicitly convert at any call sites without compiler errors. Which reminds me, you’ve probably done this already, but I suggest enabling the gcc/clang warning flags for sign-conversion and the other one (whatever it was called). Some developers find it extremely annoying because unsigned short i = 0; i += 10 will (on slightly older gcc’s anyway) produce a warning. Of course learning that (unsigned short) += 10 results in an int being added to an unsigned short surprises some developers too. Various arithmetic overflow issues are also fun.

And what about all the compiler bugs? I still need to report a bug for gcc 4.4 (which is ancient now), because it won’t let base and derived pointers alias when the derived is templated. Some very funky code got generated when I used boost::iterator_facade once, resulting in a segfault inside of boost::asio code (lucky it wasn’t worse). This is why boost::move uses the MAY_ALIAS attribute - a gcc developer incorrectly stated why the boost::move code was invalid (it was invalid for a different reason than stated), leaving valid code at the mercy of the gcc optimizer. What do you think is going to happen with the rapid C++ changes? That means more bugs in my mind, that’s the side effect of using new code. Gcc might be worse than Clang in this regard, because Clang seems better tested and easier to modify. This project cannot assume that all the libraries and tools its using has 0 bugs. Agustín Bergé (K-ballo) from boost might be a better person to talk about this, he’s been finding bugs in VS, gcc, and clang with C++11 code. I think most were odd template corner cases, but he may have found some code generation issues too which is always fun.

In theory various testing techniques should identify compiler issues; its possible that I’ve just been abused as a programmer.

How do you figure? If you compile in C++ mode, you’ll get better type checking than strict C. And you won’t have exceptions. There’s a reason why Go promoted the use of return code and defer statements over C++ style exceptions. You get the RAII without the difficulty of understanding the exception guarantees that a class can provide - recovering from an exception often leads to invalid states in classes.

I’m a HUGE fan of C++, but I will never argue with someone who chooses C for its predictable code generation. The only thing I will trash is the printf style interfaces - templates (boost::format or boost::spirit::karma) destroys that design by a large margin in safety and performance.

I thought this only reported issues, but it reacts to issues as well? I don’t see any information that these techniques will help you in release code, but I’ve never used them before either.

That’s why I suggested process separation. Unfortunately, we have to assume there is an issue in the codebase or in one of the libraries. I think Google got it correct with chrome (process per tab), because once memory gets overwritten or an incorrect pointer address gets used, pretty much anything can happen. At least this project shouldn’t be mapping hardware devices to its process space. Taking out a driver and hardware card because of some buffer overflow in a supposedly user space process is not a good time.

I partially started this discussion thinking that Protobuf was going to be detrimental on a network like this (slow performance). The system could take some more risks by having Cap 'n Proto or a custom protocol run it own sandboxed process. Then push information relevant to the particular node to the next process using protobuf, which is a simple protocol design (with possibly overly complex auto-generated code).

Are there going to be calculations done on the GPU?

2 Likes

As far as bugs in compilers, I think you will see a load of reports from MaidSafe Engineers being posted, we could list a ton of them and a lot are very frustrating. You will see workarounds and fixes in our code for many of them (such as static initialisation in MSVC to avoidance of broken algorithms in some compilers (copy_n for instance)). These are just the territory we live in. I can assure you we are extremely aware of many bugs and try hard to report them (MS for instance will almost certainly ignore many bugs and may or may not tell you that). Howard and Marshal in Clang are very responsive. Jonathan Wakely and co from gcc are likewise on the ball when bugs are found, even in other libs such as cryptopp).

I think this is all general info on c++ and other programming in general. You do need highly competent developers, well thought out algorithms and if possible a bug free developer environment (kernel, c libs etc.). c++11/14 dramatically reduces attack space as the number of calls required changes and the language itself takes on new meaning in a lot of places. The RAII and smart pointers etc. all help, the removal of need for new/delete helps as well. very high warning levels (with warnings treated as errors), Clang helps (but even libc++ have 47000 bugs when it was checked against the sanitisers). The sanitisers help, CI helps, code review helps and much more. We do all this and still fight like mad to ensure code is clean, exception safe (do not look at routing right now) and where possible and required with a strong exception guarantee.

Separating poorly written process will not always help, chrome can crash all tabs on my linux box and does frequently, but is a good step. We have 5 process to run up a vault/client so far and that helps, but interprocess communication can open up further security holes.

Protobuf or any serialisation is in itself tricky, we have a dev on board who has their own serialisation library (a boost dev) and another boost dev who writes thread safe concurrent objects and filesystem components ((heard of AFIO :slight_smile: yes that’s the one). We are interviewing another few devs from boost world who have written a filesystem abstraction, bounded integer library and one of our in house guys did another lib which went to boost. The author of ASIO has worked with us, the author of Just threads and c++ concurrency has also done work on the code base as well as a few other names in the industry. You should check out the interview questions, they are setting an exceptionally high bar.

The key is to get the best people and manage the process of development in a manner that review is solid and tested as well as dev discussions which cover much of what you have said here. Strong focus on the code in place is also important.

My personal goals are strongly tested code with the very minimum raw loops and sound core algorithms. As for bugs, then there will always be bugs, but we are at a low level here and will have fewer than higher level abstracted languages at the coal face at least, by definition.

I think being abused as a programmer is a way we all feel at times. In MaidSafe we try our very hardest to provide a creative space and allow people to operate at their best with minimum frustration. This helps, also being led by an Engineer who understands the complexity and requirement for the best helps. Pre launch as we are just now though means less of that and focus on deliverables in a manner that very hard for really creative Engineers to abide with, but we all do at least 18 hours and many of us 7 days a week. Times are hard, but that is also a good thing.

So long story short, it’s hard, there are bugs, there always will be, but we can strive for the most solidly written algorithms we can. We will always improve, but looking forward is the best route. All review, suggestions etc. are fantastic and I lap them up so fire away. But improving what code is there and helping identify specific fixes etc. would be amazing.

Anyway , I need to get back to those pesky bugs we are fixing in pivotal tracker, please help out of you want :slight_smile: you will see how we handle the critters MaidSafe Open Source - Pivotal Tracker

All patches and comments welcomed of course.

2 Likes

Personally I believe it is impossible to write a bug free program in C or C++ where by impossible I mean that the economics of doing so are not feasible. If you want economically bug free, start with Ada or one of the formal spec languages.

It’s mainly the vastly improved STL in C++ 11. Those have been audited for buffer overflow attacks and such, unlike most third party libraries you’d have to fall back onto (e.g. Boost) if you didn’t have them.

I think it was Howard Hinnant who said that Apple’s internal testing had found orders of magnitude improvements in the exploitability of newly written C++ 11 code to the extent that as a general rule you simply shouldn’t write in C any more if you want a secure program. Of course an excellent C programmer will not write insecure C programs, but we’re worried about average programmers here. Average programmers are crap with pointers and managing memory securely. An average programmer need never touch pointers or manage memory in C++ 14, though of course an average programmer won’t know that and will still be manually calling new/delete etc. and using pointers directly.

Same goes for an Ada or Haskell compiler. We’re all always at the mercy of toolsets. I’d hazard though that you’ll see fewer bugs using a toolset than hand writing everything in assembler.

Little of the Maidsafe code base uses exceptions currently. Longer run we’ll be deploying std::expected<T, E> gather-collapse style gather exception throws to return value collapsing. We’re currently blocked on Visual Studio for std::expected<> though, but expected<> lets you retrofit exception unsafe code very easily.

Of the eleven memory copies, the following could be pushed onto the GPU with huge benefits:

  1. SHA512 round.
  2. gzip deflate round.
  3. AES256 round.
  4. XOR round.
  5. SHA512 round.

Right now after those the present design requires we bring it back onto the CPU for some Routing processing and then a further AES256 round as part of the RSA encrypt round before it heads out to RUDP for 8Kb chunking. But certainly pushing the above five memory copies onto the GPU should be relatively easy with the current design and which would have enormous effects on mobile device battery life if that device has hardware OpenCL support (right now that’s only the Tegra K1).

Niall