Parallelisation of Transfers?

I’ve tried searching for mentions, but haven’t located any, so excuse me if this has been covered elsewhere…

One of the massive advantages of distribution of data across multiple nodes is the speed gains possible by parallelisation of transfers. This is how DrFTPD became a dominant choice for FTP daemons. How can a single threaded option, limited by the speed of a given node, possibly compete against one that took 6, 8, 10+ transfers at once? It was a clear winner in the war of fastest transfers.

Aria2 has long been a favourite transfer client, also because of this ability to handle, well, multi-parts from varying sites, simultaneously.

Given the internet is littered with machines having varying upload speeds, and limited by trans-Atlantic hops, amongst other limitations, parallel transfers are a must, in some cases.

Queuing, buffering, and other considerations obviously play a part, but to what extent is parallelisation implemented, planned, or at least catered for in some respects?

It is already there. When you request a file, the chunks are retrieved in parallel from their random locations, and reassembled into the file (or streaming video) as they reach your PC / tablet / phone etc.

4 Likes

Thanks for the speedy, clear, answer!

Is there a limitation of any kind, other than the obvious max download speed of the requestor or IP stack limitations of their O/S?

If there isn’t some network imposed limitation, then how would a DDoS be prevented, especially where legitimate requests are made, not indicated to be completed, and requested again quickly, in parallel from multiple nodes to multiple nodes?

You’re welcome (to the rabbit hole :slight_smile:).

There are, and probably will be more, measures to prevent DDoS. One is to cache chunks near where they are being requested, another is to penalise a node that is making frequent requests for the same data (by throwing it off the network).

7 Likes

Your requests are passed through relay nodes and in those nodes the traffic to each IP address is proportioned, so yes there is a bandwidth limit of sorts. But again you have multiple relay nodes servicing you (to prevent any one node knowing what what your traffic patterns are) so there is still a lot of parallelism.

But it is expected that for most who have a less than 1 Gbits/sec link you will be receiving the file chunks much faster than any internet server could hope to achieve in delivering files.

Obviously this will need to be tested and also assumes a network that has grown in size from a baby network.

Another DDOS protection that occurs because of the security of sending chunks via multiple hops to reach you is the lag for the chunk. This affects the initial access time but since the chunks can be requested in parallel it is only an initial delay. The delay though will affect a DDOS attempt.

As @happybeing says also caching occurs whenever a chunk is requested and the more it is requested the greater number of nodes that will be caching the chunk. So even an inadvertent (legit) DDOS where some one posts a great cat video and every one plays it straight away the network would become awash with cached copies.

4 Likes

Makes sense, thanks!

The relay nodes hopefully detect multiple requests by the same IP/node and throttle via incremental delays, a bit like a network hub would do when collisions were detected. That approach may not be required, but would be ubiquitous and relatively easily implemented.

I’d be more concerned with 1000 cat videos each being requested by 100 different nodes, (or 10000 videos requested by 10 nodes) such that cache hits would be minimal, but end users would be taxed heavily in trying to keep up with those 10, or 100 gigabit requests which repeatedly fail and get re-requested.

I guess a big part of this comes down to how sophisticated the relay nodes are in handling multiple requests from the same requestor.

Interesting stuff - definitely.

3 Likes