Pre-Dev-Update Thread! Yay! :D

Not exactly, the prefix-map file recently evolved into containing not just a map of prefixes to SAPs, but also to contain the complete sections keys DAG/tree, so this file is likely to be renamed soon. For a node joining a network, or a client bootstrapping to connect to a network, we are using that same file as it contains the SAPs they can connect to, so from their perspective, that’s just a network contacts list/file, not sure if in the future clients and nodes support other type of files/formats than this prefix-map file we currently produce, either way, they need a network contacts list.

12 Likes

Chop chop MaidSafe, it’s nearly PIMMs o’clock! :cocktail:

9 Likes

Anyone expecting anything exciting?

It’s been months :cry:

2 Likes

Sod yer poncy Pimms - used to think that was some obscure form of DRAM packaging btw - it’s weissbier and grilled bratwurst time out my back garden.

My laptop HDD is rubber-ducked so I will not be competing in the first-post race this time. I’ll read later when it finally drops and respond then.

I have to go out the back anyway, the agogometer is throbbing noisily and distracting me.

4 Likes

11 August Update has dropped!

2 Likes

The sectionTree !

15 Likes

No chains to see here, move along now. :policeman:

13 Likes

It’s quiet on GitHub and the forum, too quiet. Maybe i have more influence than I thought…

9 Likes

Missed your calling. I know I’d fear if you had a Bobby stick!

4 Likes

There were fairly big changes last nigt when I did a git pull. Built that and fired up baby-fleming. Put a couple of 2-300Mb dirs without issue then fed it my usual 3.2Gb of photos.

That put process crapped out after a couple of mins but the nodes kept running. Successfully stored another couple of small dirs then fed it a dir of ~1000 small thumbnails mostly <20kb. Nearly 12 hrs later, its still running. Tailing the logs shows a lot of msgs are
like


2022-08-18T12:27:32.624010Z DEBUG sn_node::node::flow_ctrl] checking for q data, qlen: 0
[2022-08-18T12:27:32.624014Z DEBUG sn_node::node::flow_ctrl] data found isssss: None

and one node has nothing in the logs since this at 02:07 UTC

[2022-08-18T02:07:58.898654Z DEBUG sn_dysfunction] Adding a new issue to 2198ea(00100001).. the dysfunction tracker: PendingRequestOperation(OpId-e24ab5..)
[2022-08-18T02:07:58.953496Z DEBUG sn_node::node::flow_ctrl] checking for q data, qlen: 0
[2022-08-18T02:07:58.953505Z DEBUG sn_node::node::flow_ctrl] data found isssss: None
[2022-08-18T02:07:58.986814Z DEBUG sn_node::comm] Cleanup peers , known section members: {NodeState { peer: Peer { name: 0fad97(00001111).., addr: 127.0.0.1:39126 }, state: Joined, previous_name: None }, NodeState { peer: Peer { name: 2198ea(00100001).., addr: 127.0.0.1:41898 }, state: Joined, previous_name: None }, NodeState { peer: Peer { name: 362cae(00110110).., addr: 127.0.0.1:45000 }, state: Joined, previous_name: None }, NodeState { peer: Peer { name: 555e64(01010101).., addr: 127.0.0.1:33964 }, state: Joined, previous_name: None }, NodeState { peer: Peer { name: 55bf5a(01010101).., addr: 127.0.0.1:37662 }, state: Joined, previous_name: None }, NodeState { peer: Peer { name: 75ac26(01110101).., addr: 127.0.0.1:36726 }, state: Joined, previous_name: None }, NodeState { peer: Peer { name: b42af9(10110100).., addr: 127.0.0.1:43950 }, state: Joined, previous_name: None }, NodeState { peer: Peer { name: b6afc9(10110110).., addr: 127.0.0.1:41838 }, state: Joined, previous_name: None }, NodeState { peer: Peer { name: d74072(11010111).., addr: 127.0.0.1:48182 }, state: Joined, previous_name: None }, NodeState { peer: Peer { name: f60630(11110110).., addr: 127.0.0.1:48768 }, state: Joined, previous_name: None }, NodeState { peer: Peer { name: f703af(11110111).., addr: 127.0.0.1:57683 }, state: Joined, previous_name: None }}

The agogometer is barely flickering this afternoon.

7 Likes

Now tthis is interesting…

Safe Network v0.9.0/v0.10.0/v0.70.0/v0.66.0/v0.68.0/v0.61.0
Repository: maidsafe/safe_network · Tag: 0.10.0-0.9.0-0.70.0-0.66.0-0.68.0-0.61.0 · Commit: 43fcc7c · Released by: github-actions[bot]

This release of Safe Network consists of:

Safe Node Dysfunction v0.9.0
Safe Network Interface v0.10.0
Safe Client v0.70.0
Safe Node v0.66.0
Safe API v0.68.0
Safe CLI v0.61.0

New Features

  • remove ConnectivityCheck
    Now we have periodic health checks and dysfunciton, this
    check should not be needed, and can cause network strain
    with the frequent DKG we have now
  • new cmd to display detailed information about a configured network
  • include span in module path
  • simplify log format to ’ [module] LEVEL ’
  • make AntiEntropyProbe carry a current known section key for response
13 Likes

This is cool

willie@gagarin:~$ safe networks sections
Network sections information for default network:
Read from: /home/willie/.safe/network_contacts/default

Genesis Key: PublicKey(0aa9..928c)

Sections:

Prefix ''
----------------------------------
Section key: PublicKey(06dc..6260)
Section keys chain: PublicKey(0aa9..928c)->PublicKey(1211..cda6)->PublicKey(0854..1e80)->PublicKey(0b66..272a)->PublicKey(0bc3..4074)->PublicKey(06dc..6260)

Elders:
| XorName  | Age | Address         |
| d4a18f.. |  86 | 127.0.0.1:39521 |
| 370cfc.. |  88 | 127.0.0.1:57347 |
| b63782.. |  90 | 127.0.0.1:48025 |
| 75c634.. |  92 | 127.0.0.1:57335 |
| f56d64.. |  94 | 127.0.0.1:33643 |
| 0f704a.. |  96 | 127.0.0.1:55864 |
| 15c8f2.. | 255 | 127.0.0.1:52482 |
17 Likes

Running stably?

1 Like

Stable? yeees…

I didnt crash it but when I tried to load 3Gb of pics, the box slowed to a crawl and peaked at 99% RAM and 80% swap so I killed the process and could continue to add small files. I didnt try anything over 100Mb after that.

Its still hogging RAM and refusing to release it after the files are put.
However Im thinking the tests I am doing are perhaps irrelevant in the real world. In production I doubt any sane person would try to run 15 nodes simultaneously and try to put 3GB of files in one shot. Run one or two nodes and try to put 3Gb - thats another story…

Any clown can run out of RAM if they hammer the box hard enough, whether its 15 sn_node processes or watching 30 YouTube vids. Im only exploring the limits of my box, not the code
Perhaps a more useful test - until we have a comnet - would be to run the 15 nodes, cos baby-fleming is the only option - and put a series of smaller files whilst keeping a very close eye on the RAM consumed after each put. Its the fact that each sn_node process continues to hog memory after the job is done that I think is concerning.

Rather than safe files put ~/Pictures/2016/ ← ~3.2Gb, I will make a set of test dirs each with say 1 Gb of total content and put them sequentially .

10 Likes

I got busy with other stuff and have yet to make a structured set of dirs. So I just fed it some more ~/Pictures subdirs
Putting larg(ish) dir one after another showed me this

RAM usage is up approx 10% BUT some of it appears to be getting released now.
The command was

willie@gagarin:~$ safe files put -r ~/Pictures/2011 && \
safe files put -r ~/Pictures/2008 && \
safe files put -r ~/Pictures/2019 && \
safe files put -r ~/Pictures/2018

2011 ← 1.4Gb
2008 ← 534 Mb
2019 ← 354Mb
2018 ← 265Mb

Its almost like there is a certain threshold of size of put, above which the RAM is not released. This sounds crazy to me but its all I can deduce from what I have seen over the past couple of weeks. Small uploads and it tends to work as expected but go above some limit (1Gb?) and the sn_node processes do notrelease te memory once the put is complete.

@joshuef @chriso @dirvine @qi_ma do you want logs for this?

PS @mods @ can we get a @devs address that will send to all the devs?

14 Likes

Repro case is probs more useful than the logs atm. I haven’t tried this locally yet, but I will be soon. All being well.

@southside perhaps you can have a go at running the churn example and tweaking these values: safe_network/churn.rs at main · maidsafe/safe_network · GitHub

right now it does 400mb and 27 nodes in total. If you have time/want perhaps dial in a quantity of data there that’s causing this that’d be awesome :muscle: No worries if not, i’ll be aiming to do this to try and get a test case up before the end of the week (hopefully :stuck_out_tongue: )

10 Likes

Note current usage of RAM and swap

Fire up baby-fleming with latest release
RUST_LOG=DEBUG safe node run-baby-fleming --nodes 15

put a largish (>2Gb) file or dir -watch the RAM and swap usage as it starts and after the chunks are stored

UPDATE: I ran my grab_logs script that copies baby-fleming nodes to /tmp then does a find --exec to extract the log files. This was a daft move as I only have 16Gb allocated for /tmp so I quickly ran out of space. I tried clearing /tmp and trying again but that also ran out of space, However now my RAM/swap usage looks like this…

Could my issue be related to sn_node process caching some logging in RAM?

I cant remember ever seeing swap > used RAM.

Since you are talking about RAM usage in nodes, it should not matter much if you run upload on separate PC - RAM usage of client is a small fraction of RAM usage of 15 nodes.
Then you can assume that 1 node will require 1/15 of your RAM amount. This will be real world case.
Now look at how many RAM it uses and think if such usage is adequate to task, which node performs.

Since I know that torrent clients can download tenths GB of data while using much less than 1 GB of RAM, I expect that SN node can do the same. If someone can prove that node for some reason needs more, please show such proofs.
Just to note, my torrent client is configured to use 192MB of RAM for read/write cache - experimentally I found that it is enough for it to constantly serve 10MB/s uploads. Total amount of RAM used for it now is ~400MB.

3 Likes

Looks like it…

I cleared out /tmp again before grabbing some logs and the RAM usage went down again.
Heres the screenshot

Running a baby-fleming is something only a few users are ever likely to do so best not to get hung up about this.

I’ll look at this tonight.

7 Likes

There is a lossy option… that is not enabled nor is there an option to from the sn_node bin… but I do wonder. Can you try with logs disabled and see how you go?

1 Like