Community-Test (oct6) Offline

Indeed. We’re working on that.

What we’re currently seeing more/longer AE than we’d expect at the client side. Nodes send things back quick sharp, but client still only sees each every 5-10secs… So actually doing any one thing is taking wayyyy longer than it should.


Well, it’s the main diff between working safe_network tests and cli, so it’s a possibility.

No, but the client giving up before time isn’t indicative of a node issue either.

Now “before time” , 15 secs should be enough. Yep sure. But you’re also playing with something that isn’t fully working yet.

As i stated above, it’s looking like AE processing issues in client to me. That’s not confirmed, but it’s what I think is an issue at the mo.

Couple that with a change to how timeouts worked (previously they were not being applied as expected, so it was more like: “retry after X seconds” for a query. Whereas now it’s more “this as as long as you should take to try a query”, and retries happen within that time). So 15secs for all retries is what we’re saying there, wheras the working default in the client is 90s, and I’ve had many tests going against that on both standard droplets w/ 45 nodes and the reduced ones @Josh was using.

There’s always the chance it’s all something else too :man_shrugging: Going off what we’re seeing and what’s been tried, it’s worthwhile to give it a go w/ a different query timeout. We’ll at least learn something more there.

12 Likes

I saw discussion about different indicators for network health.
But looking at my logs I did not found anything like this.
Lots of safe_network::routing::core::msg_handling:Handling system message lines tell me nothing.
Maybe I should tweak logging levels, but it is better for you to tell which tweaks are optimal.

To see what?

You can check for ServiceMsg coming in to nodes if you want to see how/when client stuff comes in and is handled?

To see if network is dead or not.
If it is doing something useful.
As I know, no chunk stores can mean both things: network is broken or noone is uploading files.

I do not see such text in my logs.
Wrong logging level?

I use RUST_LOG=info,safe_network=debug,quinn=off
But it may be outdated.

Yeh could be. maybe try trace. I use RUST_LOG=safe_network=trace normally.

This is with you having a local testnet on the go, right? How many nodes?

If you’re not seeing nodes doing anything and the client tests dont pass against your network then yeh something is wrong for sure.

3 Likes

No, it is from version 1 testnet by @Josh.

3 Likes

ahhh, you’re node against that network? right right :ok_hand:

Yes, my node was succesfully connected to first network.
Then network stopped working and I stopped my node afterwards.

But I was able to understand that something is wrong with it only by looking at abrupt stop of kv_store messages appearing in logs and because of low activity of cat command for the long time.

These indicators are not reliable enough.
Will try safe_network=trace next time, maybe it will show more clues.

6 Likes

Cool effort. Well done.

AE quite a bit away it seems? Hopefully stable tests soon👍

maybe, maybe not. Could be it just needs a wee tweak in the right place. Thats what the testnets are for, to find the issues and flag up problems.
I suspect that we need much greater participation in the testnets and to get a larger no (and variety) of error msgs. THis should help determine much faster if its a problem with the code itself or the environment of the testerr. So everything that can be done to make it easier for the community to join in and contribute is very welcome.

3 Likes

It’s there and doing its job.

I am thinking about streaming the genesis node with the next iteration so that folk who have not run a node can appreciate the beauty of it… or of course they may think that I am a nutter for thinking that it is a thing of beauty.

7 Likes

Beauty is in the eye of the beholder…

So you intend streaming the detailed logs of the genesis node?
I’d pay money to see that.
Not a lot of money and probably in a currency you wouldn’t want but its still money… Or beer.

4 Likes

Thinking about. :slightly_smiling_face: but yeah I think it would be cool.

5 Likes

That would be very awesome and really appreciated!!

5 Likes

What do you mean by “streaming” in this context please?

Just the terminal/logs live on twitch or something like that. Need to figure it out (hence thinking about :upside_down_face:)

6 Likes

Ah ok, thanks.

If the collection of logs is something that’s really important, I actually think we should seriously consider bringing ELK into the picture so that all the logs can be forwarded and collected by an ELK setup.

5 Likes

Ahh, I was just thinking of something visual for people who can’t currently participate (in my context)
Don’t know about ELK I’ll take a look.

3 Likes

ELK is Elastic Search, Logstash and Kibana. It’s an industry standard way of forwarding logs to one place.

Maidsafe could potentially have a few VMs that run an ELK stack and then any testnet that gets spun up can forward all their logs there.

7 Likes

seeing as I have a DO droplet sitting doing nothing right now …

Hmm I’m going to need a bigger droplet needs 2 vCPUs and 4Gb RAM.

5 Likes