Fleming-testnet - Error: Unable to send Message to Peer: MissingSecretKeyShare

Yea, I have this error.

Will check later if I do see any chunks.

1 Like

I have similar errors starting from 11:17 this morning

[sn_node] INFO 2021-04-10T10:37:47.285569687+01:00 [src/node/handle.rs:36] Handling NodeDuty: No op.
[sn_node] INFO 2021-04-10T10:57:43.120940818+01:00 [src/event_mapping/mod.rs:46] Handling RoutingEvent: MessageReceived { content: "000500..", src: Section(a76176..), dst: Node(f7183e..) }
[sn_node] INFO 2021-04-10T10:57:43.148048927+01:00 [src/node/handle.rs:36] Handling NodeDuty: WriteChunk
[sn_node] INFO 2021-04-10T10:57:43.148108323+01:00 [src/chunks/chunk_storage.rs:73] TRYING TO STORE BLOB
[sn_node] INFO 2021-04-10T10:57:43.148175832+01:00 [src/chunk_store/mod.rs:100] Writing chunk
[sn_node] INFO 2021-04-10T10:57:43.157169825+01:00 [src/chunk_store/mod.rs:104] consumed space: 1048604
[sn_node] INFO 2021-04-10T10:57:43.157213245+01:00 [src/chunk_store/mod.rs:105] max : 2147483648
[sn_node] INFO 2021-04-10T10:57:43.157236115+01:00 [src/chunk_store/mod.rs:106] use space total : 45650917
[sn_node] INFO 2021-04-10T10:57:43.228927029+01:00 [src/chunk_store/mod.rs:125] Writing chunk succeeded!
[sn_node] INFO 2021-04-10T10:57:43.229288353+01:00 [src/node/handle.rs:36] Handling NodeDuty: No op.
[sn_node] INFO 2021-04-10T11:17:24.387039308+01:00 [src/event_mapping/mod.rs:46] Handling RoutingEvent: MessageReceived { content: "000500..", src: Node(cff019..), dst: Node(f7183e..) }
[sn_node] INFO 2021-04-10T11:17:24.387278981+01:00 [src/node/handle.rs:36] Handling NodeDuty: ReplicateChunk
[sn_node] INFO 2021-04-10T11:17:24.387350508+01:00 [src/chunks/mod.rs:80] Creating new Message for acquiring chunk from current_holders
[sn_node] INFO 2021-04-10T11:17:24.387403372+01:00 [src/chunks/chunk_storage.rs:136] Sending NodeSystemQuery::GetChunk to existing holders
[sn_node] INFO 2021-04-10T11:17:24.387479497+01:00 [src/node/handle.rs:36] Handling NodeDuty: SendToNodes [ targets: {dd3111.., e8a8f8.., fcdffe..}, msg: NodeQuery { query: System(GetChunk { new_holder: f7183e.., address: Public(f003fb..), current_holders: {} }), id: MessageId(0905d7..), target_section_pk: None } ]
[sn_node] ERROR 2021-04-10T11:17:24.387696749+01:00 [src/node/messaging.rs:68] Unable to send Message to Peer: MissingSecretKeyShare
[sn_node] ERROR 2021-04-10T11:17:24.387817355+01:00 [src/node/messaging.rs:68] Unable to send Message to Peer: MissingSecretKeyShare
[sn_node] ERROR 2021-04-10T11:17:24.387973383+01:00 [src/node/messaging.rs:68] Unable to send Message to Peer: MissingSecretKeyShare
[sn_node] INFO 2021-04-10T11:17:24.388451814+01:00 [src/event_mapping/mod.rs:46] Handling RoutingEvent: MessageReceived { content: "000500..", src: Node(cff019..), dst: Node(f7183e..) }
[sn_node] INFO 2021-04-10T11:17:24.398505732+01:00 [src/node/handle.rs:36] Handling NodeDuty: ReplicateChunk
[sn_node] INFO 2021-04-10T11:17:24.398580584+01:00 [src/chunks/mod.rs:80] Creating new Message for acquiring chunk from current_holders
[sn_node] INFO 2021-04-10T11:17:24.398637264+01:00 [src/chunks/chunk_storage.rs:136] Sending NodeSystemQuery::GetChunk to existing holders
[sn_node] INFO 2021-04-10T11:17:24.398706752+01:00 [src/node/handle.rs:36] Handling NodeDuty: SendToNodes [ targets: {dd3111.., e8a8f8.., fcdffe..}, msg: NodeQuery { query: System(GetChunk { new_holder: f7183e.., address: Public(f176cf..), current_holders: {} }), id: MessageId(3d241b..), target_section_pk: None } ]
[sn_node] ERROR 2021-04-10T11:17:24.398856689+01:00 [src/node/messaging.rs:68] Unable to send Message to Peer: MissingSecretKeyShare
[sn_node] ERROR 2021-04-10T11:17:24.398976298+01:00 [src/node/messaging.rs:68] Unable to send Message to Peer: MissingSecretKeyShare
[sn_node] ERROR 2021-04-10T11:17:24.399093889+01:00 [src/node/messaging.rs:68] Unable to send Message to Peer: MissingSecretKeyShare
[sn_node] INFO 2021-04-10T11:17:24.399155067+01:00 [src/event_mapping/mod.rs:46] Handling RoutingEvent: MessageReceived { content: "000500..", src: Node(cff019..), dst: Node(f7183e..) }
[sn_node] INFO 2021-04-10T11:17:24.399266873+01:00 [src/node/handle.rs:36] Handling NodeDuty: ReplicateChunk
[sn_node] INFO 2021-04-10T11:17:24.399336874+01:00 [src/chunks/mod.rs:80] Creating new Message for acquiring chunk from current_holders
[sn_node] INFO 2021-04-10T11:17:24.399401768+01:00 [src/chunks/chunk_storage.rs:136] Sending NodeSystemQuery::GetChunk to existing holders

and lots of similar msgs
But her is the strange bit - although there are no more entries in $HOME/.safe/node/local-node/sn_node.log after 11:17, when I look at $HOME/.safe/node/local-node/chunks/immutable I see there are several chunks that were stored or modified(?) since…

willie@gagarin:~/projects/maidsafe/local_sites$ ls -alt ~/.safe/node/local-node/chunks/immutable/
total 598612
drwxrwxr-x 2 willie willie   86016 Apr 10 13:15 .
-rw-rw-r-- 1 willie willie  790524 Apr 10 13:15 010000002c9e181d0dce0045d5e8f2a9fb4a2706cb8930fdadd2e36941ec20d5d1490f37
-rw-rw-r-- 1 willie willie       8 Apr 10 13:15 used_space
-rw-rw-r-- 1 willie willie  790572 Apr 10 13:14 01000000122886b1ae2783e28c798b655012f53ae0242753ba412d07814e8ac0886adcbd
-rw-rw-r-- 1 willie willie  790556 Apr 10 13:14 01000000f0f54b5f643d7fe071836e8b26bdecf101aeadfb01d8a3888a069b2eca0edf29
-rw-rw-r-- 1 willie willie  790572 Apr 10 13:14 01000000f61fd9e5747698bce2835bf37d402529b88867ffb5927d4d369f7748ebdf4733
-rw-rw-r-- 1 willie willie  790588 Apr 10 13:14 01000000227bdc5f975f883566b75f722da0242b0ab5e5c2955544f9e4fcf6d77a206481
-rw-rw-r-- 1 willie willie  790572 Apr 10 13:13 010000002e1efdbd4dd3c84198f45699bd5af75637876be6ed66016afbdd0d113dbf92c7
-rw-rw-r-- 1 willie willie  790540 Apr 10 13:13 010000003b44e395a3813740b878e7d21bf875f621889e3ce261d1d54a43e362c1802d58
-rw-rw-r-- 1 willie willie  790556 Apr 10 13:13 010000002602fd08ce3bcd5835a24953a2231ef20b7cb109302c894219611ba70d3bbaa3
-rw-rw-r-- 1 willie willie  790556 Apr 10 13:13 01000000003b24acdcd25d277d02efc8c8808216bfeac2aed137a8528c2627032680c880
-rw-rw-r-- 1 willie willie  790572 Apr 10 13:12 010000000fe05d8aa41884b9be40c682451c7a974b403d4ee6e7c7a33d79fb6d972f63fb
-rw-rw-r-- 1 willie willie  790556 Apr 10 13:12 010000003c9f0336992b0dbb28e49da85bff81df7f367cbcf461fef2b8e012df1e918d6c
-rw-rw-r-- 1 willie willie  790540 Apr 10 13:11 0100000037f9c59d8cd1405a613bd252ade811e7dc3de0a72595118a39de12b7f3ae9cb4
-rw-rw-r-- 1 willie willie  790588 Apr 10 13:11 0100000008c73d29f429a22bd9b22613c28dd734b87676cd18542bd9f32661db57b0e160
-rw-rw-r-- 1 willie willie  790556 Apr 10 13:11 01000000fa09964cb21d0177fef02aaf69f47870048788cb9fecf098153648d488e9ad44
-rw-rw-r-- 1 willie willie  790556 Apr 10 13:11 01000000f28fe067aa2ffc83f360dc22a5f02f8d361467a5965820cd91414b71111c76cd
-rw-rw-r-- 1 willie willie  790556 Apr 10 13:11 010000001bdf74154d185eb8d32a210d414fe1037b4bc2d29721edb7280e652607d412d1
-rw-rw-r-- 1 willie willie  790556 Apr 10 13:10 01000000f25b76ee998a8db1996756118fdf62ab6f87af3ef5cd70ae06102fd64da760d8
-rw-rw-r-- 1 willie willie  790572 Apr 10 13:10 010000001043a974ce20a03fa77d4a71e22fc31ca5132f3a63528a99028fa1ac4eb68de0
-rw-rw-r-- 1 willie willie  790572 Apr 10 13:09 01000000369397686092edba258da6376eb6627c2fbf7580fc885bd04f486c4414c9a464
-rw-rw-r-- 1 willie willie  790540 Apr 10 13:09 010000002b1ce4f649bb9308fde30a3571277b5d70fc79bce618acc7f86f63d4755a65ed
-rw-rw-r-- 1 willie willie  790572 Apr 10 13:09 01000000fd33a974d7c4064cc48163d531451a80e7657cca80cbea2fcdb59c3c21891833
-rw-rw-r-- 1 willie willie 1048604 Apr 10 11:17 01000000133fee71b8c2173a2bb28e35f53563d8066dbb34345fce0362fe7ef551510068
-rw-rw-r-- 1 willie willie 1048604 Apr 10 11:14 01000000279ff5103c9ca6167a7aa3a158a4ee88c05d5fc32b6dcbc7ea885ee5d7c2c063
-rw-rw-r-- 1 willie willie 1048604 Apr 10 11:11 01000000328ee001f66239f70ef7e51304922013186c6ee3311d95eb0c3ac4348e4c3111
-rw-rw-r-- 1 willie willie 1048604 Apr 10 11:10 01000000bd9495de9cd9f1c8c1cdb696dfec7dd0013d7fd601b052190c4bd3875e3b3ff9
-rw-rw-r-- 1 willie willie 1048604 Apr 10 11:08 01000000bfb855b4ae52f0cfb8d3457b60e2e9c3b5ca8bb1b686d22b6c8e8cf8118603d9
-rw-rw-r-- 1 willie willie 1048604 Apr 10 11:05 0100000006982c63931d3d3beca6cc34eedfe4490a346e0b1c6fe801337ef5f6b2862bb9
-rw-rw-r-- 1 willie willie 1048604 Apr 10 11:04 0100000016e6484e6bebe5a84959ce2b8f1c7263574d60b5de48116ffb1f6a7db4587f54
-rw-rw-r-- 1 willie willie 1048604 Apr 10 11:01 01000000082b1a8e7f5a84f41cec853a23b05e40f139b26947814f5f33789b1795edad7b
-rw-rw-r-- 1 willie willie 1048604 Apr 10 10:58 010000000b1d17c8adbbc797660b6316697734579c5c3805b6c2aec79d2abc740693c339
-rw-rw-r-- 1 willie willie 1048604 Apr 10 10:57 01000000fd99de91fbe1623c72ab7aedf9b63c7fb537575ef4c7f9c99b6a517961d1e732
-rw-rw-r-- 1 willie willie 1048604 Apr 10 10:54 010000003f4f7461ad855e33083acaf457ba909b4d4c93e786ba62f79001d1e9e87bbfcf

Note the change in size of the chunks after 11:17 though

@bochaco @joshuef @StephenC do you want thee logs and dir listings in their entirety or have I spammed enough?

2 Likes

This will go away. It takes some time. The number of MissingSecretKeyShares goes down until it’s gone. Then … your first chunk!
(For me it took 2 hours from node getting accepted to first chunk stored.)

1 Like

So my node was not dead? Glad I did not shut it down.
I see 5 chunks now.
And I see corresponding lines in logs:

[sn_node] ERROR 2021-04-10T13:17:42.871929900+03:00 [src\node\messaging.rs:68] Unable to send Message to Peer: MissingSecretKeyShare

(after 2 hours of doing nothing…)

[sn_node] INFO 2021-04-10T15:09:27.191312100+03:00 [src\event_mapping\mod.rs:46] Handling RoutingEvent: MessageReceived { content: "000500..", src: Section(993623..), dst: Node(b5b3a9..) }
[sn_node] INFO 2021-04-10T15:09:27.225312100+03:00 [src\node\handle.rs:36] Handling NodeDuty: WriteChunk
[sn_node] INFO 2021-04-10T15:09:27.227312100+03:00 [src\chunks\chunk_storage.rs:73] TRYING TO STORE BLOB
[sn_node] INFO 2021-04-10T15:09:27.227312100+03:00 [src\chunk_store\mod.rs:100] Writing chunk
[sn_node] INFO 2021-04-10T15:09:27.230312100+03:00 [src\chunk_store\mod.rs:104] consumed space: 790556
[sn_node] INFO 2021-04-10T15:09:27.231312100+03:00 [src\chunk_store\mod.rs:105] max : 50000000000
[sn_node] INFO 2021-04-10T15:09:27.231312100+03:00 [src\chunk_store\mod.rs:106] use space total : 0
[sn_node] INFO 2021-04-10T15:09:27.379312100+03:00 [src\chunk_store\mod.rs:125] Writing chunk succeeded!
[sn_node] INFO 2021-04-10T15:09:27.380312100+03:00 [src\node\handle.rs:36] Handling NodeDuty: No op.

Ah but mine was working fine beforehand. I may have been the first community member to get chunks and all was going well. Then I fired up a node on the laptop and that eventually got chunks after the usual “try later” msgs.
I shut them both down with a total of about 200Mb of chunks to see if I could break anything and all seemed fine. I updated node to 0.35.2 - its just a trivial change and started again As before it took a few hours of “try later” and then started storing chunks as expected until 11:17 this morning when something happened.

If these “errors” are potentially expected as part of the syncing process, they should probably be “warnings.” I have no idea if this is expected behavior or not unfortunately.

In this case, it is good to know that getting these errors is not a reason the kill the node. Similar to cases where re-trying a command once more would solve a problem, or knowing that executing safe commands can take minutes even for the simplest commands.

1 Like

Mine is taking a good few seconds rather than minutes to run commands - still cannot put a single small.css file even with 600+ coins though…

willie@gagarin:~/projects/maidsafe/local_sites$ time safe keys balance
Checking balance of CLI's assigned keypair...
SafeKey's current balance: 625.997359750

real	0m16.356s
user	0m2.461s
sys	0m0.544s
willie@gagarin:~/projects/maidsafe/local_sites$ time safe files put starwarsintro.css 
Error: NetDataError: Failed to PUT Public Blob: Transfer(InsufficientBalance)

real	0m23.922s
user	0m1.077s
sys	0m0.106s
willie@gagarin:~/projects/maidsafe/local_sites$ 

after doing nothing for a good few hours, I appear to be storing chunks again

Same thing for me.
Tried many times in a loop.
Will try to make new account.
Maybe it will help.

1 Like

With full Anti Entropy in place these errors are all recoverable. Still errors though, but just recoverable. What is likely is that they will drop to a lower logging level. So not visible to ordinary users but they will be to folk wanting to poke further down the stack.

7 Likes

Now I am seeing these again: ERROR … src/node/messaging.rs:68] Unable to send Message to Peer: MissingSecretKeyShare. Also safe commands such as files put, and keys balance, just hang now indefinitely.

A try to fix it:
safe networks check … the current network setup does not match any of the networks in CLI config. (!?? strange, I have not touched anything.)

safe networks switch fleming-testnet
safe auth restart
safe auth unlock … Error: AuthdError: ClientError: Requested data not found :frowning:
safe auth unlock … Error: AuthdError: ClientError: Requested data not found :frowning:
safe auth unlock … Error: AuthdError: ClientError: Requested data not found :frowning:
safe auth unlock … Safe unlocked successfully :slight_smile:
safe auth lock
safe auth unlock --self-auth … Error: AuthdError: ClientError: Requested data not found :frowning:
safe auth unlock --self-auth … Safe CLI app was successfully authorised :slight_smile:
safe keys balance … MessageId MessageId(68a3b3…) was interrupted due to infrastructure updates. :frowning:
safe keys balance … hangs. :frowning:

No progress. I guess restart the node then?

safe node killall
safe node join … Routing(TryJoinLater) :frowning:
safe keys balance … hangs. :frowning:

Hmm, could be so many things but I guess it would be better for me to test from a machine in the cloud to rule out issues with NAT. First get seconds-order response time…

A machine located in New York did not work any better, timing out at safe auth create --test-coins.

1 Like

Same here.

node join occured last night, I have no chunks after MissingSecretKey in sn_node.log at 3.33am

networks check at 7.30am:
Current network setup in your system doesn't match any of your networks in the CLI config. Use 'networks switch' command to switch to any of them

safe networks switch fleming-testnet
~/.safe/authd/sn_authd restart

I can now safe auth create --test-coins without a timeout error.

Monitoring to see if the node starts working again.

Try restarting authd using its path instead of cli

1 Like

Something broke at 3.33am, the node was working then stopped working.

Here is a snip of the sn_authd.log between 00.22 and 3.33

1 Like

13 posts were split to a new topic: Testnet timeout issues

The node started working again at 1.48pm with no intervention.

I have put this down to Anti Entropy

2 Likes

My chunks (look at date and time):
image

This is what happens (I think): Network is lagging → Not many people are able to upload files → Low and irregular amount of chunks arriving.

2 Likes

Local time is UK BST - are you 2 hours ahead?

1 Like

I am at UTC+3.

1 Like

(UK BST = UTC +1)
Is it unusual that our chunks update at the same time, expectation is random timestamps?

1 Like

I suspect you have hit the nail on the head here. Also when nodes relocate they should not delete chunks, but upload them to the network again. That can have this affect, same with splits. Nodes restarting shoudl also do this.

So a rule is a node never deletes data without first saving it again to the network. It’s a good rule, but can “hide” errors at this stage, however we will find those errors. I am unconvinced with anything short of full AE that these things will happen.

We have mostly AE in place but rushed this baby out to prove a lot more and that has been great, next one (soon) will be significantly better and on we go.

7 Likes

This topic was automatically closed after 60 days. New replies are no longer allowed.