Looking for some help with debugging one of the issues with test-12b. We think node
a20694.. was provided with two GetNodeName responses when it started up due to a bug. However to confirm this or see why / who provided this invalid response, it’d be great help to get hold of the log file of this Node if you guys still have them.
So if anyone who ran a node can check their log file (if you still have them) and if it was node (
a20694..) it’d be great to get hold of this log file.
If you also maybe check for
Expecting candidate a20694.. and get a hit, that’d also be a log file that’d be worth getting hold of.
I suggest that we from now on hold onto our logs for every TEST network until a new TEST network becomes available. This I think (please correct me ) means moving the folder that holds your vault software and configuration files to a new location every time you shut down your vault.
After you have relocated and secured those logs, just extract the originally downloaded vault file the way you normally would and start again.
It might be a little inconvenient for the impatient among like myself but it could potentially help speed up development.
THAT is more than enough to justify such practice IMHO.
Hey @Viv , would it be trivial for you guys to program vaults to create separate logs for each node ID?
This way every time nodes are restarted either by the running vault or manually by the user, a neatly packaged log file could be generated and available for diagnostics. Maybe in a sub folder called “logs”
This would make things a ton more simple for both the community and the team I think.
logging right now is from a different crate, so we could before initialising logger check for presence of previous log file and move it to a unique name like append with timestamp or something or update upstream crate to support dynamic file name specification from log.toml config files for file loggers There are more states to consider too when just using node name. Bunch of things occur before a node gets its name(bootstrap and relocation requests) and even then the resource proof process needs to succeed before approval and node name kinda confirmed. Pre resource proof the process simply restarts. Also during runtime if a node has its RT collapse, then it restarts again too.
Ideally a log collector on an opt in basis may be beneficial to allow testnet only logs to get submitted on destruction/startup/manually to something like the visualiser or even a log accumulator.
Sounds good. Consider me opt in if you choose to make it happen.
Wish I could check my logs but I am away for 10 days. Any point in running the same test a second time?
Nah we’re addressing some stuff based on assumptions in behaviour we expect might have happened here and will see as part of test-12c when that gets out. Logs are also getting tweaked to indicate this more clearly if it occurs, so should be able to trace it down a lot easier if it occurs again and hopefully even if it does, it doesnt cause a breaking issue for that network.
Checked my logs, not either of these nodes or any messages related to them.