Issue connecting nodes via podman containers. "Cannot start node (log path: unknown)"

I created two nodes, one root/genesis node and one child node, both placed in a container.
The child node is unable to connect.

containers

$ podman ps -a
CONTAINER ID  IMAGE                                COMMAND     CREATED             STATUS                 PORTS                                                             NAMES
26d4b5bd0d9a  localhost/rootnode-ipv4_test:latest              About a minute ago  Up About a minute ago  192.168.178.29:12000->12000/tcp, 192.168.178.29:12000->12000/udp  test_rootnode-ipv4
769531ffe08d  localhost/joinnode-ipv4_test:latest              About a minute ago  Up 1 second ago        192.168.178.29:12001->12001/tcp, 192.168.178.29:12001->12001/udp  joinnode-ipv4_1

log child node

$ podman logs joinnode-ipv4_1
Switching to 'lan-ipv4' network...
Fetching 'lan-ipv4' network connection information from '/home/admin/.safe/cli/networks/lan-ipv4_node_connection_info.config' ...
Successfully switched to 'lan-ipv4' network in your system!
If you need write access to the 'lan-ipv4' network, you'll need to restart authd (safe auth restart), unlock a Safe and re-authorise the CLI again
Starting logging to stdout
 INFO 2021-11-23T13:05:47.037591Z [sn/src/routing/routing_api/mod.rs:L152]:
	 ➤ d5b9aa.. Bootstrapping a new node.
 ERROR 2021-11-23T13:06:47.050533Z [/cargo/registry/src/github.com-1ecc6299db9ec823/qp2p-0.27.2/src/endpoint.rs:L290]:
	 ➤ bootstrap {bootstrap_nodes=[192.168.178.29:12000]}
	 ➤ Failed to bootstrap to the network, last error: timed out
 WARN 2021-11-23T13:06:47.050670Z [/cargo/registry/src/github.com-1ecc6299db9ec823/qp2p-0.27.2/src/endpoint.rs:L418]:
	 ➤ bootstrap {bootstrap_nodes=[192.168.178.29:12000]}
	 ➤ Could not determine better public IP than local IP (0.0.0.0)
 WARN 2021-11-23T13:06:47.050705Z [/cargo/registry/src/github.com-1ecc6299db9ec823/qp2p-0.27.2/src/endpoint.rs:L443]:
	 ➤ bootstrap {bootstrap_nodes=[192.168.178.29:12000]}
	 ➤ Could not determine better public port than local port (46785)
Error:
   0:
Cannot start node (log path: unknown). If this is the first node on the network pass the local address to be used using --first
   1: Routing error:: Could not connect to any bootstrap contact
   2: Could not connect to any bootstrap contact

Location:
   sn/src/bin/sn_node.rs:216

Backtrace omitted.
Run with RUST_BACKTRACE=1 environment variable to display it.
Run with RUST_BACKTRACE=full to include source snippets.
1 Like

At least for docker containers

127.0.0.1:12000

Doesn’t refer to the host but to the running container itself

You could try to get around it either with using the public ip of your container host (/its Hostname if your router resolves that locally to the right ip address or if podman has the same flags as docker the additional argument --net=host might help with routing the communications right)

(just two thoughts I had when reading your post - no clue if that’s really your problem)

That is the problem.
I just discovered it at the same time you wrote that message.
I’m already forgoing localhost for containers and go for lan instead, but as you can see in the OP post now, I still have the same issue.

1 Like

Hmmm - just to be sure… You could start an interactive Ubuntu container, install nmap and do a ‘nmap -Pn -p 11999-12002 192.168.178.29’ to verify the started podman container can reach out to the hosts port (and it’s not blocked e.g. by the ufw firewall or so)

Maybe as step 0 you could do a port scan on the host scanning itself to see if there is something on port 12000 (and it is not filtered by any firewall)

$ nmap -Pn -p 11999-12002 192.168.178.29
Starting Nmap 7.92 ( https://nmap.org ) at 2021-11-26 13:30 CET
Nmap scan report for Rezosur-zot.fritz.box (192.168.178.29)
Host is up (0.00056s latency).

PORT      STATE  SERVICE
11999/tcp closed unknown
12000/tcp open   cce4x
12001/tcp open   entextnetwk
12002/tcp closed entexthigh

Nmap done: 1 IP address (1 host up) scanned in 0.22 seconds

Hmmmm - that does indeed look like there are 2 services listening :face_with_monocle: :face_with_raised_eyebrow:

That ofc then raises the question why communication fails nonetheless…

Did you try starting one of the nodes directly on the host and one in the container to see if the problem ‘has a direction’ (later added nodes vs genesis node…? )

I tried a rootnode in the container and then one in the host and rootnode in the host and joinnode in the container.
Neither work.

For the latter option, the port on the host is closed somehow?

$ nmap -Pn -p 11999-12002 192.168.178.29
Starting Nmap 7.92 ( https://nmap.org ) at 2021-11-26 18:24 CET
Nmap scan report for Rezosur-zot.fritz.box (192.168.178.29)
Host is up (0.00045s latency).

PORT      STATE  SERVICE
11999/tcp closed unknown
12000/tcp closed cce4x
12001/tcp open   entextnetwk
12002/tcp closed entexthigh

Nmap done: 1 IP address (1 host up) scanned in 0.18 seconds
[folaht@Rezosur-zot joinnode-ipv4]$ ps -ef | grep sn_node
folaht    996835  917707  0 18:09 pts/0    00:00:01 sn_node -vv --idle-timeout-msec 5500 --keep-alive-interval-msec 4000 --skip-igd --public-addr 192.168.178.29:12000 --first
10999     998377  998374  0 18:24 ?        00:00:00 /home/admin/.safe/node/sn_node --skip-igd
folaht    998392  917707  0 18:24 pts/0    00:00:00 grep --colour=auto sn_node

It looks like I’m not understanding --public-addr and --local-addr.

It doesn’t seem to want to connect when I enter a local address that isn’t localhost,
But looking at the sn_node help menu, it describes the local address as address to bused for the node.

Silly me, I never tested anything but localhost on the host and it seems like I need to use both.

sn_node -vv --idle-timeout-msec 5500 --keep-alive-interval-msec 4000 --skip-igd --local-addr 192.168.178.29:12000 --public-addr 192.168.178.29:12000 --first &
RUST_LOG=safe_network=info,qp2p=info ~/.safe/node/sn_node --skip-igd --local-addr 192.168.178.29:12001 --public-addr 192.168.178.29:12001 --root-dir ~/.safe/node/joinnode-ipv4_1 &
...
	 ➤ cd92b0.. Joined the network!
1 Like