Rootless podman container node assigns wrong IP address and port for host node when attempting to connect each other via slirp4netns

Safe network version: 0.52.0 0.47.0 0.40.0

I’m trying to run an app called the safe network inside a rootless podman container and then connect another node to it.

It fails because the host node tells container node that it should reach it on 10.0.2.100:random,
when it should be 192.178.168.29:12001.

The IPs I’ve chosen

Root node inside container
- Local: Tap0 address:12000
- Public/Published: LAN address or public address:12000

Host
- Local: LAN address:12001
- Public: LAN address:12001

host logs

 INFO 2021-12-29T11:20:26.259319Z [sn/src/routing/routing_api/mod.rs:L152]:
	 ➤ 6da530.. Bootstrapping a new node.
 INFO 2021-12-29T11:20:26.276224Z [sn/src/routing/routing_api/mod.rs:L166]:
	 ➤ 6da530.. Joining as a new node (PID: 202926) our socket: 192.168.178.29:12001, bootstrapper was: 192.168.178.29:12000, network's genesis key: PublicKey(0de4..893d)
 INFO 2021-12-29T11:20:26.276595Z [sn/src/routing/core/bootstrap/join.rs:L505]:
	 ➤ join {network_genesis_key=PublicKey(0de4..893d) target_section_key=PublicKey(0de4..893d) recipients=[Peer { name: 6da530(01101101).., addr: 192.168.178.29:12000, connection: Some(Connection { id: 944978480, remote_address: 192.168.178.29:12000, .. }) }]}
	 ➤ send_join_requests {join_request=JoinRequest { section_key: PublicKey(0de4..893d), resource_proof_response: None, aggregated: None } recipients=[Peer { name: 6da530(01101101).., addr: 192.168.178.29:12000, connection: Some(Connection { id: 944978480, remote_address: 192.168.178.29:12000, .. }) }] section_key=PublicKey(0de4..893d) should_backoff=false}
	 ➤ Sending JoinRequest { section_key: PublicKey(0de4..893d), resource_proof_response: None, aggregated: None } to [Peer { name: 6da530(01101101).., addr: 192.168.178.29:12000, connection: Some(Connection { id: 944978480, remote_address: 192.168.178.29:12000, .. }) }]
 INFO 2021-12-29T11:20:27.451718Z [sn/src/routing/core/bootstrap/join.rs:L367]:
	 ➤ join {network_genesis_key=PublicKey(0de4..893d) target_section_key=PublicKey(0de4..893d) recipients=[Peer { name: 6da530(01101101).., addr: 192.168.178.29:12000, connection: Some(Connection { id: 944978480, remote_address: 192.168.178.29:12000, .. }) }]}
	 ➤ Setting Node name to 8d1c11.. (age 98)
 INFO 2021-12-29T11:20:27.451765Z [sn/src/routing/core/bootstrap/join.rs:L374]:
	 ➤ join {network_genesis_key=PublicKey(0de4..893d) target_section_key=PublicKey(0de4..893d) recipients=[Peer { name: 6da530(01101101).., addr: 192.168.178.29:12000, connection: Some(Connection { id: 944978480, remote_address: 192.168.178.29:12000, .. }) }]}
	 ➤ Newer Join response for us 8d1c11(10001101).., SAP SectionAuthorityProvider { prefix: Prefix(), public_key_set: PublicKeySet { public_key: PublicKey(0de4..893d), threshold: 0 }, elders: {Peer { name: 2d82ba(00101101).., addr: 192.168.178.29:12000, connection: None }} } from Peer { name: 2d82ba(00101101).., addr: 192.168.178.29:12000, connection: Some(Connection { id: 944978480, remote_address: 192.168.178.29:12000, .. }) }
 INFO 2021-12-29T11:20:28.011536Z [sn/src/routing/core/bootstrap/join.rs:L505]:
	 ➤ join {network_genesis_key=PublicKey(0de4..893d) target_section_key=PublicKey(0de4..893d) recipients=[Peer { name: 6da530(01101101).., addr: 192.168.178.29:12000, connection: Some(Connection { id: 944978480, remote_address: 192.168.178.29:12000, .. }) }]}
	 ➤ send_join_requests {join_request=JoinRequest { section_key: PublicKey(0de4..893d), resource_proof_response: None, aggregated: None } recipients=[Peer { name: 2d82ba(00101101).., addr: 192.168.178.29:12000, connection: Some(Connection { id: 944978480, remote_address: 192.168.178.29:12000, .. }) }] section_key=PublicKey(0de4..893d) should_backoff=true}
	 ➤ Sending JoinRequest { section_key: PublicKey(0de4..893d), resource_proof_response: None, aggregated: None } to [Peer { name: 2d82ba(00101101).., addr: 192.168.178.29:12000, connection: Some(Connection { id: 944978480, remote_address: 192.168.178.29:12000, .. }) }]
 ERROR 2021-12-29T11:21:28.031312Z [sn/src/routing/core/bootstrap/join.rs:L164]:
	 ➤ join {network_genesis_key=PublicKey(0de4..893d) target_section_key=PublicKey(0de4..893d) recipients=[Peer { name: 6da530(01101101).., addr: 192.168.178.29:12000, connection: Some(Connection { id: 944978480, remote_address: 192.168.178.29:12000, .. }) }]}
	 ➤ Node cannot join the network since it is not externally reachable: 10.0.2.100:53100

container logs

 INFO 2021-12-29T10:19:25.359368Z [sn/src/routing/routing_api/mod.rs:L85]:
	 ➤ 2d82ba.. Starting a new network as the genesis node (PID: 1).
 INFO 2021-12-29T10:19:25.477464Z [sn/src/routing/routing_api/mod.rs:L128]:
	 ➤ 2d82ba.. Genesis node started!. Genesis key PublicKey(0de4..893d), hex: ade4e34061a1b0f3d1e81999bf075c0af6046f02bd81178e186a5d593957ca948dc7682d6ce729aa924aeb1a8d6ef97c
 INFO 2021-12-29T10:19:25.477680Z [sn/src/routing/routing_api/dispatcher.rs:L87]:
	 ➤ Starting to probe network
 INFO 2021-12-29T10:19:25.477867Z [sn/src/routing/routing_api/dispatcher.rs:L115]:
	 ➤ Writing our PrefixMap to disk
 INFO 2021-12-29T10:19:25.477911Z [sn/src/routing/core/mod.rs:L212]:
	 ➤ Writing our latest PrefixMap to disk
 INFO 2021-12-29T10:19:25.482805Z [sn/src/node/node_api/mod.rs:L87]:
	 ➤ Node PID: 1, prefix: Prefix(), name: 2d82ba(00101101).., age: 255, connection info: "192.168.178.29:12000"
INFO 2021-12-29T11:21:28.021787Z [sn/src/routing/core/comm.rs:L183]:
	 ➤ Peer 10.0.2.100:53100 is NOT externally reachable: Send(ConnectionLost(TimedOut))

host command

safe networks switch lan-ipv4 && \
  RUST_BACKTRACE=full ~/.safe/node/sn_node -vv \
  --clear-data \
  --skip-auto-port-forwarding \
  --local-addr 192.168.178.29:12001 \
  --public-addr 192.168.178.29:12001 \
  --root-dir=/home/folaht/.safe/node/joinnode-ipv4_12001 \
  --log-dir=/home/folaht/.safe/node/joinnode-ipv4_12001 &

The IP address and port the host node tries to connect to

[folaht@Rezosur-zot joinnode-ipv4_12001]$ safe networks
+----------+--------------+------------------------------------------------------------------------+
| Networks |              |                                                                        |
+----------+--------------+------------------------------------------------------------------------+
| Current  | Network name | Connection info                                                        |
+----------+--------------+------------------------------------------------------------------------+
| *        | lan-ipv4     | /home/folaht/.safe/cli/networks/lan-ipv4_node_connection_info.config   |
+----------+--------------+------------------------------------------------------------------------+
|          | local-ipv4   | /home/folaht/.safe/cli/networks/local-ipv4_node_connection_info.config |
+----------+--------------+------------------------------------------------------------------------+
|          | local-ipv6   | /home/folaht/.safe/cli/networks/local-ipv6_node_connection_info.config |
+----------+--------------+------------------------------------------------------------------------+
[folaht@Rezosur-zot joinnode-ipv4_12001]$ cat /home/folaht/.safe/cli/networks/lan-ipv4_node_connection_info.config
["8520343e22464270f10ef2408902ab87b1186e7410cf57d5390c9a44b8953097e0b36e2bd29d1169eaa446a82966b12d",["192.168.178.29:12000"]]

status container node

[folaht@Rezosur-zot rootnode-ipv4]$ podman ps -a
CONTAINER ID  IMAGE                                COMMAND     CREATED         STATUS             PORTS                                                             NAMES
f4c6a137b93d  localhost/rootnode-ipv4_test:latest              37 minutes ago  Up 37 minutes ago  192.168.178.29:12000->12000/tcp, 192.168.178.29:12000->12000/udp  test_rootnode-ipv4

pid container node

[folaht@Rezosur-zot rootnode-ipv4]$ ps -ef | grep sn_node
10999      51780   51777  0 18:13 ?        00:00:01 sn_node -vv --idle-timeout-msec 5500 --keep-alive-interval-msec 4000 --skip-auto-port-forwarding --clear-data --local-addr 10.0.2.100:12000 --public-addr 192.168.178.29:12000 --log-dir /home/admin/.safe/node/rootnode-ipv4_12000 --root-dir /home/admin/.safe/node/rootnode-ipv4_12000 --first
folaht     52185    4230  0 18:50 pts/1    00:00:00 grep --colour=auto sn_node

It seems that the root node cannot connect to the joining node, and it seems is trying to connect to it using 10.0.2.100:53100 address, instead of 192.168.178.29:12001?
Perhaps confirm on the joining node logs if it started with 192.168.178.29:12001 address?..

1 Like

10.0.2.100 is the local address, so I don’t think that’s the issue,
but 12000 is it’s local port.

You can see in the joining node log that it’s connecting to 192.168.178.29:12000

recipients=[Peer { name: 6da530(01101101).., addr: 192.168.178.29:12000, connection: Some(Connection { id: 944978480, remote_address: 192.168.178.29:12000, .. }

I have no idea why it then jumps from port 12000 to port 53100 or any other random node.

P.S. I’ve tried it again, same error, same IP, different port. 10.0.2.100:36303

I think I’ll file it as a bug.

I think that can still be an issue since your joining node is/would not listening on that IP but on 192.168.178.29 since you specified it with --local-addr/--public-addr. I think there is something going on in the joining node which should presumably listen on 192.168.178.29:12001 but the root node is seeing its join requests coming from 10.0.2.100:< random port > instead. The root node is trying to connect back to the joining node using the latter ip/port and that’s why it fails.

1 Like

I think you rather need to confirm which address the joining node is connecting from, as that’s the address the root node will try to connect back to.

1 Like

I misunderstood your first post.
I understand it now.

I think I’ll have a chat with the podman people.

Or maybe I need to file this as a maidsafe bug first, because they’ll be able to best tell whether this is a podman or a safe network issue.

1 Like

You should be able to see a log entry in the joining node logs similar to the one below, which can tell you which IP and port is using to connect to the root node:

Joining as a new node (PID: 318345) our socket: 127.0.0.1:35687, bootstrapper was: 127.0.0.1:52074, network's genesis key: PublicKey(143b..47dd)

The address right after “our socket:” should be in your case 192.168.178.29:12001, if not then probably the issue is that the joining node is not picking up the values passed with --local-addr / --public-addr for some reason.

1 Like
$ cat sn_node.log.2021-12-29-13  | grep Joining
	 ➤ 88df89.. Joining as a new node (PID: 203959) our socket: 192.168.178.29:12001, bootstrapper was: 192.168.178.29:12000, network's genesis key: PublicKey(0de4..893d)

It’s correct.

That’s good then.
Perhaps you can then check on the root node for a sequence of log entries similar to the following:

         ➤ Handling msg: JoinRequest from 8c0a0c.. at 127.0.0.1:35687 (connected)
 DEBUG 2021-12-29T21:42:55.063695Z [sn/src/routing/core/msg_handling/join.rs:L51]:
         ➤ handle_command {name=b11092.. prefix=() age=255 elder=true cmd_id=280799152.0 section_key=PublicKey(143b..47dd) command=HandleSystemMessage MessageId(f6bf..53f1)}
         ➤ Received JoinRequest { section_key: PublicKey(143b..47dd), resource_proof_response: None } from 8c0a0c.. at 127.0.0.1:35687 (connected)
 TRACE 2021-12-29T21:42:55.063730Z [sn/src/routing/core/msg_handling/join.rs:L168]:
         ➤ handle_command {name=b11092.. prefix=() age=255 elder=true cmd_id=280799152.0 section_key=PublicKey(143b..47dd) command=HandleSystemMessage MessageId(f6bf..53f1)}
         ➤ our_prefix Prefix() section_members 1 expected_age 98 is_age_invalid false
 INFO 2021-12-29T21:42:55.067171Z [sn/src/routing/core/comm.rs:L187]:
         ➤ handle_command {name=b11092.. prefix=() age=255 elder=true cmd_id=280799152.0 section_key=PublicKey(143b..47dd) command=HandleSystemMessage MessageId(f6bf..53f1)}
         ➤ Peer 127.0.0.1:35687 is NOT externally reachable: ......

Obviously with your IP, just to see if the IP in all those messages is still the one the joining node used to connect to and which the root node is using to do the reachability test.

1 Like

Alright. I created an equivalent part.

DEBUG 2021-12-31T00:00:50.906108Z [sn/src/routing/core/msg_handling/join.rs:L51]:
	 ➤ handle_command {name=c9a8fa.. prefix=() age=255 elder=true cmd_id=2935419944.0 section_key=PublicKey(0d39..ae26) command=HandleSystemMessage MessageId(bb51..7345)}
	 ➤ Received JoinRequest { section_key: PublicKey(0d39..ae26), resource_proof_response: None, aggregated: None } from 8ffdf7.. at 10.0.2.100:37675 (connected)
 TRACE 2021-12-31T00:00:50.906158Z [sn/src/routing/core/msg_handling/join.rs:L184]:
	 ➤ handle_command {name=c9a8fa.. prefix=() age=255 elder=true cmd_id=2935419944.0 section_key=PublicKey(0d39..ae26) command=HandleSystemMessage MessageId(bb51..7345)}
	 ➤ our_prefix Prefix() section_members 1 expected_age 98 is_age_invalid false
 ...
 INFO 2021-12-31T00:01:50.908759Z [sn/src/routing/core/comm.rs:L183]:
	 ➤ handle_command {name=c9a8fa.. prefix=() age=255 elder=true cmd_id=2935419944.0 section_key=PublicKey(0d39..ae26) command=HandleSystemMessage MessageId(bb51..7345)}
	 ➤ Peer 10.0.2.100:37675 is NOT externally reachable: Send(ConnectionLost(TimedOut))

I’ve got the fully traced log here.

Right, it looks like it’s being translated from 192.168.178.29:12001 into 10.0.2.100:37675 before reaching the root node.
Could it be that podman is doing some of that with messages coming from the host? (I’m not familiar at all with podman).

1 Like

Should it have stayed at 129.168.178.29:12001?

Yes, as that’s where the joining node is listening on.

1 Like

This topic was automatically closed after 58 days. New replies are no longer allowed.