Plot of Real Test Network Data

People here might be interested in this little project that I’ll run on the Node.log from the next public test by Maidsafe. It updates every minute, which you should see if you hit refresh in your browser:

http://91.121.173.204/

There is a testing/development function that currently uses a random number generator to create a mock Node.log, adding some IP addresses to it every minute.

Then the script filters the mock Node.log for IP addresses, and then for unique addresses, and then counts them and adds a timestamped entry to a data file that is processed by the (Gnuplot) plotter script.

I have no idea why there are gaps in the plot!

Obviously, the only value of this plot as it stands is to present an upper bound on the IP addresses on the network, since a live Node.log would contain “false positives” of various kinds, such as IPs that had dropped off, or ones that a vault was trying that had never been on the network. Still, it’s a start that can be refined.

I’ll post more detail here in a bit.

It might be of limited use right now, but such live metrics of testnets and production nets, along several dimensions, all in one place for taking in at a glance, would be way cool. SAFE network porn for addicts. :smile: Suggestions welcome for things to measure.

10 Likes

safenet-statistics :heart_eyes: this will become my first bookmarked site (ever)

2 Likes

I like this idea and it has me wonder what each node knows by default.

One obvious option is to see the current Routing Table size

Routing Table size:  27

but there must be more that’s possible. …thinking…

1 Like

I’m running it right now on the current test network and it is not finding any IP addresses in the live Node.log.

Funny… I’m sure I’ve seen IP addresses in there in the past.

Most of the lines in Node.log begin like this

     INFO 17:11:45.787311970

That long number isn’t an IP, as far as I can tell, even though it has nine digits. If it was then I would expect to be able to search for the IPs from the config file (minus dots) but nothing is found. Is that just an arbitrary message identifier? If the vault knows its group by their IPs then where is that information kept?

It’s a timestamp :wink:

4 Likes

Oh well, for now I’ll have to revamp it to count PeerId names first four digits (which is a more useful metric of number of vaults) instead of IPs.

The lines that say “added ‘blah’ to routing table” can be filtered for ‘blah’ and totalled. With network churn taht should eventually encompass every other node. then have a sliding time window for ones that don’t show up after some time.

Also, the lines that are marked “stats”

1 Like

This is what I have it doing now… a bit more useful than the zero flatline of before:

http://91.121.173.204/plot.svg

3 Likes

This is the code:

First the shell script:

#! /bin/sh
# plot the number of unique nodes added to the routing table over time

# Count the number of nodes that have been added in Node.log
# and add that number to a data file.

# Create a x-y data set ready for plotting
printf $(date +"%H:%M")" " >> ~/bin/output.dat

# Node that if you are going to use UTC as your timezone then
# you will need to set it beforehand with:
# $ sudo cp /usr/share/zoneinfo/UTC /etc/localtime

grep -o 'Added [0-9,a-f]\{4\}.. to routing table' $1 |\
 grep -o '[0-9,a-f]\{4\}' | sort -n | uniq | wc -l >> ~/bin/output.dat

# Generate the plot image from output.dat
~/bin/plot.gnu

# Copy the plot image to the web folder.
# Note that for this to work you will need to
# grant the user write privileges there.
cp -f ~/bin/plot.svg /var/www/html/

This is the plotter script that is called by the shell script:

#! /usr/bin/gnuplot
# (gnuplot version: 4.6 patchlevel 6)
#
# Plotting the data of ouput.dat

set key noautotitle

# svg
set terminal svg size 1024,768 fname 'Verdana, Helvetica, Arial, sans-serif'\
 fsize '10'
set output '~/bin/plot.svg'

# Axes label
set xlabel 'Time (UTC)'
set ylabel 'No. of Unique Nodes Added to Routing Table'

set title "SAFE TEST 3, 2016-05-17: Unique Nodes Added to Routing Table"\
 font "Arial,14"

# color definitions
set border linewidth 1.5
# set style line 1 lc rgb '#0060ad' lt 1 lw 2 pt 7 ps 1.5 # --- blue

set grid nopolar
set grid xtics nomxtics ytics nomytics noztics nomztics \
 nox2tics nomx2tics noy2tics nomy2tics nocbtics nomcbtics
set grid layerdefault   lt 0 linewidth 0.500,  lt 0 linewidth 0.500

set ytics 10
# I haven't set xtics because gnuplot takes care of it well enough.

set tics scale 1

set xdata time
set timefmt "%H:%M"
set format x "%H:%M"
set xrange ["18:00":"23:59"]
set yrange [50:250]

# set the grid lines
set style line 12 lc rgb 'blue' lt 1 lw 1.5
set style line 13 lc rgb 'blue' lt 1 lw 0.5
set grid xtics ytics mxtics mytics ls 12, ls 13

# set the frequency of the minor tics
set mxtics 6
set mytics 2

plot "~/bin/output.dat" using 1:2

This is what the output.dat file looks like after a few cycles:

18:21 64
18:22 64
18:23 64
18:24 64

This is the crontab:

    */1 * * * * /home/user/bin/run /home/user/safe_vault1/Node.log

Manual usage is (although the cron job normally takes care of it):

    $ ./run ../safe_vault1/Node.log

If you are running safe_vault as a service, as I describe here, then make the argument in the cron job, and the manual usage, point to /opt/safe_vault1/Node.log, or wherever you have installed it. That’s the only change you would need to make, since neither of the two scripts hard-wire that path.

4 Likes

Still getting gaps, due to something stopping and starting, and I just noticed that the time scale is a bit wonky, since there are not 80 minutes in an hour, for example.

EDIT: Both the gaps and the incorrect minutes are due to the x axis being formatted as integer rather than time. The time formatting is a bit arcane but I’ll have it sorted shortly.

I don’t know how you’re creating the .svg. I have awareness and less capability of d3 and some sense that can be put out as .svg, which might be easier and more flexible. One step beyond would be for users to run d3 and be passed the data for it but that’s beyond me.

It’s working correctly now!

It’s alive!! Mwahaha! :smile:

3 Likes

hahaha you can see how new datapoints are added :slight_smile: thats Awesome!

I’ve updated the code posted above, to incorporate the latest pretty formatting. If anyone can’t view the plot or there’s any flaw then please let me know. I’m not sure what will happen once it hits midnight UTC.

If you go to the main page then you can view today’s plot as well as access a link to yesterday’s plot:

http://91.121.173.204/

1 Like

The vault was hung and I restarted the server a few minutes ago, which is why you see the spike.

Actually, I’m not sure if it was hung, the connection was not responding and I restarted the machine by the provider’s control panel. It might simply have been network issues.

But the restart certainly picks up new connections, so I might include that in a future iteration.

1 Like

This is great! Thanks for putting this up :thumbsup:.

Here’s something significant: Even though it has added 60 vaults in the last few minutes, the spike (in unique node names) is only 10 vaults or so, so the other 50 are already in the log.

There are only 6 hard-coded contacts (IP addresses) in the config, and three mapper servers.

I’m not sure if that can be used as a metric of network size.

A comment on the “Test 3” topic reminds me (duh) that tools other than the safe_vault log can be used to gather IP addresses of connecting machines. In the case of the comment it was client machines (launchers) but it should be possible to do that for vaults.

But today I have to catch up on work and leave this fun for a while.

Here’s a thought:

I’m all for Maidsafe shutting down the current test network once they’re finished testing.

However, in other contexts a vault operator might do the following:

  1. Use Wireshark or some other packet sniffer to collect the IPs of all the vaults that connect to his vault.

  2. Add those IPs as hard-coded contacts to a customized, vault config. This would be much more robust than having one or two seeds.

  3. Restart the vault using the customized config.

  4. Voila! The unkillable test network. :slight_smile:

This is functionality we have, currently disabled in these tests. It’s called bootstrap cache. So every vault will check connections and those which are “directly connectable” are stored in this cache. Up to 1500 entries allowed in order of last seen.

7 Likes