« Apache web server performance | Main | Bughunt game demo »

Network Statistics

I regularly monitor my network and CPU usage. After running into performance problems with my Apache web server I decided to update the network statistics process I was using to try and catch performance problems before they happen.

My old network process just captured statistics from /proc/net/dev. I wanted to add a count of the number of httpd processes running. I also wanted to track the status of all the port 80 (web traffic) TCP/IP session statuses as reported by the netstat command.

The number of httpd processes running should be somewhere between the MaxSpareServers setting in httpd.conf, and go no higher than MaxClients setting. If I see that the number of httpd processes is zero then Apache is probably not even running, and if it reaches or even gets close to the MaxClients setting I may want to increase that setting.

I also wanted to check various TCP/IP session status counts. From there I should be able to see the number of established (active) sessions plus the number of sessions stuck in various wait states. By having some idea of what is normal, hopefully that will help when something goes wrong.

I started by creating a little table called lastPND. It stores send and receive bytes from the last time my new program is called. I used to download the raw numbers, but my spreadsheet had to compare one row to the previous to calculate the bytes sent per minute and it didn't work if the server shut down or if rec_bytes or send_bytes exceeded its maximum value and started over from zero.

DROP TABLE IF EXISTS `lastPND`;
CREATE TABLE `lastPND` (
  `tablekey` varchar(5) NOT NULL default 'ROW1',
  `dt` datetime NOT NULL default '0000-00-00 00:00:00',
  `rec_bytes` bigint(20) NOT NULL default '0',
  `send_bytes` bigint(20) NOT NULL default '0',
  PRIMARY KEY  (`tablekey`)
) ENGINE=MyISAM DEFAULT CHARSET=utf8;

My program starts out by reading the only row in this table (with tablekey = "ROW1"), and if it's more than 90 seconds old the program saves the current values in /proc/net/dev and exits.

Assuming there's an entry about 1 minute old, the program computes the actual number of bytes sent and received in the past minute. This allows for the server to be down, or me to shut down the regular running of the program, and the program will just start recording when it's able to. I didn't bother dealing with the /proc/net/dev numbers getting too big, they simply roll over to zero and a single entry with zero bytes sent/received is saved in my table. Something I could fix but it's not that important.

The program goes on to determine the number of httpd processes running and saves that count.

Finally the program issues a netstat command to get all the port 80 network sessions and save a count of how many sessions are in various states.

It saves all these numbers in a new table called netstats.

DROP TABLE IF EXISTS `netstats`;
CREATE TABLE `netstats` (
  `dt` timestamp NOT NULL default CURRENT_TIMESTAMP,
  `rec_bytes` bigint(20) unsigned NOT NULL default '0',
  `send_bytes` bigint(20) unsigned NOT NULL default '0',
  `httpd_num` smallint(5) unsigned NOT NULL default '0',
  `time_wait` smallint(5) unsigned NOT NULL default '0',
  `fin_wait2` smallint(5) unsigned NOT NULL default '0',
  `established` smallint(5) unsigned NOT NULL default '0',
  `syn_recv` smallint(5) unsigned NOT NULL default '0',
  `listen` smallint(5) unsigned NOT NULL default '0',
  `fin_wait1` smallint(5) unsigned NOT NULL default '0',
  `syn_sent` smallint(5) unsigned NOT NULL default '0',
  `closing` smallint(5) unsigned NOT NULL default '0',
  `closed` smallint(5) unsigned NOT NULL default '0',
  `close` smallint(5) unsigned NOT NULL default '0',
  `close_wait` smallint(5) unsigned NOT NULL default '0',
  `last_ack` smallint(5) unsigned NOT NULL default '0',
  `unknown` smallint(5) unsigned NOT NULL default '0',
  `other` smallint(5) unsigned NOT NULL default '0',
  `other_text` varchar(11) NOT NULL default '',
  PRIMARY KEY  (`dt`)
) ENGINE=MyISAM DEFAULT CHARSET=utf8;

Next I have a little CRON entry to run my program every minute. At the end of 24 hours I have 1440 entries. Here's the program:

#!/usr/bin/perl
use DBI;
use strict;
my $user = 'some_user';
my $password = 'some_pw';
my $dsn = 'DBI:mysql:cpu_stat:localhost';
my $dbh = DBI->connect($dsn, $user, $password,
                      { RaiseError => 1, AutoCommit => 0 })
   or die "Couldn't connect to database: " . DBI->errstr;
my $select_last =
            $dbh->prepare_cached("select UNIX_TIMESTAMP(dt), rec_bytes, send_bytes, now(), UNIX_TIMESTAMP(now()) from lastPND where tablekey = ?");
my @data;

$select_last->execute("ROW1")
        or die "Couldn't select from lastPND" . DBI->errstr;

@data = $select_last->fetchrow_array();
my $last_ut = $data[0];
my $last_rec = $data[1];
my $last_send = $data[2];
my $cur_dt = $data[3];
my $cur_ut = $data[4];

my $pnetdev = `cat /proc/net/dev`;
my $cur_rec = 0;
my $cur_send = 0;
$pnetdev =~ m/venet0:(\d+)\s+(\d+)\s+(\d+)\s+(\d+)\s+(\d+)\s+(\d+)\s+(\d+)\s+(\d+)\s+(\d+)\s+(\d+)\s+(\d+)\s+(\d+)\s+(\d+)\s+(\d+)\s+(\d+)\s+(\d+)$/ ;
($cur_rec,$cur_send) = ($1,$9);

my $rec_bytes = ($cur_rec - $last_rec);
my $send_bytes = ($cur_send - $last_send);
my $httpd_num = `ps aux | grep httpd | wc -l`;

if (($cur_ut-$last_ut) < 90) {

my @netstat = `netstat -alnp | grep :80`;
my $established=0;
my $syn_sent=0;
my $syn_recv=0;
my $fin_wait1=0;
my $fin_wait2=0;
my $time_wait=0;
my $closed=0;
my $close=0;
my $close_wait=0;
my $last_ack=0;
my $listen=0;
my $closing=0;
my $unknown=0;
my $other=0;
my $other_text = "";
my $line = "";

foreach $line (@netstat) {
        my $status = substr($line,76,11);
           if ($status eq "TIME_WAIT  ") {$time_wait++}
        elsif ($status eq "FIN_WAIT2  ") {$fin_wait2++}
        elsif ($status eq "ESTABLISHED") {$established++}
        elsif ($status eq "CLOSE      ") {$close++}
        elsif ($status eq "LISTEN     ") {$listen++}
        elsif ($status eq "SYN_RECV   ") {$syn_recv++}
        elsif ($status eq "FIN_WAIT1  ") {$fin_wait1++}
        elsif ($status eq "CLOSING    ") {$closing++}
        elsif ($status eq "CLOSED     ") {$closed++}
        elsif ($status eq "CLOSE_WAIT ") {$close_wait++}
        elsif ($status eq "SYN_SENT   ") {$syn_sent++}
        elsif ($status eq "LAST_ACK   ") {$last_ack++}
        elsif ($status eq "UNKNOWN    ") {$unknown++}
        else  {$other++; $other_text=$status}
}
my $insert_netstats =
            $dbh->prepare_cached("insert INTO netstats VALUES (?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?)");
$insert_netstats->execute($cur_dt, $rec_bytes, $send_bytes, $httpd_num, $time_wait, $fin_wait2, $established, $syn_recv, $listen, $fin_wait1, $syn_sent, $closing, $closed, $close, $close_wait, $last_ack, $unknown, $other, $other_text )
        or die "Couldn't insert netstats" . DBI->errstr;
}
my $update_last =
            $dbh->prepare_cached("update lastPND set dt = ?, rec_bytes = ?, send_bytes = ? where tablekey = ?");
$update_last->execute($cur_dt, $cur_rec, $cur_send, "ROW1")
        or die "Couldn't update lastPND" . DBI->errstr;

$select_last->finish();
$dbh->disconnect();

Finally I have another little shell script run just after midnight that unloads the previous day's entries, deletes the previous day's entries, and then optimizes the table to recover deleted space.

mysql --html --batch --silent --user=some_user --password=some_passwd --database=cpu_stat --execute="select * from netstats where date(dt)<date(now()) order by dt;" > /home/someuser/daily/netstats.html
mysql --batch --user=some_user --password=some_passwd --database=cpu_stat --execute="delete from netstats where date(dt)<date(now()) ;"
mysql --batch --user=some_user --password=some_passwd --database=cpu_stat --execute="OPTIMIZE TABLE netstats;"

The unloaded data gets bundled up with my daily backup process so that it can be downloaded to my PC. From there I can load the data into a spreadsheet and check out the numbers and do some graphing.

And here's what the chart and summary page of my spreadsheet looks like, click it to pop up a full size image.

March 20 2008 Network Statistics

I compute the maximum and average values for each statistic. This lets me see things like the maximum number of httpd child processes created so I can tell if my Apache MaxClients setting is being reached. I also calculate the maximum KB/sec rate (for a 1 minute period), and the total GB/day.

I graph the number of httpd processes running to see if the maximum reached is part of sustained traffic.

I also graph the number of Established TCP/IP connections to port 80. I'm not sure of the value of this number, since more than one session can be opened by a client IP address, but it does give a count of active and concurrent web requests.

I also graph the KB/sec average transfer rate for my VPS. This includes all network traffic and usually shows at least one spike when I download backup files. I went with a 5 minute average since I'm interested in sustained data transfer rates and my backup download can be between 200-300KB/sec which when graphed with the other numbers really skews the graph. The 5 minute average usually doesn't exceed 100 so it fits better on the graph.

From the graph of the httpd process count it looks like a MaxClients setting of 60 is ok for now. I'll have to check out my numbers the next time we send out a newsletter. Also, with a 10 Mbit/s connection (or about 1250KB/sec), our VPS should suffice for some time.


© 2016 Mike Silversides

About

This page contains a single entry from the blog posted on March 29, 2008 7:02 PM.

The previous post in this blog was Apache web server performance.

The next post in this blog is Bughunt game demo.

Many more can be found on the main index page or by looking through the archives.