Server:housekeeping:stats

From Linux How-To Repository

Jump to: navigation, search

Gathering Statistics

Apart from Webmin, there are a number of ways of gathering server stats, accessed via a browser.

Contents

PHPSysInfo

PHPSysInfo is a customizable PHP Script that parses /proc and presents the information in a nice format that you can view online. It will display information about system facts like Uptime, CPU, Memory, PCI devices, SCSI devices, IDE devices, Network adapters, Disk usage, and more.

See here for download and more: http://sourceforge.net/projects/phpsysinfo/

Apache's Server Status Mod

The following is from: http://httpd.apache.org/docs/2.0/mod/mod_status.html

To enable status reports only for browsers from the foo.com domain add this code to your httpd.conf configuration file

   <Location /server-status>
      SetHandler server-status
      Order Deny,Allow
      Deny from all
      Allow from .foo.com
   </Location>

You can now access server statistics by using a Web browser to access http://your.server.name/server-status

You can get the status page to update itself automatically if you have a browser that supports "refresh". Access http://your.server.name/server-status?refresh=N to refresh the page every N seconds.

A machine-readable version of the status file is available by accessing http://your.server.name/server-status?auto

Another option is ExtendedStatus On|Off. This is placed on its own outside of the Location directive.

Webalizer

To install webalizer, just run

apt-get install webalizer

Then edit the config file so that it has the right OutputDir.

gedit /etc/webalizer/webalizer.conf

Give it a day to gather information. If you have configured things so that your stats directory is where your Apache server can see it, you need only point to that directory with your browser, i.e. something like this:

http://localhost/stats

Munin

Source: http://tomdryer.com/blog/index.php/2007/07/16/monitor-your-linux-server-with-munin/

Munin creates graphs for just about everything going on in your system. It is simple to install and configure, and is perfect for checking if your server is overloaded. The following steps should work on any Debian-based distribution running Apache.

First, if you have not already enabled the Universe repository you need to do so now. Open your /etc/apt/sources.list file and uncomment the Universe lines. Then run an apt-get update to download the new package lists.

Install the required packages for Munin:

apt-get install munin munin-node 

Next you need to decide where to put Munin’s output. You want it in a directory where Apache will serve the files, but not somewhere obvious that anyone can find. Create the directory you would like to use and give ownership of it to Munin (replace mydir with the directory you are using):

chown munin:munin mydir 

Now Munin needs to be configured. Open /etc/munin/munin.conf in a text editor. Change the value of htmldir to the directory you created, and change localhost.localdomain to your server’s name (such as mysite.example.com).

After you make changes to Munin’s configuration you need to restart it. Do so with the following command:

/etc/init.d/munin-node restart 

Wait a few minutes and Munin will have created some output in the directory. Navigate to it in a web browser and you will see the new graphs that will be filled over time, e.g.:

http://mysite.example.com/munin

or

http://localhost/munin

Remember to do something about security if you don't want others seeing the stats.

Monit

Source: http://www.howtoforge.com/server_monitoring_monit_munin_p2

First, install monit:

apt-get install monit

Now, edit the config file:

sudo gedit /etc/monit/monitrc

In this example, monit will be set to monitor proftpd, sshd, mysql, apache, and postfix, I will use the monit web interface of port 2812 with https. It will have username and password access. Also, it will send email alerts to me:

set daemon  60
set logfile syslog facility log_daemon
set mailserver localhost
set mail-format { from: monit@server1.example.com }
set alert root@localhost
set httpd port 2812 and
SSL ENABLE
PEMFILE  /var/certs/monit.pem
allow admin:test

check process proftpd with pidfile /var/run/proftpd.pid
start program = "/etc/init.d/proftpd start"
stop program  = "/etc/init.d/proftpd stop"
if failed port 21 protocol ftp then restart
if 5 restarts within 5 cycles then timeout

check process sshd with pidfile /var/run/sshd.pid
start program  "/etc/init.d/ssh start"
stop program  "/etc/init.d/ssh stop"
if failed port 22 protocol ssh then restart
if 5 restarts within 5 cycles then timeout

check process mysql with pidfile /var/run/mysqld/mysqld.pid
group database
start program = "/etc/init.d/mysql start"
stop program = "/etc/init.d/mysql stop"
if failed host 127.0.0.1 port 3306 then restart
if 5 restarts within 5 cycles then timeout

check process apache with pidfile /var/run/apache2.pid
group www
start program = "/etc/init.d/apache2 start"
stop program  = "/etc/init.d/apache2 stop"
if failed host www.example.com port 80 protocol http
and request "/monit/token" then restart
if cpu is greater than 60% for 2 cycles then alert
if cpu > 80% for 5 cycles then restart
if totalmem > 500 MB for 5 cycles then restart
if children > 250 then restart
if loadavg(5min) greater than 10 for 8 cycles then stop
if 3 restarts within 5 cycles then timeout

check process postfix with pidfile /var/spool/postfix/pid/master.pid
group mail
start program = "/etc/init.d/postfix start"
stop  program = "/etc/init.d/postfix stop"
if failed port 25 protocol smtp then restart
if 5 restarts within 5 cycles then timeout

Apache is tested using a dummy file called token. So, let's set that up:

mkdir /var/www/admin/monit
echo "hello" > /var/www/admin/monit/token

Next create the pem cert for the SSL-encrypted monit web interface:

mkdir /var/certs
cd /var/certs

You can edit it manually:

sudo gedit /var/certs/monit.cnf

Here's what it'll look like:

# create RSA certs - Server

RANDFILE = ./openssl.rnd

[ req ]
default_bits = 1024
encrypt_key = yes
distinguished_name = req_dn
x509_extensions = cert_type

[ req_dn ]
countryName = Country Name (2 letter code)
countryName_default = MO

stateOrProvinceName             = State or Province Name (full name)
stateOrProvinceName_default     = Monitoria

localityName                    = Locality Name (eg, city)
localityName_default            = Monittown

organizationName                = Organization Name (eg, company)
organizationName_default        = Monit Inc.

organizationalUnitName          = Organizational Unit Name (eg, section)
organizationalUnitName_default  = Dept. of Monitoring Technologies

commonName                      = Common Name (FQDN of your server)
commonName_default              = server.monit.mo

emailAddress                    = Email Address
emailAddress_default            = root@monit.mo

[ cert_type ]
nsCertType = server

It is next created with this:

openssl req -new -x509 -days 365 -nodes -config ./monit.cnf -out /var/certs/monit.pem -keyout 
/var/certs/monit.pem
openssl gendh 512 >> /var/certs/monit.pem
openssl x509 -subject -dates -fingerprint -noout -in /var/certs/monit.pem
chmod 700 /var/certs/monit.pem  

Finally, enable the monit daemon:

sudo gedit /etc/default/monit 

Change startup to 1 and set the intervals:

# You must set this variable to for monit to start
startup=1

# To change the intervals which monit should run uncomment
# and change this variable.
CHECK_INTERVALS=60

Start Monit with this:

sudo /etc/init.d/monit start

Then, see if it's working:

   https://www.example.com:2812/

If it doesn't work, make sure that 2812 is not blocked by a firewall and that the apache "token" has the right permissions and ownership.

Here are more examples for configuring Monit:

Source: it used to be here http://www.tildeslash.com/monit/doc/manual.php but that disappeared.

It can be usefull to look at the examples to see how a service is running, where it put its pidfile, how to call the start and stop methods for a given service, etc.

Index
    * System Services
          o Cron (program timer)
          o Gdm (gnome desktop manager)
          o Inetd (internet service manager)
          o Syslogd (system logfile daemon)
          o Xfs (X font server)
          o YPBind (Yellow page bind daemon)
          o Net-SNMP (SNMP agent)
    * FTP Services
          o Proftpd
    * Login Services
          o SSHD
    * WWW Services
          o Apache (web server)
          o Zope (appication server)
          o Squid (http/ftp proxy)
          o Privoxy (spamfilter proxy)
    * Mail Services
          o Postfix (mail server)
          o sendmail (mail server)
          o Qpopper (pop3 server)
          o Spamassassin daemon (spam scan daemon)
          o Amavis-new (mail virus scanner)
    * Virus Scanner
          o Sophie (virus scan daemon)
          o Trophie (virus scan daemon)
          o Clamavd (virus scan daemon)
    * Printing Services
          o LPRng (printer daemon)
    * Database Services
          o MySQL Server
          o OpenLDAP Server
    * File Services
          o Samba (windows file/domain server)
    * Sun ONE Services
          o iPlanetDirectoryServer (Sun ONE)
          o iPlanetMessagingServer processes (Sun ONE)
          o iPlanetCalendarServer processes (Sun ONE)
    * Misc Services
          o apcupsd (APC ups daemon)
          o Webmin (remote admin service)
          o STunnel (SSL tunnel)

------------------------------------------------------

System Services
Cron (program timer)

         /usr/bin/pgrep -x -u 0 -P 1 cron > /var/run/cron.pid 

 check process cron with pidfile /var/run/cron.pid
   group system
   start program = "/etc/init.d/cron start"
   stop  program = "/etc/init.d/cron stop"
   if 5 restarts within 5 cycles then timeout
   alert foo@bar
   alert 123456@sms on { timeout }
   depends on cron_rc

 check file cron_rc with path /etc/init.d/cron
   group system
   if failed checksum then unmonitor
   if failed permission 755 then unmonitor
   if failed uid root then unmonitor
   if failed gid root then unmonitor
   alert foo@bar

Gdm (gnome desktop manager)

 check process gdm with pidfile /var/run/gdm.pid
   start program = "/etc/init.d/gdm start"
   stop program = "/etc/init.d/gdm stop"
   if 5 restarts within 5 cycles then timeout
   alert foo@bar
   alert 123456@sms on { timeout }

Inetd (internet service manager)

 check process inetd with pidfile /var/run/inetd.pid
   start program = "/etc/init.d/inetd start"
   stop program = "/etc/init.d/inetd stop"
   if failed host 192.168.1.1 port 25 protocol smtp then restart  # e.g. exim 
   if failed host 192.168.1.1 port 515 then restart               # e.g. cups-lpd
   if failed host 192.168.1.1 port 113 then restart               # e.g. ident
   if 5 restarts within 5 cycles then timeout
   alert foo@bar
   alert 123456@sms on { timeout }

Syslogd (system logfile daemon)

 check process syslogd with pidfile /var/run/syslogd.pid
   start program = "/etc/init.d/sysklogd start"
   stop program = "/etc/init.d/sysklogd stop"
   if 5 restarts within 5 cycles then timeout
   alert foo@bar
   alert 123456@sms on { timeout }

 check file syslogd_file with path /var/log/syslog
   if timestamp > 65 minutes then alert # Have you seen "-- MARK --"?

Xfs (X font server)

 check process xfs with pidfile /var/run/xfs.pid
   start program = "/etc/init.d/xfs start"
   stop program = "/etc/init.d/xfs stop"
   if 5 restarts within 5 cycles then timeout
   alert foo@bar
   alert 123456@sms on { timeout }

YPBind (Yellow page bind daemon)

 check process ypbind with pidfile /var/run/ypbind.pid
   start program = "/etc/init.d/nis start"
   stop program = "/etc/init.d/nis stop"
   if 5 restarts within 5 cycles then timeout
   alert foo@bar
   alert 123456@sms on { timeout }

Net-SNMP (SNMP agent)

 check process snmpd with pidfile /var/run/snmpd
   start program = "/etc/init.d/snmpd start"
   stop program = "/etc/init.d/snmpd stop"
   if failed host 192.168.1.1 port 161 type udp then restart
   if failed host 192.168.1.1 port 199 type tcp then restart
   if 5 restarts within 5 cycles then timeout
   alert foo@bar
   alert 123456@sms on { timeout }

FTP Services
Proftpd

 check process proftpd with pidfile /var/run/proftpd.pid
   start program = "/etc/init.d/proftpd start"
   stop program  = "/etc/init.d/proftpd stop"
   if failed port 21 protocol ftp then restart
   if 5 restarts within 5 cycles then timeout
   alert foo@bar
   alert 123456@sms on { timeout }

Login Services
SSHD

 check process sshd with pidfile /var/run/sshd.pid
   start program  "/etc/init.d/sshd start"
   stop program  "/etc/init.d/sshd stop"
   if failed port 22 protocol ssh then restart
   if 5 restarts within 5 cycles then timeout
   alert foo@bar
   alert 123456@sms on { timeout }

WWW Services
Apache (web server)

Hint: It is advisable to use a token file for monit. Thus, it is easily possible to filter out the accesses done by monit.

In some cases init scripts for apache and apache-ssl are separated, e.g. Debian Linux.

 check process apache with pidfile /opt/apache_misc/logs/httpd.pid
   group www
   start program = "/etc/init.d/apache start"
   stop  program = "/etc/init.d/apache stop"
   if failed host 192.168.1.1 port 80 
        protocol HTTP request /monit/token then restart
   if failed host 192.168.1.1 port 443 type TCPSSL 
        certmd5 12-34-56-78-90-AB-CD-EF-12-34-56-78-90-AB-CD-EF
    protocol HTTP request /monit/token  then restart
   if 5 restarts within 5 cycles then timeout
   alert foo@bar
   alert 123456@sms on { timeout }
   depends on apache_bin
   depends on apache_rc

 check file apache_bin with path /opt/apache/bin/httpd
   group www
   if failed checksum then unmonitor
   if failed permission 755 then unmonitor
   if failed uid root then unmonitor
   if failed gid root then unmonitor
   alert foo@bar

 check file apache_rc with path /etc/init.d/apache
   group www
   if failed checksum then unmonitor
   if failed permission 755 then unmonitor
   if failed uid root then unmonitor
   if failed gid root then unmonitor
   alert foo@bar

Zope (application server)

 check process zope with pidfile /opt/Zope/var/zProcessManager.pid
   start program = "/etc/init.d/zope start"
   stop  program = "/etc/init.d/zope stop"
   group www
   if failed host 192.168.1.1 port 8080 protocol HTTP then restart
   if 5 restarts within 5 cycles then timeout
   every 5
   alert foo@bar
   alert 123456@sms on { timeout }

Squid (http/ftp proxy)

 check process squid with pidfile /opt/squid/logs/squid.pid
   group www
   start program = "/etc/init.d/squid start"
   stop  program = "/etc/init.d/squid stop"
   if failed host 192.168.1.1 port 3128  then restart
   if 5 restarts within 5 cycles then timeout
   alert foo@bar
   alert 123456@sms on { timeout }
   depends on squid_bin
   depends on squid_rc

 check file squid_bin with path /opt/squid/bin/squid
   group www
   if failed checksum then unmonitor
   if failed permission 755 then unmonitor
   if failed uid root then unmonitor
   if failed gid root then unmonitor
   alert foo@bar

 check file squid_rc with path /etc/init.d/squid
   group www
   if failed checksum then unmonitor
   if failed permission 755 then unmonitor
   if failed uid root then unmonitor
   if failed gid root then unmonitor
   alert foo@bar

Privoxy (spamfilter proxy)

 check process privoxy with pidfile /opt/privoxy/var/privoxy.pid
   group www
   start program = "/etc/init.d/privoxy start"
   stop  program = "/etc/init.d/privoxy stop"
   if 5 restarts within 5 cycles then timeout
   if failed host 192.168.1.1 port 8118  then restart
   alert foo@bar
   alert 123456@sms on { timeout }
   depends on privoxy_bin
   depends on privoxy_rc

 check file privoxy_bin with path /opt/privoxy/sbin/privoxy
   group www
   if failed checksum then unmonitor
   if failed permission 755 then unmonitor
   if failed uid root then unmonitor
   if failed gid root then unmonitor
   alert foo@bar

 check file privoxy_rc with path /etc/init.d/privoxy
   group www
   if failed checksum then unmonitor
   if failed permission 755 then unmonitor
   if failed uid root then unmonitor
   if failed gid root then unmonitor
   alert foo@bar

Mail Services
Postfix (mail server)

 check process postfix with pidfile /var/spool/postfix/pid/master.pid
   group mail
   start program = "/etc/init.d/postfix start"
   stop  program = "/etc/init.d/postfix stop"
   if failed port 25 protocol smtp then restart
   if 5 restarts within 5 cycles then timeout
   alert foo@bar
   alert 123456@sms on { timeout }
   depends on postfix_rc

 check file postfix_rc with path /etc/init.d/postfix
   group mail
   if failed checksum then unmonitor
   if failed permission 755 then unmonitor
   if failed uid root then unmonitor
   if failed gid root then unmonitor
   alert foo@bar

Sendmail (mail server)

 check process sendmail with pidfile /var/run/sendmail.pid
   group mail
   start program = "/etc/init.d/sendmail start"
   stop  program = "/etc/init.d/sendmail stop"
   if failed port 25 protocol smtp then restart
   if 5 restarts within 5 cycles then timeout
   alert foo@bar
   alert 123456@sms on { timeout }
   depends on sendmail_bin
   depends on sendmail_rc

 check file sendmail_bin with path /usr/lib/sendmail
   group mail
   if failed checksum then unmonitor
   if failed permission 755 then unmonitor
   if failed uid root then unmonitor
   if failed gid root then unmonitor
   alert foo@bar

 check file sendmail_rc with path /etc/init.d/sendmail
   group mail
   if failed checksum then unmonitor
   if failed permission 755 then unmonitor
   if failed uid root then unmonitor
   if failed gid root then unmonitor
   alert foo@bar

Qpopper (pop3 server)

 check process qpopper with pidfile /var/run/popper.pid
   group mail
   start program = "/etc/init.d/qpopper start"
   stop  program = "/etc/init.d/qpopper stop"
   if 5 restarts within 5 cycles then timeout
   if failed port 110 type TCP protocol POP then restart
   alert foo@bar
   alert 123456@sms on { timeout }
   depends on qpopper_bin
   depends on qpopper_rc

 check file qpopper_bin with path /opt/sbin/popper
   group mail
   if failed checksum then unmonitor
   if failed permission 755 then unmonitor
   if failed uid root then unmonitor
   if failed gid root then unmonitor
   alert foo@bar

 check file qpopper_rc with path /etc/init.d/qpopper
   group mail
   if failed checksum then unmonitor
   if failed permission 755 then unmonitor
   if failed uid root then unmonitor
   if failed gid root then unmonitor
   alert foo@bar

Spamassassin daemon (spam scan daemon)

 check process spamd with pidfile /var/run/spamd.pid
   group mail
   start program = "/etc/init.d/spamd start"
   stop  program = "/etc/init.d/spamd stop"
   if 5 restarts within 5 cycles then timeout
   alert foo@bar
   alert 123456@sms on { timeout }
   depends on spamd_bin
   depends on spamd_rc

 check file smapd_bin with path /usr/local/bin/spamd
   group mail
   if failed checksum then unmonitor
   if failed permission 755 then unmonitor
   if failed uid root then unmonitor
   if failed gid root then unmonitor
   alert foo@bar

 check file spamd_rc with path /etc/init.d/spamd
   group mail
   if failed checksum then unmonitor
   if failed permission 755 then unmonitor
   if failed uid root then unmonitor
   if failed gid root then unmonitor
   alert foo@bar

Amavis-new (mail virus scanner)

 check process amavisd with pidfile /opt/virus/amavis-new/var/run/amavisd.pid
   group mail
   start program = "/etc/init.d/amavis-new start"
   stop  program = "/etc/init.d/amavis-new stop"
   if failed port 10024 protocol smtp then restart
   if 5 restarts within 5 cycles then timeout
   alert foo@bar
   alert 123456@sms on { timeout }
   depends on amavisd_bin
   depends on amavisd_rc

 check file amavisd_bin with path /opt/virus/amavis-new/bin/amavisd
   group mail
   if failed checksum then unmonitor
   if failed permission 755 then unmonitor
   if failed uid root then unmonitor
   if failed gid root then unmonitor
   alert foo@bar

 check file amavisd_rc with path /etc/init.d/amavis-new
   group mail
   if failed checksum then unmonitor
   if failed permission 755 then unmonitor
   if failed uid root then unmonitor
   if failed gid root then unmonitor
   alert foo@bar

Virus Scanner
Sophie (virus scan daemon)

 check process sophie with pidfile /var/run/sophie.pid
   group virus
   start program = "/etc/init.d/sophie start"
   stop  program = "/etc/init.d/sophie stop"
   if failed unixsocket /var/run/sophie then restart
   if 5 restarts within 5 cycles then timeout
   alert foo@bar
   alert 123456@sms on { timeout }
   depends on sophie_bin
   depends on sophie_rc

 check file sophie_bin with path /opt/virus/sophie/sophie
   group virus
   if failed checksum then unmonitor
   if failed permission 755 then unmonitor
   if failed uid root then unmonitor
   if failed gid root then unmonitor
   alert foo@bar

 check file sophie_rc with path /etc/init.d/sophie
   group virus
   if failed checksum then unmonitor
   if failed permission 755 then unmonitor
   if failed uid root then unmonitor
   if failed gid root then unmonitor
   alert foo@bar

Trophie (virus scan daemon)

 check process trophie with pidfile /var/run/trophie.pid
   group virus
   start program = "/etc/init.d/trophie start"
   stop  program = "/etc/init.d/trophie stop"
   if failed unixsocket /var/run/trophie then restart
   if 5 restarts within 5 cycles then timeout
   alert foo@bar
   alert 123456@sms on { timeout }
   depends on trophie_bin
   depends on trophie_rc

 check file trophie_bin with path /opt/virus/trophie/trophie
   group virus
   if failed checksum then unmonitor
   if failed permission 755 then unmonitor
   if failed uid root then unmonitor
   if failed gid root then unmonitor
   alert foo@bar

 check file trophie_rc with path /etc/init.d/trophie
   group virus
   if failed checksum then unmonitor
   if failed permission 755 then unmonitor
   if failed uid root then unmonitor
   if failed gid root then unmonitor
   alert foo@bar

Clamav (virus scan daemon)

 check process clamavd with pidfile /var/run/clamd.pid
   group virus
   start program = "/etc/init.d/clamavd start"
   stop  program = "/etc/init.d/clamavd stop"
   if failed unixsocket /var/run/clamd then restart
   if 5 restarts within 5 cycles then timeout
   alert foo@bar
   alert 123456@sms on { timeout }
   depends on clamavd_bin
   depends on clamavd_rc

 check file clamavd_bin with path /opt/virus/clamavd/clamavd
   group virus
   if failed checksum then unmonitor
   if failed permission 755 then unmonitor
   if failed uid root then unmonitor
   if failed gid root then unmonitor
   alert foo@bar

 check file clamavd_rc with path /etc/init.d/clamavd
   group virus
   if failed checksum then unmonitor
   if failed permission 755 then unmonitor
   if failed uid root then unmonitor
   if failed gid root then unmonitor
   alert foo@bar

Database Services
MySQL Server

The name of the pidfile consists usually of the fully quallified domainname and pidfile as extension.

check process mysql with pidfile /opt/mysql/data/myserver.mydomain.pid
   group database
   start program = "/etc/init.d/mysql start"
   stop program = "/etc/init.d/mysql stop"
   if failed host 192.168.1.1 port 3306 then restart
   if 5 restarts within 5 cycles then timeout
   alert foo@bar
   alert 123456@sms on { timeout }
   depends on mysql_bin
   depends on mysql_rc

 check file mysql_bin with path /opt/mysql/bin/mysqld
   group database
   if failed checksum then unmonitor
   if failed permission 755 then unmonitor
   if failed uid root then unmonitor
   if failed gid root then unmonitor
   alert foo@bar

 check file mysql_rc with path /etc/init.d/mysql
   group database
   if failed checksum then unmonitor
   if failed permission 755 then unmonitor
   if failed uid root then unmonitor
   if failed gid root then unmonitor
   alert foo@bar

OpenLDAP slapd (Debian package)

check process slapd with pidfile /var/run/slapd.pid
   group database
   start program = "/etc/init.d/slapd start"
   stop program = "/etc/init.d/slapd stop"
   if failed host 192.168.1.1 port 389 protocol ldap3 then restart
   if 5 restarts within 5 cycles then timeout
   alert foo@bar
   alert 123456@sms on { timeout }
   depends on slapd_bin
   depends on slapd_rc

 check file slapd_bin with path /usr/sbin/slapd
   group database
   if failed checksum then unmonitor
   if failed permission 755 then unmonitor
   if failed uid root then unmonitor
   if failed gid root then unmonitor
   alert foo@bar

 check file slapd_rc with path /etc/init.d/slapd
   group database
   if failed checksum then unmonitor
   if failed permission 755 then unmonitor
   if failed uid root then unmonitor
   if failed gid root then unmonitor
   alert foo@bar

File Services
Samba (windows file/domain server)

Hint: For enhanced controllability of the service it is handy to split up the samba init file into two pieces, one for smbd (the file service) and one for nmbd (the name service).

 check process smbd with pidfile /opt/samba2.2/var/locks/smbd.pid
   group samba
   start program = "/etc/init.d/smbd start"
   stop  program = "/etc/init.d/smbd stop"
   if failed host 192.168.1.1 port 139 type TCP  then restart
   if 5 restarts within 5 cycles then timeout
   alert foo@bar
   alert 123456@sms on { timeout }
   depends on smbd_bin

 check file smbd_bin with path /opt/samba2.2/sbin/smbd
   group samba
   if failed checksum then unmonitor
   if failed permission 755 then unmonitor
   if failed uid root then unmonitor
   if failed gid root then unmonitor
   alert foo@bar

 check process nmbd with pidfile /opt/samba2.2/var/locks/nmbd.pid
   group samba
   start program = "/etc/init.d/nmbd start"
   stop  program = "/etc/init.d/nmbd stop"
   if failed host 192.168.1.1 port 138 type UDP  then restart
   if failed host 192.168.1.1 port 137 type UDP  then restart
   if 5 restarts within 5 cycles then timeout
   alert foo@bar
   alert 123456@sms on { timeout }
   depends on nmbd_bin

 check file nmbd_bin with path /opt/samba2.2/sbin/nmbd
   group samba
   if failed checksum then unmonitor
   if failed permission 755 then unmonitor
   if failed uid root then unmonitor
   if failed gid root then unmonitor
   alert foo@bar

Printing Services
LPRng (printer daemon)

 check process lprng with pidfile /var/run/lpd.515
   group printer
   start program = "/etc/init.d/lprng start"
   stop  program = "/etc/init.d/lprng stop"
   if failed host 192.168.1.1 port 515 type TCP  then restart
   if 5 restarts within 5 cycles then timeout
   alert foo@bar
   alert 123456@sms on { timeout }
   depends on lprng_bin
   depends on lprng_rc

 check file lprng_bin with path /opt/lprng/sbin/lpd
   group printer
   if failed checksum then unmonitor
   if failed permission 755 then unmonitor
   if failed uid root then unmonitor
   if failed gid root then unmonitor
   alert foo@bar

 check file lprng_rc with path /etc/init.d/lprng
   group printer
   if failed checksum then unmonitor
   if failed permission 755 then unmonitor
   if failed uid root then unmonitor
   if failed gid root then unmonitor
   alert foo@bar

Sun ONE Services
iPlanetDirectoryServer slapd

 check process ldap-master
  with pidfile /usr/iplanet/ldapmaster/slapd-master-1/logs/pid
   start program  "/usr/iplanet/ldapmaster/slapd-master-1/start-slapd"
   stop program  "/usr/iplanet/ldapmaster/slapd-master-1/stop-slapd"
   if 5 restarts within 5 cycles then timeout
   if failed host 192.168.1.1 port 389 protocol ldap3 then restart
   alert foo@bar
   alert 123456@sms on { timeout }

iPlanetMessagingServer MTA dispatcher

 check process mta-dispatcher 
  with pidfile /usr/iplanet/msg-ims-1/config/pidfile.imta_dispatch
   start program  "/usr/iplanet/msg-ims-1/imsimta start dispatcher"
   stop program  "/usr/iplanet/msg-ims-1/imsimta stop dispatcher"
   group messaging
   if 5 restarts within 5 cycles then timeout
   if failed host 192.168.1.1 port 25 protocol smtp then restart
   alert foo@bar
   alert 123456@sms on { timeout }

iPlanetMessagingServer MTA job controler

 check process mta-job_controller 
  with pidfile /usr/iplanet/msg-ims-1/config/pidfile.imta_jbc
   start program  "/usr/iplanet/msg-ims-1/imsimta start job_controller"
   stop program  "/usr/iplanet/msg-ims-1/imsimta stop job_controller"
   group messaging
   if 5 restarts within 5 cycles then timeout
   if failed host 192.168.1.1 port 28442 then restart
   alert foo@bar
   alert 123456@sms on { timeout }

iPlanetMessagingServer stored

 check process store with pidfile /usr/iplanet/msg-ims-1/config/pidfile.store
   start program  "/usr/iplanet/msg-ims-1/start-msg store"
   stop program  "/usr/iplanet/msg-ims-1/stop-msg store"
   if 5 restarts within 5 cycles then timeout
   alert foo@bar
   alert 123456@sms on { timestamp timeout }
   group messaging

 check file stored.ckp with path /usr/iplanet/msg-ims-1/config/stored.ckp
   if timestamp > 10 minutes then alert
   group messaging

 check file stored.lcu with path /usr/iplanet/msg-ims-1/config/stored.lcu
   if timestamp > 15 minutes then alert
   group messaging

 check file stored.per with path /usr/iplanet/msg-ims-1/config/stored.per
   if timestamp > 70 minutes then alert
   group messaging

iPlanetMessagingServer mshttpd

 check process webmail with pidfile /usr/iplanet/msg-ims-1/config/pidfile.http
   start program  "/usr/iplanet/msg-ims-1/start-msg http"
   stop program  "/usr/iplanet/msg-ims-1/stop-msg http"
   group messaging
   if 5 restarts within 5 cycles then timeout
   if failed host 192.168.1.1 port 80 protocol http then restart
   alert foo@bar
   alert 123456@sms on { timeout }

iPlanetMessagingServer popd

 check process pop3 with pidfile /usr/iplanet/msg-ims-1/config/pidfile.pop
   start program  "/usr/iplanet/msg-ims-1/start-msg pop"
   stop program  "/usr/iplanet/msg-ims-1/stop-msg pop"
   group messaging
   if 5 restarts within 5 cycles then timeout
   if failed host 192.168.1.1 port 110 protocol pop then restart
   alert foo@bar
   alert 123456@sms on { timeout }

iPlanetMessagingServer imapd

 check process imap4 with pidfile /usr/iplanet/msg-ims-1/config/pidfile.imap
   start program  "/usr/iplanet/msg-ims-1/start-msg imap"
   stop program  "/usr/iplanet/msg-ims-1/stop-msg imap"
   group messaging
   if 5 restarts within 5 cycles then timeout
   if failed host 192.168.1.1 port 143 protocol imap then restart
   alert foo@bar
   alert 123456@sms on { timeout }

iPlanetMessagingServer madmand (SNMP subagent)

 check process snmp-subagent with pidfile /usr/iplanet/msg-ims-1/config/pidfile.snmp
   start program  "/usr/iplanet/msg-ims-1/start-msg snmp"
   stop program  "/usr/iplanet/msg-ims-1/stop-msg snmp"
   group messaging
   if 5 restarts within 5 cycles then timeout
   alert foo@bar
   alert 123456@sms on { timeout }

iPlanetMessagingServer MMP (POP3/IMAP4/SMTP proxy)

 check process mmp with pidfile /usr/iplanet/mmp-ims2/pidfile
   start program  "/usr/iplanet/mmp-ims2/AService.rc start"
   stop program  "/usr/iplanet/mmp-ims2/AService.rc stop"
   group messaging
   if 5 restarts within 5 cycles then timeout
   if failed host 192.168.1.2 port 110 protocol pop then restart
   if failed host 192.168.1.2 port 143 protocol imap then restart
   alert foo@bar
   alert 123456@sms on { timeout }

iPlanetCalendarServer csadmind

 check process calendar-admin with pidfile /usr/iplanet/SUNWics5/cal/bin/config/pidfile.admin
   start program  "/usr/iplanet/SUNWics5/cal/bin/csstart service admin"
   stop program  "/usr/iplanet/SUNWics5/cal/bin/csstop service admin"
   group calendar
   if 5 restarts within 5 cycles then timeout
   alert foo@bar
   alert 123456@sms on { timeout }

iPlanetCalendarServer cshttpd

 check process calendar-http with pidfile /usr/iplanet/SUNWics5/cal/bin/config/pidfile.http
   start program  "/usr/iplanet/SUNWics5/cal/bin/csstart service http"
   stop program  "/usr/iplanet/SUNWics5/cal/bin/csstop service http"
   group calendar
   if 5 restarts within 5 cycles then timeout
   if failed host 192.168.1.3 port 80 protocol http then restart
   alert foo@bar
   alert 123456@sms on { timeout }

iPlanetCalendarServer csdwpd (database wire protocol)

 check process calendar-dwp with pidfile /usr/iplanet/SUNWics5/cal/bin/config/pidfile.dwp
   start program  "/usr/iplanet/SUNWics5/cal/bin/csstart service dwp"
   stop program  "/usr/iplanet/SUNWics5/cal/bin/csstop service dwp"
   group calendar
   if 5 restarts within 5 cycles then timeout
   if failed host 192.168.1.3 port 9779 protocol dwp then restart
   if cpu usage > 2% for 5 cycles then restart   # There's a leak in csdwpd
   alert foo@bar
   alert 123456@sms on { timeout }

iPlanetCalendarServer csnotifyd

 check process calendar-notify with pidfile /usr/iplanet/SUNWics5/cal/bin/config/pidfile.notify
   start program  "/usr/iplanet/SUNWics5/cal/bin/csstart service notify"
   stop program  "/usr/iplanet/SUNWics5/cal/bin/csstop service notify"
   group calendar
   if 5 restarts within 5 cycles then timeout
   alert foo@bar
   alert 123456@sms on { timeout }

iPlanetCalendarServer enpd (event notification service broker)

 check process calendar-ens with pidfile /usr/iplanet/SUNWics5/cal/bin/config/pidfile.ens
   start program  "/usr/iplanet/SUNWics5/cal/bin/csstart service ens"
   stop program  "/usr/iplanet/SUNWics5/cal/bin/csstop service ens"
   group calendar
   if 5 restarts within 5 cycles then timeout
   if failed host 192.168.1.3 port 7997 then restart
   alert foo@bar
   alert 123456@sms on { timeout }

Misc Services
Apcupsd (APC ups daemon)

 check process apcupsd with pidfile /var/run/apcupsd.pid
   group ups
   start program = "/etc/init.d/apcupsd start"
   stop  program = "/etc/init.d/apcupsd stop"
   if 5 restarts within 5 cycles then timeout
   if failed host 192.168.1.3 port 7000 type TCP  then restart
   alert foo@bar
   alert 123456@sms on { timeout }
   depends on apcupsd_bin
   depends on apcupsd_rc

 check file apcupsd_bin with path /opt/apcupsd/sbin/apcupsd
   group ups
   if failed checksum then unmonitor
   if failed permission 755 then unmonitor
   if failed uid root then unmonitor
   if failed gid root then unmonitor
   alert foo@bar

 check file apcupsd_rc with path /etc/init.d/apcupsd
   group ups
   if failed checksum then unmonitor
   if failed permission 755 then unmonitor
   if failed uid root then unmonitor
   if failed gid root then unmonitor
   alert foo@bar

Webmin (remote admin service)

 check process webmin with pidfile /var/webmin/miniserv.pid
   group webmin
   start program = "/etc/init.d/webmin start"
   stop  program = "/etc/init.d/webmin stop"
   if failed host 192.168.1.3 port 10000 then restart
   if 5 restarts within 5 cycles then timeout
   alert foo@bar
   alert 123456@sms on { timeout }

 check file webmin_rc with path /etc/init.d/webmin
   group webmin
   if failed checksum then unmonitor
   if failed permission 755 then unmonitor
   if failed uid root then unmonitor
   if failed gid root then unmonitor
   alert foo@bar

STunnel (SSL tunnel)

 check process stunnel_pop3 with pidfile /opt/var/stunnel/stunnel.110.pid
   start program = "/etc/init.d/stunnel start_pop3"
   stop  program = "/etc/init.d/stunnel stop_pop3"
   if failed host 192.168.1.1 port 143 type TCPSSL protocol POP then restart
   group stunnel
   alert foo@bar
   depends stunnel_init
   depends stunnel_bin

 check process stunnel_swat with pidfile /opt/var/stunnel/stunnel.901.pid
   start program = "/etc/init.d/stunnel start_swat"
   stop  program = "/etc/init.d/stunnel stop_swat"
   if failed host 192.168.1.1 port 995 type TCPSSL then restart
   group stunnel
   alert foo@bar
   alert 123456@sms on { timeout }
   depends stunnel_bin
   depends stunnel_rc

 check file stunnel_bin with path /opt/sbin/stunnel
   group stunnel
   if failed checksum then unmonitor
   if failed permission 755 then unmonitor
   if failed uid root then unmonitor
   if failed gid root then unmonitor
   alert foo@bar

 check file stunnel_rc with path /etc/init.d/stunnel
   group stunnel
   if failed checksum then unmonitor
   if failed permission 755 then unmonitor
   if failed uid root then unmonitor
   if failed gid root then unmonitor
   alert foo@bar
Personal tools
KARA Logo