Todd
What I do is have cron kick off a script periodically that runs an ovstatus
command and greps for the words NOT RUNNING. If these are found then the output
from an ovstatus is emailed to me for action.
The script looks like this.....
#!/usr/bin/ksh
# check_nv_daemon.sh
#
# Author : Gavin Newman
# Date : 19th October, 2000
# Platform : AIX UN*X
#
# This script is called by CRON. It checks whether any of the
# Netview daemons are in a NOT_RUNNING state. If so it emails
# a copy of the nvstat output to the named userids.
#
# A lock file (nvdaemon_down.lock) is written once the email has
# been sent so no more emails are sent to the user. It is expected
# that the user will delete the lock file once the daemons have been
# restarted. The lock file is chown'ed to the user id & group so the
# user does not have to su - to delete it
#
# Reads email addresses from email.addresses.
#
# FILES : nvdaemon_down.addresses - email results to each name in this
#
# ------------------- Maintenance History --------------------
# -----------------------------------------------------------
#
# Define the file from which we get our email addresses.
FILE=/home/root/nvdaemon_down.addresses
LOCK=/home/root/nvdaemon_down.lock
#
# If the address file does not exist there is no point in going on!
if [[ ! -f $FILE ]]
then
exit 1
fi
#
# If the lock file exists then bail out. The user already knows.
if [[ -f $LOCK ]]
then
exit 0
fi
#
# run the ovstatus command, grep the output looking for the
# phrase NOT. If found then we have problems!
/usr/OV/bin/ovstatus | grep NOT > /dev/null 2> /dev/null
if [[ $? = 1 ]]
then
# grep did not find the phrase NOT - all daemons are up.
exit 0
fi
# A daemon is cactus. Create the lock file and email the user(s)
touch $LOCK
#
# Open the address file using descriptor 3
exec 3< $FILE
#
# Iterate through each name in the file sending them the good/bad news
while read -u3 NAME
do
echo "A Netview daemon appears to be down on host `hostname`.\nReview the
output from nvstat and ovstatus below and correct.\n\n*** Once the error has
been corrected please delete the lock file ***\n****** $LOCK
******\n\n`date`\n\n`/usr/OV/service/nvstat`\n\n`/usr/OV/bin/ovstatus`" | mail
-s "NV Daemon Alerter" $NAME
done
#
# Close the descriptor & file
exec 3<&-
#
# Exit with good result
exit 0
Cheers - Gavin
>>> netview@toddh.net 14/03/2001 16:02:14 >>>
Greetings all,
I'm new to the group, and thought this might be a common question, but
my search through the archives has not yielded any fruit on this
query. I'm curious how/whether other folks have tackled this
availability issue.
I'm pondering the best definition of "NetView server health."
I'd like to write a script to check the status of the netview server
and send email to a pager in the event the the Netview application
failed. This script would run on the server itself (AIX)--a separate
script takes care of verifying that the Netview machine is up from an
OS and networking standpoint.
To verify Netview health, what process(es) should be monitored? Or
should ovstatus messages be used or the output of the ps command? Is
monitoring netmon enough (I've been told that it's dependent on
several other daemons, and would die if any one of the other failed)?
Or are there separate processes that need to be watched to ensure both
monitoring and ruleset processing are alive and well?
TIA for insight you can share!
Best Regards,
--
Todd
_________________________________________________________________________
NV-L List information and Archives: http://www.tkg.com/nv-l
**********************************************************************
***** IMPORTANT INFORMATION *****
This document should be read only by those persons to whom
it is addressed and its content is not intended for use by
any other persons. If you have received this message in
error, please notify us immediately. Please also destroy and
delete the message from your computer. Any unauthorised form
of reproduction of this message is strictly prohibited.
Bank SA is not liable for the proper and complete transmission
of the information contained in this communication, nor for any
|