We have seen an issue with our paging company where if 2 alerts are
sent within 2 seconds, the second alert is dropped. I think they may be
using some sort of "dupe" detection event although the messages may be
different. When we checked the log file for a script we use for nvapgerd,
there were entries for all of the alert, so we knew that they were making it
through the rules and the paging process.
Scott Bursik
James_Shanks@tivoli.com
03/12/2001 02:02 PM
Please respond to IBM NetView Discussion
<nv-l@tkg.com>@SMTP@Exchange
To: IBM NetView Discussion <nv-l@tkg.com>@SMTP@Exchange
cc:
Subject: RE: [NV-L] nvcorrd queue
Well, I'm not sure but that is certainly possible. That's why I
recommended
the traces. I don't know how to diagnose this unless we do it one
step at
a time. Note that you could also have a problem with the pager
company
losing pages. I have seen that occur before if you send some of
them too
many one right after another.
James Shanks
Team Leader, Level 3 Support
Tivoli NetView for UNIX and NT
"Westphal, Raymond" <RWestphal@erac.com>@tkg.com on 03/12/2001
10:08:39 AM
Please respond to IBM NetView Discussion <nv-l@tkg.com>
Sent by: owner-nv-l@tkg.com
To: "'IBM NetView Discussion'" <nv-l@tkg.com>
cc:
Subject: RE: [NV-L] nvcorrd queue
Thanks James,
I had an opportunity to install 6.0.2 this weekend. But I haven't
tried my
script yet. However, the cron job that runs similar commands works
at the
time scheduled (11AM CST). It only runs 1 cycle of node down/node
up. I
think I was possibly cycling the test too fast. The ruleset reset on
match
node was set for 1 minute. So a page would occur for each cycle of
the
script. Here's the script:
##################################
#echo "Start of script"
/usr/OV/bin/nv6000_smit run_event -n'1' -e'AA_EV' -h'tnvcorp02'
-s'N' \
-d'1. Ruleset testing is about to begin.' >/dev/null 2>&1
sleep 1
##################################
/usr/OV/bin/nv6000_smit run_event -n'1' -e'AA_EV' -h'tnvcorp02'
-s'N' \
-d'SIMULATE 9816 DOWN FOR MORE THAN 1 MINUTE(S). PAGE SHOULD
OCCUR.'\>/dev/null 2>&1
/usr/OV/bin/nv6000_smit run_event -n'1' -e'NDWN_EV'
-h'blah.blah.com'-s'N'
\
-d'This is a test.' >/dev/null 2&>1
sleep 70
/usr/OV/bin/nv6000_smit run_event -n'1' -e'NUP_EV' -h'blah.blah.com'
-s'N'
\
-d'This is a test.' >/dev/null 2&>1
/usr/OV/bin/nv6000_smit run_event -n'1' -e'AA_EV' -h'tnvcorp02'
-s'N' \
-d'TEST CYCLE COMPLETED.' >/dev/null 2>&1
##################################
Do you think my test was too hard on the patient, Doc?
Ray.
-----Original Message-----
From: James_Shanks@TIVOLI.COM [mailto:James_Shanks@TIVOLI.COM]
Sent: Sunday, March 11, 2001 7:26 PM
To: IBM NetView Discussion
Subject: RE: [NV-L] nvcorrd queue
I've given this some more thought, and while my advice to call
Support is
still the same, especially if nvcorrd fails when you ovstop
actionsvr. But
what you want to do with the issue of your pages not occurring is to
try to
determine where this is failing. If you cannot find error message
then you
will need to set up some tracing and try a "divide-and-conquer"
approach.
First trace nvcorrd activity with nvcdebug -d all" and look at the
nvcorrd.alog/blog. You will see each trap as it is processed. Look
for
"Receive a trap" and "Finished with a trap". Everything in between
is
nvcorrd processing that trap. You should see the action, whether
paging
via the paging node or a script of your own being kicked off as an
action
sent to the actionsvr. If that looks OK, then look in the
nvaction.alog/blog and see what happens when that action gets there.
You
should see a line where it is launched and then any error messages
that
come back. No error messages? Then I would start tracing the pager
daemon. You can do that by adding -d to his ovsuf entry as has
been
explained in this list many times. He will start writing voluminous
output
to nvpagerd.alog so you can watch every page being sent. You will
probably need help somewhere along the line, so I'd recommend that
call to
Support.
James Shanks
Team Leader, Level 3 Support
Tivoli NetView for UNIX and NT
James_Shanks@TIVOLI.COM@tkg.com on 03/09/2001 11:02:57 PM
Please respond to IBM NetView Discussion <nv-l@tkg.com>
Sent by: owner-nv-l@tkg.com
To: IBM NetView Discussion <nv-l@tkg.com>
cc:
Subject: RE: [NV-L] nvcorrd queue
Ray -
You need to call Support and open a problem. The help you need is
much too
detailed to be given on the forum like this. Someone needs to
examine your
ruleset and set up some tracing, and if you are getting a core, then
we
need a core report and that will have to be analyzed as well.
James Shanks
Team Leader, Level 3 Support
Tivoli NetView for UNIX and NT
"Westphal, Raymond" <RWestphal@erac.com>@tkg.com on 03/09/2001
04:18:38 PM
Please respond to IBM NetView Discussion <nv-l@tkg.com>
Sent by: owner-nv-l@tkg.com
To: "'IBM NetView Discussion'" <nv-l@tkg.com>
cc:
Subject: RE: [NV-L] nvcorrd queue
James,
I've been testing these rulesets all day. I get about an 80% success
rate
on
paging. When the breaks occur nothing happens on the nvcorrd log or
nvaction
log. I know that nothing is received at the paging server also. The
next
time I run my script it may work. A restart of daemons is not
necessary.
I hope to install 6.0.2 Monday evening. I know there is a fix for an
nvcorrd
core dump problem. I encounter the problem every time I stop
actionsvr. Is
there an nvcorrd problem in 6.0.1 that could cause this symptom?
Thanks.
Ray Westphal
Enterprise Rent-A-Car
-----Original Message-----
From: James_Shanks@TIVOLI.COM [mailto:James_Shanks@TIVOLI.COM]
Sent: Friday, March 09, 2001 9:00 AM
To: NV-L@tkg.com
Subject: [NV-L] nvcorrd queue
Ray -
netstat -a will show you what I was talking about, which aren't
really
queues but messages on the sockets of various UNIX processes. If
everything is working right, you should see the send and receive
queues be
at zero or near zero, indicating that the processes involved are
getting to
every new message right away.
Since netstat uses the /etc/services file to identify the port
owners, you
should see some part of the word "nvcorrd" as part of the named
process
owner. If your hostname is too long and you don't get enough
displayed,
you can try "netstat -an" which gives a numeric display. nvcorrd's
port is
1666. Sockets are assigned in pairs, for interprocess
communication, so
you when you have rulesets running you will always see another
process
shown who's information is the obverse of nvcorrd's -- the receive
and send
amounts will be swapped.
Assuming you are doing paging in the standard fashion, either from
an
action node in the a ruleset which calls your own script, or from
the pager
icon, both result in the actionsvr actually doing the page. He logs
everything so go check out /usr/OV/log/nvaction.alog and .blog for
clues.
James Shanks
Team Leader, Level 3 Support
Tivoli NetView for UNIX and NT
"Westphal, Raymond" <RWestphal@erac.com>@tkg.com on 03/09/2001
09:06:56 AM
Please respond to IBM NetView Discussion <nv-l@tkg.com>
Sent by: owner-nv-l@tkg.com
To: "NV List (E-mail)" <nv-l@tkg.com>
cc:
Subject: [NV-L] nvcorrd queue
Hello James,
Back in '99 you wrote a nice Ruleset Performance document. In it you
talk
about the nvcorrd daemon and the queue (32K) filling up. I'm having
a
problem with an intermittent paging ruleset. I was wondering what do
look
for in the nvcorrd.?log file to indicate a queue problem.
Thanks in advance.
Ray Westphal
Enterprise Rent-A-Car
_________________________________________________________________________
NV-L List information and Archives: http://www.tkg.com/nv-l
_________________________________________________________________________
NV-L List information and Archives: http://www.tkg.com/nv-l
_________________________________________________________________________
NV-L List information and Archives: http://www.tkg.com/nv-l
_________________________________________________________________________
NV-L List information and Archives: http://www.tkg.com/nv-l
_________________________________________________________________________
NV-L List information and Archives: http://www.tkg.com/nv-l
_________________________________________________________________________
NV-L List information and Archives: http://www.tkg.com/nv-l
_________________________________________________________________________
NV-L List information and Archives: http://www.tkg.com/nv-l
|