nv-l
[Top] [All Lists]

RE: nvcorrd queue

To: nv-l@lists.tivoli.com
Subject: RE: nvcorrd queue
From: "Bursik, Scott" <Scott.Bursik@fritolay.com>
Date: Tue, 13 Mar 2001 07:49:34 -0600
        We have seen an issue with our paging company where if 2 alerts are
sent within 2 seconds, the second alert is dropped. I think they may be
using some sort of "dupe" detection event although the messages may be
different. When we checked the log file for a script we use for nvapgerd,
there were entries for all of the alert, so we knew that they were making it
through the rules and the paging process.

        Scott Bursik




        James_Shanks@tivoli.com
        03/12/2001 02:02 PM
        Please respond to IBM NetView Discussion
<nv-l@tkg.com>@SMTP@Exchange
        To:     IBM NetView Discussion <nv-l@tkg.com>@SMTP@Exchange
        cc:      
        Subject:        RE: [NV-L] nvcorrd queue

        Well, I'm not sure but that is certainly possible. That's why I
recommended
        the traces.  I don't know how to diagnose this unless we do it one
step at
        a time.   Note that you could also have a problem with the pager
company
        losing pages.  I have seen that occur before if you send some of
them too
        many one right after another.

        James Shanks
        Team Leader, Level 3 Support
         Tivoli NetView for UNIX and NT



        "Westphal, Raymond" <RWestphal@erac.com>@tkg.com on 03/12/2001
10:08:39 AM

        Please respond to IBM NetView Discussion <nv-l@tkg.com>

        Sent by:  owner-nv-l@tkg.com


        To:   "'IBM NetView Discussion'" <nv-l@tkg.com>
        cc:
        Subject:  RE: [NV-L] nvcorrd queue



        Thanks James,

        I had an opportunity to install 6.0.2 this weekend. But I haven't
tried my
        script yet. However, the cron job that runs similar commands works
at the
        time scheduled (11AM CST). It only runs 1 cycle of node down/node
up. I
        think I was possibly cycling the test too fast. The ruleset reset on
match
        node was set for 1 minute. So a page would occur for each cycle of
the
        script. Here's the script:

        ##################################
        #echo "Start of script"
        /usr/OV/bin/nv6000_smit run_event -n'1' -e'AA_EV' -h'tnvcorp02'
-s'N' \
                -d'1. Ruleset testing is about to begin.' >/dev/null 2>&1
        sleep 1
        ##################################
        /usr/OV/bin/nv6000_smit run_event -n'1' -e'AA_EV' -h'tnvcorp02'
-s'N' \
                -d'SIMULATE 9816 DOWN FOR MORE THAN 1 MINUTE(S). PAGE SHOULD
        OCCUR.'\>/dev/null 2>&1
        /usr/OV/bin/nv6000_smit run_event -n'1' -e'NDWN_EV'
-h'blah.blah.com'-s'N'
        \
             -d'This is a test.' >/dev/null 2&>1
        sleep 70
        /usr/OV/bin/nv6000_smit run_event -n'1' -e'NUP_EV' -h'blah.blah.com'
-s'N'
        \
             -d'This is a test.' >/dev/null 2&>1
        /usr/OV/bin/nv6000_smit run_event -n'1' -e'AA_EV' -h'tnvcorp02'
-s'N' \
             -d'TEST CYCLE COMPLETED.' >/dev/null 2>&1
        ##################################


        Do you think my test was too hard on the patient, Doc?

        Ray.



        -----Original Message-----
        From: James_Shanks@TIVOLI.COM [mailto:James_Shanks@TIVOLI.COM]
        Sent: Sunday, March 11, 2001 7:26 PM
        To: IBM NetView Discussion
        Subject: RE: [NV-L] nvcorrd queue


        I've given this some more thought, and while my advice to call
Support is
        still the same, especially if nvcorrd fails when you ovstop
actionsvr.  But
        what you want to do with the issue of your pages not occurring is to
try to
        determine where this is failing.  If you cannot find error message
then you
        will need to set up some tracing and try a "divide-and-conquer"
approach.

        First trace nvcorrd activity with nvcdebug -d all" and look at the
        nvcorrd.alog/blog.  You will see each trap as it is processed.  Look
for
        "Receive a trap" and "Finished with a trap".  Everything in between
is
        nvcorrd processing that trap.  You should see the action, whether
paging
        via the paging node or a script of your own being kicked off as an
action
        sent to the actionsvr.  If that looks OK, then look in the
        nvaction.alog/blog and see what happens when that action gets there.
You
        should see a line where it is launched and then any error messages
that
        come back.  No error messages?  Then I would start tracing the pager
        daemon.  You can do that by adding  -d to his ovsuf entry as has
been
        explained in this list many times.  He will start writing voluminous
output
        to nvpagerd.alog so you can watch every page being sent.   You will
        probably need help somewhere along the line, so I'd recommend that
call to
        Support.

        James Shanks
        Team Leader, Level 3 Support
         Tivoli NetView for UNIX and NT



        James_Shanks@TIVOLI.COM@tkg.com on 03/09/2001 11:02:57 PM

        Please respond to IBM NetView Discussion <nv-l@tkg.com>

        Sent by:  owner-nv-l@tkg.com


        To:   IBM NetView Discussion <nv-l@tkg.com>
        cc:
        Subject:  RE: [NV-L] nvcorrd queue



        Ray -

        You need to call Support and open a problem.  The help you need is
much too
        detailed to be given on the forum like this.  Someone needs to
examine your
        ruleset and set up some tracing, and if you are getting a core, then
we
        need a core report and that will have to be analyzed as well.

        James Shanks
        Team Leader, Level 3 Support
         Tivoli NetView for UNIX and NT



        "Westphal, Raymond" <RWestphal@erac.com>@tkg.com on 03/09/2001
04:18:38 PM

        Please respond to IBM NetView Discussion <nv-l@tkg.com>

        Sent by:  owner-nv-l@tkg.com


        To:   "'IBM NetView Discussion'" <nv-l@tkg.com>
        cc:
        Subject:  RE: [NV-L] nvcorrd queue



        James,
        I've been testing these rulesets all day. I get about an 80% success
rate
        on
        paging. When the breaks occur nothing happens on the nvcorrd log or
        nvaction
        log. I know that nothing is received at the paging server also. The
next
        time I run my script it may work. A restart of daemons is not
necessary.

        I hope to install 6.0.2 Monday evening. I know there is a fix for an
        nvcorrd
        core dump problem. I encounter the problem every time I stop
actionsvr. Is
        there an nvcorrd problem in 6.0.1 that could cause this symptom?

        Thanks.

        Ray Westphal
        Enterprise Rent-A-Car



        -----Original Message-----
        From: James_Shanks@TIVOLI.COM [mailto:James_Shanks@TIVOLI.COM]
        Sent: Friday, March 09, 2001 9:00 AM
        To: NV-L@tkg.com
        Subject: [NV-L] nvcorrd queue


        Ray -
        netstat -a  will show you what I was talking about, which aren't
really
        queues but messages on the sockets of various UNIX processes.  If
        everything is working right, you should see the send and receive
queues be
        at zero or near zero, indicating that the processes involved are
getting to
        every new message right away.
        Since netstat uses the /etc/services file to identify the port
owners, you
        should see some part of the word "nvcorrd" as part of the named
process
        owner.  If your hostname is too long and you don't get enough
displayed,
        you can try "netstat -an" which gives a numeric display.  nvcorrd's
port is
        1666.  Sockets are assigned in pairs, for interprocess
communication, so
        you when you have rulesets running you will always see another
process
        shown who's information is the obverse of nvcorrd's -- the receive
and send
        amounts will be swapped.

        Assuming you are doing paging in the standard fashion, either from
an
        action node in the a ruleset which calls your own script, or from
the pager
        icon, both result in the actionsvr actually doing the page.  He logs
        everything so go check out /usr/OV/log/nvaction.alog and .blog for
clues.

        James Shanks
        Team Leader, Level 3 Support
         Tivoli NetView for UNIX and NT



        "Westphal, Raymond" <RWestphal@erac.com>@tkg.com on 03/09/2001
09:06:56 AM

        Please respond to IBM NetView Discussion <nv-l@tkg.com>

        Sent by:  owner-nv-l@tkg.com


        To:   "NV List (E-mail)" <nv-l@tkg.com>
        cc:
        Subject:  [NV-L] nvcorrd queue



        Hello James,

        Back in '99 you wrote a nice Ruleset Performance document. In it you
talk
        about the nvcorrd daemon and the queue (32K) filling up. I'm having
a
        problem with an intermittent paging ruleset. I was wondering what do
look
        for in the nvcorrd.?log file to indicate a queue problem.

        Thanks in advance.

        Ray Westphal
        Enterprise Rent-A-Car

        
_________________________________________________________________________
        NV-L List information and Archives: http://www.tkg.com/nv-l


        
_________________________________________________________________________
        NV-L List information and Archives: http://www.tkg.com/nv-l
        
_________________________________________________________________________
        NV-L List information and Archives: http://www.tkg.com/nv-l


        
_________________________________________________________________________
        NV-L List information and Archives: http://www.tkg.com/nv-l


        
_________________________________________________________________________
        NV-L List information and Archives: http://www.tkg.com/nv-l
        
_________________________________________________________________________
        NV-L List information and Archives: http://www.tkg.com/nv-l


        
_________________________________________________________________________
        NV-L List information and Archives: http://www.tkg.com/nv-l


<Prev in Thread] Current Thread [Next in Thread>

Archive operated by Skills 1st Ltd

See also: The NetView Web