nv-l
[Top] [All Lists]

Re: Node/interface UP/Down Reset on Match

To: nv-l@lists.tivoli.com
Subject: Re: Node/interface UP/Down Reset on Match
From: "Stoner, Raymond" <raymond.stoner@SPCORP.COM>
Date: Mon, 9 Nov 1998 15:38:29 -0500
Reply-to: Discussion of IBM NetView and POLYCENTER Manager on NetView <NV-L@UCSBVM.UCSB.EDU>
Sender: Discussion of IBM NetView and POLYCENTER Manager on NetView <NV-L@UCSBVM.UCSB.EDU>
Joel & James, Thanks, I did not realize that would happen. So I have
adjusted our International Links to 9 & 4. Our international links seem
to be behaving better. Our polling interval for these links is at 10
minutes, global default is 5, should I leave this as is?

-----Original Message-----
From: Joel A. Gerber [mailto:joel.gerber@usaa.com]
Sent: Monday, November 09, 1998 2:42 PM
To: NV-L@UCSBVM.UCSB.EDU
Subject: Re: Node/interface UP/Down Reset on Match


James is right.  You need to be careful when increasing retries.
Timeout/retries are not unique to the NetView application, but will
simply
control what happens at the lower TCP/IP layers in the protocol stack.
The
most common implementation on all platforms is to double the timeout
value
for every retry which is exactly what AIX does.  A timeout/retry
combination
of 30/40 will result in total timeout of a million years!! (try the math
yourself: take 2 to the 40th power times 30 seconds).  You need to be
especially careful when increasing retries, but you should be careful
with
the timeout value, too.  For example, changing the timeout from 1 to 10
seconds with a retries of 5 means you increased the total timeout from
63
seconds to 630 seconds.

We use a global default of 5.0 second timeout and 3 retries.  For
resources
that need a longer timeout we use 9.0 seconds and 4 retries.

        -----Original Message-----
        From:   James_Shanks@TIVOLI.COM [SMTP:James_Shanks@TIVOLI.COM]
        Sent:   Friday, November 06, 1998 15:14
        To:     NV-L@UCSBVM.UCSB.EDU
        Subject:        Re: Node/interface UP/Down Reset on Match

        40 retries?  That cannot be right.  You should not increase the
retries
        like that.  It would mean that netmon would never be finished
with
the
        polling cycle for this device.  The retry count is how many
times
netmon
        should try the device before he considers it down.    With a
high
timeout,
        he would still be waiting on timeouts from one cycle when it is
time
to
        begin the next, which will lead to very starnge results.  Drop
that
back to
        where it was.    What you want is longer timeouts but few
retries.

        There are sample rulsesets for Node Down/UP and Interface
Down/UP.
Have
        you looked at those?

        James Shanks
        Tivoli (NetView for UNIX) L3 Support



        "Stoner, Raymond" <raymond.stoner@SPCORP.COM> on 11/06/98
03:38:54
PM

        Please respond to Discussion of IBM NetView and POLYCENTER
Manager
on
              NetView <NV-L@UCSBVM.UCSB.EDU>

        To:   NV-L@UCSBVM.UCSB.EDU
        cc:    (bcc: James Shanks)
        Subject:  Re: Node/interface UP/Down Reset on Match





        I have changed and continue to increment the polling to these
devices as
        you suggested maybe my values are NG. I currently have  (just
for
these
        specific devices) timeout at 30 retries at 40 and Polling
interval
every
        10 minutes. We started @ 8 5 and 5.  I'll do some netmon tracing
on
        Monday.

        I probably do not have the rule structured properly. (NetView
rookie)
        Not quite sure how to match up the events.

        -----Original Message-----
        From: James_Shanks@TIVOLI.COM [mailto:James_Shanks@TIVOLI.COM]
        Sent: Friday, November 06, 1998 2:49 PM
        To: NV-L@UCSBVM.UCSB.EDU
        Subject: Re: Node/interface UP/Down Reset on Match


        Normally, I would recommend you look at polling intervals and
timeouts,
        since that controls what when netmon decides that an interface
is
down
        and
        sends the traps.  I would suggest a separate entry in the SNMP
        Configuration for these entries with a longer timeout.  If
that's
not
        working, perhaps you might try a netmon trace to see what is
happening
        here.  If you need help with that, I'd call Support and ask for
it.

        The ruleset issue is more puzzling to me, because in principle,
this
is
        just the sort of thing Pass/Reset-On-Match should do well.  The
problem
        may
        be your timing however.  Ten seconds is way too fine an
increment
for
        the
        daemon to handle.  The heartbeat mechanism for checking the
threshold is
        set at 15 seconds, so it would be impossible to get good results
lower
        than
        that.    Why not have him hold it for a minute or two?  Then if
there is
        going to be an UP event, you are sure not to miss it.

        James Shanks
        Tivoli (NetView for UNIX) L3 Support



        "Stoner, Raymond" <raymond.stoner@SPCORP.COM> on 11/06/98
02:02:14
PM

        Please respond to Discussion of IBM NetView and POLYCENTER
Manager
on
              NetView <NV-L@UCSBVM.UCSB.EDU>

        To:   NV-L@UCSBVM.UCSB.EDU
        cc:    (bcc: James Shanks)
        Subject:  Node/interface UP/Down Reset on Match





        Sometimes we receiving a Node/Interface Down event and a second
or
two
        later he Node/Interface Up event is received, especially on our
        International links. I have tried to adjust the timeout and
retry
        intervals for these nodes but this problem still occurs. I would
like to
        hold the down messages for about ten seconds to see if the up
message is
        received, if not then forward the down event on to our T/EC
console.
A
        ruleset using the Reset on Match might be the way to go, but I'm
having
        trouble getting that to work. Any suggestions on dealing with
the
rule
        or this situation is greatly appreciated.

        We are running NetView V4.1 on AIX 4.1.5

        Raymond Stoner
        Technical Advisor
        Schering Plough Corporation
        1011 Morris Ave. Union NJ 07083-7120
        Phone : (908)-820-6268 Fax : (908)-820-6102
        email: raymond.stoner@spcorp.com
        iloviT

<Prev in Thread] Current Thread [Next in Thread>

Archive operated by Skills 1st Ltd

See also: The NetView Web