nv-l
[Top] [All Lists]

Re: Node up / down suppression

To: nv-l@lists.tivoli.com
Subject: Re: Node up / down suppression
From: Ray Schafer <schafer@TKG.COM>
Date: Thu, 11 Jun 1998 01:00:55 -0400
In-reply-to: <8625661F.00060971.00@I1.genam.com>
Reply-to: schafer@tkg.com
Sender: Discussion of IBM NetView and POLYCENTER Manager on NetView et alia <NV-L@UCSBVM.UCSB.EDU>
Hi John,

You are right.  Netmon will try the ping, and there are reasons that the
ping may be missed, for example if there is network congestion or a transmit
or receive queue overflow on the network card, either on the netview
machine, the destination machine, or any network card along the route.  The
configuration file ovsnmp.conf controls the timeout value and the retries
for polling.  You can get to it via the xsnmpconf command.  However I think
(but am not certain) that the retry only applies to snmp packets.  I don't
think netmon retries pings, but I could be wrong...  The timeout definitely
does apply to pings.

--
Ray Schafer     The Kernel Group     Network Computing Consulting
schafer@tkg.com http://www.tkg.com   +1 212 880 6444

> -----Original Message-----
> From: Discussion of IBM NetView and POLYCENTER Manager on NetView et
> alia [mailto:NV-L@UCSBVM.UCSB.EDU]On Behalf Of John Mutrux
> Sent: Tuesday, June 09, 1998 9:14 PM
> To: NV-L@UCSBVM.UCSB.EDU
> Subject: Node up / down suppression
>
>
> We have the Event Configuration set to page us when a node goes down and
> again when it comes back up.  The problem is when the node goes down for
> only a few minutes, due to a missed poll, etc and then comes right back up
> - we don't want these 'false alarms'.  Is it a ping sweep that is
> determining whether the node is down (I think it's set to every 5
> minutes).
> Can we increase the number of tries in the ping that is determining if the
> node is down to decrease the sensitivity?  We want to be paged if the node
> is down longer than 10 minutes for instance.  I noticed the ruleset editor
> has some built in function for this (is obviously a common problem /
> complaint), however how is the logic built to page when it is down longer
> than 10 minutes.  It seems that the logic does not include a specific
> source - only a node down matched by a node up correlation.  What if say,
> three nodes go down and two come back up.  What will be the source of the
> page?
>
>  What is a good way to be paged only when a node is down longer than a
> given (say 10 minutes) time.
>

<Prev in Thread] Current Thread [Next in Thread>

Archive operated by Skills 1st Ltd

See also: The NetView Web