nv-l
[Top] [All Lists]

Re: Trap Filtering ??

To: nv-l@lists.tivoli.com
Subject: Re: Trap Filtering ??
From: Bill Evans <wvevans@attglobal.net>
Date: Sat, 08 Sep 2001 16:23:01 -0400
The critical data needed is the host system type although I guess it's 
Unix by the use of "ovxbeep".  If it's Unix then Todd's advice is on the 
mark.  If it's Windows it's a bit more difficult and the approach is 
slightly different in a "life critical" situation since Windows does not 
have the Ruleset Editor but it will run rules created on a Unix system.

False alarms are normal in a NetView network; remember that all NetView 
can really tell you is that "Interface eth0 (or eth1 or Fddi0)" is not 
responding to a series of three pings (or SNMP queries) during the 
polling cycle because of one of a number of reasons. (One reason is busy 
routers, another is saturated links and a third is busy processors on 
the systems being monitored.  Someone should be looking into the reason 
for these false alarms along the way.)  This cycle is typically a span 
of a few seconds on a normal network.

The normal ruleset solution is to delay the notice until the next 
polling cycle (the node up/node down sample rule which delays five 
minutes on Unix) to see if it's still bad and that may be too much delay 
in your situation.  I would suggest an automated response where you 
modify the shell script issuing ovxbeep to validate the NetView event to 
add a slight timeout to let system and network spikes pass (maybe sleep 
one minute) and then issue an SNMPWALK against the "oper down" MIB 
variable for the interface before accepting the NetView notification as 
accurate.

Hope this helps.  I'd also second Todd's advice to get the one week 
NetView training for administrators from the IBM Education troops.

Jeff CTR Dennison wrote:

>      I have inherited maintenance of a series of small (25 IP addresses) 
>      systems with a NetView based monitoring and control workstation at 
>      each of 20 sites. Approximately 5 or 6 times each day at each site 
>      NetView displays a message such as "Interface eth0 (or eth1 or Fddi0) 
>      down" for some processor or router port. These traps are set up using 
>      ovxbeep to post an alarm that must be manually acknowledged. However, 
>      the processor or port is not actually down.  The system being 
>      monitored is a real-time, life critical system with personnel manning 
>      the monitoring position 24/7. The problem is that the monitoring 
>      personnel have beome used to ignoring the alerts and just clicking 
>      them off because they are almost always false alarms and they are now 
>      complaining about the high number of false alarms. This is a bad 
>      situation in a life critical environment.
>      
>      Although I have very little actual NetView expertise I have been told 
>      that the traps can be filtered so that the trap would only be 
>      displayed if if happened twice during some time period or twice in a 
>      row, or something similar.
>      
>      I'm hoping someone can either help in how this filtering might be done 
>      or point me to a good reference source.
>      
>      I would appreciate it very much if you could respond to my email 
>      address since I have just joined the maining list and am not fully set 
>      up yet.
>      
>      Thanks in advance for any help.
>      
>      Jeff Dennison
> _________________________________________________________________________
> NV-L List information and Archives: http://www.tkg.com/nv-l
> 
> 


-- 
Bill Evans  --  Consultant in Enterprise Systems Management
reply-to: wvevans@prodigy.net  (or Bill_Evans@sra.com)
Phone: 919-696-7513


<Prev in Thread] Current Thread [Next in Thread>

Archive operated by Skills 1st Ltd

See also: The NetView Web