Hi Ray,
I use another approach for this issue: if you can wait fro three minutes before
getting the IF_DOWN - event then you can modify the polling paramters so that
"down" will only occure after netmon was pinging for three minutes without
success. Go to "Options -> SNMP Configuration" and change the "Timeout"-Value
and "Retry Count". Remember that the timeout-period is doubled for every retry:
by example: Timeout 2, number of retry's sets total time until IF_DOWN - event
initial ping wait 2 seconds total time 2 seconds till IF_DOWN -
event
retry 1 wait 4 seconds total time 6 seconds till IF_DOWN -
event
retry 2 wait 8 seconds total time 14 seconds till IF_DOWN -
event
retry 3 wait 16 seconds total time 30 seconds till IF_DOWN -
event
retry 4 wait 32 seconds total time 62 seconds till IF_DOWN -
event
retry 5 wait 64 seconds total time 126 seconds till IF_DOWN -
event
retry 6 wait 128 seconds total time 254 seconds till IF_DOWN -
event
Let's look at the probable reasons for those fake downs. One possibility is
that the timeout/retry combination specified in netview is not appropriate.
If the log of those devices shows interface down/up for short times, then there
is a real problem and it's better to solve this rather to tell netview to
ignore it.
If you look at the log of the devices with fake downs and they show no
interface down, then they were too busy to answer the netmon ping request or
there was a network problem between your netview and the device.
My experience is that with timeout 1.8 and retry 5 in a local network and a
frame-relay network with a maximum of 3 hops those reoccuring "fake" interface
downs are showing those devices which will become critical within the next few
weeks because they are permanently overloaded (speaking of cisco devices). You
may not see this looking at the CPU-load of these devices or the load on single
interfaces, but that's my experience. Another cause for those "fake" interface
downs are routing-problems, by example updates of large routing tables
reoccuring very fast due to a flippy interface. So I would play with the
timeout and retry parameters a bit and then look for the reasons if those fake
downs still exist.
Hope this helps
Michael Seibold
>>> Ray.Foss@motorola.com 01.09. 2.03 Uhr >>>
Forgive me if this is a FAQ, my ears are still wet when it comes to NV6K.
I'm running NetView 6.0 and forwarding OV_IF_Down and OV_IF_Up events to a
3.6.2 TEC based on a NetView ruleset. My question is:
Can I use the RESOLVE template, as in the sample (sampcorrIuId.rs) ruleset,
to suppress quick down/up indications that are not real outages? Here is
the pseudo-code for my desired results:
if you get a OV_IF_Down
Hold it for a while (3 minutes)
if you get a OV_IF_Up
if you have a OV_IF_Down in hold from the same IP
resolve this event
if the Hold timeout expires
forward the OV_IF_Down event
I also don't know if I can cascade rule sets. Any help if greatly
appreciated. Thanks.
--
~~~~~~~~
Ray Foss
_________________________________________
mailto:Ray.Foss@motorola.com
Office: 480-441-1093 Mobile: 602-721-4792
Pager: 800-759-8352 PIN: 1244994
FAX: 480-441-5455
Motorola, Inc.
Global Computing & Telecommunications
8111 East McDowell Road - AZ33-H1780
Scottsdale, Arizona 85257 USA
_________________________________________
_________________________________________________________________________
|