John,
We ran into the same problem. Setting the timeout intervals
to the following helped:
5 20 7
The 1st timeout is long enough for normal WAN/LAN devices.
The 2nd one buys some time for network conjestion to go away.
The 3rd could be 3/5/7... it really was the 2nd one that counts.
The above settings improve false-down reporting about 80%.
In addition... I put in a script (perl) that does a little
"event-smoothing". Basically, it does the follwing:
a) Starts-up at the 1st interface-down (for any given node),
creates a file, and puts an entry in the file
(filename=nodename) of "interface-address status"
b) Sleeps for 2 minutes.
c) Wakes up and read all of the interfaces... and figures
out whether there were interface down/up pairs.
d) For any interface down/up pair... it does nothing. For
interfaces that are in-fact down... it reports them (page).
Note: One nice thing... if you write the code
correctly... you'll only get one page for
an entire router that goes down.
e) Finally, it deletes the temp "node-file".
----------
The program also does the following... to make the
above logic work:
a) Starts-up at the subsequent interface-down (or up) events.
b) If it discovers that the node-file already exists... then
it just adds an entry to the file... and exits (no sleep).
----------
Now the program does add an additional two minutes to the
detection sequence, but I don't get any (or very few) down/up
pages. The people that use this code will trade the sleep
for the additional two minutes.
Regards,
Gary Boyles, Intel
-----Original Message-----
From: John D. Westmoreland [mailto:jwestmor@dc-is.org]
Sent: Friday, August 25, 2000 9:18 AM
To: nv-l@tkg.com
Subject: [NV-L] Bogus Interface Down Events
I am running NetView for AIX V6R0 (AIX oslevel 4.3.1.0).
I keep receiving Interface Down events from various routers, hubs,
switches during netmon's polling cycle. During the next polling cycle,
the corresponding Interface Up events are received.
I can ping the interfaces, and the devices in question do not record any
outages in their internal error logs.
I have tried adjusting the timeout and retry parameters in SNMP Config.,
but I really don't want to change them too drastically.
Any suggestions???
_________________________________________________________________________
|