RE: Bogus Interface Down Events

From: "Boyles, Gary P" <gary.p.boyles@intel.com>
Date: Fri, 25 Aug 2000 09:47:34 -0700
We ran into the same problem.  Setting the timeout intervals
to the following helped:
5  20  7

The 1st timeout is long enough for normal WAN/LAN devices.
The 2nd one buys some time for network conjestion to go away.
The 3rd could be 3/5/7... it really was the 2nd one that counts.

The above settings improve false-down reporting about 80%.

In addition... I put in a script (perl) that does a little
"event-smoothing".  Basically, it does the follwing:
a)  Starts-up at the 1st interface-down (for any given node),
    creates a file, and puts an entry in the file
    (filename=nodename) of "interface-address   status"
b)  Sleeps for 2 minutes.
c)  Wakes up and read all of the interfaces... and figures
    out whether there were interface down/up pairs.
d)  For any interface down/up pair... it does nothing.  For
    interfaces that are in-fact down... it reports them (page).
        Note:  One nice thing... if you write the code
               correctly... you'll only get one page for
               an entire router that goes down.
e)  Finally, it deletes the temp "node-file".
The program also does the following... to make the
above logic work:
a)  Starts-up at the subsequent interface-down (or up) events.
b)  If it discovers that the node-file already exists... then
    it just adds an entry to the file... and exits  (no sleep).

Now the program does add an additional two minutes to the
detection sequence, but I don't get any (or very few) down/up
pages.  The people that use this code will trade the sleep
for the additional two minutes.


Gary Boyles, Intel

I am running NetView for AIX V6R0 (AIX oslevel

I keep receiving Interface Down events from various routers, hubs,
switches during netmon's polling cycle. During the next polling cycle,
the corresponding Interface Up events are received.

I can ping the interfaces, and the devices in question do not record any
outages in their internal error logs.

I have tried adjusting the timeout and retry parameters in SNMP Config.,
but I really don't want to change them too drastically.

Any suggestions???


