When configuring the SNMP polling parameters we have decreased
the polling intervall to 2min with the same time out (2 sec)
and retry count (3) as default. Our network includes approx 500
interfaces on different routers.
On some links we have found that link down events appear altought
the link/interface is actually up (manual ping test). This can
occur when there is a timeout in the polling cycle because no icmp reply
whitin the time limits (slow link,hight util router, icmp low priority,
recalculating routertables etc).
Now we have increased the number off retries to 10 for some of
our routers to really be sure that the link is down when the event
is triggered (we actually start other processes to create enterprise
error messages to helpdesk etc)
Now, that seems to result in a very slow update time (10-15min) for
links/routers that comes up after a down state.
My question is about the polling process.
When increasing the retry count the time to flag the interface
'down' will of course increase. Does that affect (delay) the polling
frequency of the other nodes in the polling list ?
(is every poll a separate process not depending on the previous one)
If the answer is yes, that would seriously affect the polling cycle
and the time when a new state of an interface is detected.
If for example we have 10 down interfaces that would result in
10*10*2 sec delay wich will hold back the polling cycle for every
other node/interface.
Is this correct ?
In that case one should really keep the retry count low and polling
interval at more than 2 min so that every interface can be checked
whitin the polling interval.
Have I got this wrong or right ?
Any recommendations ?
(BTW, AIX 4.2.1 Netview 5.1)
Erik Nilsson (erik@netman.se)
Network Management tcpip AB
Stockholm
|