In addition, both you and the customer should remember than "Node down"
means that the node is unusable on the network. It could be up and have a
bad or unresponsive network card. Or the network could be so choked that
the ping gets lost. When you get your node down, try a ping, a demandpoll,
or an snmpwalk to see what connectivity problems might be happening. Try a
traceroute or a checkroute also. And don't increase the timeout vlue beyond
the ping cycle. For example, if you are polling by ping every three
minutes, and you have a 60 second timeout, with three retries, you may
never finish.
James Shanks
Team Leader, Level 3 Support
Tivoli NetView for UNIX and NT
"Leslie Clark" <lclark@us.ibm.com>@tkg.com on 02/13/2001 08:39:30 AM
Please respond to IBM NetView Discussion <nv-l@tkg.com>
Sent by: owner-nv-l@tkg.com
To: IBM NetView Discussion <nv-l@tkg.com>
cc:
Subject: Re: [NV-L] IBM_NVNDWN_EV events Received in Error
Connie, a 60-second timeout is a bad idea. This is a knob that you
turn by degrees. If the default 2/3 is not enough, try 3/3 or 3/4. It
depends
on what the problem is. Sometimes certain hubs just loose their IP for
a while, although they are still operating. Sometimes pings get lost,
in which case more retries will help better than longer timeouts. The
real network people can explain better than I can the various things
that can cause false alarms. But from an overall netview standpoint,
to approach you should take is
1) Tiny increases in the default setting
2) Slightly larger increases for specific nodes or IP wildcards over slow
links.
3) Look for other problems with, or on the path to, devices that take an
inordinate amount of time to respond.
Cordially,
Leslie A. Clark
IBM Global Services - Systems Mgmt & Networking
Detroit
"Bresson, Connie" <connie.bresson@eds.com>@tkg.com on 02/12/2001 09:47:25
AM
Please respond to IBM NetView Discussion <nv-l@tkg.com>
Sent by: owner-nv-l@tkg.com
To: "'nv-l@tkg.com'" <nv-l@tkg.com>
cc:
Subject: [NV-L] IBM_NVNDWN_EV events Received in Error
> Subject: IBM_NVNDWN_EV events Received in Error
>
> Operating System: Solaris 2.6
> Product Group: NetView for UNIX
> .
> Environment:
> NetView 6.0
> Solaris 2.6
> Tivoli 3.6.2
>
> Problem Description:
> We are receiving IBM_NVNDWN_EV node down events in error. There are
> cases when we receive the node down event, but when we check the node we
> find that it was never down and operating normally. The network links
to
> these devices are 56kb to 128 kb. Is it possible that the timeout values
> and/or the retry values in the SNMP configuration needs to be adjusted?
> The customer did adjust the values to 60.0 for timeout and left the retry
> count at 3 but still had problems.
> Would using a value of 60.0 for the timeout value cause other problems?
> Should we look at using MLMs at the remote sites to do the polling? Will
> MLMs also have any problems communicating to the NetView server over slow
> links.
> Any help or advice will be appreciated.
> Regards,
> Connie Bresson
> Global Delivery -Enterprise Framework Capabilities
> connie.bresson@eds.com
>
_________________________________________________________________________
NV-L List information and Archives: http://www.tkg.com/nv-l
_________________________________________________________________________
NV-L List information and Archives: http://www.tkg.com/nv-l
|