RE: [nv-l] Windows Clusters/Router issue

To:	<nv-l@lists.us.ibm.com>
Subject:	RE: [nv-l] Windows Clusters/Router issue
From:	"Barr, Scott" <Scott_Barr@csgsystems.com>
Date:	Thu, 23 Oct 2003 16:27:00 -0500
Delivery-date:	Thu, 23 Oct 2003 22:37:55 +0100
Envelope-to:	nv-l-archive@lists.skills-1st.co.uk
Reply-to:	nv-l@lists.us.ibm.com
Sender:	owner-nv-l-digest@lists.us.ibm.com
Thread-index:	AcOZqwhoGj/yx4JVSQK5z8bO+vJdHwAAOGvw
Thread-topic:	[nv-l] Windows Clusters/Router issue

I use the defaults - if I have a problem I usually push the timeout from 2 seconds to 4 seconds. Increasing retries is more undesirable in my estimate. The defaults are 3 retries at 2.0 seconds, I up the timeout to 3 or 4 seconds and if there is a really really slow issue, I might bump the retries as high as 7. But again, these have some undesirable side-impacts if you do it with a lot of devices.

It's interesting that changing the default polling interval fixed it up - I use a 3 minute polling interval. I think you need to look at the -V option on netmon for the HSRP issue.

-----Original Message-----
From: owner-nv-l-digest@lists.us.ibm.com [mailto:owner-nv-l-digest@lists.us.ibm.com]On Behalf Of CMazon@commercebankfl.com
Sent: Thursday, October 23, 2003 4:11 PM
To: nv-l@lists.us.ibm.com
Subject: RE: [nv-l] Windows Clusters/Router issue

I ran it, and they have a date time just like the example you gave me, however the TOPOLOGY POLL states MAXIMUM TIME. These routers are Cisco 2620s. We do have a Cisco 2500 in Panama, and that particular router doesn't have this up/down issue but is always deleting and readding the HSRP address.

What I have done to have the router switch to SNMP poll (without rediscover) is to change the default status polling interval. That seems to help netmon accept the changes. Can I ask what is your default snmp poll interval? And what is your forgiving time interval for your slow links? I am monitoring routers in Curacao,Panama,Zurich, Houston from Miami.

Thanks.

"Barr, Scott" <Scott_Barr@csgsystems.com>
Sent by: owner-nv-l-digest@lists.us.ibm.com
10/23/2003 04:24 PM
Please respond to nv-l

To: <nv-l@lists.us.ibm.com>
cc: <owner-nv-l-digest@lists.us.ibm.com>
Subject: RE: [nv-l] Windows Clusters

Do an ovtopodump -rl on the router name and look for this field:

SNMP STATUS POLL: Thu Oct 23 21:16:15 2003

If that line reads "MAXIMUM TIME" then you are still polling with ICMP. My guess is you need to yank it out of topology and rediscover it (once you added the $ flag to the seed file, it doesn't automatically conver to SNMP polling - although other folks may tell you it does, I have never seen it switch without being rediscovered)

Assuming that you do see them being polled with SNMP properly, then you may have to adjust the SNMP timing (i.e. retries / time outs) if your routers are particularly heavily used, they may take longer to respond. I work in a 99% cisco shop and we never have issues with SNMP not responding (intermittantly). SNMP is the lowest priority process on the router so if they are very busy, I could envision sometimes SNMP not being responded to.

To combat this, like I said you can adjust hte polling parameters to be a little more forgiving, or (as I do) have a ruleset that requires two consecutive status polls fail before I page anyone. This is especially helpful for routers on the far side of slow/saturated links and Cisco 2500s which often lack the CPU power to handle SNMP queries and a full T-1 circuit.

There are probably some other ways to address/investigate this - but SNMP status polls are very reliable, so something unusual must be going on.
-----Original Message-----
From: owner-nv-l-digest@lists.us.ibm.com [mailto:owner-nv-l-digest@lists.us.ibm.com]On Behalf Of CMazon@commercebankfl.com
Sent: Thursday, October 23, 2003 2:56 PM
To: nv-l@lists.us.ibm.com
Cc: 'nv-l@lists.us.ibm.com'; owner-nv-l-digest@lists.us.ibm.com
Subject: RE: [nv-l] Windows Clusters

Thank you all for you help... enabling SNMP polling on those nodes worked. I went ahead and enabled snmp polling for everything that is configured for SNMP as well and I now have a new problem and was wondering if you have seen this as well. The routers that I enabled SNMP polling for are now constantly reporting as a down node and then up repeatedly. I checked and they do have the correct SNMP settings. When these routers were ICMP polled, they did not have an issue. I had to exclude them from being SNMP polled to prevent this from happening. Any insight on this?

Carlos
(Win2k/NV 7.1.3 FP 1)

"Bursik, Scott {PBSG}" <Scott.Bursik@pbsg.com>
Sent by: owner-nv-l-digest@lists.us.ibm.com
10/23/2003 02:19 PM
Please respond to nv-l

To: "'nv-l@lists.us.ibm.com'" <nv-l@lists.us.ibm.com>
cc:
Subject: RE: [nv-l] Windows Clusters

We have the same exact issue here. I turned on the duplicate IP address notification and have it write out to a log file whenever a dup IP trap comes in and I was amazed at how many servers out there are using "private" networks and how many of them are using 192.168.x.x for the address scheme. It is hard to get teams to understand that these interfaces can been seen with NetView. People process are hard to change. Scott Bursik Enterprise Systems Management PepsiCo Business Solutions Group scott.bursik@pbsg.com (972) 963-1400 ________________________________________ From: Barr, Scott [mailto:Scott_Barr@csgsystems.com] Sent: Thursday, October 23, 2003 1:05 PM To: nv-l@lists.us.ibm.com Subject: RE: [nv-l] Windows Clusters I am assuming the issue is you have is that SNMP discovery finds the second non-pingable interface. What is probably happening is you have more than one server with the 192 address (based on my experience it is 192.168.254.253 -seems to pop up a lot). You unmanage the interface on one box and when a second box is discovered also with the 192,168 interface it deletes the first one. The config polls suddenly find it again and delete it from the second box and add it to hte first box again - in a managed state not unmanaged. I would recommend two things - first use SNMP polling not ping polling. This way, the status of the second interface can be obtained. Second, force your server administrators to put a different address on each of servers that have one of these interfaces. I am struggling with the same thing here with our Dell servers. -----Original Message----- From: owner-nv-l-digest@lists.us.ibm.com [mailto:owner-nv-l-digest@lists.us.ibm.com]On Behalf Of CMazon@commercebankfl.com Sent: Thursday, October 23, 2003 10:42 AM To: nv-l@lists.us.ibm.com Subject: [nv-l] Windows Clusters Win2k/Netview 7.1.3. FP1 / SQL2000, Hi list, Maybe someone can shed some light for me. We have 3 Microsoft clusters with several nic cards. One nic in each server is configured with an ip that are not pingable (192.168.X.X) for the cluster heartbeat. Is there a way to prevent Netview from discovering these interfaces? I have them in the exclude list of the seed file and I tried to unmanage them, but somehow Netview continues to manage these interfaces on its own. Has anyone come accross this problem before? Also, is there any consultant on this list located in Miami, FL please email me directly. (Sorry for posting this here.) Carlos

<Prev in Thread]	Current Thread	[Next in Thread>
RE: [nv-l] Windows Clusters/Router issue, Barr, Scott <=

Previous by Date:	RE: [nv-l] Windows Clusters/Router issue, CMazon
Next by Date:	[nv-l] trap DEFAULT FMT, Brian Kraftchick
Previous by Thread:	[nv-l] Web Client Background Image, Meyos Yemveng
Next by Thread:	[nv-l] trap DEFAULT FMT, Brian Kraftchick
Indexes:	[Date] [Thread] [Top] [All Lists]