One warning about retries.
Each time you retry, the SNMP or ping, netmon appears to double the
timeout value. So, if you set 7 retries with 1 second time out, you get
1, 2, 4, 8, 16, 32, 64 seconds timeout values.
If you have a lot of nodes this way, that can cause more issues than it
solves.
One caveat, I've been doing TEC/Framework/ITM for a while so the way
netmon behaves may have changed some time ago.
-----Original Message-----
From: owner-nv-l@lists.us.ibm.com [mailto:owner-nv-l@lists.us.ibm.com]
On Behalf Of Kumar Vanka
Sent: Friday, June 24, 2005 2:28 PM
To: nv-l@lists.us.ibm.com; nv-l@lists.us.ibm.com
Subject: RE: [nv-l] Status Polling
Thanks, Leslie, Bill.
I'll try your suggestions and let you'll know.
- Kumar Vanka
Enterprise Architect
Invenio, Inc.
>-- Original Message --
>To: nv-l@lists.us.ibm.com
>Subject: RE: [nv-l] Status Polling
>From: Leslie Clark <lclark@us.ibm.com>
>Date: Fri, 24 Jun 2005 14:49:56 -0400
>Reply-To: nv-l@lists.us.ibm.com
>
>
>I agree with Bill. The timeouts and retries are your best bet for
tuning
>
>out false alarms. Depending on your network, it may be the retries
rather
>
>than the timeouts that work best for you. Say 5 retries with a timeout
of
>
>2, if pings are getting lost.
>
>Cordially,
>
>Leslie A. Clark
>IBM Global Services - Systems Mgmt & Networking
>(248) 552-4968 Voicemail, Fax, Pager
>
>
>
>
>"Evans, Bill" <Bill.Evans@hq.doe.gov>
>Sent by: owner-nv-l@lists.us.ibm.com
>06/23/2005 09:11 PM
>Please respond to
>nv-l
>
>
>To
>"'nv-l@lists.us.ibm.com'" <nv-l@lists.us.ibm.com>
>cc
>
>Subject
>RE: [nv-l] Status Polling
>
>
>
>
>
>
>I?ve done it. Not hard at all but expensive. Demand Poll takes a lot
of
>
>cycles. This script is executed out of the ESE.Automation when an
event
>
>indicating a failed poll is received. A ruleset kicks it off as a
>background action.
>
>goshawk2#cat RouterDP.sh
>#!/bin/ksh
>Hostname=${1}
>Date=`date`
>echo ${Date} function off >>/opt/webmon/RouterDP.log
>usr/OV/bin/nmdemandpoll ${Hostname} >>/opt/webmon/RouterDP.log &
>
>One problem is that SNMP doesn?t really have any better priority or
>architectural power than ICMP. I actually used the process when SNMP
>polling had a problem with late arriving responses on a slow and
>overloaded processor. It?s an architectural fact that ICMP and SNMP
are
>
>low priority and allowed to be thrown away. NetView compensates by its
>geometrically increasing waits on retries and the ability to customize
>retries and wait time by device.
>
>I quit using the script once we had the problem figured out. The
overhead
>
>of Demand Poll actually made things a bit worse.
>
>I?d go for solving the root cause. Manipulate the timeouts and retries
>for ICMP. Make sure your NetView box has enough resources. Check the
>delays at the routers and switches to see if there?s a bad card tying
up
>
>traffic. Etc.
>
>The other alternative is to look into the IBM Tivoli Switch Analyzer.
It
>
>automates the follow-up of failed polls and its slightly delayed follow
up
>
>to the failed ICMP often clears the condition.
>
>Using an inline action is a VERY BAD idea. Your entire rules
processing
>
>waits for the demand poll to finish. The system can totally bog down;
>note that my background script spins the demand poll off as an
independent
>
>process because it was single threading the background action
processing.
>
>
>
>Bill Evans
>
>-----Original Message-----
>From: owner-nv-l@lists.us.ibm.com [mailto:owner-nv-l@lists.us.ibm.com]
On
>
>Behalf Of Kumar Vanka
>Sent: Thursday, June 23, 2005 8:48 PM
>To: nv-l@lists.us.ibm.com
>Subject: [nv-l] Status Polling
>
>I'm using ICMP for status polling in our environment. However, due to
>several factors, we're getting many false positives. One of these
factors
>
>is that ICMP has a low priority in our environment. Is it possible to
>configure netmon so that if the ICMP status poll shows that a node is
>down, it can then do a demand poll using SNMP?
>
>Based on my research, it appears this is not possible. So, I'm
considering
>
>modifying my ruleset to use an inline action to run nmdemandpoll. Is
this
>
>a good option? Or, are there other options that I'm not considering?
>
>Thanks.
>
>- Kumar Vanka
>ESM Architect
>Invenio, Inc.
>
|