nv-l
[Top] [All Lists]

RE: [nv-l] Status Polling

To: <nv-l@lists.us.ibm.com>
Subject: RE: [nv-l] Status Polling
From: "Barr, Scott" <Scott_Barr@csgsystems.com>
Date: Fri, 24 Jun 2005 15:27:33 -0500
Delivery-date: Fri, 24 Jun 2005 21:28:48 +0100
Envelope-to: nv-l-archive@lists.skills-1st.co.uk
Reply-to: nv-l@lists.us.ibm.com
Sender: owner-nv-l@lists.us.ibm.com
Thread-index: AcV48tiUCHq/pieMQf+093VVOpm3vgAB9xKQ
Thread-topic: [nv-l] Status Polling
One warning about retries.

Each time you retry, the SNMP or ping, netmon appears to double the
timeout value. So, if you set 7 retries with 1 second time out, you get
1, 2, 4, 8, 16, 32, 64 seconds timeout values. 

If you have a lot of nodes this way, that can cause more issues than it
solves.

One caveat, I've been doing TEC/Framework/ITM for a while so the way
netmon behaves may have changed some time ago. 

-----Original Message-----
From: owner-nv-l@lists.us.ibm.com [mailto:owner-nv-l@lists.us.ibm.com]
On Behalf Of Kumar Vanka
Sent: Friday, June 24, 2005 2:28 PM
To: nv-l@lists.us.ibm.com; nv-l@lists.us.ibm.com
Subject: RE: [nv-l] Status Polling

Thanks, Leslie, Bill.

I'll try your suggestions and let you'll know.

- Kumar Vanka
Enterprise Architect
Invenio, Inc.

>-- Original Message --
>To: nv-l@lists.us.ibm.com
>Subject: RE: [nv-l] Status Polling
>From: Leslie Clark <lclark@us.ibm.com>
>Date: Fri, 24 Jun 2005 14:49:56 -0400
>Reply-To: nv-l@lists.us.ibm.com
>
>
>I agree with Bill. The timeouts and retries are your best bet for
tuning
>
>out false alarms. Depending on your network, it may be the retries
rather
>
>than the timeouts that work best for you. Say 5 retries with a timeout
of
>
>2, if pings are getting lost. 
>
>Cordially,
>
>Leslie A. Clark
>IBM Global Services - Systems Mgmt & Networking
>(248) 552-4968 Voicemail, Fax, Pager
>
>
>
>
>"Evans, Bill" <Bill.Evans@hq.doe.gov> 
>Sent by: owner-nv-l@lists.us.ibm.com
>06/23/2005 09:11 PM
>Please respond to
>nv-l
>
>
>To
>"'nv-l@lists.us.ibm.com'" <nv-l@lists.us.ibm.com>
>cc
>
>Subject
>RE: [nv-l] Status Polling
>
>
>
>
>
>
>I?ve done it.  Not hard at all but expensive.  Demand Poll takes a lot
of
>
>cycles.  This script is executed out of the ESE.Automation when an
event
>
>indicating a failed poll is received.  A ruleset kicks it off as a 
>background action. 
> 
>goshawk2#cat RouterDP.sh 
>#!/bin/ksh
>Hostname=${1} 
>Date=`date`
>echo ${Date} function off >>/opt/webmon/RouterDP.log
>usr/OV/bin/nmdemandpoll ${Hostname} >>/opt/webmon/RouterDP.log &
> 
>One problem is that SNMP doesn?t really have any better priority or 
>architectural power than ICMP.  I actually used the process when SNMP 
>polling had a problem with late arriving responses on a slow and 
>overloaded processor.  It?s an architectural fact that ICMP and SNMP
are
>
>low priority and allowed to be thrown away.  NetView compensates by its

>geometrically increasing waits on retries and the ability to customize

>retries and wait time by device. 
> 
>I quit using the script once we had the problem figured out.  The
overhead
>
>of Demand Poll actually made things a bit worse. 
> 
>I?d go for solving the root cause.  Manipulate the timeouts and retries

>for ICMP.  Make sure your NetView box has enough resources.  Check the

>delays at the routers and switches to see if there?s a bad card tying
up
>
>traffic.  Etc. 
> 
>The other alternative is to look into the IBM Tivoli Switch Analyzer. 
It
>
>automates the follow-up of failed polls and its slightly delayed follow
up
>
>to the failed ICMP often clears the condition. 
> 
>Using an inline action is a VERY BAD idea.  Your entire rules
processing
>
>waits for the demand poll to finish.  The system can totally bog down;

>note that my background script spins the demand poll off as an
independent
>
>process because it was single threading the background action
processing.
> 
> 
> 
>Bill Evans
> 
>-----Original Message-----
>From: owner-nv-l@lists.us.ibm.com [mailto:owner-nv-l@lists.us.ibm.com]
On
>
>Behalf Of Kumar Vanka
>Sent: Thursday, June 23, 2005 8:48 PM
>To: nv-l@lists.us.ibm.com
>Subject: [nv-l] Status Polling
> 
>I'm using ICMP for status polling in our environment. However, due to 
>several factors, we're getting many false positives. One of these
factors
>
>is that ICMP has a low priority in our environment. Is it possible to 
>configure netmon so that if the ICMP status poll shows that a node is 
>down, it can then do a demand poll using SNMP?
> 
>Based on my research, it appears this is not possible. So, I'm
considering
>
>modifying my ruleset to  use an inline action to run nmdemandpoll. Is
this
>
>a good option? Or, are there other options that I'm not considering?
> 
>Thanks.
> 
>- Kumar Vanka
>ESM Architect
>Invenio, Inc.
> 





<Prev in Thread] Current Thread [Next in Thread>

Archive operated by Skills 1st Ltd

See also: The NetView Web