nv-l
[Top] [All Lists]

RE: Confirmation of Netview pinging

To: nv-l@lists.tivoli.com
Subject: RE: Confirmation of Netview pinging
From: "Treptow, Craig" <Treptow.Craig@principal.com>
Date: Fri, 1 Jun 2001 07:58:30 -0500
Thanks Leslie!

Just to clarify, 1700 was not the number of interfaces, it was the number of 
hubs/routers/switches/servers.  I used the script you provided and had to 
increase the sleep time  to 20 seconds.  Anything less just resulted in the 
"Netmon is too busy" message.  At 20 seconds, I typically get:

Netmon is  3427 behind in status pinging

I don't really understand what I'm looking at, though. 

The box itself is currently located on one of the backbone switches.  I haven't 
taken any other action, because I wanted to understand what I was looking at.  
In netmon.trace I see things such as:

1043: 162.131.203.57 () list = 0x202aa858 
or
-21138: 162.131.115.1 (tower1-feth-5-0.net.principal.com) list = 0x202aa7b8

Can you explain these any more?

Thanks!

Craig

> -----Original Message-----
> From: Leslie Clark [mailto:lclark@us.ibm.com]
> Sent: Wednesday, May 30, 2001 8:04 PM
> To: IBM NetView Discussion
> Subject: Re: [NV-L] Confirmation of Netview pinging
> 
> 
> A couple of things.
> 
> Remember your normal response is 40ms, not 1 sec. Yes, it 
> will take a while
> to make the rounds if everything is down. But I hope your 
> normal state is
> that everything is up.
> 
> The number of outstanding pings is configurable. I think the current
> default
> is 16 (it was 10, years ago). It has been tested at up to 64. 
> That means it
> can send off pings to 64 nodes at once, and as they repond, 
> send out more.
> That number is the number of nodes it can be waiting on at one time
> (waiting
> an average of 40ms, you say). Set it in 
> /usr/OV/lrf/netmon.lrf, adding the
> '-q' parameter. Use -q 32 to set the ping queue, and -Q 32 to 
> set the snmp
> request queue. Experiment to see if you have the CPU and 
> interface speed to
> back it up. I have never seen it overrun the adapter, but I 
> have seen it
> use
> up all of the cpu.
> 
> 1700 interfaces is not a lot. You should be able to handle that in 5
> minutes
> easily on just about any box, using the default timeout/retry 
> of 2 and 3.
> Some caveats: If you have a lot of unpingable interfaces in 
> your map, clear
> them up. They clog up the ping queue (or increase the ping queue).
> Acknowledged counts, too, since they are still pinged. Make 
> sure your name
> resolution method is really fast. That slows everything down 
> more than you
> would expect. If you are having problems with false alarms, 
> make note of
> them
> and tune them individually to accomodate normal variations in 
> the network,
> rather than increase the timeout across the board. Make sure 
> you box is
> centrally located in the network, with the most reliable connection
> available,
> and make sure that connection is running at full-duplex if 
> the connection
> supports it.
> 
> Here's a little script to help you monitor how well netmon is keeping
> up with the status polling. See how fast it catches up when 
> it gets behind.
> 
> #!/bin/ksh
> #
> # pingstatus.sh
> #
> # A script to check whether netmon can keep up with the polling
> # frequency scheduled. Can be called from the Reports menu.
> # Output: a messages to stdout
> # Note: not reliable if netmon tracing is going on!
> #
> #set -x
> rm /usr/OV/log/netmon.trace
> netmon -a 12
> sleep 3
> if [ -f /usr/OV/log/netmon.trace ]; then
>   echo "Netmon is " `grep [-].*[:] /usr/OV/log/netmon.trace | 
> wc -l ` \
>       "behind in status pinging";
> else
>   echo "Netmon is too busy to report now. Try later."
> fi
> exit
> 
> 
> Cordially,
> 
> Leslie A. Clark
> IBM Global Services - Systems Mgmt & Networking
> Detroit
> 
> 
> "Treptow, Craig" <Treptow.Craig@principal.com>@tkg.com on 05/30/2001
> 04:39:22 PM
> 
> Please respond to IBM NetView Discussion <nv-l@tkg.com>
> 
> Sent by:  owner-nv-l@tkg.com
> 
> 
> To:   "NetView List (E-mail)" <nv-l@tkg.com>
> cc:
> Subject:  [NV-L] Confirmation of Netview pinging
> 
> 
> 
> Hi.  We are running Netview 6.0.2 on AIX 4.3.  We are wanting 
> to move to a
> more proactive approach to problem notifications.  Our hope is to ping
> servers/hubs/switches/routers and generate events when they aren't
> reachable.  This would make use of the Netview features to reduce the
> "noisy" pages, etc.  In preparation for this, I was running 
> some numbers
> and would like some input to see if I am flawed somewhere:
> 
> Average response time for pings = 40ms (includes LAN and WAN)
> Total devices to ping 1700. (and growing at about 30 per month)
> # outstanding pings = 10 (Is this true?  Does it affect my 
> numbers?  If so,
> how?)
> Retries = 0
> Timeout = 1 sec
> One Netview machine.
> 
> Netview could only ping 2 devices per second for a total of 
> 120 per minute.
> 1700 / 120 = 14 minutes to complete one ping cycle.
> 
> So this would mean that using this method, we would only find 
> out about a
> down device after 14 minutes at best?  I don't think anybody 
> would accept
> this long of a window.
> 
> Assuming the above is true, it appears that it is time for 
> use to look into
> a different Netview architecture that could achieve our goals?
> 
> I'm just looking for some insight into how Netview pings and 
> if my numbers
> are even reasonable, etc.  Thanks for any help you can provide.
> 
> Craig
> 
> P.S. I have searched the archives, but there appears to be many open
> questions on this topic.  Also, no form of netmon -a ?, or 
> any other flag
> produced output in the netmon.trace file.
> ______________________________________________________________
> ___________
> NV-L List information and Archives: http://www.tkg.com/nv-l
> 
> 
> ______________________________________________________________
> ___________


<Prev in Thread] Current Thread [Next in Thread>

Archive operated by Skills 1st Ltd

See also: The NetView Web