nv-l
[Top] [All Lists]

RE: Confirmation of Netview pinging

To: nv-l@lists.tivoli.com
Subject: RE: Confirmation of Netview pinging
From: "Treptow, Craig" <Treptow.Craig@principal.com>
Date: Mon, 4 Jun 2001 10:58:10 -0500
Thanks for the extra info Leslie.  Regarding DNS performance, I run a secondary 
DNS on the Netview machine for the reverse address space only.  These 
consistenly respond in 3-4 ms, while the forward entries take 10-20ms in 
general.

I let this run all weekend and tried the script again this morning.  I didn't 
get any output from it until I increased the sleep to 60 seconds.  When I did 
this I got:

Netmon is  4335 behind in status pinging

I have configuration polling set to 1 day, with 11 OID's to include and 6 OID's 
to exclude in the seed file.

Thanks.

Craig

> -----Original Message-----
> From: Leslie Clark [mailto:lclark@US.IBM.COM]
> Sent: Saturday, June 02, 2001 3:56 PM
> To: IBM NetView Discussion
> Subject: RE: [NV-L] Confirmation of Netview pinging
> 
> 
> The 20 seconds is how long it takes netmon to get around to responding
> to your request that he dump the report.  The script counts 
> the number of
> records, such as the one below, that  were scheduled for 
> times that have
> already passed. Note that near the top of the netmon.trace they have
> negatives, and hopefully, at the bottom, there will be some 
> with positive
> numbers. The positive numbers tell you that this node is 
> schedule to be
> pinged that many seconds in the future. Or some such time 
> unit. That looks
> like you have some nodes that have not been pinged in hours, so maybe
> the units are not seconds.
> 
> -21138: 162.131.115.1 (tower1-feth-5-0.net.principal.com) ...
> 
> When you first start netmon up, it is always behind. It has a 
> lot to do at
> startup.
> Try letting it run for half an hour or so, and see if it 
> catches up. What
> it says
> at startup is not something I worry about. It is the 
> steady-state behavior
> that
> you want to tune for. I suspect that you do have some 
> performance issues to
> deal with, though, because it sort of looks from this as if 
> it is never
> catching up.
> Which is why you are writing, I guess. What does it say after 
> it has been
> running
> for an hour? How often are you sending it off to do the 
> configuration poll
> (which
> defaults to once a day)? How good is your name resolution? Oh 
> - and how
> many names are in your seedfile? All of them? I have seen that delay
> netmon startup by half an hour unneccessarily.
> 
> Cordially,
> 
> Leslie A. Clark
> IBM Global Services - Systems Mgmt & Networking
> Detroit
> 
> "Treptow, Craig" <Treptow.Craig@principal.com>@tkg.com on 06/01/2001
> 08:58:30 AM
> 
> Please respond to IBM NetView Discussion <nv-l@tkg.com>
> 
> Sent by:  owner-nv-l@tkg.com
> 
> 
> To:   "'IBM NetView Discussion'" <nv-l@tkg.com>
> cc:
> Subject:  RE: [NV-L] Confirmation of Netview pinging
> 
> 
> 
> Thanks Leslie!
> 
> Just to clarify, 1700 was not the number of interfaces, it 
> was the number
> of hubs/routers/switches/servers.  I used the script you 
> provided and had
> to increase the sleep time  to 20 seconds.  Anything less 
> just resulted in
> the "Netmon is too busy" message.  At 20 seconds, I typically get:
> 
> Netmon is  3427 behind in status pinging
> 
> I don't really understand what I'm looking at, though.
> 
> The box itself is currently located on one of the backbone 
> switches.  I
> haven't taken any other action, because I wanted to 
> understand what I was
> looking at.  In netmon.trace I see things such as:
> 
> 1043: 162.131.203.57 () list = 0x202aa858
> or
> -21138: 162.131.115.1 (tower1-feth-5-0.net.principal.com) 
> list = 0x202aa7b8
> 
> Can you explain these any more?
> 
> Thanks!
> 
> Craig
> 
> > -----Original Message-----
> > From: Leslie Clark [mailto:lclark@us.ibm.com]
> > Sent: Wednesday, May 30, 2001 8:04 PM
> > To: IBM NetView Discussion
> > Subject: Re: [NV-L] Confirmation of Netview pinging
> >
> >
> > A couple of things.
> >
> > Remember your normal response is 40ms, not 1 sec. Yes, it
> > will take a while
> > to make the rounds if everything is down. But I hope your
> > normal state is
> > that everything is up.
> >
> > The number of outstanding pings is configurable. I think the current
> > default
> > is 16 (it was 10, years ago). It has been tested at up to 64.
> > That means it
> > can send off pings to 64 nodes at once, and as they repond,
> > send out more.
> > That number is the number of nodes it can be waiting on at one time
> > (waiting
> > an average of 40ms, you say). Set it in
> > /usr/OV/lrf/netmon.lrf, adding the
> > '-q' parameter. Use -q 32 to set the ping queue, and -Q 32 to
> > set the snmp
> > request queue. Experiment to see if you have the CPU and
> > interface speed to
> > back it up. I have never seen it overrun the adapter, but I
> > have seen it
> > use
> > up all of the cpu.
> >
> > 1700 interfaces is not a lot. You should be able to handle that in 5
> > minutes
> > easily on just about any box, using the default timeout/retry
> > of 2 and 3.
> > Some caveats: If you have a lot of unpingable interfaces in
> > your map, clear
> > them up. They clog up the ping queue (or increase the ping queue).
> > Acknowledged counts, too, since they are still pinged. Make
> > sure your name
> > resolution method is really fast. That slows everything down
> > more than you
> > would expect. If you are having problems with false alarms,
> > make note of
> > them
> > and tune them individually to accomodate normal variations in
> > the network,
> > rather than increase the timeout across the board. Make sure
> > you box is
> > centrally located in the network, with the most reliable connection
> > available,
> > and make sure that connection is running at full-duplex if
> > the connection
> > supports it.
> >
> > Here's a little script to help you monitor how well netmon 
> is keeping
> > up with the status polling. See how fast it catches up when
> > it gets behind.
> >
> > #!/bin/ksh
> > #
> > # pingstatus.sh
> > #
> > # A script to check whether netmon can keep up with the polling
> > # frequency scheduled. Can be called from the Reports menu.
> > # Output: a messages to stdout
> > # Note: not reliable if netmon tracing is going on!
> > #
> > #set -x
> > rm /usr/OV/log/netmon.trace
> > netmon -a 12
> > sleep 3
> > if [ -f /usr/OV/log/netmon.trace ]; then
> >   echo "Netmon is " `grep [-].*[:] /usr/OV/log/netmon.trace |
> > wc -l ` \
> >       "behind in status pinging";
> > else
> >   echo "Netmon is too busy to report now. Try later."
> > fi
> > exit
> >
> >
> > Cordially,
> >
> > Leslie A. Clark
> > IBM Global Services - Systems Mgmt & Networking
> > Detroit
> >
> >
> > "Treptow, Craig" <Treptow.Craig@principal.com>@tkg.com on 05/30/2001
> > 04:39:22 PM
> >
> > Please respond to IBM NetView Discussion <nv-l@tkg.com>
> >
> > Sent by:  owner-nv-l@tkg.com
> >
> >
> > To:   "NetView List (E-mail)" <nv-l@tkg.com>
> > cc:
> > Subject:  [NV-L] Confirmation of Netview pinging
> >
> >
> >
> > Hi.  We are running Netview 6.0.2 on AIX 4.3.  We are wanting
> > to move to a
> > more proactive approach to problem notifications.  Our hope 
> is to ping
> > servers/hubs/switches/routers and generate events when they aren't
> > reachable.  This would make use of the Netview features to 
> reduce the
> > "noisy" pages, etc.  In preparation for this, I was running
> > some numbers
> > and would like some input to see if I am flawed somewhere:
> >
> > Average response time for pings = 40ms (includes LAN and WAN)
> > Total devices to ping 1700. (and growing at about 30 per month)
> > # outstanding pings = 10 (Is this true?  Does it affect my
> > numbers?  If so,
> > how?)
> > Retries = 0
> > Timeout = 1 sec
> > One Netview machine.
> >
> > Netview could only ping 2 devices per second for a total of
> > 120 per minute.
> > 1700 / 120 = 14 minutes to complete one ping cycle.
> >
> > So this would mean that using this method, we would only find
> > out about a
> > down device after 14 minutes at best?  I don't think anybody
> > would accept
> > this long of a window.
> >
> > Assuming the above is true, it appears that it is time for
> > use to look into
> > a different Netview architecture that could achieve our goals?
> >
> > I'm just looking for some insight into how Netview pings and
> > if my numbers
> > are even reasonable, etc.  Thanks for any help you can provide.
> >
> > Craig
> >
> > P.S. I have searched the archives, but there appears to be many open
> > questions on this topic.  Also, no form of netmon -a ?, or
> > any other flag
> > produced output in the netmon.trace file.
> > ______________________________________________________________
> > ___________
> > NV-L List information and Archives: http://www.tkg.com/nv-l
> >
> >
> > ______________________________________________________________
> > ___________
> > NV-L List information and Archives: http://www.tkg.com/nv-l
> >
> ______________________________________________________________
> ___________
> NV-L List information and Archives: http://www.tkg.com/nv-l
> 
> 
> ______________________________________________________________
> ___________


<Prev in Thread] Current Thread [Next in Thread>

Archive operated by Skills 1st Ltd

See also: The NetView Web