Thanks for the extra info Leslie. Regarding DNS performance, I run a secondary
DNS on the Netview machine for the reverse address space only. These
consistenly respond in 3-4 ms, while the forward entries take 10-20ms in
general.
I let this run all weekend and tried the script again this morning. I didn't
get any output from it until I increased the sleep to 60 seconds. When I did
this I got:
Netmon is 4335 behind in status pinging
I have configuration polling set to 1 day, with 11 OID's to include and 6 OID's
to exclude in the seed file.
Thanks.
Craig
> -----Original Message-----
> From: Leslie Clark [mailto:lclark@US.IBM.COM]
> Sent: Saturday, June 02, 2001 3:56 PM
> To: IBM NetView Discussion
> Subject: RE: [NV-L] Confirmation of Netview pinging
>
>
> The 20 seconds is how long it takes netmon to get around to responding
> to your request that he dump the report. The script counts
> the number of
> records, such as the one below, that were scheduled for
> times that have
> already passed. Note that near the top of the netmon.trace they have
> negatives, and hopefully, at the bottom, there will be some
> with positive
> numbers. The positive numbers tell you that this node is
> schedule to be
> pinged that many seconds in the future. Or some such time
> unit. That looks
> like you have some nodes that have not been pinged in hours, so maybe
> the units are not seconds.
>
> -21138: 162.131.115.1 (tower1-feth-5-0.net.principal.com) ...
>
> When you first start netmon up, it is always behind. It has a
> lot to do at
> startup.
> Try letting it run for half an hour or so, and see if it
> catches up. What
> it says
> at startup is not something I worry about. It is the
> steady-state behavior
> that
> you want to tune for. I suspect that you do have some
> performance issues to
> deal with, though, because it sort of looks from this as if
> it is never
> catching up.
> Which is why you are writing, I guess. What does it say after
> it has been
> running
> for an hour? How often are you sending it off to do the
> configuration poll
> (which
> defaults to once a day)? How good is your name resolution? Oh
> - and how
> many names are in your seedfile? All of them? I have seen that delay
> netmon startup by half an hour unneccessarily.
>
> Cordially,
>
> Leslie A. Clark
> IBM Global Services - Systems Mgmt & Networking
> Detroit
>
> "Treptow, Craig" <Treptow.Craig@principal.com>@tkg.com on 06/01/2001
> 08:58:30 AM
>
> Please respond to IBM NetView Discussion <nv-l@tkg.com>
>
> Sent by: owner-nv-l@tkg.com
>
>
> To: "'IBM NetView Discussion'" <nv-l@tkg.com>
> cc:
> Subject: RE: [NV-L] Confirmation of Netview pinging
>
>
>
> Thanks Leslie!
>
> Just to clarify, 1700 was not the number of interfaces, it
> was the number
> of hubs/routers/switches/servers. I used the script you
> provided and had
> to increase the sleep time to 20 seconds. Anything less
> just resulted in
> the "Netmon is too busy" message. At 20 seconds, I typically get:
>
> Netmon is 3427 behind in status pinging
>
> I don't really understand what I'm looking at, though.
>
> The box itself is currently located on one of the backbone
> switches. I
> haven't taken any other action, because I wanted to
> understand what I was
> looking at. In netmon.trace I see things such as:
>
> 1043: 162.131.203.57 () list = 0x202aa858
> or
> -21138: 162.131.115.1 (tower1-feth-5-0.net.principal.com)
> list = 0x202aa7b8
>
> Can you explain these any more?
>
> Thanks!
>
> Craig
>
> > -----Original Message-----
> > From: Leslie Clark [mailto:lclark@us.ibm.com]
> > Sent: Wednesday, May 30, 2001 8:04 PM
> > To: IBM NetView Discussion
> > Subject: Re: [NV-L] Confirmation of Netview pinging
> >
> >
> > A couple of things.
> >
> > Remember your normal response is 40ms, not 1 sec. Yes, it
> > will take a while
> > to make the rounds if everything is down. But I hope your
> > normal state is
> > that everything is up.
> >
> > The number of outstanding pings is configurable. I think the current
> > default
> > is 16 (it was 10, years ago). It has been tested at up to 64.
> > That means it
> > can send off pings to 64 nodes at once, and as they repond,
> > send out more.
> > That number is the number of nodes it can be waiting on at one time
> > (waiting
> > an average of 40ms, you say). Set it in
> > /usr/OV/lrf/netmon.lrf, adding the
> > '-q' parameter. Use -q 32 to set the ping queue, and -Q 32 to
> > set the snmp
> > request queue. Experiment to see if you have the CPU and
> > interface speed to
> > back it up. I have never seen it overrun the adapter, but I
> > have seen it
> > use
> > up all of the cpu.
> >
> > 1700 interfaces is not a lot. You should be able to handle that in 5
> > minutes
> > easily on just about any box, using the default timeout/retry
> > of 2 and 3.
> > Some caveats: If you have a lot of unpingable interfaces in
> > your map, clear
> > them up. They clog up the ping queue (or increase the ping queue).
> > Acknowledged counts, too, since they are still pinged. Make
> > sure your name
> > resolution method is really fast. That slows everything down
> > more than you
> > would expect. If you are having problems with false alarms,
> > make note of
> > them
> > and tune them individually to accomodate normal variations in
> > the network,
> > rather than increase the timeout across the board. Make sure
> > you box is
> > centrally located in the network, with the most reliable connection
> > available,
> > and make sure that connection is running at full-duplex if
> > the connection
> > supports it.
> >
> > Here's a little script to help you monitor how well netmon
> is keeping
> > up with the status polling. See how fast it catches up when
> > it gets behind.
> >
> > #!/bin/ksh
> > #
> > # pingstatus.sh
> > #
> > # A script to check whether netmon can keep up with the polling
> > # frequency scheduled. Can be called from the Reports menu.
> > # Output: a messages to stdout
> > # Note: not reliable if netmon tracing is going on!
> > #
> > #set -x
> > rm /usr/OV/log/netmon.trace
> > netmon -a 12
> > sleep 3
> > if [ -f /usr/OV/log/netmon.trace ]; then
> > echo "Netmon is " `grep [-].*[:] /usr/OV/log/netmon.trace |
> > wc -l ` \
> > "behind in status pinging";
> > else
> > echo "Netmon is too busy to report now. Try later."
> > fi
> > exit
> >
> >
> > Cordially,
> >
> > Leslie A. Clark
> > IBM Global Services - Systems Mgmt & Networking
> > Detroit
> >
> >
> > "Treptow, Craig" <Treptow.Craig@principal.com>@tkg.com on 05/30/2001
> > 04:39:22 PM
> >
> > Please respond to IBM NetView Discussion <nv-l@tkg.com>
> >
> > Sent by: owner-nv-l@tkg.com
> >
> >
> > To: "NetView List (E-mail)" <nv-l@tkg.com>
> > cc:
> > Subject: [NV-L] Confirmation of Netview pinging
> >
> >
> >
> > Hi. We are running Netview 6.0.2 on AIX 4.3. We are wanting
> > to move to a
> > more proactive approach to problem notifications. Our hope
> is to ping
> > servers/hubs/switches/routers and generate events when they aren't
> > reachable. This would make use of the Netview features to
> reduce the
> > "noisy" pages, etc. In preparation for this, I was running
> > some numbers
> > and would like some input to see if I am flawed somewhere:
> >
> > Average response time for pings = 40ms (includes LAN and WAN)
> > Total devices to ping 1700. (and growing at about 30 per month)
> > # outstanding pings = 10 (Is this true? Does it affect my
> > numbers? If so,
> > how?)
> > Retries = 0
> > Timeout = 1 sec
> > One Netview machine.
> >
> > Netview could only ping 2 devices per second for a total of
> > 120 per minute.
> > 1700 / 120 = 14 minutes to complete one ping cycle.
> >
> > So this would mean that using this method, we would only find
> > out about a
> > down device after 14 minutes at best? I don't think anybody
> > would accept
> > this long of a window.
> >
> > Assuming the above is true, it appears that it is time for
> > use to look into
> > a different Netview architecture that could achieve our goals?
> >
> > I'm just looking for some insight into how Netview pings and
> > if my numbers
> > are even reasonable, etc. Thanks for any help you can provide.
> >
> > Craig
> >
> > P.S. I have searched the archives, but there appears to be many open
> > questions on this topic. Also, no form of netmon -a ?, or
> > any other flag
> > produced output in the netmon.trace file.
> > ______________________________________________________________
> > ___________
> > NV-L List information and Archives: http://www.tkg.com/nv-l
> >
> >
> > ______________________________________________________________
> > ___________
> > NV-L List information and Archives: http://www.tkg.com/nv-l
> >
> ______________________________________________________________
> ___________
> NV-L List information and Archives: http://www.tkg.com/nv-l
>
>
> ______________________________________________________________
> ___________
|