nv-l
[Top] [All Lists]

RE: netmon -a12

To: nv-l@lists.tivoli.com
Subject: RE: netmon -a12
From: "Leslie Clark" <lclark@us.ibm.com>
Date: Mon, 20 Nov 2000 22:41:09 -0500
The meaning of these entries is not documented. I'm going by observed
behavior. When I see no negatives, all future times, I know netmon is
caught up. When I see negative times, I know that it was scheduled to
poll that particular interface some time ago. Take a look at the node with
the
really old time. Maybe it's not up? Maybe it has missed its polling time
for many polls? That's a guess. I've observed a reduction in the number
of negative entries when I reduce the number of managed, down interfaces,
and/or increased the number of pingers (-q) and/or when increased the
polling
cycle.  I may be completely wrong about this, but that's what it looks like
and
that's how I use it.

I would not say that  -59 is 'in good order'.  That is a minute behind.
Aside from
right after a netmon startup, your goal is all positive times. Otherwise,
you probably
have your polling cycle set to something shorter than your system can
handle. Or
your  timeouts are too long. Same for the snmp polling (netmon -a 16).

Again, I'm guessing. You would have to go to Support for a real explanation
of the contents of those records, and they may have to go even further to
get
the answer, since it is not documented.

Anybody else?

Cordially,

Leslie A. Clark
IBM Global Services - Systems Mgmt & Networking
Detroit


Stephen Elliott <selliott@epicrealm.com>@tkg.com on 11/20/2000 05:11:00 PM

Please respond to IBM NetView Discussion <nv-l@tkg.com>

Sent by:  owner-nv-l@tkg.com


To:   "'IBM NetView Discussion'" <nv-l@tkg.com>
cc:
Subject:  RE: [NV-L] netmon -a12



Leslie,

Thanks for the reply, I understand that part of it. The part I don't
understand is why in one minute the queue is essentially caught up and in
the next, the queue is several thousand seconds behind, then caught up in
the next. For example, here are three consecutive entries:

  -40: 10.40.0.11 (VPN1X1.HKG1C) list = 0x565358
  -8745: 100.129.76.46 (SRV3X4.LON1B) list = 0x565358
  -59: 10.5.3.38 (SRV2X11.CHI4C) list = 0x565358

I'm assuming that the system is pushing garbage onto the stack for the
larger time entry, but the system is obviously processing the queue in good
form as seen in the 3rd entry. If one were going to cron a script to track
the queue for lengthy delays, this 'anomaly' would cause a considerable
number of false alarms. It could be easily resolved by looking for three or
more consecutive entries greater than X seconds before alarming, or just
take your particular approach to tracking queue length. The whole point
here
is, are we looking at a problem or not? Is this an indicator of some kind?

Regards,

Steve Elliott
Sr. Network Mgmt. Engineer
epicRealm, Inc.
214-570-4560


-----Original Message-----
From: Leslie Clark [mailto:lclark@US.IBM.COM]
Sent: Sunday, November 19, 2000 10:41 AM
To: IBM NetView Discussion
Subject: Re: [NV-L] netmon -a12


The 'behind' is for that interface only. After it does that interface, it
is
rescheduled with a future time. What I do is count the number
of records with negative numbers with a grep. That's the number of
interfaces it
is behind by.

Cordially,

Leslie A. Clark
IBM Global Services - Systems Mgmt & Networking
Detroit

Stephen Elliott <selliott@epicrealm.com>@tkg.com on 11/17/2000 03:17:19 PM

Please respond to IBM NetView Discussion <nv-l@tkg.com>

Sent by:  owner-nv-l@tkg.com


To:   "'nv-l@tkg.com'" <nv-l@tkg.com>
cc:
Subject:  [NV-L] netmon -a12



Happy Friday, Y'all,

Here's a weekend puzzler. I am monitoring the netmon polling queue on my NV
6.0.1, Solaris 2.6 system to see how often and for how long the queue might
get backed up over the course of a day. There are 3181 interfaces in the
netmon -a12 output. The polling rates are a mixture of 1 min, 1 hour and 5
min (default) intervals. I have a simple script that deletes the
netmon.trace file, runs a new netmon -a12 and then appends the first line
of
that output to a file. The script runs every minute. Here's a sample of
that
output.

 0: 88.88.99.99 (SWI2X1.AMS1B) list = 0x565358
  -10804: 180.174.76.48 (SRV3X6.FRA1B) list = 0x565358
  -5: 10.30.60.15 (TRM2X15.MAD1C) list = 0x565358
  1: 10.0.0.237 (TRM2X17.SJC1B) list = 0x565358
  -2: 165.130.105.8 (SWI2X16.TYO2C) list = 0x565358
  -10: 10.0.10.11 (VPN1X1.SAN1C) list = 0x565358
  1: 10.30.90.15 (TRM2X15.STO1C) list = 0x565358
  -14: 244.76.88.73 (SWI2X16.HKG1C) list = 0x565358
  -13: 188.174.76.1 (RTR1X20.FRA1B) list = 0x565358
  0: 168.5.137.250 (SWI5X1.ATL1A) list = 0x565358
  -30: 126.52.166.8 (SWI2X16.MIA1C) list = 0x565358
  0: 10.0.10.15 (TRM2X15.SAN1C) list = 0x565358
  -4: 200.174.77.139 (VPN1X1.GVA1C) list = 0x565358
  -2: 200.52.99.253 (RTR2X14.LAX1C) list = 0x565358
  -4: 200.224.34.21 (SVI1X3.LON2C) list = 0x565358
  -8: 200.224.206.1 (RTR1X18.LON3T) list = 0x565358
  -28: 10.40.10.15 (TRM2X15.SEL1C) list = 0x565358
  -39: 10.30.1.21 (SVI1X3.LON2C) list = 0x565358
  -48: 120.41.19.133 (SVI1X2.ANR1C) list = 0x565358
  -61: 120.0.16.62 (VPN1X20.SJC4T) list = 0x565358
  -60: 120.41.19.35 (SRV2X5.AMS1B) list = 0x565358
  -50: 10.5.3.15 (TRM2X15.CHI4C) list = 0x565358
  -30: 10.0.5.15 (TRM2X15.LAX1C) list = 0x565358
  -29: 10.42.0.31 (SRV2X4.SYD1C) list = 0x565358
  -24: 10.40.10.11 (VPN1X1.SEL1C) list = 0x565358
  -45: 150.186.221.174 (SRV2X14.GRU1C) list = 0x565358
  -50: 10.0.4.11 (VPN1X12.SJC5C) list = 0x565358
  -40: 10.40.0.11 (VPN1X1.HKG1C) list = 0x565358
  -8745: 100.129.76.46 (SRV3X4.LON1B) list = 0x565358
  -59: 10.5.3.38 (SRV2X11.CHI4C) list = 0x565358
  -71: 10.30.50.33 (SRV2X6.GVA1C) list = 0x565358
  -70: 200.174.77.135 (SWI2X2.GVA1C) list = 0x565358
  -84: 111.186.221.155 (SRV1X9.GRU1C) list = 0x565358
  -4890: 211.76.12.97 (SRV2X6.SYD1C) list = 0x565358
  -12: 164.0.16.62 (VPN1X20.SJC4T) list = 0x565358
  -4: 10.1.0.15 (TRM2X15.SEA1C) list = 0x565358

Note the entries that indicate the queue is behind by several thousand
seconds. Then the next minute the queue is essentially caught up. Anyone
have an idea what this means, or if it's a known
'anomaly', why the system does this?

Regards,

Steve Elliott
Sr. Network Mgmt. Engineer
epicRealm, Inc.
214-570-4560

_________________________________________________________________________
NV-L List information and Archives: http://www.tkg.com/nv-l


_________________________________________________________________________
NV-L List information and Archives: http://www.tkg.com/nv-l
_________________________________________________________________________
NV-L List information and Archives: http://www.tkg.com/nv-l


<Prev in Thread] Current Thread [Next in Thread>

Archive operated by Skills 1st Ltd

See also: The NetView Web