James
Again, another very clear and concise explanation.
Since I am not running anything other than a standard NetView install right
out of the box I am going to assume that my box is being flooded with traps
from some rouge device on my network. Would it be a correct guess that even
though I have some traps set to don't log or display that they still need
to be processed and therefore can cause the queue to overrun.
Thanks
Alan E. Hennis
Caterpillar Inc.
Systems+Process Division
309.494.3308
hennis_alan_e@cat.com
James Shanks
<jshanks@us.ibm.com>
Sent by:
owner-nv-l-digest@lists
.us.ibm.com To: nv-l@lists.us.ibm.com
cc:
10/27/2003 02:43 PM
Please respond to nv-l
Subject: Re: [nv-l] T
netmon-related Application reached maximum number of
outstanding events,
disconnecting from trapd.
Caterpillar: Confidential Green Retain Until: 11/26/2003
Retention Category: G90 -
General
Matters/Administration
The first thing to notice is that a netmon-related application may or may
not be netmon himself. It might be another trap receiver using the same
API as netmon.
But what it means is the same no matter whom it is about.
That trap is issued by trapd whenever he forces a connected application to
disconnect. Every connected application gets an internal queue for trapd
to put events on when they need to be sent. When an application cannot
process the traps sent to him fast enough to keep up with the rate at which
they are being processed by trapd, the queue will grow. When the queue
reaches the maximum size, trapd forcibly disconnects that application to
save himself from running out of storage. The queue is emptied when the
application is disconnected. The default size of this queue in current
code is 2000. You can increase this in serversetup (application queue
buffer size). How big you should make it is a tuning and performance
issue. All connected applications (snmpcollect, ipmap, and so on) get the
same size, whatever it is. So you are using more system memory by raising
the limit.
If this disconnection is a common occurrence, then you should increase the
queue size. How big can you go? Well, I have seen people run with sizes
ten times as high (20000), but this has its own disadvantages. A larger
queue size will allow the application to stay connected and possibly
recover from whatever is slowing him down. But the application will then
still have to process all of those traps, sooner or later. He could be
behind for a very long time.
You can see how all this is working by running the trapd trace. You can
toggle that on and off from the command line using "trapd -T" and you'll
see the application queues being written to and deleted from. That will
give you some idea of what normal processing is for you. The PID of the
external process is given in the trace so you can see who the players are
and who is getting behind before the disconnect occurs.
James Shanks
Level 3 Support for Tivoli NetView for UNIX and Windows
Tivoli Software / IBM Software Group
"Alan E. Hennis"
<Hennis_Alan_E@cat.co To: nv-l@lists.us.ibm.com
m> cc:
Sent by: Subject: [nv-l] T netmon-related
owner-nv-l-digest@lis Application reached maximum number of outstanding
ts.us.ibm.com events, disconnecting from trapd.
10/27/2003 11:01 AM
Please respond to
nv-l
NV 7.1.3 FP1 RedHat 7.2
Has anyone ever seen this trap?
Mon Oct 27 09:57:50 2003 <none> T netmon-related Application
reached maximum number of outstanding events, disconnecting from trapd.
Here is the description from trapd.conf
This event is generated by IBM Tivoli NetView when
it detects a fatal error
The data passed with the event are:
1) ID of application sending the event
2) Name or IP address
3) Formatted description of the event
Thanks
Alan E. Hennis
Caterpillar Inc.
Systems+Process Division
309.494.3308
hennis_alan_e@cat.com
|