nv-l
[Top] [All Lists]

Re: [nv-l] T netmon-related Application reached maximum number of outsta

To: nv-l@lists.us.ibm.com
Subject: Re: [nv-l] T netmon-related Application reached maximum number of outstanding events, disconnecting from trapd.
From: James Shanks <jshanks@us.ibm.com>
Date: Mon, 27 Oct 2003 15:43:30 -0500
Delivery-date: Mon, 27 Oct 2003 20:51:39 +0000
Envelope-to: nv-l-archive@lists.skills-1st.co.uk
Reply-to: nv-l@lists.us.ibm.com
Sender: owner-nv-l-digest@lists.us.ibm.com

 The first thing to notice is that a netmon-related application may or may not be netmon himself.  It might be another trap receiver using the same API as netmon.
But what it means is the same no matter whom it is about.

That trap is issued by trapd whenever he forces a connected application to disconnect.  Every connected application gets an internal queue for trapd to put events on when they need to be sent.   When an application cannot process the traps sent to him fast enough to keep up with the rate at which they are being processed by trapd, the queue will grow.  When the queue reaches the maximum size, trapd forcibly disconnects that application to save himself from running out of storage.  The queue is emptied when the application is disconnected.  The default size of this queue in current code is 2000.    You can increase this in serversetup  (application queue buffer size).   How big you should make it is a tuning and performance issue.  All connected applications (snmpcollect, ipmap, and so on) get the same size, whatever it is.  So you are using more system memory by raising the limit.

 If this disconnection is a common occurrence, then you should increase the queue size.  How big can you go?  Well, I have seen people run with sizes ten times as high (20000), but this has its own disadvantages.  A larger  queue size will allow the application to stay connected and possibly recover from whatever is slowing him down.  But the application will then still have to process all of those traps, sooner or later.  He could be behind for a very long time.  

You can see how all this is working by running the trapd trace.  You can toggle that on and off from the command line using "trapd -T" and you'll see the application queues being written to and deleted from.  That will give you some idea of what normal processing is for you.  The PID of the external process is given in the trace so you can see who the players are and who is getting behind before the disconnect occurs.  

James Shanks
Level 3 Support  for Tivoli NetView for UNIX and Windows
Tivoli Software / IBM Software Group



"Alan E. Hennis" <Hennis_Alan_E@cat.com>
Sent by: owner-nv-l-digest@lists.us.ibm.com

10/27/2003 11:01 AM
Please respond to nv-l

       
        To:        nv-l@lists.us.ibm.com
        cc:        
        Subject:        [nv-l] T netmon-related Application reached maximum number of outstanding events, disconnecting from trapd.




NV 7.1.3 FP1 RedHat 7.2

Has anyone ever seen this trap?

Mon Oct 27 09:57:50 2003 <none>          T netmon-related Application
reached maximum number of outstanding events, disconnecting from trapd.


Here is the description from trapd.conf

This event is generated by IBM Tivoli NetView when
it detects a fatal error

The data passed with the event are:
   1) ID of application sending the event
   2) Name or IP address
   3) Formatted description of the event


Thanks
Alan E. Hennis
Caterpillar Inc.
Systems+Process Division
309.494.3308
hennis_alan_e@cat.com




<Prev in Thread] Current Thread [Next in Thread>

Archive operated by Skills 1st Ltd

See also: The NetView Web