To: | nv-l@lists.tivoli.com |
---|---|
Subject: | Re: Trap queue buildup |
From: | James_Shanks@tivoli.com |
Date: | Mon, 31 Jul 2000 07:50:29 -0400 |
You need to call Support and get some help immediately. Those messages indicate that traps are arriving so fast that while trapd can get them and queue them, other applications which are processing traps, such as ovtopmd, netmon, snmpCollect, and so on, cannot process them fast enough to keep up. You have alredy boosted the trapd queue size to 32000 from its default of 2000, so this means that your external changes have resulted in many more traps per second being sent to your box than it can handle. You are experiencing trap storms. Trapd disconnects applications which exceed their queue size so that they do not cause him to crash for lack of memory. So while he is not losing traps, your other applications are not getting them. I would try turning on the trapd trace by making trapd run with the "hex dump of all packets option" (that adds a -x flag to him) and then start the trace (issue trapd -T from the command line) so that you can analyze what these traps are and where they are coming from. But as the results will be in hex, you will probably need help deciphering it. The trace will also show you the process id of the applications being disconnected. James Shanks Team Leader, Level 3 Support Tivoli NetView for UNIX and NT "Rama, R. (Reggie)" <ReggieR@nedcor.co.za> on 07/31/2000 04:10:41 AM Please respond to IBM NetView Discussion <nv-l@tkg.com> To: "'nv-l@tkg.com'" <nv-l@tkg.com> cc: "Bhikha, P. (Prakash)" <PrakashB@nedcor.com> (bcc: James Shanks/Tivoli Systems) Subject: [NV-L] Trap queue buildup Hello All Netviewers We are currently running AIX 4.2.1 and Netview 5.1.2 on a F50 (4CPU & 1GB RAM) and we are experiencing the following problem. Over the past few days we have noticed that we receive the following message within the trapd.log file "netmon-related Application reached maximum number of outstanding events, disconnecting from trapd". The trapd buffer size to 32000 .i.e. trapd -b32000.We are receiving about 4 of these messages per hour daily now. When we monitor udp port 162 using the netstat -an command, we find that the receive queue builds up to approx 32000 and it sits at this value for a few minutes and then only does it get cleared and starts it building up again. I have looked at all the various Netview configurations and they all seem OK. I have searched the Netview Archives and could find a suitable reply for the questions I have.My questions are :- 1. Are there application(s) that are not reading the traps from the queue fast enough that is the cause of the problem. 2. When we get the above message, does it mean that all the traps that were on the queue are discarded (lost). 3. How does one determine which application(s) are not reading the traps from the queue and are the cause of the problem. 4. How does one determine / verify that traps are not being lost .i.e how does one verify if the data within trapd.log is correct. 5. Also, we have made no changes to the system at all recently. Are there any external changes .i.e.many more traps from devices that can cause this to occurr. Thanks in advance for the assistance. Regards Reggie Rama ESM - Technology & Operations Division Nedcor Bank Limited (South Africa) Tel : +27 - 011 - 8813989 Fax : +27 - 011 - 8814113 e-mail : reggier@nedcor.co.za Hello All Netviewers We are currently running AIX 4.2.1 and Netview 5.1.2 on a F50 (4CPU & 1GB RAM) and we are experiencing the following problem. Over the past few days we have noticed that we receive the following message within the trapd.log file "netmon-related Application reached maximum number of outstanding events, disconnecting from trapd". The trapd buffer size to 32000 .i.e. trapd -b32000.We are receiving about 4 of these messages per hour daily now. When we monitor udp port 162 using the netstat -an command, we find that the receive queue builds up to approx 32000 and it sits at this value for a few minutes and then only does it get cleared and starts it building up again. I have looked at all the various Netview configurations and they all seem OK. I have searched the Netview Archives and could find a suitable reply for the questions I have.My questions are :- 1. Are there application(s) that are not reading the traps from the queue fast enough that is the cause of the problem. 2. When we get the above message, does it mean that all the traps that were on the queue are discarded (lost).
4. How does one determine / verify that traps are not being lost .i.e how does one verify if the data within trapd.log is correct. 5. Also, we have made no changes to the system at all recently. Are there any external changes .i.e.many more traps from devices that can cause this to occurr. Thanks in advance for the assistance. Regards
Tel : +27 - 011 - 8813989
|
<Prev in Thread] | Current Thread | [Next in Thread> |
---|---|---|
|
Previous by Date: | Trapd queue build up, Rama, R. (Reggie) |
---|---|
Next by Date: | Re: Re: Réf. : [NV-L] Netview Rules Question, James_Shanks |
Previous by Thread: | Trap queue buildup, Rama, R. (Reggie) |
Next by Thread: | Trapd queue build up, Rama, R. (Reggie) |
Indexes: | [Date] [Thread] [Top] [All Lists] |
Archive operated by Skills 1st Ltd
See also: The NetView Web