Hi, we have the same problems and the same check with a touch. We have found, our problem are nvcold and Query Smartsets in Rules. We have change the rules from Query Smartset to Query Database Field and now we have few problems. Your time for check the rules with 30s is too short. You should use min. 1min. This is our experience. Uwe
James Shanks <jshanks@us.ibm.com> 07.05.2003 15:22 AST
An: nv-l@lists.tivoli.com Kopie: Blindkopie: NVL/Synthesis Thema: Re: [nv-l] Trap Processing Performance Issue
A few observations . . . Does your netstat show anything queued for trapd or nvcorrd? If those queues aren't zero, then you have a backup there.
Are you running a trapd.trace? Turn on the "hex dump of all packets" option for trapd and turn on the trapd trace by toggling it with "trapd -T". This will show you when trapd pulled the trap in off the socket and what he did with it. You can try matching when he sends it to all appls with the input time stamp in nvcorrd to make sure you are dealing with the same instance of the trap.
Also you haven't explained what happens in the ruleset which is supposed to alter the test file. Who does that? Actionsvr? Have you looked in his log? You can see in there when he got the action from nvcorrd. The very same timestamped transfer will appear in the nvcorrd log. Actionsvr will launch the action as a script, which calls your executable after exporting the variables. Depending on how it ends you might be able to see that in nvaction.alog/blog as well. Could your scripts be stepping on one another trying to write to the same file?
James Shanks Level 3 Support for Tivoli NetView for UNIX and NT Tivoli Software / IBM Software Group
"Barr, Scott" <Scott_Barr@csgsystems.com> 05/07/2003 02:44 PM
To: <nv-l@lists.tivoli.com> cc: Subject: [nv-l] Trap Processing Performance Issue
Greetings, I have a problem I can't for the life of me figure out and I am hoping the answer here will be speedier and less painful than the answer I'll get from support. NetView v7.1.3 Solaris 2.8 Basically, we have a piece of automation who's function is to determine if trap processing is functioning. It works like this: 1. crontab executes a script every 10 minutes that issues an SNMP trap for NetView to consume. 2. A ruleset exists that catches the trap, and writes an empty file out (i.e. touches the time stamp) 3. The crontab script goes to sleep for 30 seconds. 4. When the script wakes back up, it examines the time on the file, and if it is more than 30 seconds old (i.e. automation hasn't touched the file and he should have already) 5. If this occurs, the script cycles nvcorrd. When the problem happens, the diagnostics indicate the trap was sent in and NetView did not touch the file for >30 seconds. This implies that the trap was hung up in processing for a good long time. Some things I have observed/checked: 1. I looked at CPU performance. Although ovwdb spikes pretty high, everything seems to be responding okay, i.e user interface etc. Could be a cpu issue, but I don't think so. 2. Trap volume: I snooped on port 162 during this time, and traps were coming in roughly 1 every few seconds. Nothing scary there. 3. No cores. 4. There seems to be a delay in the event showing up in the nvevents window. I consider this to be an important point. 5. The problem disappeared for weeks. Now it happened 5 times in a two hour period this morning. 6. We use a large number of rulesets (34). In general, they do not overlap but there are a few cases where traps must be handled twice. 7. We use a large number of smartsets (25) 8. Netstat -a does not show any ports associated with nvcold having queued data I am open to suggestion, this is actually occuring on two different servers in two different parts of the county. The configuration is not new. but the trap-checking automation has only been running for a few months and this problem may have been occuring before we were able to measure the time it takes to process a trap. 30 seconds seems like a real long time. I can't find any error logs or anything else even with debug on nvcorrd that shows anything obviously wrong. Anybody got any suggestions?
Scott Barr CSG Systems Inc. Network Systems Engineer Phone: 402-431-7939 Fax: 402-431-7413 Mail: scott_barr@csgsystems.com
--------------------------------------------------------------------- To unsubscribe, e-mail: nv-l-unsubscribe@lists.tivoli.com For additional commands, e-mail: nv-l-help@lists.tivoli.com
*NOTE* This is not an Offical Tivoli Support forum. If you need immediate assistance from Tivoli please call the IBM Tivoli Software Group help line at 1-800-TIVOLI8(848-6548)
---------------------------------------------------------------------
To unsubscribe, e-mail: nv-l-unsubscribe@lists.tivoli.com
For additional commands, e-mail: nv-l-help@lists.tivoli.com
*NOTE*
This is not an Offical Tivoli Support forum. If you need immediate
assistance from Tivoli please call the IBM Tivoli Software Group
help line at 1-800-TIVOLI8(848-6548)
|