nv-l
[Top] [All Lists]

Re: [nv-l] SOLVED - Traps Limitation??

To: nv-l@lists.us.ibm.com
Subject: Re: [nv-l] SOLVED - Traps Limitation??
From: Leslie Clark <lclark@us.ibm.com>
Date: Sat, 15 Apr 2006 09:00:08 -0400
Delivery-date: Sat, 15 Apr 2006 13:58:20 +0100
Envelope-to: nv-l-archive@lists.skills-1st.co.uk
In-reply-to: <OF0D1F5F3B.5631D546-ON4525714F.003E731D-45257151.001F2480@s-iii.com>
Reply-to: nv-l@lists.us.ibm.com
Sender: owner-nv-l@lists.us.ibm.com

Well, there is trap volume and then there is trap volume. The product is designed to withstand quite an onslaught but you are expected to deal with such an onslaught as a network problem, because it IS a network problem.  Ten traps per second, for instance, will get handled. After a few minutes of this, thing start getting behind and then if they stop, everything catches up. If it goes on and on, or escalates, things cannot catch up.

If you look at trapd.log or the events display, it is hard to miss the fact that an event storm is going on. If you check the queue, you will probably see it backing up. Here's a health check:
    netstat -a | grep \.162
There will also be messages in trapd.log about applications connecting and disconnecting from trapd. So that is something else to look for.

Then, netmon also is trying to send status events to trapd. It will naturally have trouble operating normally. When I see netmon not responding, my first suspicion is always bad name resolution, but in environments where traps are configured on (especially authentication failure traps), I've learned recently to check that first.


Cordially,

Leslie A. Clark
IT Services Specialist, Network Mgmt
Information Technology Services Americas
IBM Global Services
(248) 552-4968 Voicemail, Fax, Pager



usman.taokeer@s-iii.com
Sent by: owner-nv-l@lists.us.ibm.com

04/15/2006 01:44 AM
Please respond to
nv-l

To
nv-l@lists.us.ibm.com
cc
nv-l@lists.us.ibm.com, owner-nv-l@lists.us.ibm.com
Subject
Re: [nv-l] SOLVED  - Traps Limitation??






Hi,


I figured it out... the netview was flooded with traps, and it was preventing Netview (netmon) to work properl, now since we have limited the traps from devices everything is back to normal. There seems like a limitation in netview on number of traps that it can process, IBM should document it if there is one!


Regards,

Usman Taokeer

Si3.


James Shanks <jshanks@us.ibm.com>
Sent by: owner-nv-l@lists.us.ibm.com

15-03-06 11:35 PM
Please respond to
nv-l@lists.us.ibm.com

To
nv-l@lists.us.ibm.com
cc
Subject
Re: [nv-l] Traps Limitation??







Usman,
I doubt that you can ever solve this problem by merely guessing at probable
causes.
Stop guessing, turn on the full netmon trace (netmon -M -1), and call
Support.
They will be able to tell you what netmon is doing, or else pass the trace
to someone else who can.

James Shanks
Level 3 Support  for Tivoli NetView for UNIX and Windows
Tivoli Software / IBM Software Group


                                                                         
           usman.taokeer@s-i                                            
           ii.com                                                        
           Sent by:                                                   To
           owner-nv-l@lists.         nv-l@lists.us.ibm.com              
           us.ibm.com                                                 cc
                                     nv-l@lists.us.ibm.com,              
                                     owner-nv-l@lists.us.ibm.com        
           03/15/2006 09:47                                      Subject
           AM                        Re: [nv-l] Traps Limitation??      
                                                                         
                                                                         
           Please respond to                                            
           nv-l@lists.us.ibm                                            
                 .com                                                    
                                                                         
                                                                         





Gareth,

Ok! here is the scenario:

NetView 7.1.4 FP04
Windows 2003 SP1
Hosted on a Dell Dual XEON Processor with 2GB RAM!

There are around 1200 Nodes discovered in the network, and we have around
400 Routers which are sending different traps to the Netview server. The
problem is whenever we try to "Demand Poll, Quick Test etc" any device it
just keeps saying "Waiting for netmon to respond" !!! Any clues what's
causing this? There is plenty of Memory available and the CPU utilization
is also between 4-10% only!


Regards,

Usman Taokeer
Si3.

                                                                         
Gareth Holl <gholl@us.ibm.com>                                            
Sent by: owner-nv-l@lists.us.ibm.com                                      
                                                                      To
                                                  nv-l@lists.us.ibm.com  
15-03-06 08:21 AM                                                      cc
                                                                         
                                                                 Subject
         Please respond to                        Re: [nv-l] Traps      
       nv-l@lists.us.ibm.com                      Limitation??          
                                                                         
                                                                         
                                                                         
                                                                         
                                                                         
                                                                         
                                                                         






There is always going to be a limit of some sort, whether with the
hardware, OS, or trapd's ability itself. This is most likely dependent on
the resources (CPU speed, number of CPUs, and available memory per process)
available on the system hosting NetView.

trapd may end up consuming most, if not all cycles of a single CPU during
heavy trap reception. So a multi-CPU box would be essential so that other
processes could continue to run. High CPU utilization caused by trap floods
and even the subsequent processing of the traps by other daemons such as
nvcorrd could well affect netmon's ability to keep up if it cannot get the
CPU cycles it needs.

trapd will try caching/queuing all events received (with the goal to
eventually process every single one of them), hence the need for a large
amount of memory and an adequate queue size. The cached events will be
processed when there is a break in trap reception.....this could be some
time after the trap was originally generated. So while trapd is still
receiving traps, it is possible for NetView's internal events (including
those from netmon) to be caught up in this process and thus stayed
queued/unprocessed for some time. This probably isn't a direct affect on
netmon but instead more of a perceived affect as status events are not
processed in a timely fashion and thus nodes don't change color on the map
in a timely fashion

That's all I have.

Gareth



                                                                         
usman.taokeer@s-iii.com                                                  
Sent by: owner-nv-l@lists.us.ibm.com                                      
                                                                         
                                                                      To
03/14/2006 09:57 PM                                   nv-l@lists.us.ibm.c
                                                     om                  
                                                                      cc
          Please respond to                                              
                 nv-l                                            Subject
                                                     [nv-l] Traps        
                                                     Limitation??        
                                                                         
                                                                         
                                                                         
                                                                         
                                                                         
                                                                         
                                                                         
                                                                         







Hi List,

Just wanted to know are there any limitations On Netview on the number of
traps (coming from different nodes) it can handle? If there is any would it
effect the behaviour of netmon?

Regards,

Usman
Si3.


<Prev in Thread] Current Thread [Next in Thread>

Archive operated by Skills 1st Ltd

See also: The NetView Web