Re: Trapd questions

To:	nv-l@lists.tivoli.com
Subject:	Re: Trapd questions
From:	Art DeBuigny <debuigny@DALLAS.NET>
Date:	Tue, 18 May 1999 16:00:19 -0500
Reply-to:	Discussion of IBM NetView and POLYCENTER Manager on NetView <NV-L@UCSBVM.UCSB.EDU>
Sender:	Discussion of IBM NetView and POLYCENTER Manager on NetView <NV-L@UCSBVM.UCSB.EDU>

Our trapd queue size is set at 10000. We have over 1000 cisco routers in Florida alone. They were sending traps as fast as their DLSW peer could lose or establish a connection with them, all in order. I will look into the possibility of disabling DLSW traps from the branch routers, and enabling only at the peer. Since this network is growing (and currently roughly 25 percent of its end game size), that would seem to be our only hope.

Thanks

Art DeBuigny

debuigny@dallas.net

Bank of America Network Operations

-----Original Message-----
From: James Shanks <James_Shanks@TIVOLI.COM>
To: NV-L@UCSBVM.UCSB.EDU <NV-L@UCSBVM.UCSB.EDU>
Date: Tuesday, May 18, 1999 11:21 AM
Subject: Re: Trapd questions

Hmmm. What exactly is your application queue size? 10,000?   25,000?

If trapd goes down, all those others will go too. Is that what happened?
Did trapd core or what?   Who died first?

Basically the application queue size is a mechanism for people to use when
they have configured their agents to send more traps more frequently than
the daemons can usually handle.   So adjusting this is how they can be kept
up, at the cost of a lot more storage and slower performance. The boys and
girls on the Tivoli performance team were able to handle 100 traps/sec for
a few hours, but they had to boost the appl queue size to 35,000 and it
took NetView many more hours to recover and process all those traps. But
they didn't lose any daemons.

So I have to ask. What exactly is the point of getting so many traps? Can
not these Cisco agents be configured to send one or two instead of dozens
per minute? Or is that what they did, but you have 40,000 Cisco devices
sending them at one time? Why be so verbose? You cannot be helping your
outage by flooding what is left of the network with traps.

Personally, in my view (of course I'm the management vendor) the only traps
that should be sent to NetView are ones you intend to do something about.
And one is enough. Couldn't you get one trap from the FEP or a few from
key routers and stifle the rest? Lots of folks implement a tiered
solution, where routers in one tier send one kind of trap and others do
not.

After all, it's just one UNIX box receiving all that stuff.

Just my two cents.

James Shanks
Tivoli (NetView for UNIX) L3 Support

Art DeBuigny <debuigny@DALLAS.NET> on 05/18/99 11:09:59 AM

Please respond to Discussion of IBM NetView and POLYCENTER Manager on
      NetView <NV-L@UCSBVM.UCSB.EDU>

To:   NV-L@UCSBVM.UCSB.EDU
cc:    (bcc: James Shanks/Tivoli Systems)
Subject: Trapd questions

On occasion, we have been getting traps from Cisco routers when the state
of the DLSW connection resets, in this case due to a reset at the FEP.

Recently, due to a major outage, we started getting these traps from every
single router on the network. It crashed netmon, ovtopmd, trapd, and even
ovactiond.

I've tried setting the event customization to 'Do not log or display' but
that didn't seem to help. The situation only stablizes once all the
routers DLSW connections have been restored, and traps are no longer
flooding into the netview machine.

Since this can always happen again in the event of an outage, can anyone
think of a way to 'protect' NetView's daemons from such a flood without
actually stopping the trap at the source? I've tried adjusting the
connected applications queue size, but that apparently wasn't enough.

Thanks

Art DeBuigny
debuigny@dallas.net
Bank of America Network Operations

<Prev in Thread]	Current Thread	[Next in Thread>
Trapd questions, Art DeBuigny Re: Trapd questions, James Shanks Re: Trapd questions, Art DeBuigny <=

Previous by Date:	Re: Optivity Installation, Art DeBuigny
Next by Date:	CLI command to create NetView Server, Jane Curry
Previous by Thread:	Re: Trapd questions, James Shanks
Next by Thread:	Ruleset for Netview NT 5.1.1, Todd E. Lewis
Indexes:	[Date] [Thread] [Top] [All Lists]