[Top] [All Lists]

Re: [NV-L][TEC 3.9] NetView/TEC integration rules

To: Tivoli NetView Discussions <nv-l@lists.ca.ibm.com>
Subject: Re: [NV-L][TEC 3.9] NetView/TEC integration rules
From: Leslie Clark <lclark@us.ibm.com>
Date: Mon, 2 Jul 2007 17:57:03 -0400
Delivery-date: Mon, 02 Jul 2007 23:00:47 +0100
Envelope-to: nv-l-archive@lists.skills-1st.co.uk
In-reply-to: <35E9F422F25F594F8153605611ABF7B51F618F@MDCTXUEXCL01N1.corptxu.txu.com>
List-help: <mailto:nv-l-request@lists.ca.ibm.com?subject=help>
List-id: Tivoli NetView Discussions <nv-l.lists.ca.ibm.com>
List-post: <mailto:nv-l@lists.ca.ibm.com>
List-subscribe: <http://lists.ca.ibm.com/mailman/listinfo/nv-l>, <mailto:nv-l-request@lists.ca.ibm.com?subject=subscribe>
List-unsubscribe: <http://lists.ca.ibm.com/mailman/listinfo/nv-l>, <mailto:nv-l-request@lists.ca.ibm.com?subject=unsubscribe>
Reply-to: Tivoli NetView Discussions <nv-l@lists.ca.ibm.com>
Sender: nv-l-bounces@lists.ca.ibm.com

My favorite is:
Use the TEC_ITS.rs rule on the Netview side for forwarding events. Use it as-is for starters, but I always exclude the Interface Unreachable event. That one seems not to be handled well.
Enable RFI and take the trouble to get a well-connected map.
On the TEC sice, use the default netview.rls rule. In a heavily loaded system, you could excise the section on ITSA events.

The correlation on these is pretty good.  A lot go over, but most are cleared out quickly. Anything still there, open and critical, say,  after 5 minutes or one polling cycle is probably actionable. There is a latency setting in netview.rls that will likely need adjusting for your polling cycle.  Once you see this working consistently, you know what else you have to do.  You should see down/up correlation for nodes and interfaces. You should seen interface/node correlation. And, if RFI is enabled but some events leak through for nodes on the unreachable subnet, those should be correlated out as well. When an unreachable subnet comes back online, any nodes that do not come back will show up as down node events.

You may need to enable the forwarding of severity. I'm not sure if 7.1.5 went back to sending it by default or not. Then the rule works at lowering the severity, and raising it back up if a service impact event comes in. I never did any work with that, so we just removed the bits that lowered the severity and used what was sent it. Basically down is serious.

One customer I know of let things come over as minor, and if they were still open after x minutes, up them to critical. Operators only saw things that were critical, which would be all of the root cause events.

There is no dup-detect in the netview.rls, so for flapping stuff you will have to add something on the TEC side. The customer has to determine whether each outage is a new problem or not. It can get pretty philosphical.

Watch out for that Ack function in the netview.rls. When you send it an up, it sends back an event to Netview that causes Netview to check the node again. In cases where they have blocked ICMP_MASK_REQUEST or whatever its, where every other ping fails, you can get a perpetual motion machine. In that case, you can tell netview to ignore failed mask requests in /usr/OV/conf/netmon.conf
# Set to TRUE to turn off reporting ICMP error codes.
or tell TEC not to send the Acks by removing that function from the rule, or both.

Anybody else? The more real-life examples the better...


Leslie A. Clark
IT Services Specialist, Network Mgmt
Information Technology Services Americas
IBM Global Services
(248) 552-4968 Voicemail, Fax, Pager

Sent by: nv-l-bounces@lists.ca.ibm.com

07/02/2007 04:44 PM
Please respond to
Tivoli NetView Discussions <nv-l@lists.ca.ibm.com>

[NV-L][TEC 3.9] NetView/TEC integration rules

Good day, Happy NetView-ers!
I’m in the process of performing a new install of NV715 for my new employer, and want to take a poll and ask your learned opinions.
So, what is the rest of the world doing with the rule that comes with TEC for all those events that come from NetView?  At my previous employer, we filtered at NetView, and only sent the events we would action on (read – we didn’t use the default rule at all).  That said, I don’t have the particular luxury of doing the like here (although it may come to that).
 --  Long story short, just how is it that you are wading through the plethora of events that arrive in TEC to provide an ‘actionable’ alert/event that you can open a ticket on?
It’s been a while since I’ve looked at what gets left open once the netview.rls finishes its thing, so don’t hang me out-right for asking this.  I’m thinking that I need to go with a timer to see what’s still open in, say, five minutes.  Does this sound reasonable?
I’d appreciate any suggestions you might have to pass my way.
Blane Robertson
Capgemini / Dallas
Enterprise Systems Management/ Capgemini Energy
Office: +1 214 879 1666/ www.us.capgemini.com
Whether you think you can, or you think you can’t, you’re right! – Henry Ford

NV-L mailing list
http://lists.ca.ibm.com/mailman/listinfo/nv-l (Browser access limited to internal IBM'ers only)

NV-L mailing list
http://lists.ca.ibm.com/mailman/listinfo/nv-l (Browser access limited to 
internal IBM'ers only)
<Prev in Thread] Current Thread [Next in Thread>

Archive operated by Skills 1st Ltd

See also: The NetView Web