Leslie has said most of what I would comment. My approach to the
mis-handling of the Unreachable events is to change netview.rls in TEC
so that if a router has a status of unreachable, then the the related
interface unreachable events from devices are made EFFECT events of the
unreachable router, rather than causal events. This then works nicely.
You probably know the working of netview.rls but there is a paper I
wrote some time back now on the NetView Tivoli User Group website (
http://www.nv-l.org/twiki/bin/view/Netview/NetViewTECIntegration ) which
may help.
Cheers,
Jane
Leslie Clark wrote:
>
> My favorite is:
> Use the TEC_ITS.rs rule on the Netview side for forwarding events. Use
> it as-is for starters, but I always exclude the Interface Unreachable
> event. That one seems not to be handled well.
> Enable RFI and take the trouble to get a well-connected map.
> On the TEC sice, use the default netview.rls rule. In a heavily loaded
> system, you could excise the section on ITSA events.
>
> The correlation on these is pretty good. A lot go over, but most are
> cleared out quickly. Anything still there, open and critical, say,
> after 5 minutes or one polling cycle is probably actionable. There is
> a latency setting in netview.rls that will likely need adjusting for
> your polling cycle. Once you see this working consistently, you know
> what else you have to do. You should see down/up correlation for nodes
> and interfaces. You should seen interface/node correlation. And, if
> RFI is enabled but some events leak through for nodes on the
> unreachable subnet, those should be correlated out as well. When an
> unreachable subnet comes back online, any nodes that do not come back
> will show up as down node events.
>
> You may need to enable the forwarding of severity. I'm not sure if
> 7.1.5 went back to sending it by default or not. Then the rule works
> at lowering the severity, and raising it back up if a service impact
> event comes in. I never did any work with that, so we just removed the
> bits that lowered the severity and used what was sent it. Basically
> down is serious.
>
> One customer I know of let things come over as minor, and if they were
> still open after x minutes, up them to critical. Operators only saw
> things that were critical, which would be all of the root cause events.
>
> There is no dup-detect in the netview.rls, so for flapping stuff you
> will have to add something on the TEC side. The customer has to
> determine whether each outage is a new problem or not. It can get
> pretty philosphical.
>
> Watch out for that Ack function in the netview.rls. When you send it
> an up, it sends back an event to Netview that causes Netview to check
> the node again. In cases where they have blocked ICMP_MASK_REQUEST or
> whatever its, where every other ping fails, you can get a perpetual
> motion machine. In that case, you can tell netview to ignore failed
> mask requests in /usr/OV/conf/netmon.conf
> # Set to TRUE to turn off reporting ICMP error codes.
> #NV_NETMON_ICMP_ERROR_OFF=FALSE
> or tell TEC not to send the Acks by removing that function from the
> rule, or both.
>
> Anybody else? The more real-life examples the better...
>
> Cordially,
>
> Leslie A. Clark
> IT Services Specialist, Network Mgmt
> Information Technology Services Americas
> IBM Global Services
> (248) 552-4968 Voicemail, Fax, Pager
>
>
>
> *<Blane.Robertson@capgeminienergy.com>*
> Sent by: nv-l-bounces@lists.ca.ibm.com
>
> 07/02/2007 04:44 PM
> Please respond to
> Tivoli NetView Discussions <nv-l@lists.ca.ibm.com>
>
>
>
> To
> <nv-l@lists.ca.ibm.com>
> cc
>
> Subject
> [NV-L][TEC 3.9] NetView/TEC integration rules
>
>
>
>
>
>
>
>
>
> Good day, Happy NetView-ers!
>
> I’m in the process of performing a new install of NV715 for my new
> employer, and want to take a poll and ask your learned opinions.
>
> So, what is the rest of the world doing with the rule that comes with
> TEC for all those events that come from NetView? At my previous
> employer, we filtered at NetView, and only sent the events we would
> action on (read – we didn’t use the default rule at all). That said, I
> don’t have the particular luxury of doing the like here (although it
> may come to that).
>
> -- Long story short, just how is it that you are wading through the
> plethora of events that arrive in TEC to provide an ‘actionable’
> alert/event that you can open a ticket on?
>
> It’s been a while since I’ve looked at what gets left open once the
> netview.rls finishes its thing, so don’t hang me out-right for asking
> this. I’m thinking that I need to go with a timer to see what’s still
> open in, say, five minutes. Does this sound reasonable?
>
> I’d appreciate any suggestions you might have to pass my way.
>
> Peace,
> Blane Robertson
> Capgemini / Dallas
> Enterprise Systems Management/ Capgemini Energy
> Office: +1 214 879 1666/ _www.us.capgemini.com_
> <http://www.us.capgemini.com/>
>
> Whether you think you can, or you think you can’t, you’re right! –
> Henry Ford
>
>
>
> _______________________________________________
> NV-L mailing list
> NV-L@lists.ca.ibm.com
> Unsubscribe:NV-L-leave@lists.ca.ibm.com
> http://lists.ca.ibm.com/mailman/listinfo/nv-l (Browser access limited
> to internal IBM'ers only)
>
> ------------------------------------------------------------------------
>
> _______________________________________________
> NV-L mailing list
> NV-L@lists.ca.ibm.com
> Unsubscribe:NV-L-leave@lists.ca.ibm.com
> http://lists.ca.ibm.com/mailman/listinfo/nv-l (Browser access limited to
> internal IBM'ers only)
>
--
Tivoli Certified Consultant & Instructor
Skills 1st Limited, 2 Cedar Chase, Taplow, Bucks, SL6 0EU, UK.
Registered in England & Wales, Company No. 3458854.
Tel: +44 (0)1628 782565
Copyright (c) 2007 Jane Curry <jane.curry@skills-1st.co.uk>. All rights
reserved.
_______________________________________________
NV-L mailing list
NV-L@lists.ca.ibm.com
Unsubscribe:NV-L-leave@lists.ca.ibm.com
http://lists.ca.ibm.com/mailman/listinfo/nv-l (Browser access limited to
internal IBM'ers only)
|