nv-l
[Top] [All Lists]

RE: [nv-l] Ruleset + up event.

To: nv-l@lists.us.ibm.com
Subject: RE: [nv-l] Ruleset + up event.
From: James Shanks <jshanks@us.ibm.com>
Date: Wed, 22 Sep 2004 08:32:10 -0600
Delivery-date: Wed, 22 Sep 2004 15:48:17 +0100
Envelope-to: nv-l-archive@lists.skills-1st.co.uk
In-reply-to: <91D03459CD3BE04DB5C9894069B252340288C574@omaexch03.csg.csgsystems.com>
Reply-to: nv-l@lists.us.ibm.com
Sender: owner-nv-l@lists.us.ibm.com

Mighty complicated set of Reset-on-match and Pass-on-Match ruleset nodes back to back.  Not sure I follow it all.

But the key seems to be those scripts at the end so far as I can see.  Whatever happens, we pass them a Node Down event originally and then later a matching Node Up.  In between we'll wait 3 minutes before sending the original Node Down, to see whether this is a false alarm, and then we'll wait up to 144 hours (6 days) for the matching Node Up.    All of that is just to launch the appropriate scripts.  How do they work?

 The intervening razzle-dazzle of  alternating Reset and Pass Nodes, the ones with the 5-second wait intervals, seems to me to be some kind of timing mechanism to guarantee that it will be possible to use the original Node Down as the match criteria, for the much later Node Up.   It delays the final handling of the  Node Down event so that there is time to save the Node Up in the Pass-On-Match so that there will be something to match when the Node Down is released.   It's ingenious all right, and more than a little unusual, as I don't recall seeing anything like it.   Bet the original developer would be surprised too.

But to my way of thinking, this way has dependencies I don't care for.   The biggest dependency I see here is that if you have to stop and restart the daemons for any reason, then the held events are lost.    I'd prefer the setting and querying of database fields myself, since you wouldn't be time or daemon dependent.  You could preserve continuity over a much longer time period and you don't have to worry about keeping the daemons up, should you have to recycle them for some other reason.  

James Shanks
Level 3 Support  for Tivoli NetView for UNIX and Windows
Tivoli Software / IBM Software Group



"Barr, Scott" <Scott_Barr@csgsystems.com>
Sent by: owner-nv-l@lists.us.ibm.com

09/22/2004 09:47 AM
Please respond to
nv-l

To
<nv-l@lists.us.ibm.com>
cc
Subject
RE: [nv-l] Ruleset + up event.





Here is a ruleset that does just what you need.
 
It’s got some extra stuff in it because it processes a bunch of different groups node downs so you’ll want to strip out the extra query smartsets and actions. Don’t try and figure out how it works, I still don’t understand but  a netview guru helped me work it out and it works like 100 bucks.
 
1.        Accepts node up and node down traps
2.        Holds the node down for 3 minutes
3.        If the node up happens, pass the node up trap
4.        If the node up trap does not happen in 3 minutes, execute the notification script
 
 
 



From: owner-nv-l@lists.us.ibm.com [mailto:owner-nv-l@lists.us.ibm.com] On Behalf Of James Shanks
Sent:
Wednesday, September 22, 2004 8:20 AM
To:
nv-l@lists.us.ibm.com
Subject:
Re: [nv-l] Ruleset + up event.

 

What you see is what you get, Tom.


If you want to pass a Router Up event, then you have to do it explicitly.   The logic you are saying you want here is much more complicated than just a simple reset-on-match.   What you have just said is that you want the Router Down held for just five minutes and then passed to TEC if no Router Up.  And then you want something to "remember" that you passed this Router Down, and pass a matching Router up for time period much later.   Well, a simple ruleset cannot do that.  So you have to design something else more sophisticated.  


When you design a custom ruleset for TEC, you and the TEC guy have to work together.  He can code rules on his end, just as you can.   I don't see why you cannot send all Router Up events to TEC as harmless and let a TEC rule over there match them to any open Router Downs, and if there are none them close them.  Or let the operator close them.  If he sees them, then clearly there was no match so they no longer matter, right?


If you have to do this in NetView, then I think you'd have to do something like this.  You have to keep a record somewhere of Router Down events you sent to TEC, and query that list when a Router Up comes in.  One way them would be create a file, add the router name to it when you send the event TEC (use an action node for that), and then query it in an inline action script when the Router Up comes in, and if there is a match, then delete the name from the list and send the Router Up.  An alternative would be to Set and Query Database fields on the router objects in the database .  You can create your own field or use CorrState1 - 4.  You set the field to indicate that you sent the trap, then you could query it when the Router Up came in and take action that way.   Then clear the field.   You get the idea, I'm sure



James Shanks
Level 3 Support  for Tivoli NetView for UNIX and Windows
Tivoli Software / IBM Software Group

Tom Hallberg <gimli@hhcrew.tk>
Sent by: owner-nv-l@lists.us.ibm.com

09/22/2004 03:41 AM


Please respond to
nv-l


To
nv-l@lists.us.ibm.com
cc
 
Subject
[nv-l] Ruleset + up event.

 


   





Hi

I got some ruleset design problem. For the moment I got first a "Trap
Settings" (for Router down events), then a "Inline Action" to check that
its one of the routers I want to have status check on. After that I have a
"Reset on Match" because I also take in Router up events so I can reset on
match within 5 mins. But the problem is that if a router goes down, and if
it have been down for more then 5 min then it will pass that down event to
TEC. And let say now that the router when up again, so we got a Router up
event. But that up event will not pass to our TEC. So are there any
Templets that can handle the problem about sending onlye one up event when
there have been a down event passed to TEC. Or do I have to make a new
Inline Action to take care about that up event that comes after 5 min?

The TEC guy dont want to have all up events. Because the net is quite big.

Thank you

//Tom

Attachment: trap_unix_unreach.rs
Description: Binary data

<Prev in Thread] Current Thread [Next in Thread>

Archive operated by Skills 1st Ltd

See also: The NetView Web