James, I concur with everything you said.
Like I said, I don’t totally
understand it but the ruleset author who reads this forum can give you the
details. I totally agree with your concerns about the trap being “lost”
if automation is recycled. However, I always felt that a) NetView should be
stable enough his doesn’t happen (true usually) and b) NetView should
already have a feature that eliminates the need for this.
Our approach is based on the fact that our
different groups all have different requirements (re: insanity) so we had to accommodate
all sorts of interesting down-wind considerations. The script that executes uses
the smartest name passed as a parameter as a key to what notification processes
to use. The last thing the script does is use postemsg to forward the event to
TEC.
Not necessarily pretty, but it does work
well.
The author of the ruleset is the worst
fantasy football manager I have ever seen.
From:
owner-nv-l@lists.us.ibm.com [mailto:owner-nv-l@lists.us.ibm.com] On Behalf Of James Shanks
Sent: Wednesday, September 22,
2004 9:32 AM
To: nv-l@lists.us.ibm.com
Subject: RE: [nv-l] Ruleset + up
event.
Mighty complicated set of Reset-on-match and
Pass-on-Match ruleset nodes back to back. Not sure I follow it all.
But
the key seems to be those scripts at the end so far as I can see. Whatever
happens, we pass them a Node Down event originally and then later a matching
Node Up. In between we'll wait 3 minutes before sending the original Node
Down, to see whether this is a false alarm, and then we'll wait up to 144 hours
(6 days) for the matching Node Up. All of that is just to launch
the appropriate scripts. How do they work?
The
intervening razzle-dazzle of alternating Reset and Pass Nodes, the ones
with the 5-second wait intervals, seems to me to be some kind of timing
mechanism to guarantee that it will be possible to use the original Node Down
as the match criteria, for the much later Node Up. It delays the final
handling of the Node Down event so that there is time to save the Node Up
in the Pass-On-Match so that there will be something to match when the Node
Down is released. It's ingenious all right, and more than a little
unusual, as I don't recall seeing anything like it. Bet the original
developer would be surprised too.
But
to my way of thinking, this way has dependencies I don't care for. The
biggest dependency I see here is that if you have to stop and restart the
daemons for any reason, then the held events are lost. I'd prefer
the setting and querying of database fields myself, since you wouldn't be time
or daemon dependent. You could preserve continuity over a much longer
time period and you don't have to worry about keeping the daemons up, should
you have to recycle them for some other reason.
James Shanks
Level 3 Support for Tivoli
NetView for UNIX and Windows
Tivoli Software
/ IBM Software Group
"Barr, Scott"
<Scott_Barr@csgsystems.com>
Sent
by: owner-nv-l@lists.us.ibm.com
09/22/2004 09:47 AM
|
To
|
<nv-l@lists.us.ibm.com>
|
cc
|
|
Subject
|
RE: [nv-l] Ruleset + up event.
|
|
Here is a ruleset that does just what you need.
It’s got some extra stuff in it because it processes a
bunch of different groups node downs so you’ll want to strip out the
extra query smartsets and actions. Don’t try and figure out how it works,
I still don’t understand but a netview guru helped me work it out
and it works like 100 bucks.
1.
Accepts node up and node
down traps
2.
Holds the node down for 3
minutes
3.
If the node up happens,
pass the node up trap
4.
If the node up trap does
not happen in 3 minutes, execute the notification script
From:
owner-nv-l@lists.us.ibm.com [mailto:owner-nv-l@lists.us.ibm.com] On Behalf Of James Shanks
Sent: Wednesday, September 22, 2004 8:20 AM
To: nv-l@lists.us.ibm.com
Subject: Re: [nv-l] Ruleset + up event.
What you see is what you get, Tom.
If you want to pass a Router Up event, then you have to do it explicitly.
The logic you are saying you want here is much more complicated than just a simple
reset-on-match. What you have just said is that you want the Router Down
held for just five minutes and then passed to TEC if no Router Up. And
then you want something to "remember" that you passed this Router
Down, and pass a matching Router up for time period much later. Well, a
simple ruleset cannot do that. So you have to design something else more
sophisticated.
When you design a custom ruleset for TEC, you and the TEC guy have to work
together. He can code rules on his end, just as you can. I don't
see why you cannot send all Router Up events to TEC as harmless and let a TEC
rule over there match them to any open Router Downs, and if there are none them
close them. Or let the operator close them. If he sees them, then
clearly there was no match so they no longer matter, right?
If you have to do this in NetView, then I think you'd have to do something like
this. You have to keep a record somewhere of Router Down events you sent
to TEC, and query that list when a Router Up comes in. One way them would
be create a file, add the router name to it when you send the event TEC (use an
action node for that), and then query it in an inline action script when the
Router Up comes in, and if there is a match, then delete the name from the list
and send the Router Up. An alternative would be to Set and Query Database
fields on the router objects in the database . You can create your own
field or use CorrState1 - 4. You set the field to indicate that you sent
the trap, then you could query it when the Router Up came in and take action
that way. Then clear the field. You get the idea, I'm sure
James Shanks
Level 3 Support for Tivoli
NetView for UNIX and Windows
Tivoli Software
/ IBM Software Group
Tom Hallberg
<gimli@hhcrew.tk>
Sent by: owner-nv-l@lists.us.ibm.com
09/22/2004 03:41 AM
|
To
|
nv-l@lists.us.ibm.com
|
cc
|
|
Subject
|
[nv-l] Ruleset + up event.
|
|
Hi
I got some ruleset design problem. For the moment I got first a "Trap
Settings" (for Router down events), then a "Inline Action" to
check that
its one of the routers I want to have status check on. After that I have a
"Reset on Match" because I also take in Router up events so I can
reset on
match within 5 mins. But the problem is that if a router goes down, and if
it have been down for more then 5 min then it will pass that down event to
TEC. And let say now that the router when up again, so we got a Router up
event. But that up event will not pass to our TEC. So are there any
Templets that can handle the problem about sending onlye one up event when
there have been a down event passed to TEC. Or do I have to make a new
Inline Action to take care about that up event that comes after 5 min?
The TEC guy dont want to have all up events. Because the net is quite big.
Thank you
//Tom