I need a ruleset that detects when a node has gone up AND down 3 times
in 30 minutes. I'm looking for catching the condition whereby a
router reboots itself. I'm close, but I'm missing some logic which
I'm not sure how to apply within a ruleset.
The trick is catching the pattern: node down -> node up -> node down
-> node up -> node down -> node up
I thought I was clever at first, by just looking for receiving 3 node down
events in 30 minutes. This didn't work because, for example, we have a
router with several serial interfaces on it for our remote sites. One
thunderstorm on the Eastern Plains of Colorado and those nodes typically
"disappear" for a while (lightning and those remote 56K lines don't get
along so well <smile>). The problem is that my ruleset checks for 3
interfaces down signals in 30 minutes from the "origin" attribute. Well,
I learned that if an interface with 2 IP addresses configured on it goes
down, I'll get 3 interface down traps each time it goes down: 2 for each
network that's down on the interface and one from the router indicating
that it has a down interface. The problem is that these 3 traps all carry
the same "origin" attribute and will satisfy the ruleset.
You can't just check for 3 ups and 3 downs because you have the same
problem. What I really need is to ensure that I get those traps
in the order of down,up,down,up,down,up and only then will I page
out the problem. I need to create a ruleset where the order
of the traps over a period of time matters and I don't understand
how to do this???
Thank you --Greg Redder
Network Analyst
Colorado State University
==============================================================================
Greg Redder Academic Computing & Networking Services
Colorado State University, ACNS Phone:(970)491-7222 FAX: (970)491-1958
601 S. Howes, Room 625 E-mail: redder@yuma.colostate.edu
Fort Collins, CO 80523 PGP Fprint:68CEE78C86AC452881B27249785FEE91
==============================================================================
|