nv-l
[Top] [All Lists]

Re: up/down ruleset

To: nv-l@lists.tivoli.com
Subject: Re: up/down ruleset
From: Tim Clark <Tim.Clark@TAVVE.COM>
Date: Mon, 17 Aug 1998 11:29:45 -0400
Reply-to: Discussion of IBM NetView and POLYCENTER Manager on NetView et alia <NV-L@UCSBVM.UCSB.EDU>
Sender: Discussion of IBM NetView and POLYCENTER Manager on NetView et alia <NV-L@UCSBVM.UCSB.EDU>
Check www.tavve.com/nmsweb

Check for reboots reports are there for your review.


-----Original Message-----
From: Netview Operator <netview@NV1.HSNET.UFL.EDU>
To: NV-L@UCSBVM.UCSB.EDU <NV-L@UCSBVM.UCSB.EDU>
Date: Monday, August 17, 1998 11:24 AM
Subject: Re: up/down ruleset


>Hey Greg-
>
>Not sure whether you are after real time notification so you can do
something
>about it or you just want to know about reboots (and finding out after the
>rebootee is back up is okay).  If the latter, a data collection on
sysUpTime
>like:
>
>mode: Don't Store, Check Threshholds
>polling interval: 3m
>Trap number: 58720263
>Threshold: 180000
>source: <I used a wild card to match any node on our network and we only
>"manage" nodes of "interest" so we're not querying every IP address>
>rearm: 179999
>
>rearm event...
>Event Log Message: $3
>Popup notification (doesn't work): $2 Rebooted or Power-Failure
>Command for Automatic Action:(echo Sysuptime under 3 minutes at; date ;echo
for
>$2 was it rebooted?) | /usr/bin/mail -s 'Sysuptime under 3 min $2' netmgrs
>
>so the email alias netmgrs gets email whenever the system uptime on a
managed
>device falls below 3 minutes causing rearm of the data collection
threshold.
>Sysuptime under 3 minutes is pretty much a guarantee the device restarted
in the
>last three minutes and is more reliable than coldstart traps which may not
make
>it to Netview anyway.
>
>Hope this helps.
>
>Randy Martin
>Shands Healthcare
>martirw@is1.hsnet.ufl.edu
>
>You wrote:
>
>> I need a ruleset that detects when a node has gone up AND down 3 times
>> in 30 minutes.  I'm looking for catching the condition whereby a
>> router reboots itself.  I'm close, but I'm missing some logic which
>> I'm not sure how to apply within a ruleset.
>>
>> The trick is catching the pattern:  node down -> node up -> node down
>> -> node up -> node down -> node up
>>
>> I thought I was clever at first, by just looking for receiving 3 node
down
>> events in 30 minutes.  This didn't work because, for example, we have a
>> router with several serial interfaces on it for our remote sites.  One
>> thunderstorm on the Eastern Plains of Colorado and those nodes typically
>> "disappear" for a while (lightning and those remote 56K lines don't get
>> along so well <smile>).  The problem is that my ruleset checks for 3
>> interfaces down signals in 30 minutes from the "origin" attribute.  Well,
>> I learned that if an interface with 2 IP addresses configured on it goes
>> down, I'll get 3 interface down traps each time it goes down: 2 for each
>> network that's down on the interface and one from the router indicating
>> that it has a down interface.  The problem is that these 3 traps all
carry
>> the same "origin" attribute and will satisfy the ruleset.
>>
>> You can't just check for 3 ups and 3 downs because you have the same
>> problem.  What I really need is to ensure that I get those traps
>> in the order of down,up,down,up,down,up and only then will I page
>> out the problem.  I need to create a ruleset where the order
>> of the traps over a period of time matters and I don't understand
>> how to do this???
>>
>> Thank you --Greg Redder
>>             Network Analyst
>>             Colorado State University
>>
>>
>===========================================================================
====
>> Greg Redder                         Academic Computing & Networking
Services
>> Colorado State University, ACNS     Phone:(970)491-7222  FAX:
(970)491-1958
>> 601 S. Howes, Room 625              E-mail: redder@yuma.colostate.edu
>> Fort Collins, CO 80523              PGP

<Prev in Thread] Current Thread [Next in Thread>

Archive operated by Skills 1st Ltd

See also: The NetView Web