nv-l
[Top] [All Lists]

Fw: Determining length of time a node is down

To: nv-l@lists.tivoli.com
Subject: Fw: Determining length of time a node is down
From: Karin Binder <karin.binder@NWA.COM>
Date: Fri, 9 Jul 1999 15:53:13 -0500
Reply-to: karin.binder@nwa.com
Sender: Discussion of IBM NetView and POLYCENTER Manager on NetView <NV-L@UCSBVM.UCSB.EDU>
Matt,

Thanks for your response.  I thought about doing something similar, but
thought if the NetView processes or databases already had the info I'd
rather take advantage of that (and have less scripts to maintain).  Still,
it's a viable alternative.  Thanks for offering to share - being able to
see what you've done for time calculations would be most helpful.  If you
don't mind, please send me a copy.

An additional question to add to the previous list:

I was looking through the nvcorrd logs, and it appears that there was some
correlation going on in nvcorrd prior to my use of correlation through the
ruleset.  For instance, after receiving an interface down trap for a device
with multiple interfaces, there are some messages in the log that seem to
indicate a correlation check, and then a node up trap is issued (other
interfaces were up at the time).  Is this documented anywhere?  I'd like to
know what correlation is already being performed, if there's a way to take
advantage of it, and further details on how to interpret the information
contained on the log.

Again, any info appreciated!

Thanks,
Karin


----------
> From: Matt Ashfield <mda@unb.ca>
> To: karin.binder@nwa.com
> Subject: Re:      Determining length of time a node is down
> Date: Friday, July 09, 1999 2:38 PM
>
> I have a ruleset defined that if a node is down for more than ten
minutes,
> let me know. It does that by sending me the timestamp and the name of the
> node. In that script, I also add the node to a down-database. From there,
i
> run a script every 10 minutes or so which goes thorugh the nodes in the
down
> database and sends an email and tells me what ones are still down.
> In addition, I have another ruleset that when it receives a NodeUp trap,
it
> checks the down-database for that node (to see if it had been down  for
more
> than 10 minutes), if its there it removes the node from the database, and
> sends an email, saying it's now up and tells me how long it was down
> for.....
>
> Hope this helps, if ya need a copy of the rulesets...let me know...
>
> Matt
> mda@unb.ca
>
>
> -----Original Message-----
> From: Karin Binder <karin.binder@nwa.com>
> To: NV-L@ucsbvm.ucsb.edu <NV-L@ucsbvm.ucsb.edu>
> Date: Friday, July 09, 1999 4:35 PM
> Subject: Determining length of time a node is down
>
>
> >Hello all,
> >
> >I have a need to report the length of time a node is down.
> >
> >I have been trying to use ruleset processing to work with the
IBM_NDWN_EV
> >and IBM_NUP_EV events.  I receive the events, and can match the events
for
> >a given device.  So far, so good.  The difficulty I am having is in
trying
> >to access the attribute information from both traps (once they have been
> >matched) so I can pass them to a script for further processing.  I was
able
> >to do it by setting the correlation value from the node_down and then
the
> >node_up, but it seemed kludgy. And since there's only one set of rolling
> >correlation fields, I'd rather not waste it on this.
> >
> >I've tried the documentation (manuals, online, man pages), but it didn't
go
> >into much depth.  In fact, that and some testing led to further
questions:
> >
> >
> >1) Since the "pass on match" node can access event attributes from two
> >matched traps, is there any way I can access it too during ruleset
> >processing?  I'd like to obtain the $NVT and $NVATTR_4 from each of the
> >matched traps.
> >
> >2) Can someone please clarify which agent's sysUpTime is reported in the
> >sysUpTime Event Attribute Value? (Ref. Admin Guide p. 5-35)  From my
> >testing and viewing the debug output in nvcorrd logs, it does not appear
to
> >be the uptime of the device being reported on in the trap (the device at
> >$NVA).
> >
> >3) When doing a demand poll of a node that is down, the output contains
> >"down since ....." and a date/time stamp.  Where is this information
> >stored, and is it accessible?
> >
> >
> >I'm curious if others are measuring this and how.  Open to any
suggestions,
> >documentation references, etc.
> >
> >Thanks in advance,
> >Karin
> >
>

<Prev in Thread] Current Thread [Next in Thread>

Archive operated by Skills 1st Ltd

See also: The NetView Web