I
was wondering the same thing JT--our NV and TEC coexist too. We flipped on
nvserverd logging 2 days ago but haven't had any failures yet. It's just a
matter of time. Is there a pattern to when your event flow stops? Mike and
James, would the libraries function mentioned here cause the intermittent
behavior we are seeing? I figured it would show as not getting events at
all.
Nice
to know we're not alone!
Thanks--Drew
One
other small thing.
The
TEC server and Netview server are co-located (on the same servers). Could that
be our problem?
JT
JT:
I think that is
a problem with the way your netview is started. Try this.
Ovstop then ovstop nvsecd and then run /etc/netnmrc. Your problem is
with libraries that are not being available and if you call the /etc/netnmrc
that should pick them up.
Regards, Michael Pearson
Tivoli NetView for UNIX and
NT Support Building 660, Office CC105B; HWY. 54 & 600 PARK
OFFICES DR Research Triangle Park, N.C. 27709 (919)
254-2270 pearsom@us.ibm.com ******************************************************************
****************************************************************** Need
help with Tivoli Software Products? Ask Tivoli!
http://www.tivoli.com/asktivoli
"Edwards, JT - ESM"
<JEdwards3@wm.com> Sent by: owner-nv-l@lists.us.ibm.com
09/15/2004 02:35 PM
|
To
| "'nv-l@lists.us.ibm.com'"
<nv-l@lists.us.ibm.com>
|
cc
|
|
Subject
| RE: [nv-l] nvtecia
still hanging or falling behind processing TEC
_ITS.rs |
|
Jame and Jane. Found it:
************************************ NetView
*******************************@#%
Timestamp
: Wed Sep 15 2004 13:34:20.493872
Process ID
:
46230
Subsystem :
OVEXTERNAL User ID ( UID
) :
0
Log Class : ERROR
Device
ID :
-1
Path ID :
-1 Connection
ID :
-1
Log Instance : 0
Software
: /usr/OV/bin/nvserverd
Hostname
: ausu066a.wm.com ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Call to tec_create_handle failed, tec_errno =
827
Now
what do I do? -----Original Message----- From:
owner-nv-l@lists.us.ibm.com [mailto:owner-nv-l@lists.us.ibm.com]On
Behalf Of James Shanks Sent:
Wednesday, September 15, 2004 12:07 PM To:
nv-l@lists.us.ibm.com Subject: RE:
[nv-l] nvtecia still hanging or falling behind processing TEC
_ITS.rs
To figure out what is wrong you have
to answer the question, "How far do we get?"
Is nvserverd is running? ovstatus
nvserverd. Are events going to your cache file? The
default location is /etc/Tivoli/tec/cache. If it's growing with
new events, then the adapter cannot has lost contact with the
server.
You aren't
getting an nvserverd.log file? Never seen that before if you are
running the executable which came with IY60528.
But you could look for TEC adapter errors in nettl.
You have to format it first.
To do that you would have use "netfmt -f
nettl.LOG00 > formatted.nettl.LOG00" and then do the same for
LOG01, and go looking for nvserverd entries. Some of them will
be cryptic, but the one you would want would say something about a
tec_create_handle failure. Prior to 7.1.4, that was the only
place you could find adapter errors.
Another thing you should do is try running the nvcorrd trace
and see whether he has a forwardall.rs ruleset registered for
nvserverd. Issue "nvcdebug -n" and then "nvcdebug -d all" and go look
at the nvcorrd logs. You should see the current list of ruleset
being run (nvcdebug -n) and then incoming events being processed for
forwardall.rs. When he processes them he writes a message
to the log which says he is forwarding the notification to appl
<pid>. Check the <pid>. It should be the
process id (pid) for nvserverd.
Finally, you might try using the non-TME
adapter just as a test and see whether that works. But remember,
they use different executables. So for that you'd have to go
back through serversetup and reconfigure the adapter so that the right
daemon gets registered in ovsuf, and then you'd have to stop it and
modify the tecint.conf file to enable the tracing again, because the
reconfigure will wipe it out.
HTH
James
Shanks Level 3 Support for Tivoli
NetView for UNIX and Windows Tivoli Software /
IBM Software Group
"Edwards, JT - ESM"
<JEdwards3@wm.com> Sent by:
owner-nv-l@lists.us.ibm.com
09/15/2004 12:04 PM
|
To
| "'nv-l@lists.us.ibm.com'"
<nv-l@lists.us.ibm.com>
|
cc
|
|
Subject
| RE: [nv-l] nvtecia still hanging or
falling behind processing TEC
_ITS.rs |
|
Well we here at Waste Management are still hanging issues
getting events to flow to TEC. We are at 7.1.4 FP 01 with IY60528
patch installed.
I have no tracing and no signs that the nvtecia
process (or subprocess) is even working. Our rules (forwardall.rs) is
set on pass. We have stopped and restarted the nvserverd process
several times.
The tecint.conf file reads as
follows ServerLocation=@EventServer TecRuleName=forwardall.rs ServerPort=0
DefaultEventClass=TEC_ITS_BASE Type=LCF BufferEvents=YES UseStateCorrelation=YES StateCorrelationConfigURL=file:///usr/OV/conf/nvsbcrule.xml
## The following four lines are for debugging the
state correlation engine LogLevel=ALL
TraceLevel=ALL LogFileName=/usr/OV/log/adptlog.out TraceFileName=/usr/OV/log/adpttrc.out ##
The following three lines alter nvserverd default
behavior NvserverdTraceTecEvents=YES
NvserverdPrimeTecEvents=NO NvserverdSendSeverityTecEvents=YES LCFINSTANCE=1 The
two logfiles are not being created.
ummmm
HELP?! JT -----Original
Message----- From:
owner-nv-l@lists.us.ibm.com [mailto:owner-nv-l@lists.us.ibm.com]On
Behalf Of James Shanks Sent:
Tuesday, September 14, 2004 10:11 AM To:
nv-l@lists.us.ibm.com Subject: Re:
[nv-l] nvtecia still hanging or falling behind processing
TEC_ITS.rs
I'm not aware of anyone else
reporting a similar problem. Historically, however, the
adapter has always been load sensitive.
But let's clarify the issue a bit, shall we?
Are you saying that the adapter slows down or that it
hangs? Does the heartbeat event get there eventually? How
slow is it? Do things ever recover without your taking
everything down or not? How long does that take? How big is
this trap surge you are talking about?
There is no simple way to diagnose this issue
because there is the ZCE engine in the middle, as well as the fact
that nvserverd has no idea what's going on after he does
tec_put_event. As far as NetView is concerned, once that occurs,
the event has been sent. Whether it gets to the server or not is
the responsibility of the code in the TEC EEIF library. You can use
the conf file entry NvserverdTraceTecEvents=YES, or the corresponding
environment variable, to get an nvserverd.log, to see whether
nvserverd has given the event to the adapter in a timely fashion.
Then you would have to check the adapter's cache file, by
default /etc/Tivoli/tec/cache, and see whether it is caching events.
It will do that if communications with the server hiccup.
But it should recover from that automatically. When
communication is lost, it tries again on every subsequent event.
If the cache isn't growing, and nvserverd has logged the event,
then the problem is internal to the TEC code. To go deeper,
you'd have to get the TEC folks involved.
They might want you to get the java
adapter traces mentioned in the conf file, or they might want a trace
of the internals of the adapter library. For that you'd have to
obtain a special diagnosis file from them, called
".ed_diag_conf" to hook that in by a special entry in the conf
file. But then they'd have to read the traces. And
all that would require that you open a call to Support.
James Shanks
Level 3 Support for Tivoli NetView for UNIX and
Windows Tivoli Software / IBM Software
Group
"Van Order, Drew \(US -
Hermitage\)" <dvanorder@deloitte.com> Sent by: owner-nv-l@lists.us.ibm.com
09/14/2004 10:22 AM
|
To
| <nv-l@lists.us.ibm.com>
|
cc
|
|
Subject
| [nv-l] nvtecia still hanging or falling
behind processing
TEC_ITS.rs |
|
Hi all, After patching 7.1.4 FP01 with the
latest efix to fix nvcorrd/nvtecia hanging or stalling, we find it's
still happening. It mainly starts when we get a surge of Cisco syslog
traps from devices. The only piece not keeping up is the NV to TEC
integration; demandpolls are fine and events are moving in the Event
Browser. TEC_ITS only passes traps on, we do no other processing in
the ruleset. TEC events from sources outside NV are not impacted. We
send an hourly Interface Down trap via cron to serve as a heartbeat.
When it misses the second one in a row (as seen at TEC), we cycle NV
and it's OK again. MLM is not an option for our environment. Is anyone else
struggling with this? Thanks--Drew
*Disclaimer:*
This message (including any attachments) contains
confidential information intended for a specific individual and
purpose, and is protected by law. If you are not the intended
recipient, you should delete this message. Any disclosure, copying, or
distribution of this message, or the taking of any action based on it,
is strictly prohibited.
This message (including any attachments) contains confidential information intended for a specific individual and purpose, and is protected by law. If you are not the intended recipient, you should delete this message. Any disclosure, copying, or distribution of this message, or the taking of any action based on it, is strictly prohibited.
|