Ok I
am going to switch to the non TME adapter. Second, we have no events caching in
/etc/Tivoli/tec.
I am
going to try the things both James and Jane suggested (you guys rock and have
been my Netview savior for years hehe!).
I will
report back my findings.
Many
thanks!
JT
To figure out what is wrong you have to answer the
question, "How far do we get?"
Is
nvserverd is running? ovstatus nvserverd. Are events going
to your cache file? The default location is /etc/Tivoli/tec/cache.
If it's growing with new events, then the adapter cannot has lost
contact with the server.
You
aren't getting an nvserverd.log file? Never seen that before if you are
running the executable which came with IY60528. But you could look for TEC adapter errors in nettl.
You have to format it first. To
do that you would have use "netfmt -f nettl.LOG00 >
formatted.nettl.LOG00" and then do the same for LOG01, and go looking for
nvserverd entries. Some of them will be cryptic, but the one you would
want would say something about a tec_create_handle failure. Prior to
7.1.4, that was the only place you could find adapter errors.
Another thing you should do is try
running the nvcorrd trace and see whether he has a forwardall.rs ruleset
registered for nvserverd. Issue "nvcdebug -n" and then "nvcdebug -d all" and
go look at the nvcorrd logs. You should see the current list of ruleset
being run (nvcdebug -n) and then incoming events being processed for
forwardall.rs. When he processes them he writes a message to the
log which says he is forwarding the notification to appl <pid>.
Check the <pid>. It should be the process id (pid) for
nvserverd.
Finally, you might try
using the non-TME adapter just as a test and see whether that works. But
remember, they use different executables. So for that you'd have to go
back through serversetup and reconfigure the adapter so that the right daemon
gets registered in ovsuf, and then you'd have to stop it and modify the
tecint.conf file to enable the tracing again, because the reconfigure will
wipe it out.
HTH
James Shanks Level 3 Support for Tivoli
NetView for UNIX and Windows Tivoli Software / IBM Software Group
"Edwards, JT - ESM"
<JEdwards3@wm.com> Sent by: owner-nv-l@lists.us.ibm.com
09/15/2004 12:04 PM
|
To
| "'nv-l@lists.us.ibm.com'"
<nv-l@lists.us.ibm.com>
|
cc
|
|
Subject
| RE: [nv-l] nvtecia
still hanging or falling behind processing TEC
_ITS.rs |
|
Well we here at Waste Management are still hanging issues
getting events to flow to TEC. We are at 7.1.4 FP 01 with IY60528 patch
installed. I have no tracing and no signs that the nvtecia process (or
subprocess) is even working. Our rules (forwardall.rs) is set on pass. We have
stopped and restarted the nvserverd process several times. The tecint.conf
file reads as follows ServerLocation=@EventServer TecRuleName=forwardall.rs ServerPort=0 DefaultEventClass=TEC_ITS_BASE Type=LCF BufferEvents=YES UseStateCorrelation=YES StateCorrelationConfigURL=file:///usr/OV/conf/nvsbcrule.xml ##
The following four lines are for debugging the state correlation
engine LogLevel=ALL TraceLevel=ALL LogFileName=/usr/OV/log/adptlog.out TraceFileName=/usr/OV/log/adpttrc.out ##
The following three lines alter nvserverd default
behavior NvserverdTraceTecEvents=YES NvserverdPrimeTecEvents=NO NvserverdSendSeverityTecEvents=YES LCFINSTANCE=1
The two logfiles are not being
created. ummmm HELP?! JT -----Original
Message----- From: owner-nv-l@lists.us.ibm.com
[mailto:owner-nv-l@lists.us.ibm.com]On Behalf Of James
Shanks Sent: Tuesday, September 14, 2004 10:11 AM To:
nv-l@lists.us.ibm.com Subject: Re: [nv-l] nvtecia still hanging or
falling behind processing TEC_ITS.rs
I'm not aware of anyone else reporting a similar problem.
Historically, however, the adapter has always been load
sensitive. But
let's clarify the issue a bit, shall we? Are you saying that the adapter
slows down or that it hangs? Does the heartbeat event get there
eventually? How slow is it? Do things ever recover without your
taking everything down or not? How long does that take? How big is
this trap surge you are talking about?
There is no simple way to diagnose this issue
because there is the ZCE engine in the middle, as well as the fact that
nvserverd has no idea what's going on after he does tec_put_event. As
far as NetView is concerned, once that occurs, the event has been sent.
Whether it gets to the server or not is the responsibility of the code
in the TEC EEIF library. You can use the conf file entry
NvserverdTraceTecEvents=YES, or the corresponding environment variable, to get
an nvserverd.log, to see whether nvserverd has given the event to the adapter
in a timely fashion. Then you would have to check the adapter's
cache file, by default /etc/Tivoli/tec/cache, and see whether it is caching
events. It will do that if communications with the server hiccup.
But it should recover from that automatically. When communication
is lost, it tries again on every subsequent event. If the cache isn't
growing, and nvserverd has logged the event, then the problem is internal to
the TEC code. To go deeper, you'd have to get the TEC folks
involved.
They might want you to get the java adapter traces mentioned in the
conf file, or they might want a trace of the internals of the adapter library.
For that you'd have to obtain a special diagnosis file from them,
called ".ed_diag_conf" to hook that in by a special entry in the conf
file. But then they'd have to read the traces. And all
that would require that you open a call to Support.
James Shanks Level 3
Support for Tivoli NetView for UNIX and Windows Tivoli Software / IBM
Software Group
"Van Order, Drew \(US -
Hermitage\)" <dvanorder@deloitte.com> Sent by:
owner-nv-l@lists.us.ibm.com
09/14/2004 10:22 AM
|
To
| <nv-l@lists.us.ibm.com>
|
cc
|
|
Subject
| [nv-l] nvtecia still
hanging or falling behind processing
TEC_ITS.rs |
|
Hi
all, After patching 7.1.4 FP01 with the latest efix to fix
nvcorrd/nvtecia hanging or stalling, we find it's still happening. It mainly
starts when we get a surge of Cisco syslog traps from devices. The only piece
not keeping up is the NV to TEC integration; demandpolls are fine and events
are moving in the Event Browser. TEC_ITS only passes traps on, we do no other
processing in the ruleset. TEC events from sources outside NV are not
impacted. We send an hourly Interface Down trap via cron to serve as a
heartbeat. When it misses the second one in a row (as seen at TEC), we cycle
NV and it's OK again. MLM is not an option for our environment. Is anyone else
struggling with this? Thanks--Drew *Disclaimer:* This message (including any attachments) contains
confidential information intended for a specific individual and purpose, and
is protected by law. If you are not the intended recipient, you should delete
this message. Any disclosure, copying, or distribution of this message, or the
taking of any action based on it, is strictly prohibited.
|