nv-l
[Top] [All Lists]

RE: [nv-l] nvtecia still hanging or falling behind processing TEC _ITS.

To: nv-l@lists.us.ibm.com
Subject: RE: [nv-l] nvtecia still hanging or falling behind processing TEC _ITS.rs
From: James Shanks <jshanks@us.ibm.com>
Date: Thu, 16 Sep 2004 14:19:44 -0400
Delivery-date: Thu, 16 Sep 2004 19:42:47 +0100
Envelope-to: nv-l-archive@lists.skills-1st.co.uk
In-reply-to: <73633B2EE17A9E4ABB7495F1E3D8CD5F6E16DD@uscnt0414.us.deloitte.com>
Reply-to: nv-l@lists.us.ibm.com
Sender: owner-nv-l@lists.us.ibm.com

Drew,

You are asking the NetView guys about  TEC libraries, and the short answer is, we don't know what the source of your problem is, or we'd tell you, and we'd fix it.  In order to get to the bottom of TEC library issues we have to get TEC people involved, their Level 3 and development, because they haven't documented any cases where this doesn't work.  At least they have not told us about them.  The reason we know about errno 827 issues, for example, is because they have been found before, both internally and externally, and we got logs to see what the problem was so we could fix it.  Ditto for the memory leak in the TEC EEIF library that was originally shipped with NetView 7.1.4/ TEC 3.9.  Somebody had to see the problem, document it, and present that documentation to the folks who work on that code, in order for a fix to be made.    

But so far I don't know about any problems or restrictions associated with running both an adapter and a TEC server on the same physical box.  That doesn't mean there aren't any.  It just means that none have been documented so far.

James Shanks
Level 3 Support  for Tivoli NetView for UNIX and Windows
Tivoli Software / IBM Software Group



"Van Order, Drew \(US - Hermitage\)" <dvanorder@deloitte.com>
Sent by: owner-nv-l@lists.us.ibm.com

09/16/2004 11:46 AM
Please respond to
nv-l

To
<nv-l@lists.us.ibm.com>
cc
Subject
RE: [nv-l] nvtecia still hanging or falling behind processing TEC                _ITS.rs





I was wondering the same thing JT--our NV and TEC coexist too. We flipped on nvserverd logging 2 days ago but haven't had any failures yet. It's just a matter of time. Is there a pattern to when your event flow stops? Mike and James, would the libraries function mentioned here cause the intermittent behavior we are seeing? I figured it would show as not getting events at all.
 
Nice to know we're not alone!
 
Thanks--Drew
-----Original Message-----
From:
owner-nv-l@lists.us.ibm.com [mailto:owner-nv-l@lists.us.ibm.com] On Behalf Of Edwards, JT - ESM
Sent:
Wednesday, September 15, 2004 4:00 PM
To:
'nv-l@lists.us.ibm.com'
Subject:
RE: [nv-l] nvtecia still hanging or falling behind processing TEC _ITS.rs

One other small thing.
 
The TEC server and Netview server are co-located (on the same servers). Could that be our problem?
 
JT
-----Original Message-----
From:
owner-nv-l@lists.us.ibm.com [mailto:owner-nv-l@lists.us.ibm.com]On Behalf Of Mike Pearson
Sent:
Wednesday, September 15, 2004 1:49 PM
To:
nv-l@lists.us.ibm.com
Subject:
RE: [nv-l] nvtecia still hanging or falling behind processing TEC _ITS.rs


JT:

       I think that is a problem with the way your netview is started.  Try this.   Ovstop then ovstop nvsecd and then run /etc/netnmrc.  Your problem is with libraries that are not being available and if you call the /etc/netnmrc that should pick them up.


Regards,
Michael Pearson

Tivoli NetView for UNIX and NT Support
Building 660, Office  CC105B;
HWY. 54 & 600 PARK OFFICES DR
Research Triangle Park, N.C. 27709
(919) 254-2270
pearsom@us.ibm.com
******************************************************************

******************************************************************
Need help with Tivoli Software Products?
Ask Tivoli!
http://www.tivoli.com/asktivoli



"Edwards, JT - ESM" <JEdwards3@wm.com>
Sent by: owner-nv-l@lists.us.ibm.com

09/15/2004 02:35 PM
Please respond to
nv-l


To
"'nv-l@lists.us.ibm.com'" <nv-l@lists.us.ibm.com>
cc
Subject
RE: [nv-l] nvtecia still hanging or falling behind processing TEC                _ITS.rs








Jame  and Jane. Found it:

 

************************************ NetView  *******************************@#%

  Timestamp            :  Wed Sep 15 2004 13:34:20.493872

 Process  ID           :  46230               Subsystem        : OVEXTERNAL

 User  ID ( UID )      :  0                   Log Class        : ERROR

 Device  ID            :  -1                  Path ID          : -1

  Connection ID        :  -1                  Log Instance     : 0

 

  Software              : /usr/OV/bin/nvserverd

  Hostname              :  ausu066a.wm.com

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Call  to tec_create_handle failed, tec_errno = 827


 

Now  what do I do?

 

-----Original Message-----

From:
owner-nv-l@lists.us.ibm.com  [mailto:owner-nv-l@lists.us.ibm.com]On Behalf Of James  Shanks

Sent:
Wednesday, September 15, 2004 12:07 PM

To:
 nv-l@lists.us.ibm.com

Subject:
RE: [nv-l] nvtecia still hanging or  falling behind processing TEC _ITS.rs



To figure out what is wrong you have to answer the  question, "How far do we get?"


Is  nvserverd is running?  ovstatus nvserverd.    Are events going  to your cache file?  The default location is /etc/Tivoli/tec/cache.   If it's growing with new events, then the adapter cannot has lost  contact with the server.


You  aren't getting an nvserverd.log file?   Never seen that before if you are  running the executable which came with IY60528.

But you could look for TEC adapter errors in nettl.   You have to format it first.

To  do that you would have use "netfmt -f  nettl.LOG00 >  formatted.nettl.LOG00" and then do the same for LOG01, and go looking for  nvserverd entries.  Some of them will be cryptic, but the one you would  want would say something about a tec_create_handle failure.  Prior to  7.1.4, that was the only place you could find adapter errors.  


Another thing you should do is try  running the nvcorrd trace and see whether he has a forwardall.rs ruleset  registered for nvserverd. Issue "nvcdebug -n" and then "nvcdebug -d all" and  go look at the nvcorrd logs.  You should see the current list of ruleset  being run (nvcdebug -n) and then incoming events being processed for  forwardall.rs.    When he processes them he writes a message to the  log which says he is forwarding the notification to appl <pid>.   Check the <pid>.  It should be the process id (pid) for  nvserverd.


Finally, you might try  using the non-TME adapter just as a test and see whether that works.  But  remember, they use different executables.  So for that you'd have to go  back through serversetup and reconfigure the adapter so that the right daemon  gets registered in ovsuf, and then you'd have to stop it and modify the  tecint.conf file to enable the tracing again, because the reconfigure will  wipe it out.


HTH


James Shanks

Level 3 Support  for Tivoli  NetView for UNIX and Windows

Tivoli Software / IBM Software Group
 

"Edwards, JT - ESM"  <JEdwards3@wm.com>
Sent by: owner-nv-l@lists.us.ibm.com
 

09/15/2004 12:04 PM
 
Please respond  to
nv-l


To
"'nv-l@lists.us.ibm.com'"  <nv-l@lists.us.ibm.com>  
cc
Subject
RE: [nv-l] nvtecia  still hanging or falling behind processing TEC         _ITS.rs








Well we here at Waste Management are still hanging issues  getting events to flow to TEC.

 

We are at 7.1.4 FP 01 with IY60528 patch  installed.

 

I have no tracing and no signs that the nvtecia process (or  subprocess) is even working. Our rules (forwardall.rs) is set on pass. We have  stopped and restarted the nvserverd process several times.

 

The tecint.conf  file reads as follows

 

ServerLocation=@EventServer

TecRuleName=forwardall.rs

ServerPort=0

DefaultEventClass=TEC_ITS_BASE

Type=LCF

BufferEvents=YES

UseStateCorrelation=YES

StateCorrelationConfigURL=file:///usr/OV/conf/nvsbcrule.xml

##  The following four lines are for debugging the state correlation  engine

LogLevel=ALL

TraceLevel=ALL

LogFileName=/usr/OV/log/adptlog.out

TraceFileName=/usr/OV/log/adpttrc.out

##  The following three lines alter nvserverd default  behavior

NvserverdTraceTecEvents=YES

NvserverdPrimeTecEvents=NO

NvserverdSendSeverityTecEvents=YES

LCFINSTANCE=1
 
The two logfiles are not being  created.

 

ummmm HELP?!

 

JT

-----Original  Message-----

From:
owner-nv-l@lists.us.ibm.com  [mailto:owner-nv-l@lists.us.ibm.com]On Behalf Of James  Shanks

Sent:
Tuesday, September 14, 2004 10:11 AM

To:
 nv-l@lists.us.ibm.com

Subject:
Re: [nv-l] nvtecia still hanging or  falling behind processing TEC_ITS.rs



I'm not aware of anyone else reporting a similar problem.     Historically, however, the adapter has always been load  sensitive.

But  let's clarify the issue a bit, shall we?  Are you saying that the adapter  slows down  or that it hangs?  Does the heartbeat event get there  eventually?  How slow is it?  Do things ever recover without your  taking everything down or not?  How long does that take?  How big is  this trap surge you are talking about?


There is no simple way to diagnose this issue  because there is the ZCE engine in the middle, as well as the fact that  nvserverd has no idea what's going on after he does tec_put_event.  As  far as NetView is concerned, once that occurs, the event has been sent.   Whether it gets to the server or not is the responsibility of the code  in the TEC EEIF library. You can use the conf file entry  NvserverdTraceTecEvents=YES, or the corresponding environment variable, to get  an nvserverd.log, to see whether nvserverd has given the event to the adapter  in a timely fashion.  Then you would have to check the  adapter's  cache file, by default /etc/Tivoli/tec/cache, and see whether it is caching  events.  It will do that if communications with the server hiccup.     But it should recover from that automatically.  When communication  is lost, it tries again on every subsequent event.   If the cache isn't  growing, and nvserverd has logged the event, then the problem is internal to  the TEC code.  To go deeper,  you'd have to get the TEC folks  involved.  


They might want you to get the java adapter traces mentioned in the  conf file, or they might want a trace of the internals of the adapter library.   For that you'd have to obtain  a special diagnosis file from them,  called ".ed_diag_conf"  to hook that in by a special entry in the conf  file.   But  then they'd have to read the traces.   And all  that would require that  you open a call to Support.  


James Shanks

Level 3  Support  for Tivoli NetView for UNIX and Windows

Tivoli Software / IBM  Software Group

"Van Order, Drew \(US -  Hermitage\)" <dvanorder@deloitte.com>
Sent by:  owner-nv-l@lists.us.ibm.com
 

09/14/2004 10:22 AM
 
Please respond  to
nv-l



To
<nv-l@lists.us.ibm.com>  
cc
Subject
[nv-l] nvtecia still  hanging or falling behind processing  TEC_ITS.rs










Hi  all,

 

After patching 7.1.4 FP01 with the latest efix to fix  nvcorrd/nvtecia hanging or stalling, we find it's still happening. It mainly  starts when we get a surge of Cisco syslog traps from devices. The only piece  not keeping up is the NV to TEC integration; demandpolls are fine and events  are moving in the Event Browser. TEC_ITS only passes traps on, we do no other  processing in the ruleset. TEC events from sources outside NV are not  impacted. We send an hourly Interface Down trap via cron to serve as a  heartbeat. When it misses the second one in a row (as seen at TEC), we cycle  NV and it's OK again. MLM is not an option for our environment. Is anyone else  struggling with this?

 

Thanks--Drew

 

*Disclaimer:*
This message (including any attachments) contains  confidential information intended for a specific individual and purpose, and  is protected by law. If you are not the intended recipient, you should delete  this message. Any disclosure, copying, or distribution of this message, or the  taking of any action based on it, is strictly prohibited.

This message (including any attachments) contains confidential information intended for a specific individual and purpose, and is protected by law. If you are not the intended recipient, you should delete this message. Any disclosure, copying, or distribution of this message, or the taking of any action based on it, is strictly prohibited.

<Prev in Thread] Current Thread [Next in Thread>

Archive operated by Skills 1st Ltd

See also: The NetView Web