When trapd crashes on a signal 11 ("exit(11)" in ovstatus) it will leave a
file called "trapd.socket" in /usr/OV/sockets. If you haven't tried it yet,
delete this, and trapd (and all the demons that depend on it) should start.
They all might go back down again if trapd is being flooded, but at least
they will come up.
Steve
--
Steve Houle
Enterprise Management
Salomon Smith Barney
ph: (212) 723-3369
mailto:stephen.a.houle@ssmb.com
-----Original Message-----
From: Rob Rinear [mailto:robr@dirigo.com]
Sent: Monday, September 21, 1998 11:38 AM
To: NV-L@UCSBVM.UCSB.EDU
Subject: Re: ovtopmd not starting
Yes...I stopped them all, and even killed OVsPMD in attempt to start
fresh. I've also pulled the network cable and watched the Events
display and trapd.log to insure trap processing was idle, but these
daemons will not restart.
I completely agree that the real fix is to stop the flood of traps. I'm
trying to identify these traps and modify them to not log or display to
help keep Netview alive, until the network folks straighten out the
devices.
Until then, I'm still concerned that Netview's not bouncing back as it
should. Any other suggestions would be appreciated.
James_Shanks@TIVOLI.COM wrote:
>
> Did you take the other daemons down with an ovstop or not?
>
> If ovtopmd disconnects and goes down because he cannot connect to trapd,
he
> won't be able to re-connect if trapd is too busy to talk to him. So it
may
> be that trapd is still processing the hundreds of (apparently worthless)
> traps that are sitting on his input queue. The only way to flush that
> queue is to take down trapd. Then ovtopmd can connect to him and netmon
> can connect to both of them.
>
> The only real fix in your case is to stop those network agents from
> flooding the box.
>
> Personal opinion follows:
>
> .soapbox on
> It totally mystifies me why the defaults on some routers send identical
> traps to the trap receiver every so-many seconds. They should send one
> trap and not another until or unless the trap condition changes; or at
> least they should send them several minutes apart. But I see trapd logs
> from customers all the time where some box is sending the same trap every
> two or three seconds. Multiply that by a couple dozen of these boxes and
> pretty soon the management station on which NetView resides is using most
> of its cpu to pull in traps, format them, and then throw them away. But
> there is little NetView or any other trap receiver can do about that.
> Until you receive and decode the trap, you cannot tell what it is for.
And
> once you have done that, there are always other processes which must
> inspect those traps to decide if they work to do. The only way out of the
> hole is to stop it at the source and not configure remote agents to send
> traps too frequently.
> .soapbox off
>
> James Shanks
> Tivoli (NetView for UNIX) L3 Support
>
> Rob Rinear <robr@DIRIGO.COM> on 09/18/98 04:06:28 PM
>
> Please respond to Discussion of IBM NetView and POLYCENTER Manager on
> NetView et alia <NV-L@UCSBVM.UCSB.EDU>
>
> To: NV-L@UCSBVM.UCSB.EDU
> cc: (bcc: James Shanks)
> Subject: ovtopmd not starting
>
> I'm running AIX 4.2 with NV5.0 and have serious problems with the daemons.
> I have some devices that will at times flood Netview with traps - far too
> many for it to handle, and some of the daemons will eventually stop -
> trapd,
> netmon, ovtopmd. I understand this, per documentation in the Tivoli
> knowledge base, and have even attempted to increase the event queue, to no
> avail.
>
> My real problem is that, once this flurry is over, I cannot get ovtopmd to
> restart shy of a reboot. I get console messages:
> "Fatal Topology Error: Unable to connect to ovtopmd
> Reason: Cannot connect to server: sys 2: A file or directory in the path
> name does not exist."
> and
> "Fatal Topology Error: Unabale to connect to trapd
> Reason: Topology OK -- no error"
>
> Anyone out there seen such a problem or have any suggestions?
>
> Rob Rinear
> Dirigo Incorporated
> Systems and Network Management Solutions
> (513) 421-6500
> robr@dirigo.com
> http://www.dirigo.com
|