nv-l
[Top] [All Lists]

Re: [nv-l] ovstop hanging during nv6000_smit clear_topology_db_all

To: nv-l@lists.tivoli.com
Subject: Re: [nv-l] ovstop hanging during nv6000_smit clear_topology_db_all
From: jeff.ctr.vandenbussche@faa.gov
Date: Tue, 3 Jun 2003 08:53:29 -0400
Delivered-to: mailing list nv-l@lists.tivoli.com
Delivery-date: Tue, 03 Jun 2003 13:58:47 +0100
Envelope-to: nv-l-archive@lists.skills-1st.co.uk
List-help: <mailto:nv-l-help@lists.tivoli.com>
List-post: <mailto:nv-l@lists.tivoli.com>
List-subscribe: <mailto:nv-l-subscribe@lists.tivoli.com>
List-unsubscribe: <mailto:nv-l-unsubscribe@lists.tivoli.com>
Mailing-list: contact nv-l-help@lists.tivoli.com; run by ezmlm
James,

I had already done the ps while the ovstop was hung and the only daemons
left were nvsecd and ovspmd.  Clearing the topology was the first time I
noticed it.
I then restored the box from tape, and tried ovstop manually (w/o doing the
clear).  The first time I tried, it worked.  I then did an ovstart and then
ovstop, and this time it hung.  I will try this again to see which daemons
are running (I believe I checked and only nvsecd and ovspmd were running).

Thanks,

Jeff




                                                                                
                           
                      James Shanks                                              
                           
                      <jshanks@us.ibm.c        To:       nv-l@lists.tivoli.com  
                           
                      om>                      cc:                              
                           
                                               Subject:  Re: [nv-l] ovstop 
hanging during nv6000_smit      
                      06/02/03 05:51 PM         clear_topology_db_all           
                           
                                                                                
                           
                                                                                
                           




   Let's start over.   Basically I don't understand the whole deal, and I
think I've misled you.

First, clearing topology is something most people do once in a blue moon,
usually only while they are setting up the box, getting the seed file and
other options right, so why would you need to worry about this very often?
 You should be well past that stage with NetView for AIX Version 4.1.   If
the issue is really ovstop, then why muddy the water with clearing
topology?  Or is that the only time you see it?  I'm confused.

I took a quick look at the nv600_script under the clear option.  Did you
trace it?  It doesn't do "ovstop nvsecd", which would be required to stop
both nvsecd and ovspmd.  It just does "ovstop", which means both of those
daemon are supposed to be up during the process.  You do know that, don't
you?  "ovstop <daemon>" takes down that daemon, as well as any who depend
on him (as defined in the ovsuf file).   Just plain "ovstop" without an
option will leave nvsecd and ovspmd both up.  "ovstop nvsecd" takes them
both down.   So  the script expects all daemons except those two to go
away.   By you killing ovspmd, the script is no longer waiting for the
ovstop to complete, but that doesn't mean it was ovspmd that was at fault.
 He can't end the ovstop until all the other daemons (except nvsecd of
course) are down.  So you need to try some debug to see who is still
active.  After you killed ovspmd, what others are still there?  "ps -ef |
grep /usr/OV" should show you.  Only nvsecd is expected.  The others
should be gone.

My procedure when ovstop hangs is to cancel it, see who is still active
with ps -ef, and then try to ovstop them.  There really is no magic here.
If they won't go down with ovstop individually , then you have to use kill
 -9 on the PID.  Then you might want to look in their logs or format the
nettl log to see if there are telltale messages about a problem.   As a
debug mechanism, rather than trying to ovstop everyone at once, you could
divide and conquer, eliminating suspects as you go.  For instance, you
could try moving the ovstop lower in the chain.  "ovstop trapd"   will
take down most of the well-behaved daemons.  Then "ovstop ovwdb".    Then
"ovstop pmd".    That should leave only nvsecd, ovspmd, and the
non-well-behaved ones.  These latter should come down rapidly when you do
"ovstop nvsecd".  Do they?    Basically, you have to try to pin down the
one that is holding up the works.


James Shanks
Level 3 Support  for Tivoli NetView for UNIX and NT
Tivoli Software / IBM Software Group




jeff.ctr.vandenbussche@faa.gov
06/02/2003 04:24 PM


        To:     nv-l@lists.tivoli.com
        cc:
        Subject:        Re: [nv-l] ovstop hanging during nv6000_smit
clear_topology_db_all




James,

ovstop is hanging at times from the command line.  nvsecd is still
running,
so it looks like that is the daemon it is hanging on.  If I reboot the
box,
I can then ovstart/ovstop once or twice ok, but then ovstop hangs.

Any ideas/suggestions?

Thanks,

Jeff




                      James Shanks
                      <jshanks@us.ibm.c        To: nv-l@lists.tivoli.com

                      om>                      cc:
                                               Subject:  Re: [nv-l] ovstop
hanging during nv6000_smit
                      06/02/03 01:00 PM         clear_topology_db_all







Does it also hang from the command line, outside of your script?
I don't know why ovspmd should hang, unless he is still waiting for some
other daemon to stop (nvsecd has to come down first), or the message he's
waiting for got lost when you called this script inside your own script.
In any case, NetView 4.1 is too old even for me to find a good maintenance
history on.
But  nv6000_smit clear_topology_db_all is itself a script, and you can
trace it yourself if you haven't already.
  You don't even have to edit the script if you export
NV_TRACE=nv6000_smit  first.

James Shanks
Level 3 Support  for Tivoli NetView for UNIX and NT
Tivoli Software / IBM Software Group




jeff.ctr.vandenbussche@faa.gov
06/02/2003 12:31 PM


        To:     nv-l@lists.tivoli.com
        cc:
        Subject:        [nv-l] ovstop hanging during nv6000_smit
clear_topology_db_all



Has anyone ever had a problem running nv6000_smit clear_topology_db_all?
I am trying to run this from within a script, and it hangs during the
first
ovstop.  It looks like it is a problem with ovspmd.  If I kill -9 ovspmd,
then the clear continues, other the ovstop just sits there..

NV 4.1
AIX 4.1.5

I know these are dinosaurs, but it's what I have in the field at the
moment.

Any suggestions?

Thanks,

Jeff



---------------------------------------------------------------------
To unsubscribe, e-mail: nv-l-unsubscribe@lists.tivoli.com
For additional commands, e-mail: nv-l-help@lists.tivoli.com

*NOTE*
This is not an Offical Tivoli Support forum. If you need immediate
assistance from Tivoli please call the IBM Tivoli Software Group
help line at 1-800-TIVOLI8(848-6548)





---------------------------------------------------------------------
To unsubscribe, e-mail: nv-l-unsubscribe@lists.tivoli.com
For additional commands, e-mail: nv-l-help@lists.tivoli.com

*NOTE*
This is not an Offical Tivoli Support forum. If you need immediate
assistance from Tivoli please call the IBM Tivoli Software Group
help line at 1-800-TIVOLI8(848-6548)







---------------------------------------------------------------------
To unsubscribe, e-mail: nv-l-unsubscribe@lists.tivoli.com
For additional commands, e-mail: nv-l-help@lists.tivoli.com

*NOTE*
This is not an Offical Tivoli Support forum. If you need immediate
assistance from Tivoli please call the IBM Tivoli Software Group
help line at 1-800-TIVOLI8(848-6548)





---------------------------------------------------------------------
To unsubscribe, e-mail: nv-l-unsubscribe@lists.tivoli.com
For additional commands, e-mail: nv-l-help@lists.tivoli.com

*NOTE*
This is not an Offical Tivoli Support forum. If you need immediate
assistance from Tivoli please call the IBM Tivoli Software Group
help line at 1-800-TIVOLI8(848-6548)







---------------------------------------------------------------------
To unsubscribe, e-mail: nv-l-unsubscribe@lists.tivoli.com
For additional commands, e-mail: nv-l-help@lists.tivoli.com

*NOTE*
This is not an Offical Tivoli Support forum. If you need immediate
assistance from Tivoli please call the IBM Tivoli Software Group
help line at 1-800-TIVOLI8(848-6548)


<Prev in Thread] Current Thread [Next in Thread>

Archive operated by Skills 1st Ltd

See also: The NetView Web