nv-l
[Top] [All Lists]

Re: [nv-l] ovstop hanging during nv6000_smit clear_topology_db_all

To: nv-l@lists.tivoli.com
Subject: Re: [nv-l] ovstop hanging during nv6000_smit clear_topology_db_all
From: James Shanks <jshanks@us.ibm.com>
Date: Mon, 2 Jun 2003 17:51:51 -0400
Delivered-to: mailing list nv-l@lists.tivoli.com
Delivery-date: Mon, 02 Jun 2003 22:54:28 +0100
Envelope-to: nv-l-archive@lists.skills-1st.co.uk
List-help: <mailto:nv-l-help@lists.tivoli.com>
List-post: <mailto:nv-l@lists.tivoli.com>
List-subscribe: <mailto:nv-l-subscribe@lists.tivoli.com>
List-unsubscribe: <mailto:nv-l-unsubscribe@lists.tivoli.com>
Mailing-list: contact nv-l-help@lists.tivoli.com; run by ezmlm
   Let's start over.   Basically I don't understand the whole deal, and I 
think I've misled you. 

First, clearing topology is something most people do once in a blue moon, 
usually only while they are setting up the box, getting the seed file and 
other options right, so why would you need to worry about this very often? 
 You should be well past that stage with NetView for AIX Version 4.1.   If 
the issue is really ovstop, then why muddy the water with clearing 
topology?  Or is that the only time you see it?  I'm confused. 

I took a quick look at the nv600_script under the clear option.  Did you 
trace it?  It doesn't do "ovstop nvsecd", which would be required to stop 
both nvsecd and ovspmd.  It just does "ovstop", which means both of those 
daemon are supposed to be up during the process.  You do know that, don't 
you?  "ovstop <daemon>" takes down that daemon, as well as any who depend 
on him (as defined in the ovsuf file).   Just plain "ovstop" without an 
option will leave nvsecd and ovspmd both up.  "ovstop nvsecd" takes them 
both down.   So  the script expects all daemons except those two to go 
away.   By you killing ovspmd, the script is no longer waiting for the 
ovstop to complete, but that doesn't mean it was ovspmd that was at fault. 
 He can't end the ovstop until all the other daemons (except nvsecd of 
course) are down.  So you need to try some debug to see who is still 
active.  After you killed ovspmd, what others are still there?  "ps -ef | 
grep /usr/OV" should show you.  Only nvsecd is expected.  The others 
should be gone. 

My procedure when ovstop hangs is to cancel it, see who is still active 
with ps -ef, and then try to ovstop them.  There really is no magic here.  
If they won't go down with ovstop individually , then you have to use kill 
 -9 on the PID.  Then you might want to look in their logs or format the 
nettl log to see if there are telltale messages about a problem.   As a 
debug mechanism, rather than trying to ovstop everyone at once, you could 
divide and conquer, eliminating suspects as you go.  For instance, you 
could try moving the ovstop lower in the chain.  "ovstop trapd"   will 
take down most of the well-behaved daemons.  Then "ovstop ovwdb".    Then 
"ovstop pmd".    That should leave only nvsecd, ovspmd, and the 
non-well-behaved ones.  These latter should come down rapidly when you do 
"ovstop nvsecd".  Do they?    Basically, you have to try to pin down the 
one that is holding up the works.
 

James Shanks
Level 3 Support  for Tivoli NetView for UNIX and NT
Tivoli Software / IBM Software Group




jeff.ctr.vandenbussche@faa.gov
06/02/2003 04:24 PM

 
        To:     nv-l@lists.tivoli.com
        cc: 
        Subject:        Re: [nv-l] ovstop hanging during nv6000_smit 
clear_topology_db_all




James,

ovstop is hanging at times from the command line.  nvsecd is still 
running,
so it looks like that is the daemon it is hanging on.  If I reboot the 
box,
I can then ovstart/ovstop once or twice ok, but then ovstop hangs.

Any ideas/suggestions?

Thanks,

Jeff



  
                      James Shanks  
                      <jshanks@us.ibm.c        To: nv-l@lists.tivoli.com   
 
                      om>                      cc:   
                                               Subject:  Re: [nv-l] ovstop 
hanging during nv6000_smit 
                      06/02/03 01:00 PM         clear_topology_db_all      
 
  
  




Does it also hang from the command line, outside of your script?
I don't know why ovspmd should hang, unless he is still waiting for some
other daemon to stop (nvsecd has to come down first), or the message he's
waiting for got lost when you called this script inside your own script.
In any case, NetView 4.1 is too old even for me to find a good maintenance
history on.
But  nv6000_smit clear_topology_db_all is itself a script, and you can
trace it yourself if you haven't already.
  You don't even have to edit the script if you export
NV_TRACE=nv6000_smit  first.

James Shanks
Level 3 Support  for Tivoli NetView for UNIX and NT
Tivoli Software / IBM Software Group




jeff.ctr.vandenbussche@faa.gov
06/02/2003 12:31 PM


        To:     nv-l@lists.tivoli.com
        cc:
        Subject:        [nv-l] ovstop hanging during nv6000_smit
clear_topology_db_all



Has anyone ever had a problem running nv6000_smit clear_topology_db_all?
I am trying to run this from within a script, and it hangs during the
first
ovstop.  It looks like it is a problem with ovspmd.  If I kill -9 ovspmd,
then the clear continues, other the ovstop just sits there..

NV 4.1
AIX 4.1.5

I know these are dinosaurs, but it's what I have in the field at the
moment.

Any suggestions?

Thanks,

Jeff



---------------------------------------------------------------------
To unsubscribe, e-mail: nv-l-unsubscribe@lists.tivoli.com
For additional commands, e-mail: nv-l-help@lists.tivoli.com

*NOTE*
This is not an Offical Tivoli Support forum. If you need immediate
assistance from Tivoli please call the IBM Tivoli Software Group
help line at 1-800-TIVOLI8(848-6548)





---------------------------------------------------------------------
To unsubscribe, e-mail: nv-l-unsubscribe@lists.tivoli.com
For additional commands, e-mail: nv-l-help@lists.tivoli.com

*NOTE*
This is not an Offical Tivoli Support forum. If you need immediate
assistance from Tivoli please call the IBM Tivoli Software Group
help line at 1-800-TIVOLI8(848-6548)







---------------------------------------------------------------------
To unsubscribe, e-mail: nv-l-unsubscribe@lists.tivoli.com
For additional commands, e-mail: nv-l-help@lists.tivoli.com

*NOTE*
This is not an Offical Tivoli Support forum. If you need immediate
assistance from Tivoli please call the IBM Tivoli Software Group
help line at 1-800-TIVOLI8(848-6548)





---------------------------------------------------------------------
To unsubscribe, e-mail: nv-l-unsubscribe@lists.tivoli.com
For additional commands, e-mail: nv-l-help@lists.tivoli.com

*NOTE*
This is not an Offical Tivoli Support forum. If you need immediate
assistance from Tivoli please call the IBM Tivoli Software Group
help line at 1-800-TIVOLI8(848-6548)


<Prev in Thread] Current Thread [Next in Thread>

Archive operated by Skills 1st Ltd

See also: The NetView Web