We have a multimachine AIX 4.2.1 HACMP cluster.
We make the things in this way:
Machine 1 TMR 3.6
Machine 2 NetView 5.1
Machine 3 TEC
Machine 4 Another application
M1 is the pair of M2 and M3 is the pair of M4, if any goes down the
companion machine pick up the work.
All 4 machines have access to a shared disk cabinet.
Every one has two fast ethernet cards, one is the main (this is the main Ip
address of this machine) a the other is for the HACMP ( with the Ip address
of the companion), if the companion machine goes down the backup cards goes
up with its IP address, this means that simulates the Ip address of the
other and run its process.
We install one TMR only for the NetView (it was the suggestion of IBM tech
support), and another to make the rest of the job (in this TMR goes TEC).
We put every TMR in a different File system and each file system goes in a
different volume group (NetView goes int another file system but in the same
vg of its TMR).
We set every oserv to run in a different TCP port, If one machine goes down
the conpanion machine first activate the backup interface and run a varyonvg
and next mount the coresponding filesystem , since every TMR is in a
different filesystem we did not have any problem to mount both filesystem in
one machie, next we setup the enviroment variables and run the oserv,
finally we run NetView (is necesary to run reset_ci) . since the machine has
both ip address runing, we do not have any problem whit the traps.
please see the TME 10 V5 r1 for UNIX release notes, pag46 for more detail.
for umount the /usr/OV make this:
If you have the AIX in a local hard disc, in each machine you need to
install NetView in both machine because NetView need the part of the
software that is integrated whit the AIX, after that you must delete one (no
unistall, delete the filesystem).
If you are in the backup machine run reset_ci script before run netview.
Always run the setup_env script before run netview (not is necessary to run
Chris Cowan wrote:
> A multi machine AIX 4.3.1 HACMP cluster.
> Machine 1 - TMR 3.6 Server
> Machine 2 - TEC 3.6 Server
> Machine 3 - Netview 5.1
> Tivoli Filesystems (/usr/local/Tivoli, /var/spool/Tivoli).
> Endpoints presently in the default /opt/Tivoli
> All 3 machines have access to a shared disk cabinet (Right now it's an
> IBM SSA, later it may be EMC).
> We are probably going to go with unique filesystems like:
> And then switch by making and breaking symlinks to /usr/local/Tivoli and
> The failover scenario is to move the either the TMR (with priority) or
> the TEC server over to the Netview machine.
> The sequence would be:
> - Detect the machine going down
> - Stop Netview (ovstop)
> - Stop Netview's oserv
> - Umount Netview's MN Filesystems
> - Assume the failed-over machine's IP address
> - Mount failed machine's MN Filesystems
> - Start oserv
> The billion dollar question is:
> Can Netview be brought up again with a "new" Managed Node running
> underneath of it???????
> Obviously, we would have to do the "reset_ci" procedure since the
> hostname/ip address of the Netview Manager will have changed. How easy
> would it be to move the change the underlying MN configuration and keep
> the Administrative and NV Event Adapter stuff running.
> The following things come to my mind:
> - SNMP traps from agents would have to be retargeted (or all agents
> should be configured for multiple targets from the outset.
> - All three MN would have to have a Netview Server object installed on
> - The local snmpd and snmpd.conf may have to be messed with.
> - These three systems are also running Endpoints (for Event Adapter
> support). Presently, they are configured for /opt/Tivoli. I can
> reinstall them to run in /usr/local/Tivoli/lcf. (Probably a good idea
> for several reasons).
> Also, are there an implications of reversing this procedure?
> Please don't tell me it's bad idea, not my call. I may have to
> implement this regardless.