Hi, James!
Here's the output of my ps -ef:
root@netview [/usr/OV/log] # ps -ef | grep netfmt
root 28067 1 0 15:45 ? 00:00:00 netfmt -CF
root 23113 1 0 15:48 ? 00:00:00 netfmt -CF
root 23748 1 0 15:48 ? 00:00:00 netfmt -CF
root 24536 1 0 15:48 ? 00:00:00 netfmt -CF
root 2472 2471 0 15:49 ? 00:00:00 netfmt -CF
root 8020 9132 0 15:53 pts/0 00:00:00 grep netfmt
root@netview [/usr/OV/log] # ps -ef | grep 2471
root 2471 1 0 15:49 ? 00:00:00 /usr/OV/bin/ntl_reader 0
1 1 1 1
root 2472 2471 0 15:49 ? 00:00:00 netfmt -CF
root 8018 9132 0 15:53 pts/0 00:00:00 grep 2471
And, these are since I had to restart my machine 50-minutes ago.
I performed a nettl -stop and still had the netfmt processes belonging
to PID 1 running; killed them. Restarted nettl.
Here're some of the nettl log messages . . .
************************************ NetView
*******************************@#%
Timestamp : Mon May 10 2004 10:06:07.308834
Process ID : 9774 Subsystem : SECURITY
User ID ( UID ) : 0 Log Class : ERROR
Device ID : -1 Path ID : -1
Connection ID : -1 Log Instance : 0
Software : /usr/OV/bin/ovw
Hostname : netview.carilion.com
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
OVwUserSecurity() error 4 on waitpid
************************************ NetView
*******************************@#%
Timestamp : Mon May 10 2004 15:08:45.118009
Process ID : 1609 Subsystem : OVW
User ID ( UID ) : 0 Log Class : ERROR
Device ID : -1 Path ID : -1
Connection ID : -1 Log Instance : 0
Software : /usr/OV/bin/ipmap
Hostname : netview.carilion.com
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
IPMap error in symbolMgr::flushSymbols - OVwCreateSymbols - (OVwError =
80): Object not found.
************************************ NetView
*******************************@#%
Timestamp : Mon May 10 2004 15:08:45.118101
Process ID : 1609 Subsystem : OVW
User ID ( UID ) : 0 Log Class : ERROR
Device ID : -1 Path ID : -1
Connection ID : -1 Log Instance : 0
Software : /usr/OV/bin/ipmap
Hostname : netview.carilion.com
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Failed to create symbol: 172.23.6.25. OVwError =80: Object not found.
************************************ NetView
*******************************@#%
Timestamp : Mon May 10 2004 15:08:45.118763
Process ID : 1609 Subsystem : OVW
User ID ( UID ) : 0 Log Class : ERROR
Device ID : -1 Path ID : -1
Connection ID : -1 Log Instance : 0
Software : /usr/OV/bin/ipmap
Hostname : netview.carilion.com
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
IPMap error in symbolMgr::flushSymbols - OVwCreateSymbols - (OVwError =
80): Object not found.
************************************ NetView
*******************************@#%
Timestamp : Mon May 10 2004 15:08:45.118822
Process ID : 1609 Subsystem : OVW
User ID ( UID ) : 0 Log Class : ERROR
Device ID : -1 Path ID : -1
Connection ID : -1 Log Instance : 0
Software : /usr/OV/bin/ipmap
Hostname : netview.carilion.com
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Failed to create symbol: 10.10.10.10. OVwError =80: Object not found.
************************************ NetView
*******************************@#%
Timestamp : Mon May 10 2004 15:17:38.349803
Process ID : 1394 Subsystem : OVS
User ID ( UID ) : 0 Log Class : ERROR
Device ID : -1 Path ID : -1
Connection ID : -1 Log Instance : 0
Software : /usr/OV/bin/ovspmd
Hostname : netview.carilion.com
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Object manager kronos.carilion.com is not registered. See ovaddobj(1m).
Kronos.carilion.com is 10.10.10.10 which is a Win2K cluster address and
is excluded as !kronos.carilion.com in netmon.seed.
If you see something obvious can you please drop me a reply. If not, I
will submit a PMR.
Thanks.
Mahesh
On Mon, 2004-05-10 at 15:41, James Shanks wrote:
> Well, I don't have a clue what is wrong, but on Linux, it is the nettl
> process itself which spawns the netfmt -CF. But only one of those is
> spawned on my system and it stays active only so long as nettl is
> active. When I do a "/usr/OV/bin/nettl -stop" both nettl and the
> netfmt go away.
>
> You should be able to chase ownership of the process via ps -ef. Who
> is | are the parents of these rogue netfmts? Your current nettl or
> some other long gone? What happens when or if you do nettl -stop?
> Once the main nettl goes away, you should be able to kill those netfmt
> processes with impunity, though that will not tell you why they are
> being created. But you can stop and restart nettl any time you wish.
> Normally it is just started once and keeps running until stopped. If
> you stop nettl and kill all the remaining netfmts, if any, and then
> restart nettl with nettl -start, try looking with "ps -ef |grep
> netfmt". How many do you see? Should be just one. Try looking again
> every few minutes.
>
> Offhand I see nothing in your status that looks out of line. Where
> would you look for a source of the problem? Well, I'm not sure, since
> I've never seen anything like this before, but here's what I'd do:
> (1) /usr/OV/bin/nettl -stop
> (2) ps -ef | grep netfmt. kill any you find
> (3) cd /usr/OV/log
> (4) ls nettl* and see how many you have, just netttl.LOG00 or also
> nettl.LOG01
> (5) for each nettl.LOG0n you have, issue
> /usr/OV/bin/netfmt -f nettl.LOG0n > formatted.LOG0n
> This creates ascii files you can read.
> (6) Look in the formatted logs for interesting error messages
> (7) Call Support with what you find.
>
> James Shanks
> Level 3 Support for Tivoli NetView for UNIX and Windows
> Tivoli Software / IBM Software Group
>
>
> Mahesh Tailor
> <mahesh.tailor@network.carilion.com>
> Sent by:
> owner-nv-l@lists.us.ibm.com
>
> 05/10/2004 03:01 PM
> Please respond to
> nv-l
> To
> NetView User List
> <nv-l@lists.us.ibm.com>
> cc
>
> Subject
> [nv-l] netfmt
>
>
>
>
> Hi!
>
> Running NetView 7.1.3 fp 2 on RedHat Linux AS 2.1.
>
> I am having a problem with hundreds of netfmt -CF processes running
> and
> eventually disabling the system because of too many open files [system
> default open files has been set to 32K files]. How can I figure out
> what is causing all these processes to start? Here's my nettl status
> output:
>
> Logging Information:
> Log Filename: /usr/OV/log/nettl.LOG0x
> User's ID: 0 Buffer Size: 8192
> Messages Dropped: 0 Messages Queued: 0
>
> Subsystem Name: Log Class:
> NON_IP ERROR
> DISASTER
> DISTMAN WARNING ERROR
> DISASTER
> SECURITY WARNING ERROR
> DISASTER
> COLLECTION WARNING ERROR
> DISASTER
> SNMP ERROR
> DISASTER
> CMOT ERROR
> DISASTER
> OVE ERROR
> DISASTER
> OVC ERROR
> DISASTER
> OVW ERROR
> DISASTER
> OVD ERROR
> DISASTER
> OVS INFORMATIVE ERROR
> DISASTER
> OVCAPI ERROR
> DISASTER
> OVEXTERNAL ERROR
> DISASTER
> OVWAPI ERROR
> DISASTER
> TEST_ID_1
> DISASTER
> TEST_ID_2
> DISASTER
> FORMATTER
> DISASTER
>
>
> Tracing Information:
>
> Trace Filename:
> No Subsystems Active
>
>
> In addition to NetView the server also has the following running:
>
> - MySQL DB
> - Apache w/PHP and Perl.
> - Some ksh scripts that perform /usr/OV/bin/nvUtil on various
> smartsets
> once every 30-minutes.
>
> That is essentially it.
>
> Also, what does the netfmt -C option do? It is not in the man page.
>
> Thanks.
>
> Mahesh
--
Mahesh Tailor
WAN/TSM/NetView Administrator
Carilion Health System
Information Services
37 Reserve Avenue
Roanoke, VA 24016
Phone: 540.224.3929
Fax: 540.224.3954
|