To: | nv-l@lists.us.ibm.com |
---|---|
Subject: | Re: [nv-l] netfmt |
From: | James Shanks <jshanks@us.ibm.com> |
Date: | Mon, 10 May 2004 17:00:51 -0400 |
Delivery-date: | Mon, 10 May 2004 22:09:01 +0100 |
Envelope-to: | nv-l-archive@lists.skills-1st.co.uk |
In-reply-to: | <1084219994.15957.51.camel@chibuku.ns.carilion.com> |
Reply-to: | nv-l@lists.us.ibm.com |
Sender: | owner-nv-l@lists.us.ibm.com |
Mahesh, All that's in your nettl log are messages from other processes, ipmap, ovw and ovspmd. There is nothing from nettl itself., and nothing to indicate that the nettl process had a problem. See where it says "Software:"? That's how you can tell what process wrote the message. So the nettl log itself doesn't look promising, but you should let someone else from Support look for you. The ps output may tell us more. This kind of output is normal. it is what you should see: root 2471 1 0 15:49 ? 00:00:00 /usr/OV/bin/ntl_reader 0 1 1 1 1 root 2472 2471 0 15:49 ? 00:00:00 netfmt -CF root 8018 9132 0 15:53 pts/0 00:00:00 grep 2471 Notice how the parent process of the netfmt -CF (2471) is the ntl_reader process? In the earlier cases, the parent process is 1, which means that the nettl process, the ntl_reader, which spawned them, has itself gone away and the netfmt then inherits the init process (1) as its parent. , since it has no parent left in the system. These are all orphans. root 28067 1 0 15:45 ? 00:00:00 netfmt -CF root 23113 1 0 15:48 ? 00:00:00 netfmt -CF root 23748 1 0 15:48 ? 00:00:00 netfmt -CF root 24536 1 0 15:48 ? 00:00:00 netfmt -CF This situation might indicate that the ntl_reader process is coring on your box. Can you find any core files in the root (/) directory? Or in /usr/OV? I don't believe that ntl-reader is setup to use /usr/OV/PD /cores. In any case, open a problem to Support, and let them help you gather some data.. I have no idea what else to tell you. James Shanks Level 3 Support for Tivoli NetView for UNIX and Windows Tivoli Software / IBM Software Group
Hi, James! Here's the output of my ps -ef: root@netview [/usr/OV/log] # ps -ef | grep netfmt root 28067 1 0 15:45 ? 00:00:00 netfmt -CF root 23113 1 0 15:48 ? 00:00:00 netfmt -CF root 23748 1 0 15:48 ? 00:00:00 netfmt -CF root 24536 1 0 15:48 ? 00:00:00 netfmt -CF root 2472 2471 0 15:49 ? 00:00:00 netfmt -CF root 8020 9132 0 15:53 pts/0 00:00:00 grep netfmt root@netview [/usr/OV/log] # ps -ef | grep 2471 root 2471 1 0 15:49 ? 00:00:00 /usr/OV/bin/ntl_reader 0 1 1 1 1 root 2472 2471 0 15:49 ? 00:00:00 netfmt -CF root 8018 9132 0 15:53 pts/0 00:00:00 grep 2471 And, these are since I had to restart my machine 50-minutes ago. I performed a nettl -stop and still had the netfmt processes belonging to PID 1 running; killed them. Restarted nettl. Here're some of the nettl log messages . . . ************************************ NetView *******************************@#% Timestamp : Mon May 10 2004 10:06:07.308834 Process ID : 9774 Subsystem : SECURITY User ID ( UID ) : 0 Log Class : ERROR Device ID : -1 Path ID : -1 Connection ID : -1 Log Instance : 0 Software : /usr/OV/bin/ovw Hostname : netview.carilion.com ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ OVwUserSecurity() error 4 on waitpid ************************************ NetView *******************************@#% Timestamp : Mon May 10 2004 15:08:45.118009 Process ID : 1609 Subsystem : OVW User ID ( UID ) : 0 Log Class : ERROR Device ID : -1 Path ID : -1 Connection ID : -1 Log Instance : 0 Software : /usr/OV/bin/ipmap Hostname : netview.carilion.com ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ IPMap error in symbolMgr::flushSymbols - OVwCreateSymbols - (OVwError = 80): Object not found. ************************************ NetView *******************************@#% Timestamp : Mon May 10 2004 15:08:45.118101 Process ID : 1609 Subsystem : OVW User ID ( UID ) : 0 Log Class : ERROR Device ID : -1 Path ID : -1 Connection ID : -1 Log Instance : 0 Software : /usr/OV/bin/ipmap Hostname : netview.carilion.com ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Failed to create symbol: 172.23.6.25. OVwError =80: Object not found. ************************************ NetView *******************************@#% Timestamp : Mon May 10 2004 15:08:45.118763 Process ID : 1609 Subsystem : OVW User ID ( UID ) : 0 Log Class : ERROR Device ID : -1 Path ID : -1 Connection ID : -1 Log Instance : 0 Software : /usr/OV/bin/ipmap Hostname : netview.carilion.com ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ IPMap error in symbolMgr::flushSymbols - OVwCreateSymbols - (OVwError = 80): Object not found. ************************************ NetView *******************************@#% Timestamp : Mon May 10 2004 15:08:45.118822 Process ID : 1609 Subsystem : OVW User ID ( UID ) : 0 Log Class : ERROR Device ID : -1 Path ID : -1 Connection ID : -1 Log Instance : 0 Software : /usr/OV/bin/ipmap Hostname : netview.carilion.com ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Failed to create symbol: 10.10.10.10. OVwError =80: Object not found. ************************************ NetView *******************************@#% Timestamp : Mon May 10 2004 15:17:38.349803 Process ID : 1394 Subsystem : OVS User ID ( UID ) : 0 Log Class : ERROR Device ID : -1 Path ID : -1 Connection ID : -1 Log Instance : 0 Software : /usr/OV/bin/ovspmd Hostname : netview.carilion.com ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Object manager kronos.carilion.com is not registered. See ovaddobj(1m). Kronos.carilion.com is 10.10.10.10 which is a Win2K cluster address and is excluded as !kronos.carilion.com in netmon.seed. If you see something obvious can you please drop me a reply. If not, I will submit a PMR. Thanks. Mahesh On Mon, 2004-05-10 at 15:41, James Shanks wrote: > Well, I don't have a clue what is wrong, but on Linux, it is the nettl > process itself which spawns the netfmt -CF. But only one of those is > spawned on my system and it stays active only so long as nettl is > active. When I do a "/usr/OV/bin/nettl -stop" both nettl and the > netfmt go away. > > You should be able to chase ownership of the process via ps -ef. Who > is | are the parents of these rogue netfmts? Your current nettl or > some other long gone? What happens when or if you do nettl -stop? > Once the main nettl goes away, you should be able to kill those netfmt > processes with impunity, though that will not tell you why they are > being created. But you can stop and restart nettl any time you wish. > Normally it is just started once and keeps running until stopped. If > you stop nettl and kill all the remaining netfmts, if any, and then > restart nettl with nettl -start, try looking with "ps -ef |grep > netfmt". How many do you see? Should be just one. Try looking again > every few minutes. > > Offhand I see nothing in your status that looks out of line. Where > would you look for a source of the problem? Well, I'm not sure, since > I've never seen anything like this before, but here's what I'd do: > (1) /usr/OV/bin/nettl -stop > (2) ps -ef | grep netfmt. kill any you find > (3) cd /usr/OV/log > (4) ls nettl* and see how many you have, just netttl.LOG00 or also > nettl.LOG01 > (5) for each nettl.LOG0n you have, issue > /usr/OV/bin/netfmt -f nettl.LOG0n > formatted.LOG0n > This creates ascii files you can read. > (6) Look in the formatted logs for interesting error messages > (7) Call Support with what you find. > > James Shanks > Level 3 Support for Tivoli NetView for UNIX and Windows > Tivoli Software / IBM Software Group > > > Mahesh Tailor > <mahesh.tailor@network.carilion.com> > Sent by: > owner-nv-l@lists.us.ibm.com > > 05/10/2004 03:01 PM > Please respond to > nv-l > To > NetView User List > <nv-l@lists.us.ibm.com> > cc > > Subject > [nv-l] netfmt > > > > > Hi! > > Running NetView 7.1.3 fp 2 on RedHat Linux AS 2.1. > > I am having a problem with hundreds of netfmt -CF processes running > and > eventually disabling the system because of too many open files [system > default open files has been set to 32K files]. How can I figure out > what is causing all these processes to start? Here's my nettl status > output: > > Logging Information: > Log Filename: /usr/OV/log/nettl.LOG0x > User's ID: 0 Buffer Size: 8192 > Messages Dropped: 0 Messages Queued: 0 > > Subsystem Name: Log Class: > NON_IP ERROR > DISASTER > DISTMAN WARNING ERROR > DISASTER > SECURITY WARNING ERROR > DISASTER > COLLECTION WARNING ERROR > DISASTER > SNMP ERROR > DISASTER > CMOT ERROR > DISASTER > OVE ERROR > DISASTER > OVC ERROR > DISASTER > OVW ERROR > DISASTER > OVD ERROR > DISASTER > OVS INFORMATIVE ERROR > DISASTER > OVCAPI ERROR > DISASTER > OVEXTERNAL ERROR > DISASTER > OVWAPI ERROR > DISASTER > TEST_ID_1 > DISASTER > TEST_ID_2 > DISASTER > FORMATTER > DISASTER > > > Tracing Information: > > Trace Filename: > No Subsystems Active > > > In addition to NetView the server also has the following running: > > - MySQL DB > - Apache w/PHP and Perl. > - Some ksh scripts that perform /usr/OV/bin/nvUtil on various > smartsets > once every 30-minutes. > > That is essentially it. > > Also, what does the netfmt -C option do? It is not in the man page. > > Thanks. > > Mahesh -- Mahesh Tailor WAN/TSM/NetView Administrator Carilion Health System Information Services 37 Reserve Avenue Roanoke, VA 24016 Phone: 540.224.3929 Fax: 540.224.3954 |
<Prev in Thread] | Current Thread | [Next in Thread> |
---|---|---|
|
Previous by Date: | Re: [nv-l] netfmt, Mahesh Tailor |
---|---|
Next by Date: | Re: [nv-l] netfmt, Stephen Hochstetler |
Previous by Thread: | Re: [nv-l] netfmt, Mahesh Tailor |
Next by Thread: | Re: [nv-l] netfmt, Stephen Hochstetler |
Indexes: | [Date] [Thread] [Top] [All Lists] |
Archive operated by Skills 1st Ltd
See also: The NetView Web