Found the problem. The ARP table on this guy is HUGE. I checked last night
and there were 34000 entries. This router ties together 2 routing domains,
but only has complete knowledge of one of them. The problem is that a static
route is used
to give him minimal knowledge of the other routing domain. That static route
is tied to an interface instead of a next hop address. Because of that, he
has to maintain an ARP cache entry for the next hop MAC address to get to
all ip devices in the other routing domain. The problem is aggravated by the
fact that the other routing domain has 600+ subnets. So basically, this
router gets one entry in his ARP cache for every IP device in those 600
subnets.
Did a demandpoll on him and sat and waited 20 min while netmon tried to get
the ARP table. Got tired of waiting, so I closed it. For the short term,
I've disabled config polling and new node discovery on Netview. The whole
time, CPU was pegged at 100% w/ IP SNMP getting 80%.
The good news is that Cisco prioritizes CPU. SNMP gets low priority, so
everything else in the high and medium range should get CPU before SNMP.
This wasn't really causing a problem, other than a slightly noticeable
latency for data going thru him when CPU is pegged.
The problem I had in tracking this down is that the router won't tell me who
is generating the SNMP traffic destined for him, short of turning on a debug
(debug snmp packet or debug ip packet w/ a list applied). With that router
already at 100% CPU, I'd hate to turn a debug on and risk crashing the
router, since he was still able to do his normal traffic processing.
Also found that my Unix guys had redeployed my old NetView v3 server as an
AIX DNS test server, but left NetView v3 running.
This guy was the first problem. Had to kill his Netview daemons and edit the
ovsuf file so they wouldn't start. I've also got some guys testing CA
Unicenter in my lab, so I've disabled them from being able to query him via
SNMP.
Anyways, the static route that is used says that an entire Class B is
reachable via such and such Token Ring interface.
Like I said before, that Class B has 600+ subnets. For anyone out there that
is doing anything similar, EXPECT problems when netmon tries to pull the ARP
table off that router.
Anyone know what variables are being used to pull the ARP table? Are the
SNMPv1 or SNMPv2? I'm trying to figure out if
there is any config I can do on the router to disable pulling the ARP table
via SNMP. If SNMPv2, I might be able to exclude it via snmp-server view
command as Marc suggested.
__________________________
Thanks,
Sean Davidson
Sr. Network Systems Engineer
Publix Super Markets, Inc.
P.O. Box 32015
Lakeland, Fl. 33802-2015
Email - sean.davidson@publix.com
Voice - (863) 686-8754 x6889
Fax - (863)616-5895
-----Original Message-----
From: Marc Russo [mailto:mrusso@al.iisl.com]
Sent: Thursday, October 14, 1999 9:15 AM
To: 'sean.davidson@MAIL.PUBLIX.COM'
Subject: Re: NETMON causing high SNMP Util on router
I've seen the same problem at several of my customers sites. How large are
your routing tables? I've seen the SNMP process run the cpu really high when
netmon is trying to get the routing table. The following lines in your
cisco configs will prevent netmon from getting the routing table (ip.21) via
snmp. You can also deny the arp tables. I haven't really seen any negative
effects to blocking the routing table, but their might be some.
snmp-server view demandpoll mib-2 included
snmp-server view demandpoll ip.21 excluded
snmp-server view demandpoll private included
snmp-server community whatever RW
snmp-server community public view demandpoll RO
You can also use a sniffer to help determine exactly what is happening.
That is how we initially solved the problem.
Marc Russo (mrusso@al.iisl.com)
International Integrated Solutions, LTD
|