This was the trick! Netview got traps from a router that wasnt listed in
DNS. Not only was it not listed (unresolvable) but no DNS server had
authority over that zone so the lookup hung. We fixed the problem by adding
an empty zone file (now it no longer hangs) and have longer term plans to
have hostname resolutions for IP addresses on our network equipment.
Thanks to Jim Shanks (and everyone else who offered suggestions) who shaved
off many hours of work..
--Bryan
> ----------
> From: James Shanks[SMTP:James_Shanks@TIVOLI.COM]
> Reply To: Discussion of IBM NetView and POLYCENTER Manager on NetView
> Sent: Tuesday, May 11, 1999 9:28 AM
> To: NV-L@UCSBVM.UCSB.EDU
> Subject: Re: events showing up late in control desk (nvcorrd,
> nvserverd)
>
> In the absence of anything else, processing delays between trapd and the
> events window are often the result of DNS problems. That's where I'd
> start
> looking first. All it would take would be that someone changes DNS to go
> to yet some other server if a name cannot be resolved, and that server has
> poor performance, and you could get hung out to dry awaitng name
> resolution. Names are resolved by trapd, nvcorrd, nvserverd, and by
> nvevents itself.
>
> Do your nslookups work well? Pick a few traps from trapd.log and see what
> happens. Any delay at the command line will be felt more seriously by the
> daemons.
>
> James Shanks
> Tivoli (NetView for UNIX) L3 Support
>
>
>
> "Brook, Bryan S" <bryan.s.brook@LMCO.COM> on 05/10/99 01:32:38 PM
>
> Please respond to Discussion of IBM NetView and POLYCENTER Manager on
> NetView <NV-L@UCSBVM.UCSB.EDU>
>
> To: NV-L@UCSBVM.UCSB.EDU
> cc: (bcc: James Shanks/Tivoli Systems)
> Subject: events showing up late in control desk (nvcorrd, nvserverd)
>
>
>
>
>
> someone PLEASE throw me a bone...
>
> Our customer is reporting large delays (up to 4 hours) of events arriving
> in
> their control desks. I can force a trap to be sent from a network device
> and it hits /usr/OV/log/trapd.log with low delay (<1 sec). The event
> shows
> up delayed (10 sec< delay <4 hours) in the control desk. The only rule
> they
> have implemented is forwardall.rls. There are approx 4 ovws running each
> w/
> multiple dynamic workspaces.
>
> netstat -a |grep nvcorrd shows the following:
>
> Proto Recv-Q Send-Q Local Address Foreign Address (state)
> tcp 0 0 loopback.nvcorrd loopback.1339
> tcp 156 0 loopback.1339 loopback.nvcorrd
> .
> .
> .
> .
>
> There are several of these socket pairs each with a receive queue listed
> for
> the numbered socket. lsof shows these socket numbers are nvserverd
> processes. These bytes in queue are VERY slow to clear out. Bursts of
> traps cause large numbers (~31k) and queuing delays.
>
> Has anyone seen this behavior? What could be wrong. Any hints, ideas,
> suggestions would be greatly appreciated.
>
> --Bryan Brook
>
|