nv-l
[Top] [All Lists]

RE: [nv-l] NV tuning for Data collection

To: nv-l@lists.us.ibm.com
Subject: RE: [nv-l] NV tuning for Data collection
From: Joe Fernandez <jfernand@kardinia.com>
Date: Thu, 13 Jan 2005 11:30:08 +1100
Delivery-date: Thu, 13 Jan 2005 00:38:47 +0000
Envelope-to: nv-l-archive@lists.skills-1st.co.uk
In-reply-to: <C353F42ACF29E240B9050B86F1852A4F0C2675@nlspm204.emea.corp. eds.com>
Reply-to: nv-l@lists.us.ibm.com
Sender: owner-nv-l@lists.us.ibm.com
At 11:05 AM 12-01-05 +0000, you wrote:
Joe,

Thanks for your reply.

I did turn on the trace and looked at the trace file. The only useful
message was

"hostname doesn't reply to xx(number) object PDU, but responds to sysUpTime.
Be sure timeouts are not set too small (SNMP interval 20.00s retry:3)."

I have checked the manual, it only says the definition (and the defaults)
but no consequences (or examples).

Regards,
David

David,

As Paul and Jason have responded, a 20 second response time is too long and you should be checking the nodes.

From the trace file message you can deduce what the effect is going to be on your collections.

snmpCollect's strategy is if a node does not respond to a complete retry cycle, polling of that node gets deferred for some relatively long time. From what you said previously your defer time is 60 minutes. So if a node does not respond to 3 consecutive gets, it will not be polled for 60 minutes.

However the trace file message above tells you that the node is responding with at least one MIB var but not with others. Hence this is defeating the "defer" strategy.

You suggested in one of your replies that you believe it is a firewall problem. I don't think so. If it was, the node would not be able to respond with any MIB var. I don't think your firewall would filter on MIB vars.

If you issue an snmpget from the command line or MIB browser, can you always get a (quick) response from these nodes with the specific MIB vars and instances you are collecting? If not, you need to check the node, as Paul suggested.

You have snmpCollect set to do 50 concurrent polls with up to 50 MIB vars in each and a polling cycle of 15 minutes. You are polling 600 nodes and 200 of them have "lots" of interfaces. snmpCollect does not do snmp v2 Get Bulks so this will mean more than one PDU per node for those 200.

The best case of one PDU per node is 600 snmpget cycles that have to be performed by 50 concurrent "threads" in a 15 minute poll period. But it is more likely to be 1000 (= 400 + 200 x 2) , 1200 (= 400 + 200 x 3) ....2400(=400+200 x 10)..... depending on what "lots" is.

Each non response is going to block one of these for 1 minute. How many of the 600 nodes are not responding?


Joe Fernandez
Kardinia Software
jfernand@kardinia.com
www.kardinia.com



<Prev in Thread] Current Thread [Next in Thread>

Archive operated by Skills 1st Ltd

See also: The NetView Web