Update Nov 2012: Thanks for the comment, Paul. I have added an update with another look at flow control.
When NAS is used with virtualization, performance and throughput are potential concerns. Enabling jumbo has proven to be the most effective method. Flow control, on the other hand, is not always easily agreeable by all parties.
When NAS is used with virtualization, performance and throughput are potential concerns. Enabling jumbo has proven to be the most effective method. Flow control, on the other hand, is not always easily agreeable by all parties.
NetApp has a one paragraph “best practice” on the subject. The recommendation is to set flow control to “receive on” on the switch port. In other words, allow the switch to receive “pause” from NAS.
As it appears, NetApp can send a lot of “pause”. This is easily shown on the port channel or physical interfaces connected to NAS:
Nexus5k# show interface e2/7
Ethernet2/7 is up
…
30 seconds input rate 116486568 bits/sec, 2079 packets/sec
30 seconds output rate 33970464 bits/sec, 792 packets/sec
Load-Interval #2: 5 minute (300 seconds)
input rate 57.99 Mbps, 1.58 Kpps; output rate 32.18 Mbps, 693 pps
RX
20604669084 unicast packets 176831374 multicast packets 0 broadcast packet
s
20781500458 input packets 63305126825983 bytes
8255225288 jumbo packets 0 storm suppression packets
0 runts 0 giants 0 CRC 0 no buffer
0 input error 0 short frame 0 overrun 0 underrun 0 ignored
0 watchdog 0 bad etype drop 0 bad proto drop 0 if down drop
0 input with dribble 0 input discard
176409426 Rx pause
TX
12091810302 unicast packets 52908470 multicast packets 2650659 broadcast p
ackets
12147369431 output packets 48540927928599 bytes
7298956759 jumbo packets
0 output errors 0 collision 0 deferred 0 late collision
0 lost carrier 0 no carrier 0 babble
0 Tx pause
0 interface resets
So what does it mean? Flow control is only meaningful if the receiving party acts on it. Therefore the expectation here is for switch to “slow down” the transmission, since it is hearing NAS saying “slow down, I can’t keep up”.
According to Cisco, once Pause is enabled and received on switch egress port, it will back pressure the ingress port, eventually packets will be buffered on the ingress port. If pause is enabled on the ingress port, it can further send the pause to the upstream switch. In the ideal scenario, the pause eventually reaches the source, which is the ESXi host, thus slowing down the origination of the transmission.
Now let’s revisit the flow control scenario again, NAS sends Pause, switches receives it, there is no use applying back pressure all the way to the host since host won’t receive it. The best a switch can do is to buffer it, or apply back pressure to upstream, and have the upstream switch buffer somewhere.
How well that works really depends on the switch and linecard models, each have different capabilities and buffer size. In many cases, it is highly questionable how far back flow control is propagating to have any positive effect. In any case, you want to check:
- Switch interface to NAS, to see the amount of Pause received
- All interfaces where NAS traffic flows, to see if there are drops