CTHDCN-2303-Troubleshooting ARP Storms On The Nexus7k
CTHDCN-2303-Troubleshooting ARP Storms On The Nexus7k
CTHDCN-2303-Troubleshooting ARP Storms On The Nexus7k
Troubleshooting ARP
storms on the
Nexus7k
Ruvin Conganige
@Ruvin_Anthony
CTHDCN-2303
#CLUS
Cisco Webex Teams
Questions?
Use Cisco Webex Teams to chat
with the speaker after the session
How
1 Find this session in the Cisco Live Mobile App
2 Click “Join the Discussion”
3 Install Webex Teams or go directly to the team space
4 Enter messages/questions in the team space
#CLUS © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 3
Who are we
We are Technical Consulting Engineers from the Customer
Experience (CX) Support Services organization, also known
as TAC
#CLUS CTHDCN-2303 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 5
Problem Symptoms
Instability of the control plane at frequent intervals
2019 Feb 17 12:57:05 Nexus7700 %OSPFV3-5-ADJCHANGE: ospfv3-20 [8777] Nbr 172.16.118.244 on port-channel4 went INIT
2019 Feb 17 12:57:05 Nexus7700 %OSPF-5-ADJCHANGE: ospf-10 [8778] Nbr 172.16 .118.168 on port-channel3 went INIT
2019 Feb 17 12:57:05 Nexus7700 %OSPFV3-5-ADJCHANGE: ospfv3-20 [8777] Nbr 172.16.118.244 on port-channel4 went EXSTART
2019 Feb 17 12:57:05 Nexus7700 %BFD-5-SESSION_STATE_DOWN: BFD session 1090519060 to neighbor gone down
2019 Feb 17 12:57:05 Nexus7700 %BFD-5-SESSION_REMOVED: BFD session to neighbor 10.80.1.62 on vlan 6 has been removed
2019 Feb 17 12:57:05 Nexus7700 %VPC-2-PEER_KEEP_ALIVE_RECV_FAIL: In domain 1, VPC peer keep-alive receive has failed
#CLUS CTHDCN-2303 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 6
Validate configuration
Plan of Action
Check for any interface drops/errors
show queuing interface | inc Ethernet|Drop
Show interface counter errors
#CLUS CTHDCN-2303 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 7
Data collection
#CLUS CTHDCN-2303 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 8
Data collection
module 3:
#CLUS CTHDCN-2303 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 9
Data collection
module 1:
module 3:
#CLUS CTHDCN-2303 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 10
Data collection
Nexus7700 # show hardware internal cpu-mac inband counters
Client uuid: 268, 4 filters, pid 7881 show system internal adjmgr client index
Filter 1: EthType 0x0806,
Rx: 136090542, Drop: 2463 Protocol Name Alias UUID Index
Ctrl SAP: 278 bfd bfd 706 7
Total Data tags : 2 Data tag 1: 131072 Data tag 2: 131073 netstack Static 545 6
Total Rx: 136097062, Drop: 2463, Tx: 124834078, Drop: 0 IPv4 Static 268 4
arp arp 268 3
IP IP 545 2
#CLUS CTHDCN-2303 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 11
Data collection
Nexus7700 # show process cpu sort
2019 Feb 17 12:57:05.048950 00:1a:64:db:9f:a2 -> 03:bf:0a:f0:03:23 ARP Who has 10.240.3.35? Tell 10.240.3.3
2019 Feb 17 12:57:05.048954 00:1a:64:db:9f:a2 -> 03:bf:0a:f0:03:23 ARP Who has 10.240.3.35? Tell 10.240.3.3
2019 Feb 17 12:57:05.049075 00:1a:64:db:9f:a2 -> 03:bf:0a:f0:03:23 ARP Who has 10.240.3.35? Tell 10.240.3.3
2019 Feb 17 12:57:05.049079 00:1a:64:db:9f:a2 -> 03:bf:0a:f0:03:23 ARP Who has 10.240.3.35? Tell 10.240.3.3
2019 Feb 17 12:57:05.049200 00:1a:64:db:9f:a2 -> 03:bf:0a:f0:03:23 ARP Who has 10.240.3.35? Tell 10.240.3.3
2019 Feb 17 12:57:05.049204 00:1a:64:db:9f:a2 -> 03:bf:0a:f0:03:23 ARP Who has 10.240.3.35? Tell 10.240.3.3
2019 Feb 17 12:57:05.049324 00:1a:64:db:9f:a2 -> 03:bf:0a:f0:03:23 ARP Who has 10.240.3.35? Tell 10.240.3.3
2019 Feb 17 12:57:05.049328 00:1a:64:db:9f:a2 -> 03:bf:0a:f0:03:23 ARP Who has 10.240.3.35? Tell 10.240.3.3
2019 Feb 17 12:57:05.049450 00:1a:64:db:9f:a2 -> 03:bf:0a:f0:03:23 ARP Who has 10.240.3.35? Tell 10.240.3.3
2019 Feb 17 12:57:05.049454 00:1a:64:db:9f:a2 -> 03:bf:0a:f0:03:23 ARP Who has 10.240.3.35? Tell 10.240.3.3
#CLUS CTHDCN-2303 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 12
Conclusion From Data collected
#CLUS CTHDCN-2303 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 13
Conclusion From Data collected
Special case:
• In a such a scenario where ARP storm is created on a vlan where vlan interface (SVI) for that vlan
is not present on the switch.
• For instance, these APR packets are not punted to the CPU (process). However, they are still
subject to control plane policy in hardware on the line card level.
• These ARPs will still obstruct valid ARP destined to the CPU on other vlans which can lead to
disrupt the control plane traffic that relies on ARP.
• This can be still identified by using below commands
#CLUS CTHDCN-2303 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 14
Mitigation of the Issue
Background Information
#CLUS CTHDCN-2303 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 15
Mitigation using Glean Throttle
• Single packet is sent to CPU of given flow with “hardware ip glean throttle”
• A single packet is enough to generate ARP request
• Software adds /32 drop adjacency in hardware preventing excess packets to CPU
• Drop adjacency is installed for short period of time and is configurable
• After timer expires , one packet is again sent to CPU and process repeat
• The number of entries installed in this fashion are limited to 1000 and are configurable
• This limit of 1000 is to limit the impact on Routing Information Base (RIB) table size
#CLUS CTHDCN-2303 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 16
Glean Throttle- Example
Server with IP 172.28.191.200 is down . However Line card receiving traffic for this server.
#CLUS CTHDCN-2303 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 17
Mitigation using Glean Throttle
Nexus7700 # show system internal forwarding vrf VRF_ABC ipv4 route 172.28.191.200 detail
slot 1
#CLUS CTHDCN-2303 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 18
Mitigation using Glean Throttle
Module: 1
---------------------------------------------------------
L3 mtu 500 0 0 0
L3 ttl 500 0 0 0
#CLUS CTHDCN-2303 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 19
Mitigation using Glean Throttle
#CLUS CTHDCN-2303 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 20
Mitigation using Glean Throttle
Nexus7700 # show system internal forwarding vrf VRF_ABC ipv4 route 172.28.191.200 detail
slot 1
#CLUS CTHDCN-2303 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 21
Mitigation using Glean Throttle
Module: 1
----------------------------------------------------------
Hardware Rate limiter
L3 mtu 500 0 0 0 does not see drops
L3 ttl 500 0 0 0
L3 glean 100 0 0 0
#CLUS CTHDCN-2303 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 22
Helpful commands and information
• sh interface | in Ethernet|discard
• sh system resources
#CLUS CTHDCN-2303 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 23
Helpful commands and information
https://www.cisco.com/c/en/us/support/docs/switches/nexus-7000-series-switches/116136-trouble-
ethanalyzer-nexus7000-00.html
https://www.cisco.com/c/en/us/support/docs/switches/nexus-7000-series-switches/200677-Nexus-7000-
Understanding-hardware-ip-g.html
#CLUS CTHDCN-2303 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 24
Cisco Webex Teams
Questions?
Use Cisco Webex Teams to chat
with the speaker after the session
How
1 Find this session in the Cisco Live Mobile App
2 Click “Join the Discussion”
3 Install Webex Teams or go directly to the team space
4 Enter messages/questions in the team space
#CLUS © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 25
Complete your
online session • Please complete your session survey
evaluation after each session. Your feedback
is very important.
• Complete a minimum of 4 session
surveys and the Overall Conference
survey (starting on Thursday) to
receive your Cisco Live water bottle.
• All surveys can be taken in the Cisco Live
Mobile App or by logging in to the Session
Catalog on ciscolive.cisco.com/us.
Cisco Live sessions will be available for viewing
on demand after the event at ciscolive.cisco.com.
#CLUS BRKACI-3456 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 26
Continue your education
Demos in the
Walk-in labs
Cisco campus
#CLUS Session ID © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 27
Thank you
#CLUS
#CLUS