HN829: EVPN/VXLAN Vs. TradCore
Heavy Networking hosts Ethan Banks and Drew Connery-Murray interview network instructor Tony Burke about how to choose between traditional core (TradCore) networking and EVPN/VXLAN fabric designs. The discussion covers key decision factors including scale, redundancy, operational overhead, and automation requirements. Tony argues both approaches remain valid in 2026 depending on organizational context, pushing back against the industry trend of recommending EVPN/VXLAN for every scenario.
Summary
The episode opens by framing the central question: when should a network engineer choose TradCore (traditional core-aggregation-access layer designs using SVIs, VLANs, spanning tree, and MLag) versus EVPN/VXLAN fabric designs for data center or campus networks? Tony Burke, a seasoned network instructor, pitched the topic after observing the question arise repeatedly among students and in network design discussions.
Tony begins by defining TradCore as the legacy networking stack using Layer 2 constructs like 802.1Q trunks, VLANs, and MLag/VPC for redundancy at the aggregation layer — a model familiar to anyone trained on Cisco's CCNA/CCNP curriculum. He contrasts this with EVPN/VXLAN, which uses an underlay routing protocol, an MPBGP overlay, VXLAN encapsulation, and constructs like VNIs, route targets, and route distinguishers to create a scalable, distributed fabric.
A common misconception Tony addresses is the VLAN scale argument: while VXLAN theoretically supports over 16 million segments, in practice each VXLAN segment still maps to a local VLAN, so most deployments remain constrained to the 4,000 VLAN limit unless significant complexity is added (as Cisco ACI does internally). Multi-vendor EVPN/VXLAN is technically possible but Tony strongly discourages it due to differing vendor interpretations, inconsistent show commands, and the operational burden of cross-vendor troubleshooting — even between Cisco and Arista, which have similar CLI styles but very different EVPN implementations.
The three primary decision factors Tony identifies are: (1) Scale — EVPN/VXLAN scales far better, supporting hundreds of leaf switches via spine layers and super-spine (five-stage Clos) topologies, while TradCore is limited by the number of chassis line cards and the constraint of only two aggregation devices in an MLag pair; (2) Redundancy — EVPN/VXLAN allows three or four spines, preserving 66-75% capacity after a failure, whereas TradCore's two-device aggregation layer loses all redundancy if one device fails; and (3) Operational overhead — EVPN/VXLAN is significantly more complex to configure and operate, involving underlays, overlays, VLAN-to-VNI mappings, multiple BGP EVPN route types, and distributed forwarding tables.
Tony is emphatic that EVPN/VXLAN should never be deployed manually in production — only in labs for learning purposes. He argues automation is essentially mandatory for EVPN/VXLAN environments, whether through vendor tools like Arista Cloud Vision/AVD, Cisco Nexus Dashboard/Catalyst Center, Juniper Apstra (now Apstra Data Center Director), or Nokia's Event-Driven Automation, or through homegrown solutions using Ansible, Jinja, Python, and YAML. TradCore, by contrast, can be managed effectively without automation, which matters for smaller teams with broad responsibilities.
On the topic of workload mobility — the requirement that any subnet be reachable from any rack, enabling vMotion and flexible workload placement — both TradCore and EVPN/VXLAN satisfy the requirement equally within a single data center. EVPN/VXLAN has a slight edge for multi-data-center interconnect. This requirement dates to around 2006-2008 when AMD and Intel processor virtualization extensions made VM workloads practical with near-zero performance overhead.
Tony and the hosts also compare MLag versus ESI (Ethernet Segment Identifier) for multi-homing hosts to multiple switches. MLag is a mature, proprietary-between-two-devices technology that requires same-vendor hardware and matched software versions. ESI, available only in EVPN/VXLAN environments, is a distributed LAG using a 10-byte identifier, supports more than two devices, allows mixed hardware models, and does not require a dedicated peer link between switches — though Tony notes these advantages are rarely decisive in practice given typical rack-and-stack patterns.
The conversation concludes with a caution against industry fashion: VARs and vendors are often pushing EVPN/VXLAN as a default recommendation, but Tony and the hosts argue that TradCore remains a fully legitimate, often preferable choice for smaller organizations, teams without automation skills, or networks that will never reach the scale where EVPN/VXLAN's advantages materialize. Tony summarizes: small shops with limited teams should lean TradCore; very large fabrics should use EVPN/VXLAN; and many mid-size environments could go either way depending on specific organizational factors.
Key Insights
- Tony Burke argues that EVPN/VXLAN's theoretical 16+ million segment scale is largely negated in practice because each VXLAN segment still requires a local VLAN mapping, keeping most deployments within the same 4,000 VLAN constraint as TradCore.
- Tony contends that EVPN/VXLAN should never be configured manually in production environments — only in labs — because the number of interdependent configuration elements (route targets, route distinguishers, VNI mappings, BGP EVPN route types) makes manual operation too error-prone.
- Tony argues that TradCore's two-device MLag aggregation layer is a meaningful redundancy disadvantage compared to EVPN/VXLAN's typical three-to-four spine topology, where losing one spine still preserves 66-75% of forwarding capacity.
- Tony strongly discourages multi-vendor EVPN/VXLAN deployments, citing differing vendor interpretations of the standard, inconsistent CLI commands, and the operational burden of cross-vendor troubleshooting — arguing the administrative overhead outweighs interoperability improvements seen in recent years.
- Tony claims that for networks with roughly two switches, EVPN/VXLAN provides essentially no meaningful benefit over TradCore — the configuration complexity is high and neither redundancy nor scale advantages apply at that size.
- Tony observes that organizations without established automation skills or tooling are better served by TradCore, because TradCore can be operated effectively by hand in a way that EVPN/VXLAN fundamentally cannot be at production scale.
- Tony distinguishes ESI (Ethernet Segment Identifier) from MLag, noting ESI supports more than two devices and does not require a dedicated peer link, but argues these advantages are rarely decisive in practice because most deployments naturally home servers to two top-of-rack switches in the same rack anyway.
- Tony warns that EVPN/VXLAN has become an industry fashion pushed by VARs and vendors, and argues that network engineers should demand specific justification for the design choice rather than accepting it as a default recommendation — with TradCore remaining a fully valid option for many real-world scenarios.
Topics
Full transcript available for MurmurCast members
Sign Up to Access