Introduction
Network issues on VMware ESXi hosts can bring down entire clusters of virtual machines, disrupt vMotion migrations, and cripple storage connectivity. Troubleshooting ESXi networking requires understanding both the virtual switching layer inside the hypervisor and the physical network infrastructure it connects to.
This guide provides a structured troubleshooting methodology for the most common ESXi networking problems you will encounter, complete with the CLI commands and diagnostic steps needed to isolate and resolve each issue. Whether you are a VMware admin, a network engineer, or a sysadmin who manages both, this checklist will help you get to root cause faster.
Troubleshooting Methodology
Before diving into specific issues, follow this general approach for any ESXi networking problem:
- Define the scope: Is the issue affecting one VM, all VMs on a host, all VMs on a port group, or the entire host?
- Check the physical layer first: NIC link status, speed/duplex, cable faults
- Check the virtual layer: vSwitch/VDS configuration, port group settings, VLAN assignments
- Verify end-to-end: vmkping, packet captures, firewall rules
- Check recent changes: Was anything modified on the ESXi host, vCenter, or physical switches?
Problem 1: ESXi Host Has Lost Network Connectivity
Symptoms
- Host shows as “Disconnected” or “Not Responding” in vCenter
- Cannot SSH to the ESXi host
- VMs on the host have lost all network connectivity
Diagnostic Steps
# Access the host via DCUI (Direct Console User Interface) or iLO/iDRAC/CIMC
# 1. Check physical NIC status
esxcli network nic list
# Look for: Link Status = Up, Speed/Duplex correct (e.g., 10000 Full)
# 2. Check if the management VMkernel has an IP
esxcli network ip interface ipv4 get
# Verify vmk0 has the correct IP address and subnet mask
# 3. Check the management vSwitch
esxcli network vswitch standard list
# Verify vSwitch0 has uplinks (vmnics) assigned and they show as "Up"
# 4. Test connectivity from the management VMkernel
vmkping -I vmk0 10.1.1.1
# Ping the default gateway
# 5. Check the routing table
esxcli network ip route ipv4 list
# Verify the default route exists and points to the correct gateway
# 6. Check for NIC driver issues
esxcli software vib list | grep -i net
# Verify network driver VIBs are installed and current
# 7. Check system logs for NIC errors
cat /var/log/vmkernel.log | grep -i "nic\|network\|link\|uplink" | tail -50
Common Causes and Fixes
- Physical cable disconnected or faulty: Check
esxcli network nic list— if Link Status is Down, the issue is physical. Check cables, switch port, SFP module - NIC driver crash: Check
/var/log/vmkernel.logfor driver errors. Try reloading the driver:esxcli system module set -m ixgbe -e false && esxcli system module set -m ixgbe -e true(replace ixgbe with your driver name) - Management VLAN changed on physical switch: If the native VLAN or allowed VLANs were changed on the physical switch port, the host loses management connectivity. Fix the physical switch or adjust ESXi VLAN tag
- IP address conflict: Another device has the same IP. Check ARP tables on the default gateway
Problem 2: VMs Cannot Communicate Across VLANs
Symptoms
- VMs on the same port group can ping each other
- VMs cannot reach the default gateway or VMs on other VLANs
- Traffic seems to stay local to the ESXi host
Diagnostic Steps
# 1. Verify the VM port group VLAN ID
esxcli network vswitch standard portgroup list
# Check that the VLAN ID matches what the physical switch expects
# For VDS:
esxcli network vswitch dvs vmware list
# 2. Verify the physical switch trunk configuration
# On the physical switch (Cisco example):
show interface GigabitEthernet1/0/1 trunk
# Confirm the VLAN is in the "allowed" list and in the "active" state
# 3. Check if the uplink is actually trunking
# On ESXi, capture traffic on the uplink to see VLAN tags:
pktcap-uw --uplink vmnic0 -o /tmp/uplink_capture.pcap --count 100
# 4. Check the VM's virtual NIC connection
# In vSphere Client: VM > Edit Settings > Network Adapter
# Verify: Connected = Yes, Port Group = correct VLAN
Common Causes and Fixes
- VLAN ID mismatch: The port group VLAN on ESXi does not match the VLAN allowed on the physical switch trunk. Fix: Align the VLAN IDs on both sides
- Physical switch port in access mode: The switch port is not configured as a trunk, so all VLAN-tagged traffic from ESXi is dropped. Fix: Set the switch port to trunk mode
- VLAN not created on physical switch: The VLAN exists on ESXi but was never created in the physical switch’s VLAN database. Fix: Create the VLAN on the switch
- Spanning Tree blocking: The switch port may be in STP blocking state. Fix: Add
spanning-tree portfast trunkto the switch port configuration
Problem 3: Intermittent Packet Loss or Slow Performance
Symptoms
- VMs experience intermittent connectivity issues
- Application performance is degraded
- Pings show variable latency or packet loss
Diagnostic Steps
# 1. Check physical NIC error counters
esxcli network nic stats get -n vmnic0
# Look for: Rx errors, Tx errors, Rx dropped, Tx dropped, CRC errors
# 2. Check NIC speed and duplex
esxcli network nic get -n vmnic0
# Verify: Speed is correct (1000/10000/25000), Duplex is Full
# A duplex mismatch causes massive packet loss
# 3. Check for NIC ring buffer overflows
vsish -e cat /net/pNics/vmnic0/stats
# Look for rxdroppedbyRingFull — indicates the NIC receive buffer is overflowing
# 4. Check VM network adapter type
# In vSphere Client: VM > Edit Settings > Network Adapter
# Verify adapter type is VMXNET3 (not E1000 or E1000E)
# VMXNET3 provides much better performance than emulated adapters
# 5. Check for traffic shaping limits
esxcli network vswitch standard portgroup policy list -p "Your-Port-Group"
# Look for traffic shaping being enabled with low bandwidth limits
# 6. Check physical switch interface errors
# On the physical switch:
show interface GigabitEthernet1/0/1
# Look for: CRC errors, input errors, output drops, late collisions
# 7. Monitor in real-time with esxtop
esxtop
# Press 'n' for network view
# Look for: %DRPTX, %DRPRX (dropped packets), MbTX/MbRX (throughput)
Common Causes and Fixes
- Duplex mismatch: One side is full-duplex, the other is half. This causes late collisions and massive packet loss. Fix: Set both sides to auto-negotiate or force the same speed/duplex
- NIC ring buffer too small: Increase the ring buffer size:
esxcli network nic ring current set -n vmnic0 -r 4096 -t 4096 - E1000 virtual NIC: The emulated E1000 adapter is much slower than VMXNET3. Change the VM’s network adapter type to VMXNET3 (requires VMware Tools installed in the guest)
- Oversubscribed uplinks: Too many VMs competing for limited physical bandwidth. Use NIOC to prioritize critical traffic, or add more physical uplinks
- Faulty cable or SFP: Physical layer errors (CRC, FCS) that increase under load. Replace the cable or SFP
Problem 4: vMotion Fails
Symptoms
- vMotion migration fails with a timeout or network error
- vMotion is extremely slow (minutes for a small VM)
- vCenter reports “migration exceeded maximum switchover time”
Diagnostic Steps
# 1. Verify vMotion VMkernel exists on both hosts
esxcli network ip interface list
# Look for a vmk tagged with "VMotion" service enabled
# 2. Verify vMotion connectivity between hosts
vmkping -I vmk1 [other-host-vmotion-ip]
# Replace vmk1 with your vMotion VMkernel adapter
# 3. Test with jumbo frames if enabled
vmkping -I vmk1 -d -s 8972 [other-host-vmotion-ip]
# If this fails, there is an MTU mismatch in the path
# 4. Check vMotion VMkernel VLAN matches on both sides
esxcli network ip interface tag get -i vmk1
# 5. Check vMotion bandwidth
# During a migration, run esxtop on the source host
esxtop
# Press 'n' for network view
# Look at MbTX on the vMotion vmk — should be near link speed
# 6. Check logs for vMotion errors
grep -i "vmotion\|migrate" /var/log/vpxa.log | tail -30
Common Causes and Fixes
- vMotion VMkernel not enabled: The VMkernel port exists but vMotion service is not enabled on it. Fix: Edit the VMkernel adapter and enable the vMotion checkbox
- Different VLANs/subnets: vMotion VMkernels on source and destination hosts must be in the same Layer 2 broadcast domain (or you need Enhanced vMotion with routed vMotion in vSphere 8). Fix: Put both vMotion VMkernels on the same VLAN/subnet
- MTU mismatch: If jumbo frames are enabled on one host but not the other (or not on the physical switch between them), large packets are dropped. Fix: Ensure consistent MTU across all vMotion VMkernels and physical switches
- Bandwidth limitation: vMotion over a 1 Gbps link is much slower than 10/25 Gbps. Large VMs with high memory churn rate may time out on slow links. Fix: Use dedicated 10+ Gbps links for vMotion
- Firewall blocking: A firewall between the hosts is blocking vMotion traffic (TCP 8000 and high-numbered ports). Fix: Allow vMotion traffic between hosts
Problem 5: Storage Network Issues (NFS/iSCSI)
Symptoms
- Datastores show as inaccessible
- VMs experience high disk latency
- APD (All Paths Down) or PDL (Permanent Device Loss) events in logs
Diagnostic Steps
# 1. Verify storage VMkernel connectivity
vmkping -I vmk2 [storage-array-ip]
# 2. Check NFS datastore mount status
esxcli storage nfs list
esxcli storage nfs41 list
# 3. Check iSCSI sessions
esxcli iscsi session list
# 4. Check storage adapter status
esxcli storage core adapter list
# 5. Look for storage-related errors in logs
grep -i "nfs\|iscsi\|storage\|APD\|PDL\|SCSI" /var/log/vmkernel.log | tail -50
# 6. Check storage latency with esxtop
esxtop
# Press 'u' for disk/storage view
# Look for: DAVG (device average latency) — should be under 20ms
# KAVG (kernel average latency) — should be under 2ms
# GAVG = DAVG + KAVG (total guest-visible latency)
# 7. Test with jumbo frames if storage network uses them
vmkping -I vmk2 -d -s 8972 [storage-array-ip]
Common Causes and Fixes
- Storage VMkernel on wrong VLAN: The storage VMkernel was accidentally put on a different VLAN than the storage array. Fix: Verify and correct the VLAN assignment
- MTU mismatch on storage network: If the storage network uses jumbo frames, every hop (ESXi vmk, physical switch, storage array port) must have matching MTU. A single mismatch causes connection drops or poor performance
- NIC failover misconfigured: If the active NIC for the storage port group fails over to a NIC on a different physical switch that does not have the storage VLAN, connectivity is lost. Fix: Ensure the failover NICs have the same VLAN access
- Overloaded storage network: Too much non-storage traffic sharing the same physical NICs. Fix: Use NIOC to reserve bandwidth for storage, or dedicate physical NICs to storage traffic
Problem 6: VM Network Adapter Issues
Quick Checks
# Check VM's network adapter from ESXi CLI
vim-cmd vmsvc/getallvms
# Get the VM ID, then:
vim-cmd vmsvc/device.getdevices [vmid] | grep -A5 "Network"
# Verify the VM is connected to the correct port group
esxcli network vm list
esxcli network vm port list -w [world-id]
# Check if the VM port group exists
esxcli network vswitch standard portgroup list
Common VM-Level Fixes
- VM NIC disconnected: The “Connected” checkbox in VM settings is unchecked. Edit VM settings and check the “Connected” box
- Wrong port group: The VM is connected to the wrong port group (wrong VLAN). Change the port group assignment in VM settings
- E1000 adapter instead of VMXNET3: Older VMs may have the legacy E1000 adapter. Change to VMXNET3 for better performance (requires VMware Tools and a brief network interruption)
- Guest OS firewall: The guest OS (Windows Firewall, iptables) is blocking traffic. Check guest firewall rules
- Stale MAC address in physical switch CAM table: After a vMotion, the physical switch may still have the old host’s port in its MAC address table. Fix: Wait for MAC aging (default 300 seconds) or clear the MAC table entry on the switch:
clear mac address-table dynamic address [mac-addr]
Essential ESXi Networking Commands Cheat Sheet
| Command | Purpose |
|---|---|
esxcli network nic list | List all physical NICs and link status |
esxcli network nic stats get -n vmnic0 | Show NIC error counters and statistics |
esxcli network vswitch standard list | Show standard vSwitch configuration |
esxcli network vswitch dvs vmware list | Show Distributed Switch configuration |
esxcli network ip interface list | List all VMkernel interfaces |
esxcli network ip interface ipv4 get | Show VMkernel IP addresses |
vmkping -I vmk0 [ip] | Ping from a specific VMkernel interface |
vmkping -I vmk0 -d -s 8972 [ip] | Test jumbo frame path (no fragmentation) |
esxcli network ip route ipv4 list | Show routing table |
esxcli network ip neighbor list | Show ARP table |
pktcap-uw --uplink vmnic0 -o /tmp/cap.pcap | Capture packets on a physical NIC |
esxcli network vm list | List all VMs and their network world IDs |
esxcli network firewall ruleset list | Show ESXi host firewall rules |
esxtop (press ‘n’) | Real-time network performance monitoring |
Related Resources on UnifiedGuru
- vSphere 8 Networking Guide for Network Engineers
- VLAN ID Service Console in ESX
- All VMware Configuration Templates
- ESXi CLI Troubleshooting Templates
Conclusion
ESXi network troubleshooting comes down to methodically working through the layers — physical NIC status, virtual switch configuration, VLAN alignment with the physical network, and end-to-end connectivity verification. The esxcli commands and pktcap-uw packet capture tool give you the same level of visibility into the virtual network that you have on physical switches. Combined with checking the physical switch side, you can isolate most networking issues to a specific layer and fix them quickly.
Running into a tricky ESXi networking issue not covered here? Post the details in our ESXi/vSphere forum and the community can help you troubleshoot.