DPDK: Difference between revisions

From VoIPmonitor.org
Jump to navigation Jump to search
No edit summary
No edit summary
Line 23: Line 23:
* bind/unbind means that when you undind NIC port from the kernel you cannot use it within the operating system - the port dissapears (you will not see eth1 for example)
* bind/unbind means that when you undind NIC port from the kernel you cannot use it within the operating system - the port dissapears (you will not see eth1 for example)
* you can unbind from dpdk and bind back to kernel so eth1 can be used again
* you can unbind from dpdk and bind back to kernel so eth1 can be used again
* dpdk is referencing NIC port by the PCI address which you can get from the "dpdk-devbind.py -s" command for example


list of available network devices:  
list of available network devices:  
Line 50: Line 51:
  GRUB_CMDLINE_LINUX_DEFAULT="iommu=pt intel_iommu=on"
  GRUB_CMDLINE_LINUX_DEFAULT="iommu=pt intel_iommu=on"


Loading igb_uio for IBM internal X540-AT2 4 port 10Gbit card:
Loading igb_uio for X540-AT2 4 port 10Gbit card (if vfio does not work)


  modprobe igb_uio
  modprobe igb_uio
Line 62: Line 63:


= Sniffer configuration =
= Sniffer configuration =
== mandatory parameters ==
*dpdk_read_thread_core sets on which CPU core will reader (polling NIC for packets) run.
*dpdk_worker_thread_core sets on which CPU core will worker run - it should run on hyperthread sibbling to core you set for dpdk_read_thread_core
*dpdk_pci_device  - what interface will be used for sniffing packets
*it is important to lock reader and worker threads to particular CPU cores so that sniffer will not use those cores for other threads
*in case of more NUMA nodes (two or more physical CPUs) always chose CPU cores for reader and worker thread which are on the same NUMA node for the NIC PCI card
voipmonitor.conf:
interface = dpdk:0
dpdk = yes
dpdk_read_thread_core = 2
dpdk_worker_thread_core = 3
dpdk_pci_device = 0000:04:00.0
== optional parameters ==
cpu_cores = 1,2,3-5,4 ; this sets cpu affinity for the voipmonitor. It is automatically set to all cpu cores except dpdk_read_thread_core and dpdk_worker_thread_core. Using this option will override automatic cpu cores
dpdk_nb_rx = 4096 ; this is ring buffer on the NIC port. Maximum for Intel X540 is 4096 but it can be larger for others
dpdk_nb_tx = 1024 (we do not need ring buffer for sending, but dpdk wants to have this - default is 1024
dpdk_nb_mbufs = 8192 ; number of packets multiplied by 1024 between reader and worker (buffer size). Each packet size is around 2kb which means that it will allocate 16GB of RAM by default
dpdk_ring_size = 4096; number of packets * 1024 in ring buffer which holds references to mbuf structures between worker thread and voipmonitor's packet buffer
dpdk_pkt_burst = 32 ; do not change this unless you exactly know what you are doing
dpdk_mempool_cache_size = 512; size of the cache size for dpdk mempool (do not change this until you exactly know what you are doing)
dpdk_memory_channels = 4; number of memory bank channels - if not specified, dpdk uses default value (TODO: we are not sure if it tries to guess it or what is the default)
dpdk_force_max_simd_bitwidth = 512; default is not set - if you have CPU which supports AVX 512 and you have compiled dpdk with AVX 512 support you can try to enable this and set 512

Revision as of 13:45, 30 September 2021

What is DPDK

DPDK is the Data Plane Development Kit that consists of libraries to accelerate packet processing workloads running on a wide variety of CPU architectures. Designed to run on x86, POWER and ARM processors. Polling-mode drivers skips packet processing from the operating system kernel to processes running in user space. This offloading achieves higher computing efficiency and higher packet throughput than is possible using the interrupt-driven processing provided in the kernel.

Why DPDK for voipmonitor

Sniffing packets by kernel linux is driven by IRQ interrupts - every packet (or if driver supports every set of packets) needs to be handled by interrupt which has limitation around 3Gbit on 10Gbit cards (it depends on CPU). DPDK allows to read pacekts directly in userspace not using interrupts which allows faster packet reading (so called poll-mode reading). It needs some tweaks to the operating system (cpu affinity / NOHZ kernel) as the reader thread is sensitive to any scheduler delays which can occur on overloaded or misconfigured system. For 6Gbit packet rate with 3 000 000 packets / second any slight delays can cause packet drops.


installation

Version >= DPDK 21.08.0 is requried - download the latest version from:

https://core.dpdk.org/download/


How it works

On supported NIC cards (https://core.dpdk.org/supported/) the ethernet port needs to be unbinded from kernel and binded to DPDK, the command for it is:

  • no special driver is needed - debian 10/11 already has support for this out of the box
  • bind/unbind means that when you undind NIC port from the kernel you cannot use it within the operating system - the port dissapears (you will not see eth1 for example)
  • you can unbind from dpdk and bind back to kernel so eth1 can be used again
  • dpdk is referencing NIC port by the PCI address which you can get from the "dpdk-devbind.py -s" command for example

list of available network devices:

dpdk-devbind.py -s

Network devices using kernel driver
===================================
0000:0b:00.0 'NetXtreme II BCM5709 Gigabit Ethernet 1639' if=enp11s0f0 drv=bnx2 unused= *Active*
0000:0b:00.1 'NetXtreme II BCM5709 Gigabit Ethernet 1639' if=enp11s0f1 drv=bnx2 unused=
0000:1f:00.0 'Ethernet Controller 10-Gigabit X540-AT2 1528' if=ens3f0 drv=ixgbe unused=
0000:1f:00.1 'Ethernet Controller 10-Gigabit X540-AT2 1528' if=ens3f1 drv=ixgbe unused=

bind both 10gbit ports to vfio-pci driver (this driver is available by default on >= debian10)

modprobe vfio-pci
dpdk-devbind.py -b vfio-pci 0000:1f:00.0 0000:1f:00.1

bind B port back to kernel:

dpdk-devbind.py -b ixgbe 0000:1f:00.1

On some systems vfio-pci does not work for 10Gbit card - instead igb_uio (for Intel cards) needs to be loaded alongside with special kernel parameters:

/etc/default/grub:

GRUB_CMDLINE_LINUX_DEFAULT="iommu=pt intel_iommu=on"

Loading igb_uio for X540-AT2 4 port 10Gbit card (if vfio does not work)

modprobe igb_uio

More information about drivers:

https://doc.dpdk.org/guides/linux_gsg/linux_drivers.html


dpdk is now ready to be used by voipmonitor

Sniffer configuration

mandatory parameters

  • dpdk_read_thread_core sets on which CPU core will reader (polling NIC for packets) run.
  • dpdk_worker_thread_core sets on which CPU core will worker run - it should run on hyperthread sibbling to core you set for dpdk_read_thread_core
  • dpdk_pci_device - what interface will be used for sniffing packets
  • it is important to lock reader and worker threads to particular CPU cores so that sniffer will not use those cores for other threads
  • in case of more NUMA nodes (two or more physical CPUs) always chose CPU cores for reader and worker thread which are on the same NUMA node for the NIC PCI card


voipmonitor.conf: 

interface = dpdk:0 
dpdk = yes
dpdk_read_thread_core = 2
dpdk_worker_thread_core = 3
dpdk_pci_device = 0000:04:00.0

optional parameters

cpu_cores = 1,2,3-5,4 ; this sets cpu affinity for the voipmonitor. It is automatically set to all cpu cores except dpdk_read_thread_core and dpdk_worker_thread_core. Using this option will override automatic cpu cores 
dpdk_nb_rx = 4096 ; this is ring buffer on the NIC port. Maximum for Intel X540 is 4096 but it can be larger for others 
dpdk_nb_tx = 1024 (we do not need ring buffer for sending, but dpdk wants to have this - default is 1024
dpdk_nb_mbufs = 8192 ; number of packets multiplied by 1024 between reader and worker (buffer size). Each packet size is around 2kb which means that it will allocate 16GB of RAM by default 
dpdk_ring_size = 4096; number of packets * 1024 in ring buffer which holds references to mbuf structures between worker thread and voipmonitor's packet buffer 
dpdk_pkt_burst = 32 ; do not change this unless you exactly know what you are doing
dpdk_mempool_cache_size = 512; size of the cache size for dpdk mempool (do not change this until you exactly know what you are doing) 
dpdk_memory_channels = 4; number of memory bank channels - if not specified, dpdk uses default value (TODO: we are not sure if it tries to guess it or what is the default)
dpdk_force_max_simd_bitwidth = 512; default is not set - if you have CPU which supports AVX 512 and you have compiled dpdk with AVX 512 support you can try to enable this and set 512