»Demystifying Network Cards«
2017-12-27, 11:45–12:15, Saal Borg
Network cards (NICs) are often seen as black boxes: you put data in a socket on one side and packets come out at the other end - or the other way around. Let's have a deeper look at how a network card actually works at the lower levels by writing a simple userspace driver from scratch for a 10 Gbit/s NIC.
The first part of the talk looks at the evolution from 10 Mbit/s to 100 Gbit/s networks - both from a hardware perspective and how the software had to change from Linux pre-NAPI all the way to Linux XDP to keep up. Hundreds of thousands of lines of code are involved when handling a packet in a typical operating system. Reading and understanding so much code is quite tedious, so the obvious next question is: How hard can it be to implement a driver for a modern 10 Gbit/s NIC from scratch while ignoring all of the existing software layers? Turns out that it's not very hard: I've written ixy, a userspace driver for 10 Gbit/s NICs from the Intel 82599 family (X520, X540, X550) from scratch in just about 1000 lines of C code. The second part of the talk focuses on userspace drivers and the Intel 82599 architecture as it is easy to understand, has a great datasheet, and the core functionality is in the driver as opposed to a magic black-box firmware. You will not only learn about my driver implementation - the talk also discusses how it relates to similar network frameworks like DPDK or Snabb and seemingly similar frameworks like netmap, XDP, pf_ring, or pfq.
Packet processing in software is currently undergoing a huge paradigm shift. Connection speeds of 10 Gbit/s and above created new problems and operating systems couldn't keep keep up. Hence, there has been a rise of frameworks and libraries working around the kernel, sometimes referred to as kernel bypass or zero copy (the latter is a misnomer). Examples are DPDK, Snabb, netmap, XDP, pf_ring, and pfq. These new frameworks break with all traditional APIs and present new paradigms. For example, they usually provide an application exclusive access to a network interface and exchange raw packets with the app. There are no sockets, they don't even offer a protocol stack. Hence, they are mostly used for low-level packet processing apps: routers, (virtual) switches, firewalls, and annoying middleboxes "optimizing" your connection. These frameworks have already changed how network research is done in academia by shifting the focus from hardware to software. It's now feasible to write quick prototypes of packet processing and forwarding apps that were restricted to dedicated hardware in the past, enabling everyone to build and test high-speed networking equipment with a low budget.
These concepts are slowly creeping into operating systems: FreeBSD ships with netmap today, XDP is coming to Linux, Open vSwitch can be compiled with a DPDK backend, pfSense is adopting DPDK as well, ... We need to look at the architecture of all of these frameworks to better understand what is coming for us. Most of these frameworks build on the original drivers that have been growing in complexity: a typical driver for a 10 or 40 Gbit/s NIC is in the order of 50,000 lines of code nowadays. This is why it's important to have a simple driver like ixy: for hacking and educational purposes. Core functionality of the driver, like handling DMA buffers, is never far away when writing an ixy app: you typically only need to look beneath one layer to see the guts of the driver. For example, when you send out a packet you call a transmit function that directly modifies a ring buffer of DMA descriptors.
ixy is a full userspace driver: you get your raw packets delivered directly into your application and the operating system doesn't even know the NIC exists. Userspace drivers are also very hackable, you get direct access to the full hardware in your application in userspace making it really easy to test out new features, no pesky kernel code needed.
Check out the code of ixy on GitHub!