Florentin Putz June 2024

The Nexmon UDP Tunnel

You might have heard of nexmon before, which is a firmware patching framework for Broadcom/Cypress WiFi chips, developed at our research group SEEMOO. Nexmon can be used to enable monitor mode on smartphones and Raspberry Pis with Broadcom chips, to get lower level access to CSI, or to even turn those devices into SDRs. It was also used by the folks at Google Project Zero for their impressive WiFi SoC exploit, which allowed them to take over the smartphone’s operating system by just sending a specially crafted WiFi frame over the air. Fascinating, and also a bit scary.

Back to nexmon. It includes a firmware patch that allows you to transmit arbitrary data from the WiFi chip to the host OS via UDP. One function of the patch looks a bit strange:

static inline uint16_t calc_checksum(uint16_t total_len) {
    return ~(23078 + total_len);
}

What does this function do? And where does the magic number 23078 come from?

That’s what this post is about.

UDP Tunnel

This function actually originated from my Bachelor’s thesis, where I’ve developed a way to extract WiFi probe requests from the WiFi chip and send them to the host OS, to demonstrate a severe privacy vulnerability that affected most modern smartphones at the time. I needed some way to send this data to an Android app running on the host, without the app requiring root privileges. My approach consisted of using IP packets containing UDP datagrams with the probe request as the data payload, thereby constructing a UDP tunnel to send data from the WiFi chip to the host. UDP is very simple and quite easy to implement using both C (which compiles to ARM bytecode running on the WiFi firmware) and Java (running on the Android host).

In principle, both IPv4 and IPv6 packets can be used for this purpose. The main difference lies in the checksum construction:

  1. IPv4 + UDP: IPv4 packets contain a mandatory checksum in their header. This checksum only covers the IPv4 header itself, not the data payload. UDP datagrams inside IPv4 packets contain an optional checksum.

  2. IPv6 + UDP: IPv6 packets do not contain a checksum in their header. UDP datagrams embedded in IPv6 packets, however, contain a mandatory checksum, which covers a pseudo IPv6 header, the UDP header and the UDP data payload.

I chose the former method to decrease the amount of computation on the WiFi chip. Even though I still needed to compute the IPv4 header checksum, this computation can be reduced to a minimum by utilizing the fact that most of the IPv4 header is constant in my system. Parts of the checksum can thus be pre-calculated. With the latter method, this would only be possible to a minor extent, as I would need to calculate the checksum over each probe request payload separately.

The resulting packet structure can be seen in the figure below. This packet is then sent over SDIO to the Linux driver, which will inject the Ethernet packet into the network stack. The embedded UDP datagram can then be received using a datagram socket in userland.

Resulting packet structure. The probe request is wrapped in an IPv4/UDP packet, which is wrapped in an Ethernet packet. In the firmware, this packet then also needs a BCD header for correct transmission over the SDIO interface. Figure from my Bachelor's thesis.

IPv4 Checksum Pre-Calculation

The checksum of the IPv4 packet can be pre-calculated to improve the performance of the UDP tunnel. The only variable field in the IPv4 header is the total length, because this depends on the probe request embedded in the UDP payload.

The IPv4 standard defines the checksum algorithm as follows:

“The checksum field is the 16 bit one’s complement of the one’s complement sum of all 16 bit words in the header. For purposes of computing the checksum, the value of the checksum field is zero.”

My IPv4 header looks like this:

unsigned char ipv4_header_array[] = {
  0x45, 0x00,             /* Version, Header length, DSCP, ECN */
  0x00, 0x00,             /* Total length (determined later) */
  0x00, 0x01, 0x00, 0x00, /* Identification, Flags, Fragments */
  0x01, 0x11,             /* TTL, Protocol (UDP) */
  0x00, 0x00,             /* Header checksum (determined later) */
  0x0A, 0x0A, 0x0A, 0x0A, /* Source IP = 10.10.10.10*/
  0xFF, 0xFF, 0xFF, 0xFF  /* Destination IP = 255.255.255.255 */
};

The following Python code demonstrates a naive checksum calculation that strictly follows the algorithm:

def calc_chksum(total_len):
    """Calculates the IPv4 header checksum of the packet used in the
    ‘extractor‘, given the total IPv4 packet length.
    """
    accu = 0x4500 + 0x1 + 0x111 + 0xA0A + 0xA0A + 0xFFFF + 0xFFFF
    accu += total_len
    carry = accu >> 16
    withoutcarry = accu & 0xFFFF
    val = carry + withoutcarry
    return ~val & 0xFFFF

To reduce the amount of computation, the sum can be calculated without including the total length. If the addition of the total length does not produce another carry, the total length can also be added later on. No carry is produced for all total lengths that are smaller or equal than 42457 Bytes, which corresponds to a maximum probe request size of 42429 Bytes. According to the IEEE 802.11 standard, management frames, and therefore probe requests, have a maximum size of 2352 Bytes. Thus, our simplification is accurate for all possible probe requests.

The resulting code looks like this:

def calc_chksum_fast(total_len):
    return ~(23078 + total_len) & 0xFFFF # accurate for total_len <= 42457

Which corresponds to the C code at the top.