PMTU weirdness

Sat 11 March 2023

On a good day, my day job involves building networking tools in python. Too many python networking tools look like shell scripts, spawning subprocesses for basic tools like 'ping' or 'ip' - often resulting in a fragile mess due to poor, or inconsistent error handling.

I was quite excited to find icmplib. It provides a much simpler, less fragile way to do things like reachability tests, RTT measurements, path discovery and path MTU discovery in python code. Hopefully it finds its way into Fedora soon!

Armed with icmplib, I went on a journey of discovery to develop my understanding of Path MTU. I specifically wanted to understand the differences between IPv4 and IPv6, and the effect of VETH, VLAN and Bridge virtual devices.

icmplib needed some improvement to be able to set the DF flag in the IPv4 header and IPV6_DONTFRAG socket option to do PMTU discovery. In the IPv6 case, the socket option does not change the IP header, but is an instruction to the kernel to disable the automatic fragmentation of IPv6 packets at the source. I raised a PR for this.

The results were not entirely obvious to me: pmtu test results

It seems like devices are willing to accept frames that are larger than their configured MTU, but won't send frames larger than their configured MTU. This is consistent with Postel's Robustness Principle

The side effect is that a PMTU discovery may discover a slightly larger usable PMTU than what was intentionally configured on remote devices. Could this lead to unintended, undiscovered fragmentation?

And if none of this meant anything to you, remember this: bridges don't fragment; routers do.