I got one thing wrong, and that saves bundles of work. Here's from experimental code,
/*
* Confusingly, ip(7) states
*
* IP_MTU (since Linux 2.2)
* Retrieve the current known path MTU of the current socket.
* Returns an integer. IP_MTU is valid only for getsockopt(2) and
* can be employed only when the socket has been connected.
*
* Similarly, ipv6(7) states
*
* IPV6_MTU
* getsockopt(): Retrieve the current known path MTU of the current
* socket. Valid only when the socket has been connected. Returns
* an integer.
*
* setsockopt(): Set the MTU to be used for the socket. The MTU
* is limited by the device MTU or the path MTU when path MTU
* discovery is enabled. Argument is a pointer to integer.
*
* This suggests that IP_MTU is a socket property. However, it makes
* more sense as a shared global property, which indeed seems to apply:
*
* The ipv6(7) entry for IPV6_MTU_DISCOVER references IP_MTU_DISCOVER;
* the ip(7) entry for IP_MTU_DISCOVER states
*
* IP_MTU_DISCOVER (since Linux 2.2)
* When PMTU discovery is enabled, the kernel automatically keeps track
* of the path MTU per destination host. When it is connected to a
* specific peer with connect(2), the currently known path MTU can be
* retrieved conveniently using the IP_MTU socket option (e.g., after
* an EMSGSIZE error occurred). The path MTU may change over time.
* For connectionless sockets with many destinations, the new MTU for a
* given destination can also be accessed using the error queue (see
* IP_RECVERR). A new error will be queued for every incoming MTU update.
*
* While MTU discovery is in progress, initial packets from datagram
* sockets may be dropped. Applications using UDP should be aware
* of this and not take it into account for their packet retransmit strategy.
*
* Retransmission is common in UDP applications. Ideally, the IP_RECVERR or
* IPV6_RECVERR are used to immediately resend, without wait for timers to
* expire; and without limiting the number of Path MTU lessens learnt to the
* number of timer rounds.
*
* For IPv6, where fragmenttion is required to accomodate the Path MTU, and
* for unconnected applications, the lessons from Path MTU discovery are of
* major impact on their behaviour; we should always let the socket fragment
* frames when so desired, so:
*
* IP_MTU_DISCOVER (since Linux 2.2)
* IP_PMTUDISC_WANT will fragment a datagram if needed according to the
* path MTU, [IPv4-only: or will set the don't-fragment flag otherwise].
*
* Path MTU discovery value Meaning
* IP_PMTUDISC_WANT Use per-route settings.
* IP_PMTUDISC_DONT Never do Path MTU Discovery.
* IP_PMTUDISC_DO Always do Path MTU Discovery.
* IP_PMTUDISC_PROBE Set DF but ignore Path MTU.
*
*/
I'm documenting it here, so that the knowledge is not lost on the project. This is difficult stuff.
It would seem that Path MTU discovery is not maintained per socket (which would benefit locality and proper cleanup of the knowledge) but as a global kernel property for the route (which benefits reuse of the knowledge, IWO a useful form of caching).
sysctl()
could make such a setting, Kamailio stability demands this for IPv6, AFAIK.IPV6_RECVERR
enables immediate resending, with improved Path MTU knowledge. This involves an extra polling mechanism, which is beyond my reach. This also links into the tm
logic and goes beyond my reach. For sl
replies there will probably be a 2nd round if Path MTU problems arise, because the reply was sent-then-forgotten, and needs to wait for another round.—
Reply to this email directly, view it on GitHub, or unsubscribe.
You are receiving this because you are subscribed to this thread.