Path MTU issues over UDP/IPv6

List overview All Threads
Download

newer

older

LCR GW

Forwarding SIP status code to HTTP...

Rick van Rein

12 May 2022 12 May '22

11:31 a.m.

Hello,

IPv6 routers never fragment packets. Rather, they drop a packet that is too large for a (local) MTU and send back ICMPv6 "Packet Too Big". This seems to cause loss of larger SIP messages when an ISP tunnels their IPv6 at the expense of the MTU.

The pmtu_discovery flag sets Don't Fragment in IPv4 traffic; in IPv6 this is an implied property. Does Kamailio learn a lower MTU from any "Packet Too Big" for IPv6 even if pmtu_discovery is not set? Future resends can then be fragemented appropriately.

The udp_mtu setting diverts to another protocol, but that would be a setting as low as the worst peer, impacting all. It would be a weird struggle with a telco serving many. PMTU would be better to rely on, but how does it work in Kamailio?

Details on https://www.rfc-editor.org/rfc/rfc3542#section-11.3 https://stackoverflow.com/questions/38817837/how-does-mtu-retransmission-wor...

Thanks, -Rick

Show replies by date

Henning Westerholt

17 May 17 May

5:39 p.m.

(sr-dev on CC)

Hello,

I am not aware of a special handling of MTU discovery regarding IPv6 UDP traffic in Kamailio core. But of course, we have a lot of code.

You find the implementation of the MTU handling in the src/core/udp_server.c file. Its just setting the appropriate socket option right now.

Cheers,

Henning

-- Henning Westerholt - https://skalatan.de/blog/ Kamailio services - https://gilawa.com -----Original Message----- From: sr-users sr-users-bounces@lists.kamailio.org On Behalf Of Rick van Rein Sent: Thursday, May 12, 2022 1:31 PM To: sr-users@lists.kamailio.org Subject: [SR-Users] Path MTU issues over UDP/IPv6 Hello, IPv6 routers never fragment packets. Rather, they drop a packet that is too large for a (local) MTU and send back ICMPv6 "Packet Too Big". This seems to cause loss of larger SIP messages when an ISP tunnels their IPv6 at the expense of the MTU. The pmtu_discovery flag sets Don't Fragment in IPv4 traffic; in IPv6 this is an implied property. Does Kamailio learn a lower MTU from any "Packet Too Big" for IPv6 even if pmtu_discovery is not set? Future resends can then be fragemented appropriately. The udp_mtu setting diverts to another protocol, but that would be a setting as low as the worst peer, impacting all. It would be a weird struggle with a telco serving many. PMTU would be better to rely on, but how does it work in Kamailio? Details on https://www.rfc-editor.org/rfc/rfc3542#section-11.3 https://stackoverflow.com/questions/38817837/how-does-mtu-retransmission-wor... Thanks, -Rick __________________________________________________________ Kamailio - Users Mailing List - Non Commercial Discussions * sr-users@lists.kamailio.org Important: keep the mailing list in the recipients, do not reply only to the sender! Edit mailing list options or unsubscribe: * https://lists.kamailio.org/cgi-bin/mailman/listinfo/sr-users

Richard Fuchs

18 May 18 May

5:38 a.m.

On 12/05/2022 07.31, [EXT] Rick van Rein wrote:

...

Hello,

IPv6 routers never fragment packets. Rather, they drop a packet that is too large for a (local) MTU and send back ICMPv6 "Packet Too Big". This seems to cause loss of larger SIP messages when an ISP tunnels their IPv6 at the expense of the MTU.

The pmtu_discovery flag sets Don't Fragment in IPv4 traffic; in IPv6 this is an implied property. Does Kamailio learn a lower MTU from any "Packet Too Big" for IPv6 even if pmtu_discovery is not set? Future resends can then be fragemented appropriately.

The udp_mtu setting diverts to another protocol, but that would be a setting as low as the worst peer, impacting all. It would be a weird struggle with a telco serving many. PMTU would be better to rely on, but how does it work in Kamailio?

I haven't looked at the Kamailio code either, but in general this is handled by the network stack directly (e.g. the Linux kernel), transparent to the application (Kamailio).

1. The application wants to send a packet and uses the appropriate API (e.g. the kernel's send() system call). 2. The kernel takes care to actually send the packet out to its destination. 3. The packet then hits an MTU barrier along its path. The packet is discarded by the remote router and the router sends back an ICMPv6 packet to the originator. 4. The kernel receives this ICMPv6 packet and from this learns that the path MTU to that destination is lower. The application generally is not notified about this. An automatic retransmission also doesn't happen. 5. The application wants to send another packet to the same destination (e.g. in Kamailio's case probably a retransmission of the first one, as that packet was never acknowledged). 6. The application does exactly the same thing as in step 1. 7. The kernel now knows about the smaller PMTU to that packet's destination and will therefore fragment the packet appropriately before sending the fragments out.

Cheers

Rick van Rein

19 May 19 May

10:20 a.m.

Hello Henning and Richard,

Henning Westerholt helped me focus in the code:

...

You find the implementation of the MTU handling in the src/core/udp_server.c file. Its just setting the appropriate socket option right now.

I think I found a few bugs, centering around https://github.com/kamailio/kamailio/blob/master/src/core/udp_server.c#L331-...

The file clearly shows how the option is processed,

(pmtu_discovery) ? IP_PMTUDISC_DO : IP_PMTUDISC_DONT

This is IPv4-only, and it looks like a bug that no check on the family is done before this is set. Note that Linux defines

/usr/include/linux/in6.h: #define IPV6_MTU_DISCOVER 23 /usr/include/linux/in.h: #define IP_MTU_DISCOVER 10

In general, Path MTU discovery only applies to connected sockets, which is not what happens in udp_server.c -- the IPv4 version sets the DF flag, which made me wonder if that actually gets handled at all. The IP_RECVERR flag described in ip(7) is used and is intended for such connectionless MTU handling. For IPv6, there is an IPV6_RECVERR,

/usr/include/linux/in6.h: #define IPV6_RECVERR 25 /usr/include/linux/in.h: #define IP_RECVERR 11

The IPV6 variant is absent, which would be another bug. (FYI, I use an IPv6-only setup, probably why this turns up.)

This being the mechanism to handle MTU discovery for unconnected sockets, I read ip(7) and it mentions a flag MSG_ERRQUEUE to be used with recvmsg(). I could not find this flag in Kamailio, so I suspect that this treatment was not completed after adding the IP_RECVERR flag.

An approach that would always be safe AFAIK is to change a socket with this kind of error to a connected socket, and set the lower MTU on that. And then, continue sending. Connecting over UDP is kind-of free, and avoids relying on another protocol in the peer. The expense would be grabbing an extra socket, which is why it may be better to await Path MTU failure.

Richard Fuchs explained in detail what happens:

...

The application wants to send another packet to the same destination (e.g. in Kamailio's case probably a retransmission of the first one, as that packet was never acknowledged).

The application does exactly the same thing as in step 1.

The kernel now knows about the smaller PMTU to that packet's destination and will therefore fragment the packet appropriately before sending the fragments out.

These last steps however, only apply to a _connected_ UDP socket. I chased for that in the given file, but did not find it.

I suppose there are also problems in Linux' double-action of MTU as implied MRU -- it means that you cannot be conservative in what you send and liberal in what you accept -- that would have been a useful OS-level strategy. In lieu of that, I suppose it is an application problem :'-(

This in general feels like it is outside my reach. I can understand it, but cannot fix it. Have I hereby submitted a bug, or is an issue on GitHub the proper path?

Thanks,

Rick van Rein

Henning Westerholt

20 May 20 May

8:01 a.m.

Hello Rick,

thanks for looking into it.

You already opened an issue about that, which is a good idea to keep track of it.

Cheers,

Henning

-- Henning Westerholt - https://skalatan.de/blog/ Kamailio services - https://gilawa.com -----Original Message----- From: Rick van Rein rick+kamailio.org@vanrein.org Sent: Thursday, May 19, 2022 12:21 PM To: Kamailio (SER) - Users Mailing List sr-users@lists.kamailio.org Cc: Richard Fuchs rfuchs@sipwise.com; Henning Westerholt hw@gilawa.com Subject: Re: [SR-Users] Path MTU issues over UDP/IPv6 Hello Henning and Richard, Henning Westerholt helped me focus in the code: > You find the implementation of the MTU handling in the src/core/udp_server.c file. Its just setting the appropriate socket option right now. I think I found a few bugs, centering around https://github.com/kamailio/kamailio/blob/master/src/core/udp_server.c#L331-... The file clearly shows how the option is processed, (pmtu_discovery) ? IP_PMTUDISC_DO : IP_PMTUDISC_DONT This is IPv4-only, and it looks like a bug that no check on the family is done before this is set. Note that Linux defines /usr/include/linux/in6.h: #define IPV6_MTU_DISCOVER 23 /usr/include/linux/in.h: #define IP_MTU_DISCOVER 10 In general, Path MTU discovery only applies to connected sockets, which is not what happens in udp_server.c -- the IPv4 version sets the DF flag, which made me wonder if that actually gets handled at all. The IP_RECVERR flag described in ip(7) is used and is intended for such connectionless MTU handling. For IPv6, there is an IPV6_RECVERR, /usr/include/linux/in6.h: #define IPV6_RECVERR 25 /usr/include/linux/in.h: #define IP_RECVERR 11 The IPV6 variant is absent, which would be another bug. (FYI, I use an IPv6-only setup, probably why this turns up.) This being the mechanism to handle MTU discovery for unconnected sockets, I read ip(7) and it mentions a flag MSG_ERRQUEUE to be used with recvmsg(). I could not find this flag in Kamailio, so I suspect that this treatment was not completed after adding the IP_RECVERR flag. An approach that would always be safe AFAIK is to change a socket with this kind of error to a connected socket, and set the lower MTU on that. And then, continue sending. Connecting over UDP is kind-of free, and avoids relying on another protocol in the peer. The expense would be grabbing an extra socket, which is why it may be better to await Path MTU failure. Richard Fuchs explained in detail what happens: > 5. The application wants to send another packet to the same destination > (e.g. in Kamailio's case probably a retransmission of the first one, > as that packet was never acknowledged). > 6. The application does exactly the same thing as in step 1. > 7. The kernel now knows about the smaller PMTU to that packet's > destination and will therefore fragment the packet appropriately > before sending the fragments out. These last steps however, only apply to a _connected_ UDP socket. I chased for that in the given file, but did not find it. I suppose there are also problems in Linux' double-action of MTU as implied MRU -- it means that you cannot be conservative in what you send and liberal in what you accept -- that would have been a useful OS-level strategy. In lieu of that, I suppose it is an application problem :'-( This in general feels like it is outside my reach. I can understand it, but cannot fix it. Have I hereby submitted a bug, or is an issue on GitHub the proper path? Thanks, Rick van Rein

1139

Age (days ago)

1147

Last active (days ago)

sr-users@lists.kamailio.org

4 comments

3 participants

tags (0)

participants (3)

Henning Westerholt
Richard Fuchs
Rick van Rein