Description

I have an edge fleet of kamailio hosts holding connections to SIP clients. Behind this edge fleet I have authoritative proxies (APs) managing call routing and branching.

I have noticed an issue with ACK processing in this scenario:

  1. Branch a call to multiple devices which are connected to the same edge host.
  2. AP fowards the INVITEs to a single kamailio edge host. which then fowards those INVITEs to the devices
  3. On device with branch index 0, reply 4xx response
  4. The authoritative proxy will automatically ACK this response back to the edge host. Edge host receives ACK for branch index 0
  5. Have another device reply 200 OK. This gets back to the caller device fine.
  6. The caller will send the ACK for the 200 OK. At the authoritative proxy, kamailio will build the via and hardcodes this via construction with branch index 0.
  7. Edge host receives ACK with branch index 0, checks and sees it already got the ACK with branch index 0 (ACK from step 4) and does not forward the 200 OK ACK to the device which is trying to accept the call.

Summary:

Kamailio builds VIA headers which suffix the branch index. Note the .0 in the via header below. This via was generated from a message with branch index 0.

SIP/2.0/TLS 172.xxx.xxx.xxx:5061;received=34.xxx.xxx.xxx;branch=z9hG4bK3e52.8b4427a3530f17e60bf152a73caf0b38.0;i=9c52a

When doing ACK processing, Kamailio fowards ACKs with hardcoded 0-th branch index (this gets suffix'd to the via branch). If the branch with index 0 had already replied, it may have already been ACK'd and therefore the next ACK will not be forwarded by the edge host.

https://github.com/kamailio/kamailio/blob/master/src/core/forward.c#L527

Possible Solutions

The simplest solution I can see to this problem is to change how the VIA is constructed for INVITE requests. To avoid collisions with the 200 OK ACK, change the logic so that the value written to the branch index part of the via branch value is 1 greater than the actual branch index.

As far as I can tell, this requires changes inside t_msgbuilder.c and t_lookup. The first change is to pass b + 1 instead of b when constructing the via header value.

int t_calc_branch(struct cell *t, 
	int b, char *branch, int *branch_len)
{
	return branch_builder( t->hash_index,
			0, t->md5,
			b+1, branch, branch_len );
}

The second is to update the reply matching logic which parses the branch index, so that it decrements the value.

	/* sanity check */
	if (unlikely(reverse_hex2int(hashi, hashl, &hash_index)<0
		||hash_index>=TABLE_ENTRIES
		|| reverse_hex2int(branchi, branchl, &branch_id)<0
		|| branch_id>=sr_dst_max_branches   <--- this should become strict greater
		|| loopl!=MD5_LEN)
	) {
		DBG("DEBUG: t_reply_matching: poor reply labels %d label %d "
			"branch %d\n", hash_index, entry_label, branch_id );
		goto nomatch2;
	}

/*  add the decrement here */
branch_id--;


You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub, or unsubscribe.