Module: sip-router
Branch: andrei/tcp_tls_changes
Commit: d89437a3d7bc25a9c098a04c6ee69fc3848ff0b5
URL: http://git.sip-router.org/cgi-bin/gitweb.cgi/sip-router/?a=commit;h=d89437a…
Author: Andrei Pelinescu-Onciul <andrei(a)iptel.org>
Committer: Andrei Pelinescu-Onciul <andrei(a)iptel.org>
Date: Wed Jul 7 11:59:30 2010 +0200
tcp: fix dispatching closed connections to tcp readers
Under very heavy load it is possible that send2child() might try
to send an already closed connection/fd to a tcp reader.
This can happen only if the tcp connection is watched for read
(POLLIN) by tcp_main (and not by a tcp reader), the connection
becomes available for reading (either new data received, EOF or
RST) and tcp_main chooses a specific tcp reader to send the
connection to while in the same time the same tcp reader tries to
send on the same connection, fails for some reason (no more space
for buffering, EOF, RST a.s.o) and sends a close command back to
tcp_main. Because send2child() executes first any pending commands
from the choosen tcp_reader, this might lead to closing the fd
before attempting to send it to the tcp_reader.
Under normal circumstances the impact is only an extra syscall and
some log messages, however it is possible (but highly unlikely)
that after sending the close command the tcp_reader opens a new
connection for sending and sends its fd back to tcp_main. This new
fd might get the same number as the freshly closed fd and
send2child might send the wrong (fd, tcp connection) pair.
---
tcp_main.c | 20 ++++++++++++++------
1 files changed, 14 insertions(+), 6 deletions(-)
diff --git a/tcp_main.c b/tcp_main.c
index 844b3a8..aa4fc52 100644
--- a/tcp_main.c
+++ b/tcp_main.c
@@ -2891,6 +2891,7 @@ close_again:
su2a(&tcpconn->rcv.src_su, sizeof(tcpconn->rcv.src_su)),
fd, tcpconn->flags, strerror(errno), errno);
}
+ tcpconn->s=-1;
}
@@ -3836,10 +3837,20 @@ inline static int send2child(struct tcp_connection* tcpconn)
* send a release command, but the master fills its socket buffer
* with new connection commands => deadlock) */
/* answer tcp_send requests first */
- while(handle_ser_child(&pt[tcp_children[idx].proc_no], -1)>0);
+ while(unlikely((tcpconn->state != S_CONN_BAD) &&
+ (handle_ser_child(&pt[tcp_children[idx].proc_no], -1)>0)));
/* process tcp readers requests */
- while(handle_tcp_child(&tcp_children[idx], -1)>0);
-
+ while(unlikely((tcpconn->state != S_CONN_BAD &&
+ (handle_tcp_child(&tcp_children[idx], -1)>0))));
+
+ /* the above possible pending requests might have included a
+ command to close this tcpconn (e.g. CONN_ERROR, CONN_EOF).
+ In this case the fd is already closed here (and possible
+ even replaced by another one with the same number) so it
+ must not be sent to a reader anymore */
+ if (unlikely(tcpconn->state == S_CONN_BAD ||
+ (tcpconn->flags & F_CONN_FD_CLOSED)))
+ return -1;
#ifdef SEND_FD_QUEUE
/* if queue full, try to queue the io */
if (unlikely(send_fd(tcp_children[idx].unix_sock, &tcpconn,
@@ -3961,8 +3972,6 @@ static inline int handle_new_connect(struct socket_info* si)
DBG("handle_new_connect: new connection from %s: %p %d flags: %04x\n",
su2a(&su, sizeof(su)), tcpconn, tcpconn->s, tcpconn->flags);
if(unlikely(send2child(tcpconn)<0)){
- LOG(L_ERR,"ERROR: handle_new_connect: no children "
- "available\n");
tcpconn->flags&=~F_CONN_READER;
tcpconn_put(tcpconn);
tcpconn_try_unhash(tcpconn);
@@ -4142,7 +4151,6 @@ send_to_child:
tcpconn->flags&=~(F_CONN_MAIN_TIMER|F_CONN_READ_W|F_CONN_WANTS_RD);
tcpconn_ref(tcpconn); /* refcnt ++ */
if (unlikely(send2child(tcpconn)<0)){
- LOG(L_ERR,"ERROR: handle_tcpconn_ev: no children available\n");
tcpconn->flags&=~F_CONN_READER;
#ifdef TCP_ASYNC
if (tcpconn->flags & F_CONN_WRITE_W){
Revision: 6015
http://openser.svn.sourceforge.net/openser/?rev=6015&view=rev
Author: mariuszbihlei
Date: 2010-07-07 09:31:13 +0000 (Wed, 07 Jul 2010)
Log Message:
-----------
Fixed a memory leak on pkg memory. dtrie_destroy and dtrie_clear could leak 10 * sizeof(void*) for each node in the trie.
Mainly affected carrierroute and userblacklist modules
Modified Paths:
--------------
branches/1.5/trie/dtrie.c
This was sent by the SourceForge.net collaborative development platform, the world's largest Open Source development site.
Module: sip-router
Branch: andrei/tcp_tls_changes
Commit: a08385f531dd0bf1044c8495e6e47c9aabfebd8a
URL: http://git.sip-router.org/cgi-bin/gitweb.cgi/sip-router/?a=commit;h=a08385f…
Author: Andrei Pelinescu-Onciul <andrei(a)iptel.org>
Committer: Andrei Pelinescu-Onciul <andrei(a)iptel.org>
Date: Mon Jul 5 15:23:02 2010 +0200
tls: support for partial encoding and reseting send_flags
Support for the extended version of the tls_encode() callback.
On partial tls write and no more space in the internal static
buffers for more records, return what records were encoded so far
and ask to be called again after they are sent.
When exiting after a partial write or after postponing the encode
until more data has been read (WANT_READ), make sure that any
request to close the connection after the first successful tcp
level write will be postponed (reset SND_F_CON_CLOSE).
---
modules/tls/tls_server.c | 56 ++++++++++++++++++++++++++++++++++++++-------
modules/tls/tls_server.h | 4 ++-
2 files changed, 50 insertions(+), 10 deletions(-)
diff --git a/modules/tls/tls_server.c b/modules/tls/tls_server.c
index 600653c..f6d822c 100644
--- a/modules/tls/tls_server.c
+++ b/modules/tls/tls_server.c
@@ -607,16 +607,31 @@ typedef int (*tcp_low_level_send_t)(int fd, struct tcp_connection *c,
* It is a callback that will be called by the tcp code, before a send
* on TLS would be attempted. It should replace the input buffer with a
* new static buffer containing the TLS processed data.
+ * If the input buffer could not be fully encoded (e.g. run out of space
+ * in the internal static buffer), it should set rest_buf and rest_len to
+ * the remaining part, so that it could be called again once the output has
+ * been used (sent). The send_flags used are also passed and they can be
+ * changed (e.g. to disallow a close() after a partial encode).
* WARNING: it must always be called with c->write_lock held!
* @param c - tcp connection
* @param pbuf - pointer to buffer (value/result, on success it will be
* replaced with a static buffer).
* @param plen - pointer to buffer size (value/result, on success it will be
* replaced with the size of the replacement buffer.
+ * @param rest_buf - (result) should be filled with a pointer to the
+ * remaining unencoded part of the original buffer if any,
+ * 0 otherwise.
+ * @param rest_len - (result) should be filled with the length of the
+ * remaining unencoded part of the original buffer (0 if
+ * the original buffer was fully encoded).
+ * @param send_flags - pointer to the send_flags that will be used for sending
+ * the message.
* @return *plen on success (>=0), < 0 on error.
*/
int tls_encode_f(struct tcp_connection *c,
- const char* *pbuf, unsigned int* plen)
+ const char** pbuf, unsigned int* plen,
+ const char** rest_buf, unsigned int* rest_len,
+ snd_flags_t* send_flags)
{
int n, offs;
SSL* ssl;
@@ -631,8 +646,11 @@ int tls_encode_f(struct tcp_connection *c,
buf = *pbuf;
len = *plen;
- TLS_WR_TRACE("(%p, %p, %d) start (%s:%d* -> %s)\n",
- c, buf, len, ip_addr2a(&c->rcv.dst_ip), c->rcv.dst_port,
+ *rest_buf = 0;
+ *rest_len = 0;
+ TLS_WR_TRACE("(%p, %p, %d, ... 0x%0x) start (%s:%d* -> %s)\n",
+ c, buf, len, send_flags->f,
+ ip_addr2a(&c->rcv.dst_ip), c->rcv.dst_port,
su2a(&c->rcv.src_su, sizeof(c->rcv.src_su)));
n = 0;
offs = 0;
@@ -656,6 +674,10 @@ int tls_encode_f(struct tcp_connection *c,
c, tls_c->ct_wq?tls_c->ct_wq->queued:0);
goto error_wq_full;
}
+ /* buffer queued for a future send attempt, after first reading
+ some data (key exchange) => don't allow immediate closing of
+ the connection */
+ send_flags->f &= ~SND_F_CON_CLOSE;
goto end;
}
if (unlikely(tls_set_mbufs(c, &rd, &wr) < 0)) {
@@ -718,12 +740,28 @@ redo_wr:
goto error_wq_full;
}
tls_c->flags |= F_TLS_CON_WR_WANTS_RD;
+ /* buffer queued for a future send attempt, after first
+ reading some data (key exchange) => don't allow immediate
+ closing of the connection */
+ send_flags->f &= ~SND_F_CON_CLOSE;
break; /* or goto end */
case SSL_ERROR_WANT_WRITE:
- /* error, no record fits in the buffer */
- BUG("write buffer too small (%d/%d bytes)\n",
- wr.used, wr.size);
- goto bug;
+ if (unlikely(offs == 0)) {
+ /* error, no record fits in the buffer or
+ no partial write enabled and buffer to small to fit
+ all the records */
+ BUG("write buffer too small (%d/%d bytes)\n",
+ wr.used, wr.size);
+ goto bug;
+ } else {
+ /* offs != 0 => something was "written" */
+ *rest_buf = buf + offs;
+ *rest_len = len - offs;
+ /* this function should be called again => disallow
+ immediate closing of the connection */
+ send_flags->f &= ~SND_F_CON_CLOSE;
+ }
+ break; /* or goto end */
#if OPENSSL_VERSION_NUMBER >= 0x00907000L /*0.9.7*/
case SSL_ERROR_WANT_CONNECT:
/* only if the underlying BIO is not yet connected
@@ -772,8 +810,8 @@ redo_wr:
end:
*pbuf = (const char*)wr.buf;
*plen = wr.used;
- TLS_WR_TRACE("(%p) end (offs %d) => %d \n",
- c, offs, *plen);
+ TLS_WR_TRACE("(%p) end (offs %d, rest_buf=%p rest_len=%d 0x%0x) => %d \n",
+ c, offs, *rest_buf, *rest_len, send_flags->f, *plen);
return *plen;
error:
/*error_send:*/
diff --git a/modules/tls/tls_server.h b/modules/tls/tls_server.h
index ba28067..62047d5 100644
--- a/modules/tls/tls_server.h
+++ b/modules/tls/tls_server.h
@@ -84,7 +84,9 @@ void tls_h_tcpconn_clean(struct tcp_connection *c);
void tls_h_close(struct tcp_connection *c, int fd);
int tls_encode_f(struct tcp_connection *c,
- const char ** pbuf, unsigned int* plen);
+ const char ** pbuf, unsigned int* plen,
+ const char** rest_buf, unsigned int* rest_len,
+ snd_flags_t* send_flags) ;
int tls_read_f(struct tcp_connection *c, int* flags);
Module: sip-router
Branch: andrei/tcp_tls_changes
Commit: e316312628e61cbded9b991d21385b4e8180b538
URL: http://git.sip-router.org/cgi-bin/gitweb.cgi/sip-router/?a=commit;h=e316312…
Author: Andrei Pelinescu-Onciul <andrei(a)iptel.org>
Committer: Andrei Pelinescu-Onciul <andrei(a)iptel.org>
Date: Mon Jul 5 14:44:14 2010 +0200
tcp: support for tls partial encoding
The tls_encode() callback supports now partial encoding. It can
encode only part of the input buffer (e.g. tls partial write
enabled and not enough space in the internal buffers for all the
tls records) and ask to be called back later with the remaining
un-encoded part.
The send flags are also passed to it, so that it can modify them
if needed (e.g. reset SND_F_CON_CLOSE on partial tls write or when
reading some data is needed before continuing the tls write).
---
tcp_main.c | 230 +++++++++++++++++++++++++++++++++++++++++++++++------------
tls_hooks.h | 19 ++++--
2 files changed, 199 insertions(+), 50 deletions(-)
Diff: http://git.sip-router.org/cgi-bin/gitweb.cgi/sip-router/?a=commitdiff;h=e31…
Hello list,
under special circumstances I have a problem with the DIALOG module of SIP-Router/Kamailio version 3.0.2. The dialog module is used in combination with presence, presence_xml, pua, pua_usrloc and pua_dialoginfo modules + db_mysql modules. The db_mode is set to "write_through", because of redundancy requirements.
>From my point of view this problem has something to do with timing in general. E.g. a so called "data call" has a duration of about 50 ms only; INVITE....BYE). That problem occurs on this short calls only. Because of the problems in the dialog state machine, NOTIFY messages are incorrect. Even when the call is already finished, the NOTIFY message includes the state "confirmed" and causes a wrong status indication....
The detailed error message can be seen here:
"Jun 9 16:00:39 debian /usr/sbin/kamailio[14992]: CRITICAL: dialog [dlg_hash.c:591]: bogus event 6 in state 2 for dlg [dialog-ID] with clid [Call-ID] and tags '1299370188-28358304-1276092068837' ''"
I found in old mails of the developer list another error that looked nearly the same, but it differed in the event and state of the dialog state machine (the thread can be found under the link http://www.mail-archive.com/devel@lists.kamailio.org/msg03234.html). That bug should - according the information that I've found in the list - be solved. Therefore I will ask you: is the error as displayed above another well known error / bug, which (maybe) should already be solved? Is it a new bug?
Thanks for any information.
regards,
Klaus Feichtinger
--
GMX DSL: Internet-, Telefon- und Handy-Flat ab 19,99 EUR/mtl.
Bis zu 150 EUR Startguthaben inklusive! http://portal.gmx.net/de/go/dsl