Module: sip-router
Branch: sr_3.0
Commit: 8fafff8a11a3d643de47f8d1f389d0c5e6316beb
URL: http://git.sip-router.org/cgi-bin/gitweb.cgi/sip-router/?a=commit;h=8fafff8…
Author: Andrei Pelinescu-Onciul <andrei(a)iptel.org>
Committer: Andrei Pelinescu-Onciul <andrei(a)iptel.org>
Date: Sun Jun 20 19:40:55 2010 +0200
tcp: force eof after read if write side hangup
Even if POLLRDHUP is not supported, but we detected a write side close
(POLLHUP) or an error (POLLERR) or such an event was previously detected by
tcp_main (F_CONN_EOF_SEEN), force connection closing after reading all the data
in the socket buffer. In this case we can close() after the first short read
and we save an extra system call (a read() that returns 0).
(cherry picked from commit 28e313250503d6f8d06ebab15c8421c40e7f0fe4)
---
tcp_read.c | 8 ++++----
1 files changed, 4 insertions(+), 4 deletions(-)
diff --git a/tcp_read.c b/tcp_read.c
index d201765..9015a8b 100644
--- a/tcp_read.c
+++ b/tcp_read.c
@@ -970,13 +970,13 @@ again:
con, con->id, atomic_get(&con->refcnt));
goto read_error;
}
+ read_flags=((
#ifdef POLLRDHUP
- read_flags=(((events & POLLRDHUP) |
+ (events & POLLRDHUP) |
+#endif /* POLLRDHUP */
+ (events & (POLLHUP|POLLERR)) |
(con->flags & (F_CONN_EOF_SEEN|F_CONN_FORCE_EOF)))
&& !(events & POLLPRI))? RD_CONN_FORCE_EOF: 0;
-#else /* POLLRDHUP */
- read_flags=0;
-#endif /* POLLRDHUP */
resp=tcp_read_req(con, &ret, &read_flags);
if (unlikely(resp<0)){
read_error:
Module: sip-router
Branch: sr_3.0
Commit: a72fd8da7c0b40c37a322f7d69b7d76197ceb9c7
URL: http://git.sip-router.org/cgi-bin/gitweb.cgi/sip-router/?a=commit;h=a72fd8d…
Author: Andrei Pelinescu-Onciul <andrei(a)iptel.org>
Committer: Andrei Pelinescu-Onciul <andrei(a)iptel.org>
Date: Wed Jul 7 11:59:30 2010 +0200
tcp: fix dispatching closed connections to tcp readers
Under very heavy load it is possible that send2child() might try
to send an already closed connection/fd to a tcp reader.
This can happen only if the tcp connection is watched for read
(POLLIN) by tcp_main (and not by a tcp reader), the connection
becomes available for reading (either new data received, EOF or
RST) and tcp_main chooses a specific tcp reader to send the
connection to while in the same time the same tcp reader tries to
send on the same connection, fails for some reason (no more space
for buffering, EOF, RST a.s.o) and sends a close command back to
tcp_main. Because send2child() executes first any pending commands
from the choosen tcp_reader, this might lead to closing the fd
before attempting to send it to the tcp_reader.
Under normal circumstances the impact is only an extra syscall and
some log messages, however it is possible (but highly unlikely)
that after sending the close command the tcp_reader opens a new
connection for sending and sends its fd back to tcp_main. This new
fd might get the same number as the freshly closed fd and
send2child might send the wrong (fd, tcp connection) pair.
(cherry picked from commit d89437a3d7bc25a9c098a04c6ee69fc3848ff0b5)
---
tcp_main.c | 20 ++++++++++++++------
1 files changed, 14 insertions(+), 6 deletions(-)
diff --git a/tcp_main.c b/tcp_main.c
index 6a6c4c0..54334ab 100644
--- a/tcp_main.c
+++ b/tcp_main.c
@@ -2468,6 +2468,7 @@ close_again:
LOG(L_ERR, "ERROR: tcpconn_put_destroy; close() failed: %s (%d)\n",
strerror(errno), errno);
}
+ tcpconn->s=-1;
}
@@ -3378,10 +3379,20 @@ inline static int send2child(struct tcp_connection* tcpconn)
* send a release command, but the master fills its socket buffer
* with new connection commands => deadlock) */
/* answer tcp_send requests first */
- while(handle_ser_child(&pt[tcp_children[idx].proc_no], -1)>0);
+ while(unlikely((tcpconn->state != S_CONN_BAD) &&
+ (handle_ser_child(&pt[tcp_children[idx].proc_no], -1)>0)));
/* process tcp readers requests */
- while(handle_tcp_child(&tcp_children[idx], -1)>0);
-
+ while(unlikely((tcpconn->state != S_CONN_BAD &&
+ (handle_tcp_child(&tcp_children[idx], -1)>0))));
+
+ /* the above possible pending requests might have included a
+ command to close this tcpconn (e.g. CONN_ERROR, CONN_EOF).
+ In this case the fd is already closed here (and possible
+ even replaced by another one with the same number) so it
+ must not be sent to a reader anymore */
+ if (unlikely(tcpconn->state == S_CONN_BAD ||
+ (tcpconn->flags & F_CONN_FD_CLOSED)))
+ return -1;
#ifdef SEND_FD_QUEUE
/* if queue full, try to queue the io */
if (unlikely(send_fd(tcp_children[idx].unix_sock, &tcpconn,
@@ -3501,8 +3512,6 @@ static inline int handle_new_connect(struct socket_info* si)
DBG("handle_new_connect: new connection from %s: %p %d flags: %04x\n",
su2a(&su, sizeof(su)), tcpconn, tcpconn->s, tcpconn->flags);
if(unlikely(send2child(tcpconn)<0)){
- LOG(L_ERR,"ERROR: handle_new_connect: no children "
- "available\n");
tcpconn->flags&=~F_CONN_READER;
tcpconn_put(tcpconn);
tcpconn_try_unhash(tcpconn);
@@ -3676,7 +3685,6 @@ send_to_child:
tcpconn->flags&=~(F_CONN_MAIN_TIMER|F_CONN_READ_W|F_CONN_WANTS_RD);
tcpconn_ref(tcpconn); /* refcnt ++ */
if (unlikely(send2child(tcpconn)<0)){
- LOG(L_ERR,"ERROR: handle_tcpconn_ev: no children available\n");
tcpconn->flags&=~F_CONN_READER;
#ifdef TCP_ASYNC
if (tcpconn->flags & F_CONN_WRITE_W){
Module: sip-router
Branch: sr_3.0
Commit: 30fabfb9c192e01fdf8eb0377513aa9e31f24e1b
URL: http://git.sip-router.org/cgi-bin/gitweb.cgi/sip-router/?a=commit;h=30fabfb…
Author: Andrei Pelinescu-Onciul <andrei(a)iptel.org>
Committer: Andrei Pelinescu-Onciul <andrei(a)iptel.org>
Date: Wed Jun 16 21:03:06 2010 +0200
tcp: fix fd passing bug
If connections are opened and closed very quickly when data is
sent on them, it is possible that a connection gets closed
(close() inside tcp_main) while a process waits for its fd, just
before tcp_main attempts to send the fd. In this case the fd
sending will fail (one cannot send a closed fd) and the process
that asked for it will remain waiting forever.
The fix always checks before attempting to send the fd if the fd is
still open and the connection is not in a "bad" state. If not,
a new error response is sent (no fd and connection == NULL).
(backported from 9da6fae72b9883ab8dbbb4e681c4d4e96d6549e4)
---
tcp_main.c | 51 +++++++++++++++++++++++++++++++++++++++++++++------
1 files changed, 45 insertions(+), 6 deletions(-)
diff --git a/tcp_main.c b/tcp_main.c
index 4840ddc..6a6c4c0 100644
--- a/tcp_main.c
+++ b/tcp_main.c
@@ -2050,14 +2050,25 @@ int tcp_send(struct dest_info* dst, union sockaddr_union* from,
do_close_fd=0;
goto release_c;
}
- if (unlikely(c!=tmp)){
- LOG(L_CRIT, "BUG: tcp_send: get_fd: got different connection:"
+ /* handle fd closed or bad connection/error
+ (it's possible that this happened in the time between
+ we found the intial connection and the time when we get
+ the fd)
+ */
+ if (unlikely(c!=tmp || fd==-1 || c->state==S_CONN_BAD)){
+ if (unlikely(c!=tmp && tmp!=0))
+ BUG("tcp_send: get_fd: got different connection:"
" %p (id= %d, refcnt=%d state=%d) != "
" %p (n=%d)\n",
c, c->id, atomic_get(&c->refcnt), c->state,
tmp, n
- );
+ );
n=-1; /* fail */
+ /* don't cache fd & close it */
+ do_close_fd = (fd==-1)?0:1;
+#ifdef TCP_FD_CACHE
+ use_fd_cache = 0;
+#endif /* TCP_FD_CACHE */
goto end;
}
DBG("tcp_send: after receive_fd: c= %p n=%d fd=%d\n",c, n, fd);
@@ -2721,6 +2732,12 @@ inline static void send_fd_queue_run(struct tcp_send_fd_q* q)
struct send_fd_info* t;
for (p=t=&q->data[0]; p<q->crt; p++){
+ if (unlikely(p->tcp_conn->state == S_CONN_BAD ||
+ p->tcp_conn->flags & F_CONN_FD_CLOSED ||
+ p->tcp_conn->s ==-1)) {
+ /* bad and/or already closed connection => remove */
+ goto rm_con;
+ }
if (unlikely(send_fd(p->unix_sock, &(p->tcp_conn),
sizeof(struct tcp_connection*), p->tcp_conn->s)<=0)){
if ( ((errno==EAGAIN)||(errno==EWOULDBLOCK)) &&
@@ -2736,7 +2753,11 @@ inline static void send_fd_queue_run(struct tcp_send_fd_q* q)
p->unix_sock, (long)(p-&q->data[0]), p->retries,
p->tcp_conn, p->tcp_conn->s, errno,
strerror(errno));
+rm_con:
#ifdef TCP_ASYNC
+ /* if a connection is on the send_fd queue it means it's
+ not watched for read anymore => could be watched only for
+ write */
if (p->tcp_conn->flags & F_CONN_WRITE_W){
io_watch_del(&io_h, p->tcp_conn->s, -1, IO_FD_CLOSING);
p->tcp_conn->flags &=~F_CONN_WRITE_W;
@@ -3006,6 +3027,7 @@ error:
inline static int handle_ser_child(struct process_table* p, int fd_i)
{
struct tcp_connection* tcpconn;
+ struct tcp_connection* tmp;
long response[2];
int cmd;
int bytes;
@@ -3101,9 +3123,26 @@ inline static int handle_ser_child(struct process_table* p, int fd_i)
/* send the requested FD */
/* WARNING: take care of setting refcnt properly to
* avoid race conditions */
- if (unlikely(send_fd(p->unix_sock, &tcpconn, sizeof(tcpconn),
- tcpconn->s)<=0)){
- LOG(L_ERR, "ERROR: handle_ser_child: send_fd failed\n");
+ if (unlikely(tcpconn->state == S_CONN_BAD ||
+ (tcpconn->flags & F_CONN_FD_CLOSED) ||
+ tcpconn->s ==-1)) {
+ /* connection is already marked as bad and/or has no
+ fd => don't try to send the fd (trying to send a
+ closed fd _will_ fail) */
+ tmp = 0;
+ if (unlikely(send_all(p->unix_sock, &tmp, sizeof(tmp)) <= 0))
+ BUG("handle_ser_child: CONN_GET_FD: send_all failed\n");
+ /* no need to attempt to destroy the connection, it should
+ be already in the process of being destroyed */
+ } else if (unlikely(send_fd(p->unix_sock, &tcpconn,
+ sizeof(tcpconn), tcpconn->s)<=0)){
+ LOG(L_ERR, "handle_ser_child: CONN_GET_FD:"
+ " send_fd failed\n");
+ /* try sending error (better then not sending anything) */
+ tmp = 0;
+ if (unlikely(send_all(p->unix_sock, &tmp, sizeof(tmp)) <= 0))
+ BUG("handle_ser_child: CONN_GET_FD:"
+ " send_fd send_all fallback failed\n");
}
break;
case CONN_NEW:
Module: sip-router
Branch: sr_3.0
Commit: 71d2f90e262a139f6beed666d2b9ac1f127274b4
URL: http://git.sip-router.org/cgi-bin/gitweb.cgi/sip-router/?a=commit;h=71d2f90…
Author: Andrei Pelinescu-Onciul <andrei(a)iptel.org>
Committer: Andrei Pelinescu-Onciul <andrei(a)iptel.org>
Date: Sat Jun 19 00:16:38 2010 +0200
io_wait: kqueue: use a bigger array
Use a bigger array for kevent(). Instead of the fd number, use
2* max_fd_no (2 because read and write events are not merged) and
with extra space for possible changelist errors.
(minor optimization)
(cherry picked from commit a9cdfc2938ca73d6ba40f5896c6a8930c2e73f85)
---
io_wait.c | 15 ++++++++++++---
io_wait.h | 9 ++++++---
2 files changed, 18 insertions(+), 6 deletions(-)
diff --git a/io_wait.c b/io_wait.c
index 5adb8d2..1b29813 100644
--- a/io_wait.c
+++ b/io_wait.c
@@ -542,13 +542,21 @@ int init_io_wait(io_wait_h* h, int max_fd, enum poll_types poll_method)
#endif
#ifdef HAVE_KQUEUE
case POLL_KQUEUE:
- h->kq_array=local_malloc(sizeof(*(h->kq_array))*h->max_fd_no);
+ h->kq_changes_size=KQ_CHANGES_ARRAY_SIZE;
+ /* kevent returns different events for read & write
+ => to get all the possible events in one call we
+ need twice the number of added fds + space
+ for possible changelist errors.
+ OTOH if memory is to be saved at all costs, one can
+ decrease the array size.
+ */
+ h->kq_array_size=2 * h->max_fd_no + h->kq_changes_size;
+ h->kq_array=local_malloc(sizeof(*(h->kq_array))*h->kq_array_size);
if (h->kq_array==0){
LOG(L_CRIT, "ERROR: init_io_wait: could not alloc"
" kqueue event array\n");
goto error;
}
- h->kq_changes_size=KQ_CHANGES_ARRAY_SIZE;
h->kq_changes=local_malloc(sizeof(*(h->kq_changes))*
h->kq_changes_size);
if (h->kq_changes==0){
@@ -557,7 +565,8 @@ int init_io_wait(io_wait_h* h, int max_fd, enum poll_types poll_method)
goto error;
}
h->kq_nchanges=0;
- memset((void*)h->kq_array, 0, sizeof(*(h->kq_array))*h->max_fd_no);
+ memset((void*)h->kq_array, 0,
+ sizeof(*(h->kq_array))*h->kq_array_size);
memset((void*)h->kq_changes, 0,
sizeof(*(h->kq_changes))* h->kq_changes_size);
if (init_kqueue(h)<0){
diff --git a/io_wait.h b/io_wait.h
index c28f53d..e04efd8 100644
--- a/io_wait.h
+++ b/io_wait.h
@@ -148,7 +148,9 @@ struct io_wait_handler{
int flags;
struct fd_map* fd_hash;
int fd_no; /* current index used in fd_array and the passed size for
- ep_array & kq_array*/
+ ep_array (for kq_array at least
+ max(twice the size, kq_changes_size) should be
+ be passed). */
int max_fd_no; /* maximum fd no, is also the size of fd_array,
fd_hash and ep_array*/
/* common stuff for POLL, SIGIO_RT and SELECT
@@ -170,6 +172,7 @@ struct io_wait_handler{
struct kevent* kq_array; /* used for the eventlist*/
struct kevent* kq_changes; /* used for the changelist */
size_t kq_nchanges;
+ size_t kq_array_size; /* array size */
size_t kq_changes_size; /* size of the changes array */
#endif
#ifdef HAVE_DEVPOLL
@@ -1115,7 +1118,7 @@ inline static int io_wait_loop_kqueue(io_wait_h* h, int t, int repeat)
do {
again:
n=kevent(h->kq_fd, h->kq_changes, apply_changes, h->kq_array,
- h->fd_no, &tspec);
+ h->kq_array_size, &tspec);
if (unlikely(n==-1)){
if (unlikely(errno==EINTR)) goto again; /* signal, ignore it */
else {
@@ -1127,7 +1130,7 @@ again:
/* some of the FDs in kq_changes are bad (already closed)
and there is not enough space in kq_array to return all
of them back */
- apply_changes = h->fd_no;
+ apply_changes = h->kq_array_size;
goto again;
}
}
Module: sip-router
Branch: sr_3.0
Commit: 0a38c9ab227f158c6d5d77aab509c761c9790a60
URL: http://git.sip-router.org/cgi-bin/gitweb.cgi/sip-router/?a=commit;h=0a38c9a…
Author: Andrei Pelinescu-Onciul <andrei(a)iptel.org>
Committer: Andrei Pelinescu-Onciul <andrei(a)iptel.org>
Date: Sat Jun 19 00:35:47 2010 +0200
io_wait: fix kqueue io_wait_add & POLLIN
A "goto error" was placed outside the error handling "if",
resulting in any io_watch_add(), that tried to enable write
watching on a new FD, returning failure (fortunately this kind
of io_watch_add() usage doesn't happen very often, usually write
watch is enabled via io_watch_chg() on FDs already
io_watch_add()'ed for reading).
Only POLL_KQUEUE was affected by this bug, meaning the default on
all *bsd and darwin.
(cherry picked from commit e5be1a067158c8ba49d33082eb403937546e7c69)
---
io_wait.h | 4 ++--
1 files changed, 2 insertions(+), 2 deletions(-)
diff --git a/io_wait.h b/io_wait.h
index 01df1e6..b110b6b 100644
--- a/io_wait.h
+++ b/io_wait.h
@@ -499,7 +499,7 @@ again2:
case POLL_KQUEUE:
if (likely( events & POLLIN)){
if (unlikely(kq_ev_change(h, fd, EVFILT_READ, EV_ADD, e)==-1))
- goto error;
+ goto error;
}
if (unlikely( events & POLLOUT)){
if (unlikely(kq_ev_change(h, fd, EVFILT_WRITE, EV_ADD, e)==-1))
@@ -507,8 +507,8 @@ again2:
if (likely(events & POLLIN)){
kq_ev_change(h, fd, EVFILT_READ, EV_DELETE, 0);
}
+ goto error;
}
- goto error;
}
break;
#endif