Hi Peter,
Wouldn't it had been easier if you waited until 200OK to do the insert?
Kamailio is does not store transactions persistently, so there is no
point to store the subscribe dialog information until it is not
established. The reason is that if kamailio is restarted in the mean
time before receiving the 200OK, it won't match the 200OK to the
transaction and won't complete the dialog anyways.
This is actually a solution I adopted in the b2bua modules which had the
same race problems.
Regards,
Anca
On 03/13/2012 12:01 AM, Peter Dunkley wrote:
> Module: sip-router
> Branch: master
> Commit: 287ee15ffa985cb6d07f192f1d1cbfadb31c0fd8
> URL: http://git.sip-router.org/cgi-bin/gitweb.cgi/sip-router/?a=commit;h=287ee15ffa985cb6d07f192f1d1cbfadb31c0fd8
>
> Author: Peter Dunkley<peter.dunkley@crocodile-rcs.com>
> Committer: Peter Dunkley<peter.dunkley@crocodile-rcs.com>
> Date: Mon Mar 12 21:52:39 2012 +0000
>
> modules_k/pua: Fixed race hazard on pua table
>
> - During testing a race hazard where the 2XX to a back-end SUBSCRIBE can be
> received and processed (and the DB UPDATE to convert a temporary dialog to a
> full dialog) before the DB INSERT to create a temporary dialog is run. There
> is an incredibly small window for this, but it was happening consistently on
> one system.
> - The easiest way to fix this is to use the replace() DB API to convert the
> dialog and live with the initial INSERT failing (this does not actually
> return an error from the SRDB1 interface so the rest of the code continues
> OK). Unfortunately, the replace() API is not available for some databases
> (for example, PostgreSQL).
> - I have updated the code to use replace() when it is available and to do an
> update() then check affected_rows() (and if 0 then do an insert()) when
> replace() is not available.
> - The update() and then insert() process makes the window for the race much
> smaller, but doesn't get rid of it completely. However, with PostgreSQL a
> DB rule can be used to fix it completely:
> - PostgreSQL DB rule:
> CREATE RULE "pua_insert_race1" AS ON INSERT TO "pua"
> WHERE EXISTS(
> SELECT 1 FROM pua WHERE call_id=NEW.call_id AND from_tag=NEW.from_tag
> AND pres_id=NEW.pres_id AND to_tag=''
> ) DO INSTEAD (
> UPDATE pua
> SET expires=NEW.expires,
> desired_expires=NEW.desired_expires,
> flag=NEW.flag,
> etag=NEW.etag,
> tuple_id=NEW.tuple_id,
> to_tag=NEW.to_tag,
> cseq=NEW.cseq,
> record_route=NEW.record_route,
> contact=NEW.contact,
> remote_contact=NEW.remote_contact,
> version=NEW.version,
> extra_headers=NEW.extra_headers
> WHERE call_id=NEW.call_id AND from_tag=NEW.from_tag
> AND pres_id=NEW.pres_id
> );
> - You can also add another PostgreSQL rule to make the failing INSERT
> (described above) do so quietly. This does not affect the function of the
> code, but it will make the logs quieter (which is nice):
> CREATE RULE "pua_insert_race2" AS ON INSERT TO "pua"
> WHERE EXISTS(
> SELECT 1 FROM pua WHERE call_id=NEW.call_id AND from_tag=NEW.from_tag
> AND to_tag<>''
> ) DO INSTEAD NOTHING;
>
> ---
>
> modules_k/pua/pua_db.c | 187 +++++++++++++++++++++++++---------------
> modules_k/pua/send_subscribe.c | 54 ------------
> 2 files changed, 118 insertions(+), 123 deletions(-)
>
> diff --git a/modules_k/pua/pua_db.c b/modules_k/pua/pua_db.c
> index b83e3ed..b652cee 100644
> --- a/modules_k/pua/pua_db.c
> +++ b/modules_k/pua/pua_db.c
> @@ -883,9 +883,9 @@ int get_record_id_puadb(ua_pres_t *pres, str **rec_id )
> /******************************************************************************/
> int convert_temporary_dialog_puadb(ua_pres_t *pres)
> {
> - db_key_t query_cols[4], data_cols[10];
> - db_val_t query_vals[4], data_vals[10];
> - int n_query_cols = 0, n_data_cols = 0;
> + db_key_t query_cols[18];
> + db_val_t query_vals[18];
> + int n_query_cols = 0;
>
> if (pres==NULL)
> {
> @@ -920,82 +920,131 @@ int convert_temporary_dialog_puadb(ua_pres_t *pres)
> n_query_cols++;
>
> /* The columns I need to fill in to convert a temporary dialog to a dialog */
> - data_cols[n_data_cols] =&str_expires_col;
> - data_vals[n_data_cols].type = DB1_INT;
> - data_vals[n_data_cols].nul = 0;
> - data_vals[n_data_cols].val.int_val = pres->expires;
> - n_data_cols++;
> -
> - data_cols[n_data_cols] =&str_desired_expires_col;
> - data_vals[n_data_cols].type = DB1_INT;
> - data_vals[n_data_cols].nul = 0;
> - data_vals[n_data_cols].val.int_val = pres->desired_expires;
> - n_data_cols++;
> -
> - data_cols[n_data_cols] =&str_flag_col;
> - data_vals[n_data_cols].type = DB1_INT;
> - data_vals[n_data_cols].nul = 0;
> - data_vals[n_data_cols].val.int_val = pres->flag;
> - n_data_cols++;
> -
> - data_cols[n_data_cols] =&str_to_tag_col;
> - data_vals[n_data_cols].type = DB1_STR;
> - data_vals[n_data_cols].nul = 0;
> - data_vals[n_data_cols].val.str_val = pres->to_tag;
> - n_data_cols++;
> -
> - data_cols[n_data_cols] =&str_cseq_col;
> - data_vals[n_data_cols].type = DB1_INT;
> - data_vals[n_data_cols].nul = 0;
> - data_vals[n_data_cols].val.int_val = pres->cseq;
> - n_data_cols++;
> -
> - data_cols[n_data_cols] =&str_record_route_col;
> - data_vals[n_data_cols].type = DB1_STR;
> - data_vals[n_data_cols].nul = 0;
> - data_vals[n_data_cols].val.str_val = pres->record_route;
> - n_data_cols++;
> -
> - data_cols[n_data_cols] =&str_contact_col;
> - data_vals[n_data_cols].type = DB1_STR;
> - data_vals[n_data_cols].nul = 0;
> - data_vals[n_data_cols].val.str_val = pres->contact;
> - n_data_cols++;
> -
> - data_cols[n_data_cols] =&str_remote_contact_col;
> - data_vals[n_data_cols].type = DB1_STR;
> - data_vals[n_data_cols].nul = 0;
> - data_vals[n_data_cols].val.str_val = pres->remote_contact;
> - n_data_cols++;
> -
> - data_cols[n_data_cols] =&str_version_col;
> - data_vals[n_data_cols].type = DB1_INT;
> - data_vals[n_data_cols].nul = 0;
> - data_vals[n_data_cols].val.int_val = pres->version;
> - n_data_cols++;
> -
> - data_cols[n_data_cols] =&str_extra_headers_col;
> - data_vals[n_data_cols].type = DB1_STR;
> - data_vals[n_data_cols].nul = 0;
> + query_cols[n_query_cols] =&str_expires_col;
> + query_vals[n_query_cols].type = DB1_INT;
> + query_vals[n_query_cols].nul = 0;
> + query_vals[n_query_cols].val.int_val = pres->expires;
> + n_query_cols++;
> +
> + query_cols[n_query_cols] =&str_desired_expires_col;
> + query_vals[n_query_cols].type = DB1_INT;
> + query_vals[n_query_cols].nul = 0;
> + query_vals[n_query_cols].val.int_val = pres->desired_expires;
> + n_query_cols++;
> +
> + query_cols[n_query_cols] =&str_flag_col;
> + query_vals[n_query_cols].type = DB1_INT;
> + query_vals[n_query_cols].nul = 0;
> + query_vals[n_query_cols].val.int_val = pres->flag;
> + n_query_cols++;
> +
> + query_cols[n_query_cols] =&str_to_tag_col;
> + query_vals[n_query_cols].type = DB1_STR;
> + query_vals[n_query_cols].nul = 0;
> + query_vals[n_query_cols].val.str_val = pres->to_tag;
> + n_query_cols++;
> +
> + query_cols[n_query_cols] =&str_cseq_col;
> + query_vals[n_query_cols].type = DB1_INT;
> + query_vals[n_query_cols].nul = 0;
> + query_vals[n_query_cols].val.int_val = pres->cseq;
> + n_query_cols++;
> +
> + query_cols[n_query_cols] =&str_record_route_col;
> + query_vals[n_query_cols].type = DB1_STR;
> + query_vals[n_query_cols].nul = 0;
> + query_vals[n_query_cols].val.str_val = pres->record_route;
> + n_query_cols++;
> +
> + query_cols[n_query_cols] =&str_contact_col;
> + query_vals[n_query_cols].type = DB1_STR;
> + query_vals[n_query_cols].nul = 0;
> + query_vals[n_query_cols].val.str_val = pres->contact;
> + n_query_cols++;
> +
> + query_cols[n_query_cols] =&str_remote_contact_col;
> + query_vals[n_query_cols].type = DB1_STR;
> + query_vals[n_query_cols].nul = 0;
> + query_vals[n_query_cols].val.str_val = pres->remote_contact;
> + n_query_cols++;
> +
> + query_cols[n_query_cols] =&str_version_col;
> + query_vals[n_query_cols].type = DB1_INT;
> + query_vals[n_query_cols].nul = 0;
> + query_vals[n_query_cols].val.int_val = pres->version;
> + n_query_cols++;
> +
> + query_cols[n_query_cols] =&str_extra_headers_col;
> + query_vals[n_query_cols].type = DB1_STR;
> + query_vals[n_query_cols].nul = 0;
> if (pres->extra_headers)
> {
> - data_vals[n_data_cols].val.str_val.s = pres->extra_headers->s;
> - data_vals[n_data_cols].val.str_val.len = pres->extra_headers->len;
> + query_vals[n_query_cols].val.str_val.s = pres->extra_headers->s;
> + query_vals[n_query_cols].val.str_val.len = pres->extra_headers->len;
> }
> else
> {
> - data_vals[n_data_cols].val.str_val.s = "";
> - data_vals[n_data_cols].val.str_val.len = 0;
> + query_vals[n_query_cols].val.str_val.s = "";
> + query_vals[n_query_cols].val.str_val.len = 0;
> }
> - n_data_cols++;
> + n_query_cols++;
>
> - if (pua_dbf.update(pua_db, query_cols, 0, query_vals,
> - data_cols, data_vals, n_query_cols, n_data_cols)< 0)
> + query_cols[n_query_cols] =&str_event_col;
> + query_vals[n_query_cols].type = DB1_INT;
> + query_vals[n_query_cols].nul = 0;
> + query_vals[n_query_cols].val.int_val = pres->event;
> + n_query_cols++;
> +
> + query_cols[n_query_cols] =&str_watcher_uri_col;
> + query_vals[n_query_cols].type = DB1_STR;
> + query_vals[n_query_cols].nul = 0;
> + query_vals[n_query_cols].val.str_val.s = pres->watcher_uri->s;
> + query_vals[n_query_cols].val.str_val.len = pres->watcher_uri->len;
> + n_query_cols++;
> +
> + query_cols[n_query_cols] =&str_etag_col;
> + query_vals[n_query_cols].type = DB1_STR;
> + query_vals[n_query_cols].nul = 0;
> + query_vals[n_query_cols].val.str_val.s = 0;
> + query_vals[n_query_cols].val.str_val.len = 0;
> + n_query_cols++;
> +
> + query_cols[n_query_cols] =&str_tuple_id_col;
> + query_vals[n_query_cols].type = DB1_STR;
> + query_vals[n_query_cols].nul = 0;
> + query_vals[n_query_cols].val.str_val.s = 0;
> + query_vals[n_query_cols].val.str_val.len = 0;
> + n_query_cols++;
> +
> + if (pua_dbf.replace != NULL)
> {
> - LM_ERR("Failed update db\n");
> - return -1;
> + if (pua_dbf.replace(pua_db, query_cols, query_vals, n_query_cols)< 0)
> + {
> + LM_ERR("Failed replace db\n");
> + return -1;
> + }
> + }
> + else
> + {
> + if (pua_dbf.update(pua_db, query_cols, 0, query_vals,
> + query_cols + 4, query_vals + 4, 4, n_query_cols - 4)< 0)
> + {
> + LM_ERR("Failed update db\n");
> + return -1;
> + }
> +
> + LM_DBG("affected_rows: %d\n", pua_dbf.affected_rows(pua_db));
> + if (pua_dbf.affected_rows(pua_db) == 0)
> + {
> + if (pua_dbf.insert(pua_db, query_cols, query_vals, n_query_cols)< 0)
> + {
> + LM_ERR("Failed insert db\n");
> + return -1;
> + }
> + }
> }
>
> +
> shm_free(pres->remote_contact.s);
> shm_free(pres);
>
> diff --git a/modules_k/pua/send_subscribe.c b/modules_k/pua/send_subscribe.c
> index 15a1a02..b135ac3 100644
> --- a/modules_k/pua/send_subscribe.c
> +++ b/modules_k/pua/send_subscribe.c
> @@ -1050,18 +1050,9 @@ insert:
>
> if(subs->flag& UPDATE_TYPE)
> {
> - /*
> - LM_ERR("request for a subscription"
> - " with update type and no record found\n");
> - ret= -1;
> - goto done;
> - commented this because of the strange type parameter in usrloc callback functions
> - */
> -
> LM_DBG("request for a subscription with update type"
> " and no record found\n");
> subs->flag= INSERT_TYPE;
> -
> }
> hentity= subscribe_cbparam(subs, REQ_OTHER);
> if(hentity== NULL)
> @@ -1168,50 +1159,6 @@ insert:
> }
> else
> {
> - /*
> - if(presentity->desired_expires== 0)
> - {
> -
> - if(subs->expires< 0)
> - {
> - LM_DBG("Found previous request for unlimited subscribe-"
> - " do not send subscribe\n");
> -
> - if (subs->event& PWINFO_EVENT)
> - {
> - presentity->watcher_count++;
> - }
> - lock_release(&HashT->p_records[hash_code].lock);
> - goto done;
> -
> - }
> -
> -
> - if(subs->event& PWINFO_EVENT)
> - {
> - if(subs->expires== 0)
> - {
> - presentity->watcher_count--;
> - if( presentity->watcher_count> 0)
> - {
> - lock_release(&HashT->p_records[hash_code].lock);
> - goto done;
> - }
> - }
> - else
> - {
> - presentity->watcher_count++;
> - if(presentity->watcher_count> 1)
> - {
> - lock_release(&HashT->p_records[hash_code].lock);
> - goto done;
> - }
> - }
> - }
> -
> - }
> - */
> -
> if (subs->internal_update_flag == INTERNAL_UPDATE_TRUE)
> {
> LM_INFO("attempting to re-SUBSCRIBE on internal (rls_update_subs()) update - skipping\n");
> @@ -1265,7 +1212,6 @@ insert:
> if (dbmode!=PUA_DB_ONLY)
> lock_release(&HashT->p_records[hash_code].lock);
>
> - // hentity->flag= flag;
> LM_DBG("event parameter: %d\n", hentity->event);
>
> set_uac_req(&uac_r,&met, str_hdr, 0, td, TMCB_LOCAL_COMPLETED,
>
>
> _______________________________________________
> sr-dev mailing list
> sr-dev@lists.sip-router.org
> http://lists.sip-router.org/cgi-bin/mailman/listinfo/sr-dev
>