Sponsor: VoiceMeUp - Corporate & Wholesale VoIP Services

VoIP Mailing List Archives
Mailing list archives for the VoIP community
 SearchSearch 

[asterisk-users] CEL entries over ODBC several hours late (Matthew Jordan)


 
Post new topic   Reply to topic    VoIP Mailing List Archives Forum Index -> Asterisk Users
View previous topic :: View next topic  
Author Message
viljoens at verishare....
Guest





PostPosted: Thu Dec 10, 2015 9:57 am    Post subject: [asterisk-users] CEL entries over ODBC several hours late (M Reply with quote

Hi Matthew

Thank you very much for the reply.

I must have something seriously wrong somewhere else then - I retested now
and the "apparent" effect is as I describe but your info definitely
contradicts that. But you're obviously correct.

One more question - I've noted that if I run a combination of queries in the
CEL backing DB (MariaDB) and the CEL table is locked, this severely affects
the Asterisk instance - thousands of occurrences of

chan_sip.c:4057 __sip_autodestruct: Autodestruct on dialog
'6a9f5d3543b619655e07c81437373a32@172.17.12.3:5060' with owner
SIP/3034-000207c8 in place (Method: BYE). Rescheduling destruction for 10000
ms

appear in the CLI and users complain that if they hang up then they cannot
make another call on the same SIP handset for several minutes.

This is obviously because the dialplan gets delayed in the H extension, and
cannot write to the CEL table, waiting for the MariaDB instance to clear the
locks so it can write again. The above apparently comes from a watchdog
process that watches how "fast" the H extension is and if it takes "too
long" it forces the channels closed.

Is this assumption correct?

Addtionally, it seems that the writing of CEL in Asterisk is NOT async? E.g.
it appears the thread that was running the conversation ALSO does the CEL
writing / pushing to the CEL core as you describe in a synchronous manner.

For 1.8, is this correct that CELs are synchronous, and do newer Asterisk
versions do it async?

E. g. my point being if there are major DB issues, it is quite a bit
kryptonite to have that "spill back" into Asterisk and start blocking off
users from calling out - wouldn't it be much better to simply have failed
CEL writes just die in a distal thread instead of the main call thread for
the channels running on that handset.

Or am I completely misunderstanding things?

Anyway, thanks for the reply. Smile

Kind regards,

Quote:
Hi guys

I'm running 1.8.32.3 with CEL logging over ODBC to MariaDB 5.5.41 on the
same Centos 7 machine.

I've noticed that the CDR entries made are all in-time, e. g. the call
will
Quote:
take place and the CDR entry is immediately written into the CDR table in
the MariaDB database.

However, CEL events for that CDR (which I need to process for a realtime
display feature in my dialer software) are always several hours after the
fact. E. g. I will make a call at 09:30, see the call immediate pop up in
the MariaDB CDR table, but only at about 15:15 that afternoon will I see
that call's CEL events come into the CEL table, from Asterisk.... I have
examined the `show processlist` in MariaDB exhaustively to establish this
fact.

The system doesn't appear loaded, load average is about 1.1 - it's a
quad-coare HT Intel Xeon E3-1225 with 8GB of DRAM running on an SSD for
main
storage.

The system processes about 30 000 calls every 8 hour day, and services 90
SIP phones.

I can stop and restart the MariaDB instance for several minutes, when I
restart it it immediately picks up on the "slow" CELs from where it was
interrupted - more evidence that Asterisk is very slowly streaming the
CELs
Quote:
through. I thought MariaDB was the bottleneck, but apparently not?

If I make test inserts from a script into the CEL table, all of them
complete so quickly a time indication does not even register for the query
in MariaDB. Simple test queries on the CEL table are also instant, not
even
Quote:
counting in the internal MariaDB query duration timer.

Can anybody explain why this is that the CELs asterisk emits over ODBC are
so delayed? Are CELs intended NOT to be realtime?

So, logically, Asterisk appears to be caching CELs to the tune of hundreds
of thousands of them at any given time - meaning if it is stopped (either
killed, or core stop gracefully'ed, or just "core stop now") potentially
hundreds of thousands of CELs will just evaporate irretrivably.

What can I do to mitigate this extremely slow populating of CELs over
ODBC?
Asterisk does not buffer CEL entries. If anything, it pushes the entries
out to ODBC much more aggressively than what you would get with CDRs.

An event is generated in Asterisk that corresponds to the CEL entry. That
entry is pushed over a message bus (the 'event' message bus in 1.8 - 11;
'stasis' in 12+) and is picked up by the CEL core. The events are
immediately sent to the registered backends, who also immediately write it
out to the backend they support. In the case of ODBC, this immediately does
an INSERT into the appropriate table.

In Asterisk 1.8, you can look for a verbose level 11 message that will show
when this occurs:

ast_verb(11, "[%s]\n", ast_str_buffer(sql));

In later versions, this was turned into a debug level 3 message (as
anything over a verbose 5/debug 5 was cleaned up).

If you see that message, then that will tell you when Asterisk *believes*
it has written the CEL entry. If that doesn't show up in the database, then
it is either in the ODBC driver or the Maria database.

If you don't see that message, then something is preventing those events
from getting delivered inside of Asterisk, which would only occur if you
had some other serious call related issues occurring.

Matt



--
_____________________________________________________________________
-- Bandwidth and Colocation Provided by http://www.api-digital.com --
New to Asterisk? Join us for a live introductory webinar every Thurs:
http://www.asterisk.org/hello

asterisk-users mailing list
To UNSUBSCRIBE or update options visit:
http://lists.digium.com/mailman/listinfo/asterisk-users
Back to top
vinicius at aittelecom...
Guest





PostPosted: Thu Dec 10, 2015 10:42 am    Post subject: [asterisk-users] CEL entries over ODBC several hours late (M Reply with quote

Sorry for the probably obvious question, but it's better to cover all bases.

The DBMS is running on the same box as Asterisk is? If that's the case then maybe the DBMS is using too much CPU and starving Asterisk?


2015-12-10 12:57 GMT-02:00 Stefan Viljoen <viljoens@verishare.co.za (viljoens@verishare.co.za)>:
Quote:
Hi Matthew

Thank you very much for the reply.

I must have something seriously wrong somewhere else then - I retested now
and the "apparent" effect is as  I describe but your info definitely
contradicts that. But you're obviously correct.

One more question - I've noted that if I run a combination of queries in the
CEL backing DB (MariaDB) and the CEL table is locked, this severely affects
the Asterisk instance - thousands of occurrences of

chan_sip.c:4057 __sip_autodestruct: Autodestruct on dialog
'6a9f5d3543b619655e07c81437373a32@172.17.12.3:5060' with owner
SIP/3034-000207c8 in place (Method: BYE). Rescheduling destruction for 10000
ms

appear in the CLI and users complain that if they hang up then they cannot
make another call on the same SIP handset for several minutes.

This is obviously because the dialplan gets delayed in the H extension, and
cannot write to the CEL table, waiting for the MariaDB instance to clear the
locks so it can write again. The above apparently comes from a watchdog
process that watches how "fast" the H extension is and if it takes "too
long" it forces the channels closed.

Is this assumption correct?

Addtionally, it seems that the writing of CEL in Asterisk is NOT async? E.g.
it appears the thread that was running the conversation ALSO does the CEL
writing / pushing to the CEL core as you describe in a synchronous manner.

For 1.8, is this correct that CELs are synchronous, and do newer Asterisk
versions do it async?

E. g. my point being if there are major DB issues, it is quite a bit
kryptonite to have that "spill back" into Asterisk and start blocking off
users from calling out - wouldn't it be much better to simply have failed
CEL writes just die in a distal thread instead of the main call thread for
the channels running on that handset.

Or am I completely misunderstanding things?

Anyway, thanks for the reply. Smile

Kind regards,

Quote:
Hi guys

I'm running 1.8.32.3 with CEL logging over ODBC to MariaDB 5.5.41 on the
same Centos 7 machine.

I've noticed that the CDR entries made are all in-time, e. g. the call
will
Quote:
take place and the CDR entry is immediately written into the CDR table in
the MariaDB database.

However, CEL events for that CDR (which I need to process for a realtime
display feature in my dialer software) are always several hours after the
fact. E. g. I will make a call at 09:30, see the call immediate pop up in
the MariaDB CDR table, but only at about 15:15 that afternoon will I see
that call's CEL events come into the CEL table, from Asterisk.... I have
examined the `show processlist` in MariaDB exhaustively to establish this
fact.

The system doesn't appear loaded, load average is about 1.1 - it's a
quad-coare HT Intel Xeon E3-1225 with 8GB of DRAM running on an SSD for
main
storage.

The system processes about 30 000 calls every 8 hour day, and services 90
SIP phones.

I can stop and restart the MariaDB instance for several minutes, when I
restart it it immediately picks up on the "slow" CELs from where it was
interrupted - more evidence that Asterisk is very slowly streaming the
CELs
Quote:
through. I thought MariaDB was the bottleneck, but apparently not?

If I make test inserts from a script into the CEL table, all of them
complete so quickly a time indication does not even register for the query
in MariaDB. Simple test queries on the CEL table are also instant, not
even
Quote:
counting in the internal MariaDB query duration timer.

Can anybody explain why this is that the CELs asterisk emits over ODBC are
so delayed? Are CELs intended NOT to be realtime?

So, logically, Asterisk appears to be caching CELs to the tune of hundreds
of thousands of them at any given time - meaning if it is stopped (either
killed, or core stop gracefully'ed, or just "core stop now")  potentially
hundreds of thousands of CELs will just evaporate irretrivably.

What can I do to mitigate this extremely slow populating of CELs over
ODBC?
Asterisk does not buffer CEL entries. If anything, it pushes the entries
out to ODBC much more aggressively than what you would get with CDRs.

An event is generated in Asterisk that corresponds to the CEL entry. That
entry is pushed over a message bus (the 'event' message bus in 1.8 - 11;
'stasis' in 12+) and is picked up by the CEL core. The events are
immediately sent to the registered backends, who also immediately write it
out to the backend they support. In the case of ODBC, this immediately does
an INSERT into the appropriate table.

In Asterisk 1.8, you can look for a verbose level 11 message that will show
when this occurs:

        ast_verb(11, "[%s]\n", ast_str_buffer(sql));

In later versions, this was turned into a debug level 3 message (as
anything over a verbose 5/debug 5 was cleaned up).

If you see that message, then that will tell you when Asterisk *believes*
it has written the CEL entry. If that doesn't show up in the database, then
it is either in the ODBC driver or the Maria database.

If you don't see that message, then something is preventing those events
from getting delivered inside of Asterisk, which would only occur if you
had some other serious call related issues occurring.

Matt



--
_____________________________________________________________________
-- Bandwidth and Colocation Provided by http://www.api-digital.com --
New to Asterisk? Join us for a live introductory webinar every Thurs:
               http://www.asterisk.org/hello

asterisk-users mailing list
To UNSUBSCRIBE or update options visit:
   http://lists.digium.com/mailman/listinfo/asterisk-users
Back to top
mjordan at digium.com
Guest





PostPosted: Fri Dec 11, 2015 10:27 am    Post subject: [asterisk-users] CEL entries over ODBC several hours late (M Reply with quote

On Thu, Dec 10, 2015 at 8:57 AM, Stefan Viljoen <viljoens@verishare.co.za (viljoens@verishare.co.za)> wrote:
Quote:
Hi Matthew

Thank you very much for the reply.

I must have something seriously wrong somewhere else then - I retested now
and the "apparent" effect is as  I describe but your info definitely
contradicts that. But you're obviously correct.

One more question - I've noted that if I run a combination of queries in the
CEL backing DB (MariaDB) and the CEL table is locked, this severely affects
the Asterisk instance - thousands of occurrences of

chan_sip.c:4057 __sip_autodestruct: Autodestruct on dialog
'6a9f5d3543b619655e07c81437373a32@172.17.12.3:5060' with owner
SIP/3034-000207c8 in place (Method: BYE). Rescheduling destruction for 10000
ms

appear in the CLI and users complain that if they hang up then they cannot
make another call on the same SIP handset for several minutes.

This is obviously because the dialplan gets delayed in the H extension, and
cannot write to the CEL table, waiting for the MariaDB instance to clear the
locks so it can write again. The above apparently comes from a watchdog
process that watches how "fast" the H extension is and if it takes "too
long" it forces the channels closed.

Is this assumption correct?

Addtionally, it seems that the writing of CEL in Asterisk is NOT async? E.g.
it appears the thread that was running the conversation ALSO does the CEL
writing / pushing to the CEL core as you describe in a synchronous manner.

For 1.8, is this correct that CELs are synchronous, and do newer Asterisk
versions do it async?


That's actually a bit surprising. Assuming you are using CELGenUserEvent in the 'h' extension, I would not expect that to block the channel when writing out to the database. The act of creating the CEL event will lock the channel briefly, but the actual CEL event is queued up onto a message bus and dispatched to another thread. I'd have to see the output of a gdb backtrace or 'core show locks' to know why that is impacting the channels.


It should be asynchronous everywhere - in 1.8+. While the implementation of the message bus changed in Asterisk 12, that doesn't change the nature of how it is dispatched. As I said, a gdb backtrace or 'core show locks' would show who the culprit it.




Quote:
E. g. my point being if there are major DB issues, it is quite a bit
kryptonite to have that "spill back" into Asterisk and start blocking off
users from calling out - wouldn't it be much better to simply have failed
CEL writes just die in a distal thread instead of the main call thread for
the channels running on that handset.

Or am I completely misunderstanding things?

Anyway, thanks for the reply. Smile

Kind regards,

Quote:
Hi guys

I'm running 1.8.32.3 with CEL logging over ODBC to MariaDB 5.5.41 on the
same Centos 7 machine.

I've noticed that the CDR entries made are all in-time, e. g. the call
will
Quote:
take place and the CDR entry is immediately written into the CDR table in
the MariaDB database.

However, CEL events for that CDR (which I need to process for a realtime
display feature in my dialer software) are always several hours after the
fact. E. g. I will make a call at 09:30, see the call immediate pop up in
the MariaDB CDR table, but only at about 15:15 that afternoon will I see
that call's CEL events come into the CEL table, from Asterisk.... I have
examined the `show processlist` in MariaDB exhaustively to establish this
fact.

The system doesn't appear loaded, load average is about 1.1 - it's a
quad-coare HT Intel Xeon E3-1225 with 8GB of DRAM running on an SSD for
main
storage.

The system processes about 30 000 calls every 8 hour day, and services 90
SIP phones.

I can stop and restart the MariaDB instance for several minutes, when I
restart it it immediately picks up on the "slow" CELs from where it was
interrupted - more evidence that Asterisk is very slowly streaming the
CELs
Quote:
through. I thought MariaDB was the bottleneck, but apparently not?

If I make test inserts from a script into the CEL table, all of them
complete so quickly a time indication does not even register for the query
in MariaDB. Simple test queries on the CEL table are also instant, not
even
Quote:
counting in the internal MariaDB query duration timer.

Can anybody explain why this is that the CELs asterisk emits over ODBC are
so delayed? Are CELs intended NOT to be realtime?

So, logically, Asterisk appears to be caching CELs to the tune of hundreds
of thousands of them at any given time - meaning if it is stopped (either
killed, or core stop gracefully'ed, or just "core stop now")  potentially
hundreds of thousands of CELs will just evaporate irretrivably.

What can I do to mitigate this extremely slow populating of CELs over
ODBC?
Asterisk does not buffer CEL entries. If anything, it pushes the entries
out to ODBC much more aggressively than what you would get with CDRs.

An event is generated in Asterisk that corresponds to the CEL entry. That
entry is pushed over a message bus (the 'event' message bus in 1.8 - 11;
'stasis' in 12+) and is picked up by the CEL core. The events are
immediately sent to the registered backends, who also immediately write it
out to the backend they support. In the case of ODBC, this immediately does
an INSERT into the appropriate table.

In Asterisk 1.8, you can look for a verbose level 11 message that will show
when this occurs:

        ast_verb(11, "[%s]\n", ast_str_buffer(sql));

In later versions, this was turned into a debug level 3 message (as
anything over a verbose 5/debug 5 was cleaned up).

If you see that message, then that will tell you when Asterisk *believes*
it has written the CEL entry. If that doesn't show up in the database, then
it is either in the ODBC driver or the Maria database.

If you don't see that message, then something is preventing those events
from getting delivered inside of Asterisk, which would only occur if you
had some other serious call related issues occurring.

Matt



--
_____________________________________________________________________
-- Bandwidth and Colocation Provided by http://www.api-digital.com --
New to Asterisk? Join us for a live introductory webinar every Thurs:
               http://www.asterisk.org/hello

asterisk-users mailing list
To UNSUBSCRIBE or update options visit:
   http://lists.digium.com/mailman/listinfo/asterisk-users



--
Matthew Jordan

Digium, Inc. | Director of Technology

445 Jan Davis Drive NW - Huntsville, AL 35806 - USA
Check us out at: http://digium.com & http://asterisk.org
Back to top
Display posts from previous:   
Post new topic   Reply to topic    VoIP Mailing List Archives Forum Index -> Asterisk Users All times are GMT - 5 Hours
Page 1 of 1

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum


Powered by phpBB © 2001, 2005 phpBB Group

VoiceMeUp - Corporate & Wholesale VoIP Services