Sponsor: VoiceMeUp - Corporate & Wholesale VoIP Services

VoIP Mailing List Archives
Mailing list archives for the VoIP community
 SearchSearch 

[asterisk-users] zaptel 1.4.10 regression with TE220B on Proliant DL380 G5 ?

Goto page 1, 2  Next
 
Post new topic   Reply to topic    VoIP Mailing List Archives Forum Index -> Asterisk Users
View previous topic :: View next topic  
Author Message
ex.vitorino at gmail.com
Guest





PostPosted: Mon Apr 14, 2008 5:00 pm    Post subject: [asterisk-users] zaptel 1.4.10 regression with TE220B on Pro Reply with quote

Hi list,

After a lot of testing + troubleshooting, I guess I'm observing
what I am now calling a regression with zaptel 1.4.10 (is it?)
As such I call for peer feedback, before either asking Digium
install support or filing a bug.

Thanks in advance!
System: HP Proliant DL380 G5 with 2x PCI-X + 1x PCIe riser card
OS: Centos 5
Kernel: 2.6.18-53.1.14.el5 (also tested under 2.6.18-53.el5)
HW: Digium TE220B, the one with HW echo cancellation
(configured as 2x E1 via jumpers)

Context: Pre-site installation of system, no E1 conectivity
(loopbacks tested)


/etc/zaptel.conf:
span=1,1,0,ccs,hdb3,crc4
bchan=25-39,41-55
dchan=40
span=2,2,0,ccs,hdb3,crc4
bchan=56-70,72-86
dchan=71


Under zaptel 1.4.10, when ztcfg runs this gets logged in the kernel
buffer:

About to enter spanconfig!
Done with spanconfig!
About to enter spanconfig!
Done with spanconfig!
About to enter startup!
TE2XXP: Span 1 configured for CCS/HDB3/CRC4
timing source auto card 0!
wct2xxp: Setting yellow alarm on span 1
timing source auto card 0!
SPAN 1: Primary Sync Source
VPM400: Not Present
VPM450: echo cancellation for 64 channels
BUG: soft lockup detected on CPU#0!
[<c044d448>] softlockup_tick+0x96/0xa4
[<c042ddc8>] update_process_times+0x39/0x5c
[<c04196f7>] smp_apic_timer_interrupt+0x5b/0x6c
[<c04059bf>] apic_timer_interrupt+0x1f/0x24
[<f89bc1e7>] init_vpm450m+0x32d/0x34a [wct4xxp]
[<f89a3b11>] t4_vpm450_init+0x18ce/0x198c [wct4xxp]
[<f89a7ee4>] t4_startup+0x4315/0x43c7 [wct4xxp]
[<c042621c>] release_console_sem+0x17e/0x1b8
[<c0407406>] do_IRQ+0xa5/0xae
[<f8994311>] t4_dacs+0x211/0x24b [wct4xxp]
[<f8a01f6a>] zt_ioctl+0x273/0x144f [zaptel]
[<c0457600>] mempool_alloc+0x28/0xc9
[<c04ddd33>] cfq_resort_rr_list+0x23/0x8b
[<c04deb6c>] cfq_add_crq_rb+0xba/0xc3
[<c04dec72>] cfq_insert_request+0x42/0x498
[<c04d5175>] elv_insert+0x10a/0x1ad
[<c04d908b>] __make_request+0x31d/0x366
[<c04de8b1>] cfq_dispatch_requests+0x26a/0x46b
[<c04dde27>] __cfq_slice_expired+0x8c/0xa5
[<c04de8b1>] cfq_dispatch_requests+0x26a/0x46b
[<c04d505d>] elv_next_request+0x15c/0x16a
[<f88bc101>] start_io+0x77/0xdc [cciss]
[<f88bf63e>] do_cciss_request+0x32c/0x337 [cciss]
[<f88ccff0>] __split_bio+0x408/0x418 [dm_mod]
[<f88cd6a6>] dm_request+0xce/0xd4 [dm_mod]
[<c04d6a81>] generic_make_request+0x248/0x258
[<c04d8734>] submit_bio+0xbf/0xc5
[<c04548e2>] find_get_page+0x18/0x38
[<c04719ad>] __find_get_block_slow+0xfb/0x105
[<c0471cea>] __find_get_block+0x15c/0x166
[<c0471cea>] __find_get_block+0x15c/0x166
[<c0471d24>] __getblk+0x30/0x270
[<f885a485>] journal_cancel_revoke+0x8a/0x96 [jbd]
[<f885a472>] journal_cancel_revoke+0x77/0x96 [jbd]
[<f885626f>] __journal_file_buffer+0x10e/0x1e3 [jbd]
[<c041f871>] __wake_up+0x2a/0x3d
[<f8856679>] journal_stop+0x1b0/0x1ba [jbd]
[<c042a209>] current_fs_time+0x4a/0x55
[<c048626d>] touch_atime+0x60/0x8f
[<c04552ee>] do_generic_mapping_read+0x421/0x468
[<c045478b>] file_read_actor+0x0/0xd1
[<c04548e2>] find_get_page+0x18/0x38
[<c0457319>] filemap_nopage+0x192/0x315
[<c046048f>] __handle_mm_fault+0x85e/0x87b
[<c047f46b>] do_ioctl+0x47/0x5d
[<c047f6cb>] vfs_ioctl+0x24a/0x25c
[<c047f725>] sys_ioctl+0x48/0x5f
[<c0404eff>] syscall_call+0x7/0xb
=======================
VPM450: hardware DTMF disabled.
VPM450: Present and operational servicing 2 span(s)
Completed startup!
About to enter startup!
TE2XXP: Span 2 configured for CCS/HDB3/CRC4
wct2xxp: Setting yellow alarm on span 2
timing source auto card 0!
SPAN 2: Secondary Sync Source
Completed startup!


Soft lockup ?! Hmmm... I'm ignorant on this, but it smells fishy !

For completeness sake, driver was previously loaded ok:

Zapata Telephony Interface Registered on major 196
Zaptel Version: 1.4.10
Zaptel Echo Canceller: MG2
ACPI: PCI Interrupt 0000:18:08.0[A] -> GSI 19 (level, low) -> IRQ 98
Found TE2XXP at base address fdff0000, remapped to f8854000
TE2XXP version c01a016a, burst ON
Octasic optimized!
FALC version: 00000005, Board ID: 00
Reg 0: 0x375a2400
Reg 1: 0x375a2000
Reg 2: 0xffffffff
Reg 3: 0x00000000
Reg 4: 0x00003101
Reg 5: 0x00000000
Reg 6: 0xc01a016a
Reg 7: 0x00001300
Reg 8: 0x00000000
Reg 9: 0x00ff2031
Reg 10: 0x0000004a
TE2XXP: Launching card: 0
TE2XXP: Setting up global serial parameters
Found a Wildcard: Wildcard TE220 (4th Gen)


After trying lot's of things (disable ILO, disable USBs, try different kernel,
different TE220B, etc), I figured that this "soft hangup" does not show
under zaptel 1.4.9.2...

In all due honesty, I haven't got the faintest idea what kind of impact this
could have.

Side testing zaptel 1.4.10 on a simpler system, an HP Proliant ML110 (nearly
a PC), the error does not show up as well.


I checked the zaptel 1.4.10 ChangeLog and there are some changes which
I'd suspect:

2008-04-01 16:39 +0000 [r4122] sruffell <sruffell at localhost>:

* kernel/wct4xxp/base.c: Work around for host bridges that generate
fast back to back transactions which the current version of the
quad span cards do not advertise support for.

2008-03-14 16:39 +0000 [r3983-3990] Matthew Fredrickson <creslin at digium.com>

* firmware/Makefile, kernel/wctdm24xxp/base.c,
kernel/wctdm24xxp/GpakApi.c, kernel/wctdm24xxp/GpakApi.h: Update
wctdm24xxp's VPMADT032 firmware to version 1.16

* kernel/wct4xxp/base.c: When doing the ISR rewrite, forgot to
include the vpmdtmfcheck when doing DTMF polling causing it to
check for DTMF events even when it was told not to

(+others)


I need to have this system running in about a week and a half.
What do you guys say ?
--
exvito
Back to top
bwentdg at pipeline.com
Guest





PostPosted: Tue Apr 15, 2008 3:46 am    Post subject: [asterisk-users] zaptel 1.4.10 regression with TE220B on Pro Reply with quote

Please keep us updated on your "progress".
I am considering putting several of these boxes in
and I would love to hear how this comes out.
Wish I had something to suggest.

Ex Vito wrote:
Quote:
Hi list,

After a lot of testing + troubleshooting, I guess I'm observing
what I am now calling a regression with zaptel 1.4.10 (is it?)
As such I call for peer feedback, before either asking Digium
install support or filing a bug.

Thanks in advance!


System: HP Proliant DL380 G5 with 2x PCI-X + 1x PCIe riser card
OS: Centos 5
Kernel: 2.6.18-53.1.14.el5 (also tested under 2.6.18-53.el5)
HW: Digium TE220B, the one with HW echo cancellation
(configured as 2x E1 via jumpers)

Context: Pre-site installation of system, no E1 conectivity
(loopbacks tested)


/etc/zaptel.conf:
span=1,1,0,ccs,hdb3,crc4
bchan=25-39,41-55
dchan=40
span=2,2,0,ccs,hdb3,crc4
bchan=56-70,72-86
dchan=71


Under zaptel 1.4.10, when ztcfg runs this gets logged in the kernel
buffer:

About to enter spanconfig!
Done with spanconfig!
About to enter spanconfig!
Done with spanconfig!
About to enter startup!
TE2XXP: Span 1 configured for CCS/HDB3/CRC4
timing source auto card 0!
wct2xxp: Setting yellow alarm on span 1
timing source auto card 0!
SPAN 1: Primary Sync Source
VPM400: Not Present
VPM450: echo cancellation for 64 channels
BUG: soft lockup detected on CPU#0!
[<c044d448>] softlockup_tick+0x96/0xa4
[<c042ddc8>] update_process_times+0x39/0x5c
[<c04196f7>] smp_apic_timer_interrupt+0x5b/0x6c
[<c04059bf>] apic_timer_interrupt+0x1f/0x24
[<f89bc1e7>] init_vpm450m+0x32d/0x34a [wct4xxp]
[<f89a3b11>] t4_vpm450_init+0x18ce/0x198c [wct4xxp]
[<f89a7ee4>] t4_startup+0x4315/0x43c7 [wct4xxp]
[<c042621c>] release_console_sem+0x17e/0x1b8
[<c0407406>] do_IRQ+0xa5/0xae
[<f8994311>] t4_dacs+0x211/0x24b [wct4xxp]
[<f8a01f6a>] zt_ioctl+0x273/0x144f [zaptel]
[<c0457600>] mempool_alloc+0x28/0xc9
[<c04ddd33>] cfq_resort_rr_list+0x23/0x8b
[<c04deb6c>] cfq_add_crq_rb+0xba/0xc3
[<c04dec72>] cfq_insert_request+0x42/0x498
[<c04d5175>] elv_insert+0x10a/0x1ad
[<c04d908b>] __make_request+0x31d/0x366
[<c04de8b1>] cfq_dispatch_requests+0x26a/0x46b
[<c04dde27>] __cfq_slice_expired+0x8c/0xa5
[<c04de8b1>] cfq_dispatch_requests+0x26a/0x46b
[<c04d505d>] elv_next_request+0x15c/0x16a
[<f88bc101>] start_io+0x77/0xdc [cciss]
[<f88bf63e>] do_cciss_request+0x32c/0x337 [cciss]
[<f88ccff0>] __split_bio+0x408/0x418 [dm_mod]
[<f88cd6a6>] dm_request+0xce/0xd4 [dm_mod]
[<c04d6a81>] generic_make_request+0x248/0x258
[<c04d8734>] submit_bio+0xbf/0xc5
[<c04548e2>] find_get_page+0x18/0x38
[<c04719ad>] __find_get_block_slow+0xfb/0x105
[<c0471cea>] __find_get_block+0x15c/0x166
[<c0471cea>] __find_get_block+0x15c/0x166
[<c0471d24>] __getblk+0x30/0x270
[<f885a485>] journal_cancel_revoke+0x8a/0x96 [jbd]
[<f885a472>] journal_cancel_revoke+0x77/0x96 [jbd]
[<f885626f>] __journal_file_buffer+0x10e/0x1e3 [jbd]
[<c041f871>] __wake_up+0x2a/0x3d
[<f8856679>] journal_stop+0x1b0/0x1ba [jbd]
[<c042a209>] current_fs_time+0x4a/0x55
[<c048626d>] touch_atime+0x60/0x8f
[<c04552ee>] do_generic_mapping_read+0x421/0x468
[<c045478b>] file_read_actor+0x0/0xd1
[<c04548e2>] find_get_page+0x18/0x38
[<c0457319>] filemap_nopage+0x192/0x315
[<c046048f>] __handle_mm_fault+0x85e/0x87b
[<c047f46b>] do_ioctl+0x47/0x5d
[<c047f6cb>] vfs_ioctl+0x24a/0x25c
[<c047f725>] sys_ioctl+0x48/0x5f
[<c0404eff>] syscall_call+0x7/0xb
=======================
VPM450: hardware DTMF disabled.
VPM450: Present and operational servicing 2 span(s)
Completed startup!
About to enter startup!
TE2XXP: Span 2 configured for CCS/HDB3/CRC4
wct2xxp: Setting yellow alarm on span 2
timing source auto card 0!
SPAN 2: Secondary Sync Source
Completed startup!


Soft lockup ?! Hmmm... I'm ignorant on this, but it smells fishy !

For completeness sake, driver was previously loaded ok:

Zapata Telephony Interface Registered on major 196
Zaptel Version: 1.4.10
Zaptel Echo Canceller: MG2
ACPI: PCI Interrupt 0000:18:08.0[A] -> GSI 19 (level, low) -> IRQ 98
Found TE2XXP at base address fdff0000, remapped to f8854000
TE2XXP version c01a016a, burst ON
Octasic optimized!
FALC version: 00000005, Board ID: 00
Reg 0: 0x375a2400
Reg 1: 0x375a2000
Reg 2: 0xffffffff
Reg 3: 0x00000000
Reg 4: 0x00003101
Reg 5: 0x00000000
Reg 6: 0xc01a016a
Reg 7: 0x00001300
Reg 8: 0x00000000
Reg 9: 0x00ff2031
Reg 10: 0x0000004a
TE2XXP: Launching card: 0
TE2XXP: Setting up global serial parameters
Found a Wildcard: Wildcard TE220 (4th Gen)


After trying lot's of things (disable ILO, disable USBs, try different kernel,
different TE220B, etc), I figured that this "soft hangup" does not show
under zaptel 1.4.9.2...

In all due honesty, I haven't got the faintest idea what kind of impact this
could have.

Side testing zaptel 1.4.10 on a simpler system, an HP Proliant ML110 (nearly
a PC), the error does not show up as well.


I checked the zaptel 1.4.10 ChangeLog and there are some changes which
I'd suspect:

2008-04-01 16:39 +0000 [r4122] sruffell <sruffell at localhost>:

* kernel/wct4xxp/base.c: Work around for host bridges that generate
fast back to back transactions which the current version of the
quad span cards do not advertise support for.

2008-03-14 16:39 +0000 [r3983-3990] Matthew Fredrickson <creslin at digium.com>

* firmware/Makefile, kernel/wctdm24xxp/base.c,
kernel/wctdm24xxp/GpakApi.c, kernel/wctdm24xxp/GpakApi.h: Update
wctdm24xxp's VPMADT032 firmware to version 1.16

* kernel/wct4xxp/base.c: When doing the ISR rewrite, forgot to
include the vpmdtmfcheck when doing DTMF polling causing it to
check for DTMF events even when it was told not to

(+others)


I need to have this system running in about a week and a half.
What do you guys say ?
--
exvito

_______________________________________________
-- Bandwidth and Colocation Provided by http://www.api-digital.com --

asterisk-users mailing list
To UNSUBSCRIBE or update options visit:
http://lists.digium.com/mailman/listinfo/asterisk-users


Back to top
sruffell at digium.com
Guest





PostPosted: Tue Apr 15, 2008 1:07 pm    Post subject: [asterisk-users] zaptel 1.4.10 regression with TE220B on Pro Reply with quote

Ex Vito,

[comments inline]

Ex Vito wrote:
Quote:
Hi list,

After a lot of testing + troubleshooting, I guess I'm observing
what I am now calling a regression with zaptel 1.4.10 (is it?)
As such I call for peer feedback, before either asking Digium
install support or filing a bug.


Under zaptel 1.4.10, when ztcfg runs this gets logged in the kernel
buffer:

About to enter spanconfig!
Done with spanconfig!
About to enter spanconfig!
Done with spanconfig!
About to enter startup!
TE2XXP: Span 1 configured for CCS/HDB3/CRC4
timing source auto card 0!
wct2xxp: Setting yellow alarm on span 1
timing source auto card 0!
SPAN 1: Primary Sync Source
VPM400: Not Present
VPM450: echo cancellation for 64 channels
BUG: soft lockup detected on CPU#0!
[<c044d448>] softlockup_tick+0x96/0xa4
[<c042ddc8>] update_process_times+0x39/0x5c
[<c04196f7>] smp_apic_timer_interrupt+0x5b/0x6c
[<c04059bf>] apic_timer_interrupt+0x1f/0x24
[<f89bc1e7>] init_vpm450m+0x32d/0x34a [wct4xxp]
[<f89a3b11>] t4_vpm450_init+0x18ce/0x198c [wct4xxp]
[<f89a7ee4>] t4_startup+0x4315/0x43c7 [wct4xxp]
[<c042621c>] release_console_sem+0x17e/0x1b8
[<c0407406>] do_IRQ+0xa5/0xae
[<f8994311>] t4_dacs+0x211/0x24b [wct4xxp]
[<f8a01f6a>] zt_ioctl+0x273/0x144f [zaptel]
[<c0457600>] mempool_alloc+0x28/0xc9
[<c04ddd33>] cfq_resort_rr_list+0x23/0x8b
[<c04deb6c>] cfq_add_crq_rb+0xba/0xc3
[<c04dec72>] cfq_insert_request+0x42/0x498
[<c04d5175>] elv_insert+0x10a/0x1ad
[<c04d908b>] __make_request+0x31d/0x366
[<c04de8b1>] cfq_dispatch_requests+0x26a/0x46b
[<c04dde27>] __cfq_slice_expired+0x8c/0xa5
[<c04de8b1>] cfq_dispatch_requests+0x26a/0x46b
[<c04d505d>] elv_next_request+0x15c/0x16a
[<f88bc101>] start_io+0x77/0xdc [cciss]
[<f88bf63e>] do_cciss_request+0x32c/0x337 [cciss]
[<f88ccff0>] __split_bio+0x408/0x418 [dm_mod]
[<f88cd6a6>] dm_request+0xce/0xd4 [dm_mod]
[<c04d6a81>] generic_make_request+0x248/0x258
[<c04d8734>] submit_bio+0xbf/0xc5
[<c04548e2>] find_get_page+0x18/0x38
[<c04719ad>] __find_get_block_slow+0xfb/0x105
[<c0471cea>] __find_get_block+0x15c/0x166
[<c0471cea>] __find_get_block+0x15c/0x166
[<c0471d24>] __getblk+0x30/0x270
[<f885a485>] journal_cancel_revoke+0x8a/0x96 [jbd]
[<f885a472>] journal_cancel_revoke+0x77/0x96 [jbd]
[<f885626f>] __journal_file_buffer+0x10e/0x1e3 [jbd]
[<c041f871>] __wake_up+0x2a/0x3d
[<f8856679>] journal_stop+0x1b0/0x1ba [jbd]
[<c042a209>] current_fs_time+0x4a/0x55
[<c048626d>] touch_atime+0x60/0x8f
[<c04552ee>] do_generic_mapping_read+0x421/0x468
[<c045478b>] file_read_actor+0x0/0xd1
[<c04548e2>] find_get_page+0x18/0x38
[<c0457319>] filemap_nopage+0x192/0x315
[<c046048f>] __handle_mm_fault+0x85e/0x87b
[<c047f46b>] do_ioctl+0x47/0x5d
[<c047f6cb>] vfs_ioctl+0x24a/0x25c
[<c047f725>] sys_ioctl+0x48/0x5f
[<c0404eff>] syscall_call+0x7/0xb
=======================
VPM450: hardware DTMF disabled.
VPM450: Present and operational servicing 2 span(s)
Completed startup!
About to enter startup!
TE2XXP: Span 2 configured for CCS/HDB3/CRC4
wct2xxp: Setting yellow alarm on span 2
timing source auto card 0!
SPAN 2: Secondary Sync Source
Completed startup!
Your stack trace appears to possibly be stack corruption.

Could you try either this branch:
http://svn.digium.com/view/zaptel/team/mattf/zaptel-1.4-stackcleanup/

Or with a kernel that does not have 4K stacks enabled? You can check if your installed kernel does with the following command.

$ cat /boot/config-`uname -r` | grep 4K
# CONFIG_4KSTACKS is not set

Cheers,
Shaun
Back to top
ex.vitorino at gmail.com
Guest





PostPosted: Tue Apr 15, 2008 2:01 pm    Post subject: [asterisk-users] zaptel 1.4.10 regression with TE220B on Pro Reply with quote

Quote:

Your stack trace appears to possibly be stack corruption.

Could you try either this branch:
http://svn.digium.com/view/zaptel/team/mattf/zaptel-1.4-stackcleanup/

Or with a kernel that does not have 4K stacks enabled? You can check if your installed kernel does with the following command.

$ cat /boot/config-`uname -r` | grep 4K
# CONFIG_4KSTACKS is not set


...thanks for your feedback Shaun.

I am currently nearing other troubleshooting issues regarding
a TC400B (which will probably lead me to get in touch with
Digium install support).

So I have no schedule today to test your suggestions; maybe
tomorrow / thursday.

They are noted, however. Smile

Cheers,
--
exvito
Back to top
ex.vitorino at gmail.com
Guest





PostPosted: Tue Apr 15, 2008 2:02 pm    Post subject: [asterisk-users] zaptel 1.4.10 regression with TE220B on Pro Reply with quote

Quote:
Quote:

Or with a kernel that does not have 4K stacks enabled? You can check if your installed kernel does with the following command.

$ cat /boot/config-`uname -r` | grep 4K
# CONFIG_4KSTACKS is not set


Opps, forgot to feedback: yes this kernel seems
to have CONFIG_4KSTACKS enabled.
--
exvito
Back to top
bwentdg at pipeline.com
Guest





PostPosted: Tue Apr 15, 2008 2:44 pm    Post subject: [asterisk-users] zaptel 1.4.10 regression with TE220B on Pro Reply with quote

Shaun - Could you clarify your post a bit ?

1 - Is the "4 K " stacks a Known Problem ?
a) If so is it known to be problem on any specific Linux distro ?
b) Should ALL installation Check for this PRIOR to doing an
Asterisk Install ?

2) The "branch" you mention below - are "fixes" from it in Any current *
release ?
Shaun Ruffell wrote:
Quote:
Your stack trace appears to possibly be stack corruption.

Could you try either this branch:
http://svn.digium.com/view/zaptel/team/mattf/zaptel-1.4-stackcleanup/

Or with a kernel that does not have 4K stacks enabled? You can check if your installed kernel does with the following command.

$ cat /boot/config-`uname -r` | grep 4K
# CONFIG_4KSTACKS is not set

Cheers,
Shaun


_______________________________________________
-- Bandwidth and Colocation Provided by http://www.api-digital.com --

asterisk-users mailing list
To UNSUBSCRIBE or update options visit:
http://lists.digium.com/mailman/listinfo/asterisk-users


Back to top
ex.vitorino at gmail.com
Guest





PostPosted: Tue Apr 15, 2008 7:09 pm    Post subject: [asterisk-users] zaptel 1.4.10 regression with TE220B on Pro Reply with quote

On Tue, Apr 15, 2008 at 8:37 PM, Al Baker <bwentdg at pipeline.com> wrote:
Quote:
exvito - I know it is a pain in the cahoonkus - but would you consider
sharing the OTHER Digium board issues you are having , the recommended
steps you were given by Digium to troubleshoot them, and the results ?
I think this "real-wold" experience wold be invaluable to the list.
THX in Advance for sharing !


...sure, here it goes, without all the infinite detail we went
through in the process.

Short version: same DL380 G5 system, Centos 5, kernel
2.6.18-53.1.14.el5, zaptel 1.4.10, zaptel 1.4.9.2, almost all
possible combinations in PCI slots, USB / 2nd NIC / ILO
enabling / disabling.

TC400B module loading fails (wctc4xxp)

(actually it loaded fine once or twice and asterisk recognized
its presence, but failed in subsequent reboots without any
reconfiguration!)

If asterisk 1.4.19 is started under these conditions, we get a
kernel panic -- did not get a dump / log of it but we have a
console picture that we can share (~460KiB). But at some
point we get:

...
[<address>] apic_timer_interrupt+0x1f/0x24
[<address>] zt_tc_open+0x59/0xc3 [zttranscode]
[<address>] zt_open+0x86/0x22a [zaptel]
[<address>] chrdev_open+0x11e/0x123
...

We also tried the same card under all the other variations
in a different system -- a proliant ML110 G4 -- we obtained
the same behaviour: once or twice it loaded most of the
time it failed with the same error.

dmesg snippet is:

...
Zaptel Version: 1.4.10
Zaptel Echo Canceller: MG2
Zaptel Transcoder support loaded
Registered codec translator 'DTE Encoder' with 92 transcoders
(srcs=0000000c, dsts=00000101)
Registered codec translator 'DTE Decoder' with 92 transcoders
(srcs=00000101, dsts=0000000c)
Zaptel DTE (G.729a / G.723.1) Transcoder support LOADED (firm ver = 6.12)
wctc4xxp: probe of 0000:0a:01.0 failed with error -5
...

Both when the card is the only one installed on the system
and when in the presence of TE220B and / or TE122.

We contacted Digium support, who suggested we RMA
this card, they believe the card is faulty. We seem to
agree, as the behavior does not seem to make much
sense (although this is our first experience with such
a card)

There it is, in the hope that it helps some one in the future.
We will post back results when the new card arrives.

Cheers,
--
exvito
Back to top
ex.vitorino at gmail.com
Guest





PostPosted: Wed Apr 16, 2008 8:26 am    Post subject: [asterisk-users] zaptel 1.4.10 regression with TE220B on Pro Reply with quote

On Tue, Apr 15, 2008 at 7:07 PM, Shaun Ruffell <sruffell at digium.com> wrote:
Quote:

Your stack trace appears to possibly be stack corruption.

Could you try either this branch:
http://svn.digium.com/view/zaptel/team/mattf/zaptel-1.4-stackcleanup/


Just tried it... Behaviour looks equivalent. Drivers load ok, ztcfg
leads to "BUG: soft lockup detected on CPU#1"... dmesg snippet is:
Zapata Telephony Interface Registered on major 196
Zaptel Version: SVN-mattf-zaptel-1.4-stackcleanup-r4163M
Zaptel Echo Canceller: MG2
PCI: Enabling device 0000:12:01.0 (0150 -> 0153)
ACPI: PCI Interrupt 0000:12:01.0[A] -> GSI 25 (level, low) -> IRQ 138
wcte12xp: Setting up global serial parameters for T1
wcte12xp: Found a Wildcard TE122
Found TE2XXP at base address fdff0000, remapped to f89c4000
TE2XXP version c01a016a, burst ON
Octasic optimized!
FALC version: 00000005, Board ID: 00
Reg 0: 0x37407400
Reg 1: 0x37407000
Reg 2: 0xffffffff
Reg 3: 0x00000000
Reg 4: 0x00000001
Reg 5: 0x00000000
Reg 6: 0xc01a016a
Reg 7: 0x00001300
Reg 8: 0x000200ff
Reg 9: 0x00f50000
Reg 10: 0x0000004a
TE2XXP: Launching card: 0
TE2XXP: Setting up global serial parameters
Found a Wildcard: Wildcard TE220 (4th Gen)
About to enter spanconfig!
Done with spanconfig!
About to enter spanconfig!
Done with spanconfig!
Registered tone zone 25 (Portugal)
wcte12xp: Span configured for ESF/B8ZS
About to enter startup!
TE2XXP: Span 1 configured for CCS/HDB3/CRC4
timing source auto card 0!
wct2xxp: Setting yellow alarm on span 1
timing source auto card 0!
SPAN 2: Primary Sync Source
VPM400: Not Present
wcte12xp: Setting yellow alarm
VPM450: echo cancellation for 64 channels
BUG: soft lockup detected on CPU#1!
[<c044d448>] softlockup_tick+0x96/0xa4
[<c042ddc8>] update_process_times+0x39/0x5c
[<c04196f7>] smp_apic_timer_interrupt+0x5b/0x6c
[<c04059bf>] apic_timer_interrupt+0x1f/0x24
[<f8f6b1e7>] init_vpm450m+0x32d/0x34a [wct4xxp]
[<f8f52b11>] t4_vpm450_init+0x18ce/0x198c [wct4xxp]
[<f8f56ee4>] t4_startup+0x4315/0x43c7 [wct4xxp]
[<c042624e>] release_console_sem+0x1b0/0x1b8
[<c042680e>] printk+0x18/0x8e
[<f8966fe4>] t1_configure_t1+0xc10/0xc18 [wcte12xp]
[<f89945ef>] zt_rbs_sethook+0x102/0x13b [zaptel]
[<f899bf39>] zt_ioctl+0x273/0x14be [zaptel]
[<c0477775>] chrdev_open+0x11e/0x132
[<c0477657>] chrdev_open+0x0/0x132
[<c046e9e6>] __dentry_open+0xea/0x1ab
[<c0604451>] schedule+0x90d/0x9ba
[<c047f46b>] do_ioctl+0x47/0x5d
[<c047f6cb>] vfs_ioctl+0x24a/0x25c
[<c0470daa>] __fput+0x13f/0x167
[<c047f725>] sys_ioctl+0x48/0x5f
[<c0404eff>] syscall_call+0x7/0xb
=======================
wcte12xp0: Missed interrupt. Increasing latency to 4 ms in order to compensate.
VPM450: hardware DTMF disabled.
VPM450: Present and operational servicing 2 span(s)
Completed startup!
About to enter startup!
TE2XXP: Span 2 configured for CCS/HDB3/CRC4
wct2xxp: Setting yellow alarm on span 2
SPAN 3: Secondary Sync Source
timing source auto card 0!
Completed startup!
wcte12xp: Clearing yellow alarm
Quote:

Or with a kernel that does not have 4K stacks enabled? You can check if your installed kernel does with the following command.

$ cat /boot/config-`uname -r` | grep 4K
# CONFIG_4KSTACKS is not set


...as mentioned previously, current kernel has CONFIG_4KSTACKS
set. I'll now go ahead and rebuild a kernel with 4K stacks disabled.

I'll post back later.
--
exvito
Back to top
creslin at digium.com
Guest





PostPosted: Wed Apr 16, 2008 9:26 am    Post subject: [asterisk-users] zaptel 1.4.10 regression with TE220B on Pro Reply with quote

Ex Vito wrote:
Quote:
Hi list,

After a lot of testing + troubleshooting, I guess I'm observing
what I am now calling a regression with zaptel 1.4.10 (is it?)
As such I call for peer feedback, before either asking Digium
install support or filing a bug.

Thanks in advance!


System: HP Proliant DL380 G5 with 2x PCI-X + 1x PCIe riser card
OS: Centos 5
Kernel: 2.6.18-53.1.14.el5 (also tested under 2.6.18-53.el5)
HW: Digium TE220B, the one with HW echo cancellation
(configured as 2x E1 via jumpers)

Context: Pre-site installation of system, no E1 conectivity
(loopbacks tested)


/etc/zaptel.conf:
span=1,1,0,ccs,hdb3,crc4
bchan=25-39,41-55
dchan=40
span=2,2,0,ccs,hdb3,crc4
bchan=56-70,72-86
dchan=71


Under zaptel 1.4.10, when ztcfg runs this gets logged in the kernel
buffer:

About to enter spanconfig!
Done with spanconfig!
About to enter spanconfig!
Done with spanconfig!
About to enter startup!
TE2XXP: Span 1 configured for CCS/HDB3/CRC4
timing source auto card 0!
wct2xxp: Setting yellow alarm on span 1
timing source auto card 0!
SPAN 1: Primary Sync Source
VPM400: Not Present
VPM450: echo cancellation for 64 channels
BUG: soft lockup detected on CPU#0!
[<c044d448>] softlockup_tick+0x96/0xa4
[<c042ddc8>] update_process_times+0x39/0x5c
[<c04196f7>] smp_apic_timer_interrupt+0x5b/0x6c
[<c04059bf>] apic_timer_interrupt+0x1f/0x24
[<f89bc1e7>] init_vpm450m+0x32d/0x34a [wct4xxp]
[<f89a3b11>] t4_vpm450_init+0x18ce/0x198c [wct4xxp]
[<f89a7ee4>] t4_startup+0x4315/0x43c7 [wct4xxp]
[<c042621c>] release_console_sem+0x17e/0x1b8
[<c0407406>] do_IRQ+0xa5/0xae
[<f8994311>] t4_dacs+0x211/0x24b [wct4xxp]
[<f8a01f6a>] zt_ioctl+0x273/0x144f [zaptel]
[<c0457600>] mempool_alloc+0x28/0xc9
[<c04ddd33>] cfq_resort_rr_list+0x23/0x8b
[<c04deb6c>] cfq_add_crq_rb+0xba/0xc3
[<c04dec72>] cfq_insert_request+0x42/0x498
[<c04d5175>] elv_insert+0x10a/0x1ad
[<c04d908b>] __make_request+0x31d/0x366
[<c04de8b1>] cfq_dispatch_requests+0x26a/0x46b
[<c04dde27>] __cfq_slice_expired+0x8c/0xa5
[<c04de8b1>] cfq_dispatch_requests+0x26a/0x46b
[<c04d505d>] elv_next_request+0x15c/0x16a
[<f88bc101>] start_io+0x77/0xdc [cciss]
[<f88bf63e>] do_cciss_request+0x32c/0x337 [cciss]
[<f88ccff0>] __split_bio+0x408/0x418 [dm_mod]
[<f88cd6a6>] dm_request+0xce/0xd4 [dm_mod]
[<c04d6a81>] generic_make_request+0x248/0x258
[<c04d8734>] submit_bio+0xbf/0xc5
[<c04548e2>] find_get_page+0x18/0x38
[<c04719ad>] __find_get_block_slow+0xfb/0x105
[<c0471cea>] __find_get_block+0x15c/0x166
[<c0471cea>] __find_get_block+0x15c/0x166
[<c0471d24>] __getblk+0x30/0x270
[<f885a485>] journal_cancel_revoke+0x8a/0x96 [jbd]
[<f885a472>] journal_cancel_revoke+0x77/0x96 [jbd]
[<f885626f>] __journal_file_buffer+0x10e/0x1e3 [jbd]
[<c041f871>] __wake_up+0x2a/0x3d
[<f8856679>] journal_stop+0x1b0/0x1ba [jbd]
[<c042a209>] current_fs_time+0x4a/0x55
[<c048626d>] touch_atime+0x60/0x8f
[<c04552ee>] do_generic_mapping_read+0x421/0x468
[<c045478b>] file_read_actor+0x0/0xd1
[<c04548e2>] find_get_page+0x18/0x38
[<c0457319>] filemap_nopage+0x192/0x315
[<c046048f>] __handle_mm_fault+0x85e/0x87b
[<c047f46b>] do_ioctl+0x47/0x5d
[<c047f6cb>] vfs_ioctl+0x24a/0x25c
[<c047f725>] sys_ioctl+0x48/0x5f
[<c0404eff>] syscall_call+0x7/0xb
=======================
VPM450: hardware DTMF disabled.
VPM450: Present and operational servicing 2 span(s)
Completed startup!
About to enter startup!
TE2XXP: Span 2 configured for CCS/HDB3/CRC4
wct2xxp: Setting yellow alarm on span 2
timing source auto card 0!
SPAN 2: Secondary Sync Source
Completed startup!


Soft lockup ?! Hmmm... I'm ignorant on this, but it smells fishy !

For completeness sake, driver was previously loaded ok:

Zapata Telephony Interface Registered on major 196
Zaptel Version: 1.4.10
Zaptel Echo Canceller: MG2
ACPI: PCI Interrupt 0000:18:08.0[A] -> GSI 19 (level, low) -> IRQ 98
Found TE2XXP at base address fdff0000, remapped to f8854000
TE2XXP version c01a016a, burst ON
Octasic optimized!
FALC version: 00000005, Board ID: 00
Reg 0: 0x375a2400
Reg 1: 0x375a2000
Reg 2: 0xffffffff
Reg 3: 0x00000000
Reg 4: 0x00003101
Reg 5: 0x00000000
Reg 6: 0xc01a016a
Reg 7: 0x00001300
Reg 8: 0x00000000
Reg 9: 0x00ff2031
Reg 10: 0x0000004a
TE2XXP: Launching card: 0
TE2XXP: Setting up global serial parameters
Found a Wildcard: Wildcard TE220 (4th Gen)


After trying lot's of things (disable ILO, disable USBs, try different kernel,
different TE220B, etc), I figured that this "soft hangup" does not show
under zaptel 1.4.9.2...

In all due honesty, I haven't got the faintest idea what kind of impact this
could have.

Side testing zaptel 1.4.10 on a simpler system, an HP Proliant ML110 (nearly
a PC), the error does not show up as well.


I checked the zaptel 1.4.10 ChangeLog and there are some changes which
I'd suspect:

2008-04-01 16:39 +0000 [r4122] sruffell <sruffell at localhost>:

* kernel/wct4xxp/base.c: Work around for host bridges that generate
fast back to back transactions which the current version of the
quad span cards do not advertise support for.

2008-03-14 16:39 +0000 [r3983-3990] Matthew Fredrickson <creslin at digium.com>

* firmware/Makefile, kernel/wctdm24xxp/base.c,
kernel/wctdm24xxp/GpakApi.c, kernel/wctdm24xxp/GpakApi.h: Update
wctdm24xxp's VPMADT032 firmware to version 1.16

* kernel/wct4xxp/base.c: When doing the ISR rewrite, forgot to
include the vpmdtmfcheck when doing DTMF polling causing it to
check for DTMF events even when it was told not to

(+others)


I need to have this system running in about a week and a half.
What do you guys say ?

The softlockup indicator should be benign. It gets called when loaded
the firmware for the part since the firmware image is so large and it
takes a long time to load. However, I might have a fix for you.

Can you try my stack reduction branch at:

https://origsvn.digium.com/svn/zaptel/team/mattf/zaptel-1.4-stackcleanup

If that does not work, please contact me directly and I will work with
you to get a resolution.

--
Matthew Fredrickson
Software/Firmware Engineer
Digium, Inc.
Back to top
sruffell at digium.com
Guest





PostPosted: Wed Apr 16, 2008 10:05 am    Post subject: [asterisk-users] zaptel 1.4.10 regression with TE220B on Pro Reply with quote

Hi Al,

Al Baker wrote:
Quote:
Shaun - Could you clarify your post a bit ?

1 - Is the "4 K " stacks a Known Problem ?
a) If so is it known to be problem on any specific Linux distro ?
b) Should ALL installation Check for this PRIOR to doing an
Asterisk Install ?

I wouldn't really say a known *problem*, since it really depends on what other code is running in the system at the time. I just mentioned that because I've seen 8K stacks help in certain situations. 8K stacks are still the default configuration option in the vanilla kernel. Some distributions (CentOS / Fedora) have switched to 4K by default because they help with memory consumption in highly threaded environments like web servers.

For the most part, kernel panics and oops are best handled on a case by case basis with Digium's tech support department since each case is unique.

Quote:

2) The "branch" you mention below - are "fixes" from it in Any current *
release ?


Not that I'm aware of...

Cheers,
Shaun
Back to top
ex.vitorino at gmail.com
Guest





PostPosted: Wed Apr 16, 2008 10:11 am    Post subject: [asterisk-users] zaptel 1.4.10 regression with TE220B on Pro Reply with quote

On Wed, Apr 16, 2008 at 3:26 PM, Matthew Fredrickson <creslin at digium.com> wrote:
Quote:

The softlockup indicator should be benign. It gets called when loaded
the firmware for the part since the firmware image is so large and it
takes a long time to load. However, I might have a fix for you.

Can you try my stack reduction branch at:

https://origsvn.digium.com/svn/zaptel/team/mattf/zaptel-1.4-stackcleanup

If that does not work, please contact me directly and I will work with
you to get a resolution.


Matt,

Thanks for your feedback. We've already tested the following
branch as per Shaun's suggestion, without getting a different
behaviour (see today's earlier email to the list):

http://svn.digium.com/view/zaptel/team/mattf/zaptel-1.4-stackcleanup/

Question:

- The url you suggest is very similar, are we talking about
a different "stackcleanup" branch ?

We are now in the middle of rebuilding a non 4K stack page
kernel so as to give it a try with 1.4.10, the branch Shaun
suggested, 1.4.9.2 and the branch you mention, if it is in fact
different from Shaun's.

We wait your confirmation and will post non 4K stack kernel
results later today.

Cheers,
--
exvito
Back to top
creslin at digium.com
Guest





PostPosted: Wed Apr 16, 2008 10:26 am    Post subject: [asterisk-users] zaptel 1.4.10 regression with TE220B on Pro Reply with quote

Shaun Ruffell wrote:
Quote:
Hi Al,

Al Baker wrote:
Quote:
Shaun - Could you clarify your post a bit ?

1 - Is the "4 K " stacks a Known Problem ?
a) If so is it known to be problem on any specific Linux distro ?
b) Should ALL installation Check for this PRIOR to doing an
Asterisk Install ?

I wouldn't really say a known *problem*, since it really depends on what other code is running in the system at the time. I just mentioned that because I've seen 8K stacks help in certain situations. 8K stacks are still the default configuration option in the vanilla kernel. Some distributions (CentOS / Fedora) have switched to 4K by default because they help with memory consumption in highly threaded environments like web servers.

For the most part, kernel panics and oops are best handled on a case by case basis with Digium's tech support department since each case is unique.


In this case, it looks like his kernel is compiled with the softlockup
detector code and it is falsely triggering. Disabling that should
remove the warning message at the very least.

Quote:
Quote:
2) The "branch" you mention below - are "fixes" from it in Any current *
release ?

They will be in the next Zaptel release.

--
Matthew Fredrickson
Software/Firmware Engineer
Digium, Inc.
Back to top
tzafrir.cohen at xorco...
Guest





PostPosted: Wed Apr 16, 2008 10:40 am    Post subject: [asterisk-users] zaptel 1.4.10 regression with TE220B on Pro Reply with quote

On Wed, Apr 16, 2008 at 04:11:52PM +0100, Ex Vito wrote:
Quote:
On Wed, Apr 16, 2008 at 3:26 PM, Matthew Fredrickson <creslin at digium.com> wrote:

[snip]

Quote:
Quote:
Can you try my stack reduction branch at:

https://origsvn.digium.com/svn/zaptel/team/mattf/zaptel-1.4-stackcleanup

If that does not work, please contact me directly and I will work with
you to get a resolution.


Matt,

Thanks for your feedback. We've already tested the following
branch as per Shaun's suggestion, without getting a different
behaviour (see today's earlier email to the list):

http://svn.digium.com/view/zaptel/team/mattf/zaptel-1.4-stackcleanup/

Question:

- The url you suggest is very similar, are we talking about
a different "stackcleanup" branch ?

Try:

http://svn.digium.com/svn/zaptel/team/mattf/zaptel-1.4-stackcleanup/

Try the seocnd one (svn.digium.com), actually. All point to the same
place. But origsvn does not allow annonymous access and /view is the
viewcvs/viewsvn web interface.

--
Tzafrir Cohen
icq#16849755 jabber:tzafrir.cohen at xorcom.com
+972-50-7952406 mailto:tzafrir.cohen at xorcom.com
http://www.xorcom.com iax:guest at local.xorcom.com/tzafrir
Back to top
ex.vitorino at gmail.com
Guest





PostPosted: Wed Apr 16, 2008 10:41 am    Post subject: [asterisk-users] zaptel 1.4.10 regression with TE220B on Pro Reply with quote

On Wed, Apr 16, 2008 at 4:20 PM, Matthew Fredrickson <creslin at digium.com> wrote:
Quote:


One thing also I would like to see is your kernel .config file. Another
thing that would for sure remove that warning is to disable the kernel
softlockup detector which is giving a false lockup warning in this case.
I belive it's under the "KERNEL HACKING" configuration menu if you are
using menuconfig.


Up till now we're running stock CentOS kernel: 2.6.18-53.1.14.el5
The .config is publicly available but we can fwd it to you should you
prefer.

The kernel we're now building (it is taking quite a while... but it also
has been quite a few years since we've built custom kernels... since
the 2.0.3x days ?) is based on the stock CentOS kernel with only
the 4K stacks option disabled.

Please confirm if the SVN branch you suggested is the same or
different from the one Shaun suggested yesterday which we already
tested.

Thanks,
--
exvito
Back to top
ex.vitorino at gmail.com
Guest





PostPosted: Wed Apr 16, 2008 10:55 am    Post subject: [asterisk-users] zaptel 1.4.10 regression with TE220B on Pro Reply with quote

Quote:
Quote:

http://svn.digium.com/view/zaptel/team/mattf/zaptel-1.4-stackcleanup/

Question:

- The url you suggest is very similar, are we talking about
a different "stackcleanup" branch ?

Try:

http://svn.digium.com/svn/zaptel/team/mattf/zaptel-1.4-stackcleanup/

Try the seocnd one (svn.digium.com), actually. All point to the same
place. But origsvn does not allow annonymous access and /view is the
viewcvs/viewsvn web interface.


So Matt's suggestion is the same as Shaun's... Which we already tested
with no different results, correct ?
--
exvito
Back to top
Display posts from previous:   
Post new topic   Reply to topic    VoIP Mailing List Archives Forum Index -> Asterisk Users All times are GMT - 5 Hours
Goto page 1, 2  Next
Page 1 of 2

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum


Powered by phpBB © 2001, 2005 phpBB Group

VoiceMeUp - Corporate & Wholesale VoIP Services