LKML Archive on lore.kernel.org
help / color / mirror / Atom feed
* [kernel panic @ reboot] 2.6.0-test10-mm1
@ 2003-11-26 16:51 Vince
  2003-11-26 17:16 ` Zwane Mwaikambo
  0 siblings, 1 reply; 113+ messages in thread
From: Vince @ 2003-11-26 16:51 UTC (permalink / raw)
  To: linux-kernel

[-- Attachment #1: Type: text/plain, Size: 964 bytes --]

Hi,

   I get a kernel panic each time I'm rebooting my system on all
recent 2.6.0testx kernels (cpu is an Athlon 1800XP, kernel compiled with
preempt and ACPI ; config and dmesg is attached).

   This time, I got tired of seeing this and finally installed kmsgdump
in order to collect some data, available in messages.txt (*)

For my particular case, X was not loaded: I just logged in in console
mode and did a reboot. No nvidia or other binary driver loaded. Any hint 
on tracking down this bug is appreciated (I can compile my kernel with 
additional debugging options if required).

Regards,

Vincent

(*) BTW: something like kmsgdump should really one day be included in 
mainline kernels, serial console is often not an option for people like 
me running a single machine ; writting an oops by hand is not fun... and 
often, even if the crash does not occur while running an X server, the 
beginning of the oops is not accessible on the console any more...

[-- Attachment #2: messages.txt --]
[-- Type: text/plain, Size: 16384 bytes --]

tainted VLI
<4>EFLAGS: 00010086
<4>EIP is at show_registers+0xf3/0x1e0
<4>eax: ffedffe8   ebx: cc91a4a0   ecx: ffee02ea   edx: cc91a000
<4>esi: cc91a46c   edi: 00000068   ebp: 00000001   esp: cc91a370
<4>ds: 007b   es: 007b   ss: 0068
<1>Unable to handle kernel paging request at virtual address ffee0070
<4> printing eip:
<4>c010b543
<1>*pde = 00002067
<1>*pte = 00000000
<4>Oops: 0000 [#49]
<4>PREEMPT 
<4>CPU:    0
<4>EIP:    0060:[<c010b543>]    Not tainted VLI
<4>EFLAGS: 00010086
<4>EIP is at show_registers+0xf3/0x1e0
<4>eax: ffedffe8   ebx: cc91a370   ecx: ffee02ea   edx: cc91a000
<4>esi: cc91a33c   edi: 00000068   ebp: 00000001   esp: cc91a240
<4>ds: 007b   es: 007b   ss: 0068
<4>Process B\x15‰D$\x14\x0f¾B\x14‰D$\x10‹B\f‰\f$ÇD$\b\b (pid: 608471316, threadinfo=cc91a000 task=c01392ec)
<4>Stack: c02b9580 0000007b 0000007b cc91a000 ffedffe8 00010086 cc91a33c cc91a000 
<4>       c011cc80 ffedffe8 c010b769 cc91a33c c02bf912 00000000 00000030 00000000 
<4>       00000000 c011ce56 c02bf912 cc91a33c 00000000 ffffffff c02b2640 000002e0 
<4>Call Trace:
<4> [<c011cc80>] do_page_fault+0x0/0x504
<4> [<c010b769>] die+0x89/0x100
<4> [<c011ce56>] do_page_fault+0x1d6/0x504
<4> [<c0122bb7>] __call_console_drivers+0x57/0x60
<4> [<c0122cb5>] call_console_drivers+0x65/0x120
<4> [<c011cc80>] do_page_fault+0x0/0x504
<4> [<c02a92af>] error_code+0x2f/0x38
<4> [<c010b543>] show_registers+0xf3/0x1e0
<4> [<c011cc80>] do_page_fault+0x0/0x504
<4> [<c010b769>] die+0x89/0x100
<4> [<c011ce56>] do_page_fault+0x1d6/0x504
<4> [<c0122bb7>] __call_console_drivers+0x57/0x60
<4> [<c0122cb5>] call_console_drivers+0x65/0x120
<4> [<c011cc80>] do_page_fault+0x0/0x504
<4> [<c02a92af>] error_code+0x2f/0x38
<4> [<c010b543>] show_registers+0xf3/0x1e0
<4> [<c011cc80>] do_page_fault+0x0/0x504
<4> [<c010b769>] die+0x89/0x100
<4> [<c011ce56>] do_page_fault+0x1d6/0x504
<4> [<c0122bb7>] __call_console_drivers+0x57/0x60
<4> [<c0122cb5>] call_console_drivers+0x65/0x120
<4> [<c011cc80>] do_page_fault+0x0/0x504
<4> [<c02a92af>] error_code+0x2f/0x38
<4> [<c010b543>] show_registers+0xf3/0x1e0
<4> [<c011cc80>] do_page_fault+0x0/0x504
<4> [<c010b769>] die+0x89/0x100
<4> [<c011ce56>] do_page_fault+0x1d6/0x504
<4> [<c0122bb7>] __call_console_drivers+0x57/0x60
<4> [<c0122cb5>] call_console_drivers+0x65/0x120
<4> [<c011cc80>] do_page_fault+0x0/0x504
<4> [<c02a92af>] error_code+0x2f/0x38
<4> [<c010b543>] show_registers+0xf3/0x1e0
<4> [<c011cc80>] do_page_fault+0x0/0x504
<4> [<c010b769>] die+0x89/0x100
<4> [<c011ce56>] do_page_fault+0x1d6/0x504
<4> [<c0122bb7>] __call_console_drivers+0x57/0x60
<4> [<c0122cb5>] call_console_drivers+0x65/0x120
<4> [<c011cc80>] do_page_fault+0x0/0x504
<4> [<c02a92af>] error_code+0x2f/0x38
<4> [<c010b543>] show_registers+0xf3/0x1e0
<4> [<c011cc80>] do_page_fault+0x0/0x504
<4> [<c010b769>] die+0x89/0x100
<4> [<c011ce56>] do_page_fault+0x1d6/0x504
<4> [<c0122bb7>] __call_console_drivers+0x57/0x60
<4> [<c0122cb5>] call_console_drivers+0x65/0x120
<4> [<c011cc80>] do_page_fault+0x0/0x504
<4> [<c02a92af>] error_code+0x2f/0x38
<4> [<c010b543>] show_registers+0xf3/0x1e0
<4> [<c011cc80>] do_page_fault+0x0/0x504
<4> [<c010b769>] die+0x89/0x100
<4> [<c011ce56>] do_page_fault+0x1d6/0x504
<4> [<c0122bb7>] __call_console_drivers+0x57/0x60
<4> [<c0122cb5>] call_console_drivers+0x65/0x120
<4> [<c011cc80>] do_page_fault+0x0/0x504
<4> [<c02a92af>] error_code+0x2f/0x38
<4> [<c010b543>] show_registers+0xf3/0x1e0
<4> [<c011cc80>] do_page_fault+0x0/0x504
<4> [<c010b769>] die+0x89/0x100
<4> [<c011ce56>] do_page_fault+0x1d6/0x504
<4> [<c0122bb7>] __call_console_drivers+0x57/0x60
<4> [<c0122cb5>] call_console_drivers+0x65/0x120
<4> [<c011cc80>] do_page_fault+0x0/0x504
<4> [<c02a92af>] error_code+0x2f/0x38
<4> [<c010b543>] show_registers+0xf3/0x1e0
<4> [<c011cc80>] do_page_fault+0x0/0x504
<4> [<c010b769>] die+0x89/0x100
<4> [<c011ce56>] do_page_fault+0x1d6/0x504
<4> [<c0122bb7>] __call_console_drivers+0x57/0x60
<4> [<c0122cb5>] call_console_drivers+0x65/0x120
<4> [<c011cc80>] do_page_fault+0x0/0x504
<4> [<c02a92af>] error_code+0x2f/0x38
<4> [<c010b543>] show_registers+0xf3/0x1e0
<4> [<c011cc80>] do_page_fault+0x0/0x504
<4> [<c010b769>] die+0x89/0x100
<4> [<c011ce56>] do_page_fault+0x1d6/0x504
<4> [<c0122bb7>] __call_console_drivers+0x57/0x60
<4> [<c0122cb5>] call_console_drivers+0x65/0x120
<4> [<c011cc80>] do_page_fault+0x0/0x504
<4> [<c02a92af>] error_code+0x2f/0x38
<4> [<c010b543>] show_registers+0xf3/0x1e0
<4> [<c011cc80>] do_page_fault+0x0/0x504
<4> [<c010b769>] die+0x89/0x100
<4> [<c011ce56>] do_page_fault+0x1d6/0x504
<4> [<c0122bb7>] __call_console_drivers+0x57/0x60
<4> [<c0122cb5>] call_console_drivers+0x65/0x120
<4> [<c011cc80>] do_page_fault+0x0/0x504
<4> [<c02a92af>] error_code+0x2f/0x38
<4> [<c010b543>] show_registers+0xf3/0x1e0
<4> [<c011cc80>] do_page_fault+0x0/0x504
<4> [<c010b769>] die+0x89/0x100
<4> [<c011ce56>] do_page_fault+0x1d6/0x504
<4> [<c0122bb7>] __call_console_drivers+0x57/0x60
<4> [<c0122cb5>] call_console_drivers+0x65/0x120
<4> [<c011cc80>] do_page_fault+0x0/0x504
<4> [<c02a92af>] error_code+0x2f/0x38
<4> [<c010b543>] show_registers+0xf3/0x1e0
<4> [<c011cc80>] do_page_fault+0x0/0x504
<4> [<c010b769>] die+0x89/0x100
<4> [<c011ce56>] do_page_fault+0x1d6/0x504
<4> [<c0122bb7>] __call_console_drivers+0x57/0x60
<4> [<c0122cb5>] call_console_drivers+0x65/0x120
<4> [<c011cc80>] do_page_fault+0x0/0x504
<4> [<c02a92af>] error_code+0x2f/0x38
<4> [<c010b543>] show_registers+0xf3/0x1e0
<4> [<c011cc80>] do_page_fault+0x0/0x504
<4> [<c010b769>] die+0x89/0x100
<4> [<c011ce56>] do_page_fault+0x1d6/0x504
<4> [<c0122bb7>] __call_console_drivers+0x57/0x60
<4> [<c0122cb5>] call_console_drivers+0x65/0x120
<4> [<c011cc80>] do_page_fault+0x0/0x504
<4> [<c02a92af>] error_code+0x2f/0x38
<4> [<c010b543>] show_registers+0xf3/0x1e0
<4> [<c011cc80>] do_page_fault+0x0/0x504
<4> [<c010b769>] die+0x89/0x100
<4> [<c011ce56>] do_page_fault+0x1d6/0x504
<4> [<c0122bb7>] __call_console_drivers+0x57/0x60
<4> [<c0122cb5>] call_console_drivers+0x65/0x120
<4> [<c011cc80>] do_page_fault+0x0/0x504
<4> [<c02a92af>] error_code+0x2f/0x38
<4> [<c010b543>] show_registers+0xf3/0x1e0
<4> [<c011cc80>] do_page_fault+0x0/0x504
<4> [<c010b769>] die+0x89/0x100
<4> [<c011ce56>] do_page_fault+0x1d6/0x504
<4> [<c0122bb7>] __call_console_drivers+0x57/0x60
<4> [<c0122cb5>] call_console_drivers+0x65/0x120
<4> [<c011cc80>] do_page_fault+0x0/0x504
<4> [<c02a92af>] error_code+0x2f/0x38
<4> [<c010b543>] show_registers+0xf3/0x1e0
<4> [<c011cc80>] do_page_fault+0x0/0x504
<4> [<c010b769>] die+0x89/0x100
<4> [<c011ce56>] do_page_fault+0x1d6/0x504
<4> [<c0122bb7>] __call_console_drivers+0x57/0x60
<4> [<c0122cb5>] call_console_drivers+0x65/0x120
<4> [<c011cc80>] do_page_fault+0x0/0x504
<4> [<c02a92af>] error_code+0x2f/0x38
<4> [<c010b543>] show_registers+0xf3/0x1e0
<4> [<c011cc80>] do_page_fault+0x0/0x504
<4> [<c010b769>] die+0x89/0x100
<4> [<c011ce56>] do_page_fault+0x1d6/0x504
<4> [<c0122bb7>] __call_console_drivers+0x57/0x60
<4> [<c0122bb7>] __call_console_drivers+0x57/0x60
<4> [<c0122cb5>] call_console_drivers+0x65/0x120
<4> [<c011cc80>] do_page_fault+0x0/0x504
<4> [<c02a92af>] error_code+0x2f/0x38
<4> [<c010b543>] show_registers+0xf3/0x1e0
<4> [<c011cc80>] do_page_fault+0x0/0x504
<4> [<c010b769>] die+0x89/0x100
<4> [<c011ce56>] do_page_fault+0x1d6/0x504
<4> [<c0122bb7>] __call_console_drivers+0x57/0x60
<4> [<c0122cb5>] call_console_drivers+0x65/0x120
<4> [<c011cc80>] do_page_fault+0x0/0x504
<4> [<c02a92af>] error_code+0x2f/0x38
<4> [<c010b543>] show_registers+0xf3/0x1e0
<4> [<c011cc80>] do_page_fault+0x0/0x504
<4> [<c010b769>] die+0x89/0x100
<4> [<c011ce56>] do_page_fault+0x1d6/0x504
<4> [<c0122bb7>] __call_console_drivers+0x57/0x60
<4> [<c0122cb5>] call_console_drivers+0x65/0x120
<4> [<c011cc80>] do_page_fault+0x0/0x504
<4> [<c02a92af>] error_code+0x2f/0x38
<4> [<c010b543>] show_registers+0xf3/0x1e0
<4> [<c011cc80>] do_page_fault+0x0/0x504
<4> [<c010b769>] die+0x89/0x100
<4> [<c011ce56>] do_page_fault+0x1d6/0x504
<4> [<c0122bb7>] __call_console_drivers+0x57/0x60
<4> [<c0122cb5>] call_console_drivers+0x65/0x120
<4> [<c011cc80>] do_page_fault+0x0/0x504
<4> [<c02a92af>] error_code+0x2f/0x38
<4> [<c010b543>] show_registers+0xf3/0x1e0
<4> [<c011cc80>] do_page_fault+0x0/0x504
<4> [<c010b769>] die+0x89/0x100
<4> [<c011ce56>] do_page_fault+0x1d6/0x504
<4> [<c0122bb7>] __call_console_drivers+0x57/0x60
<4> [<c0122cb5>] call_console_drivers+0x65/0x120
<4> [<c011cc80>] do_page_fault+0x0/0x504
<4> [<c02a92af>] error_code+0x2f/0x38
<4> [<c010b543>] show_registers+0xf3/0x1e0
<4> [<c011cc80>] do_page_fault+0x0/0x504
<4> [<c010b769>] die+0x89/0x100
<4> [<c011ce56>] do_page_fault+0x1d6/0x504
<4> [<c0122bb7>] __call_console_drivers+0x57/0x60
<4> [<c0122cb5>] call_console_drivers+0x65/0x120
<4> [<c011cc80>] do_page_fault+0x0/0x504
<4> [<c02a92af>] error_code+0x2f/0x38
<4> [<c010b543>] show_registers+0xf3/0x1e0
<4> [<c011cc80>] do_page_fault+0x0/0x504
<4> [<c010b769>] die+0x89/0x100
<4> [<c011ce56>] do_page_fault+0x1d6/0x504
<4> [<c0122bb7>] __call_console_drivers+0x57/0x60
<4> [<c0122cb5>] call_console_drivers+0x65/0x120
<4> [<c011cc80>] do_page_fault+0x0/0x504
<4> [<c02a92af>] error_code+0x2f/0x38
<4> [<c010b543>] show_registers+0xf3/0x1e0
<4> [<c011cc80>] do_page_fault+0x0/0x504
<4> [<c010b769>] die+0x89/0x100
<4> [<c011ce56>] do_page_fault+0x1d6/0x504
<4> [<c011cc80>] do_page_fault+0x0/0x504
<4> [<c02a92af>] error_code+0x2f/0x38
<4> [<c011cc80>] do_page_fault+0x0/0x504
<4> [<c011ccc4>] do_page_fault+0x44/0x504
<4>
<4>Code: 0f b7 46 1c c7 04 24 80 95 2b c0 89 44 24 04 e8 b4 78 01 00 ba 00 e0 ff ff 21 e2 8b 02 89 54 24 0c 8d 88 02 03 00 00 89 44 24 10 <8b> 80 88 00 00 00 89 4c 24 04 c7 04 24 a0 95 2b c0 89 44 24 08 
<4> <1>Unable to handle kernel paging request at virtual address 4f6269d4
<4> printing eip:
<4>c011ccc4
<1>*pde = 00000000
<4>Oops: 0000 [#50]
<4>PREEMPT 
<4>CPU:    0
<4>EIP:    0060:[<c011ccc4>]    Not tainted VLI
<4>EFLAGS: 00010893
<4>EIP is at do_page_fault+0x44/0x504
<4>eax: cc900000   ebx: 00000000   ecx: 0000007b   edx: 00000000
<4>esi: 00000000   edi: c011cc80   ebp: 4f62696c   esp: cc900074
<4>ds: 007b   es: 007b   ss: 0068
<4>Process  (pid: 1680683566, threadinfo=cc8fe000 task=c4831e74)
<4>Stack: 00000000 00000000 00000000 00000000 00000000 4f6269d4 00000000 00000000 
<4>       00030001 00000000 00000000 00000000 00000000 00000000 00000000 00000000 
<4>       00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 
<4>Call Trace:
<4> [<c011cc80>] do_page_fault+0x0/0x504
<4> [<c02a92af>] error_code+0x2f/0x38
<4> [<c011cc80>] do_page_fault+0x0/0x504
<4> [<c011ccc4>] do_page_fault+0x44/0x504
<4> [<c011cc80>] do_page_fault+0x0/0x504
<4> [<c02a92af>] error_code+0x2f/0x38
<4> [<c011cc80>] do_page_fault+0x0/0x504
<4> [<c011ccc4>] do_page_fault+0x44/0x504
<4> [<c011cc80>] do_page_fault+0x0/0x504
<4> [<c02a92af>] error_code+0x2f/0x38
<4> [<c011cc80>] do_page_fault+0x0/0x504
<4> [<c011ccc4>] do_page_fault+0x44/0x504
<4> [<c011cc80>] do_page_fault+0x0/0x504
<4> [<c02a92af>] error_code+0x2f/0x38
<4> [<c011cc80>] do_page_fault+0x0/0x504
<4> [<c011ccc4>] do_page_fault+0x44/0x504
<4> [<c011cc80>] do_page_fault+0x0/0x504
<4> [<c02a92af>] error_code+0x2f/0x38
<4> [<c011cc80>] do_page_fault+0x0/0x504
<4> [<c011ccc4>] do_page_fault+0x44/0x504
<4> [<c011cc80>] do_page_fault+0x0/0x504
<4> [<c02a92af>] error_code+0x2f/0x38
<4> [<c011cc80>] do_page_fault+0x0/0x504
<4> [<c011ccc4>] do_page_fault+0x44/0x504
<4> [<c011cc80>] do_page_fault+0x0/0x504
<4> [<c02a92af>] error_code+0x2f/0x38
<4> [<c011cc80>] do_page_fault+0x0/0x504
<4> [<c011ccc4>] do_page_fault+0x44/0x504
<4> [<c011cc80>] do_page_fault+0x0/0x504
<4> [<c02a92af>] error_code+0x2f/0x38
<4> [<c011cc80>] do_page_fault+0x0/0x504
<4> [<c011ccc4>] do_page_fault+0x44/0x504
<4> [<c011cc80>] do_page_fault+0x0/0x504
<4> [<c02a92af>] error_code+0x2f/0x38
<4> [<c011cc80>] do_page_fault+0x0/0x504
<4> [<c011ccc4>] do_page_fault+0x44/0x504
<4> [<c011cc80>] do_page_fault+0x0/0x504
<4> [<c02a92af>] error_code+0x2f/0x38
<4> [<c011cc80>] do_page_fault+0x0/0x504
<4> [<c011ccc4>] do_page_fault+0x44/0x504
<4> [<c011cc80>] do_page_fault+0x0/0x504
<4> [<c02a92af>] error_code+0x2f/0x38
<4> [<c011cc80>] do_page_fault+0x0/0x504
<4> [<c011ccc4>] do_page_fault+0x44/0x504
<4> [<c011cc80>] do_page_fault+0x0/0x504
<4> [<c02a92af>] error_code+0x2f/0x38
<4> [<c011cc80>] do_page_fault+0x0/0x504
<4> [<c011ccc4>] do_page_fault+0x44/0x504
<4> [<c011cc80>] do_page_fault+0x0/0x504
<4> [<c02a92af>] error_code+0x2f/0x38
<4> [<c011cc80>] do_page_fault+0x0/0x504
<4> [<c011ccc4>] do_page_fault+0x44/0x504
<4> [<c011cc80>] do_page_fault+0x0/0x504
<4> [<c02a92af>] error_code+0x2f/0x38
<4> [<c011cc80>] do_page_fault+0x0/0x504
<4> [<c011ccc4>] do_page_fault+0x44/0x504
<4> [<c011cc80>] do_page_fault+0x0/0x504
<4> [<c02a92af>] error_code+0x2f/0x38
<4> [<c011cc80>] do_page_fault+0x0/0x504
<4> [<c011ccc4>] do_page_fault+0x44/0x504
<4> [<c011cc80>] do_page_fault+0x0/0x504
<4> [<c02a92af>] error_code+0x2f/0x38
<4> [<c011cc80>] do_page_fault+0x0/0x504
<4> [<c011ccc4>] do_page_fault+0x44/0x504
<4> [<c011cc80>] do_page_fault+0x0/0x504
<4> [<c02a92af>] error_code+0x2f/0x38
<4> [<c011cc80>] do_page_fault+0x0/0x504
<4> [<c011ccc4>] do_page_fault+0x44/0x504
<4> [<c011cc80>] do_page_fault+0x0/0x504
<4> [<c02a92af>] error_code+0x2f/0x38
<4> [<c011cc80>] do_page_fault+0x0/0x504
<4> [<c011ccc4>] do_page_fault+0x44/0x504
<4> [<c011cc80>] do_page_fault+0x0/0x504
<4> [<c02a92af>] error_code+0x2f/0x38
<4> [<c011cc80>] do_page_fault+0x0/0x504
<4> [<c011ccc4>] do_page_fault+0x44/0x504
<4> [<c011cc80>] do_page_fault+0x0/0x504
<4> [<c02a92af>] error_code+0x2f/0x38
<4> [<c011cc80>] do_page_fault+0x0/0x504
<4> [<c011ccc4>] do_page_fault+0x44/0x504
<4> [<c011cc80>] do_page_fault+0x0/0x504
<4> [<c02a92af>] error_code+0x2f/0x38
<4> [<c011cc80>] do_page_fault+0x0/0x504
<4> [<c011ccc4>] do_page_fault+0x44/0x504
<4> [<c011cc80>] do_page_fault+0x0/0x504
<4> [<c02a92af>] error_code+0x2f/0x38
<4> [<c011cc80>] do_page_fault+0x0/0x504
<4> [<c011ccc4>] do_page_fault+0x44/0x504
<4> [<c011cc80>] do_page_fault+0x0/0x504
<4> [<c02a92af>] error_code+0x2f/0x38
<4> [<c011cc80>] do_page_fault+0x0/0x504
<4> [<c011ccc4>] do_page_fault+0x44/0x504
<4> [<c011cc80>] do_page_fault+0x0/0x504
<4> [<c02a92af>] error_code+0x2f/0x38
<4> [<c011cc80>] do_page_fault+0x0/0x504
<4> [<c011ccc4>] do_page_fault+0x44/0x504
<4> [<c011cc80>] do_page_fault+0x0/0x504
<4> [<c02a92af>] error_code+0x2f/0x38
<4> [<c011cc80>] do_page_fault+0x0/0x504
<4> [<c011ccc4>] do_page_fault+0x44/0x504
<4> [<c011cc80>] do_page_fault+0x0/0x504
<4> [<c02a92af>] error_code+0x2f/0x38
<4> [<c011cc80>] do_page_fault+0x0/0x504
<4> [<c011ccc4>] do_page_fault+0x44/0x504
<4> [<c011cc80>] do_page_fault+0x0/0x504
<4> [<c02a92af>] error_code+0x2f/0x38
<4> [<c011cc80>] do_page_fault+0x0/0x504
<4> [<c011ccc4>] do_page_fault+0x44/0x504
<4> [<c011cc80>] do_page_fault+0x0/0x504
<4> [<c02a92af>] error_code+0x2f/0x38
<4> [<c011cc80>] do_page_fault+0x0/0x504
<4> [<c011ccc4>] do_page_fault+0x44/0x504
<4> [<c011cc80>] do_page_fault+0x0/0x504
<4> [<c02a92af>] error_code+0x2f/0x38
<4> [<c011cc80>] do_page_fault+0x0/0x504
<4> [<c011ccc4>] do_page_fault+0x44/0x504
<4> [<c011cc80>] do_page_fault+0x0/0x504
<4> [<c02a92af>] error_code+0x2f/0x38
<4> [<c011cc80>] do_page_fault+0x0/0x504
<4> [<c011ccc4>] do_page_fault+0x44/0x504
<4> [<c011cc80>] do_page_fault+0x0/0x504
<4> [<c02a92af>] error_code+0x2f/0x38
<4> [<c011cc80>] do_page_fault+0x0/0x504
<4> [<c011ccc4>] do_page_fault+0x44/0x504
<4> [<c011cc80>] do_page_fault+0x0/0x504
<4> [<c02a92af>] error_code+0x2f/0x38
<4> [<c011cc80>] do_page_fault+0x0/0x504
<4> [<c011ccc4>] do_page_fault+0x44/0x504
<4> [<c011cc80>] do_page_fault+0x0/0x504
<4> [<c02a92af>] error_code+0x2f/0x38
<4> [<c011cc80>] do_page_fault+0x0/0x504
<4> [<c011ccc4>] do_page_fault+0x44/0x504
<4> [<c011cc80>] do_page_fault+0x0/0x504
<4> [<c02a92af>] error_code+0x2f/0x38
<4> [<c011cc80>] do_page_fault+0x0/0x504
<4> [<c011ccc4>] do_page_fault+0x44/0x504
<4> [<c011cc80>] do_page_fault+0x0/0x504
<4> [<c02a92af>] error_code+0x2f/0x38
<4>
<4>Code: 42 30 00 02 02 00 74 01 fb b8 00 e0 ff ff 21 e0 81 7c 24 14 ff ff ff bf 8b 28 c7 44 24 20 01 00 03 00 0f 87 68 04 00 00 8b 50 14 <8b> 5d 68 8b 00 81 e2 ff ff ff fb 8b 40 14 f7 d0 c1 e8 1f 39 c2 
<4> <0>Kernel panic: Fatal exception in interrupt
<0>In interrupt handler - not syncing
<4> <0>Dumping messages in 0 seconds : last chance for Alt-SysRq...

[-- Attachment #3: dmesg --]
[-- Type: text/plain, Size: 15325 bytes --]

rovided physical RAM map:
 BIOS-e820: 0000000000000000 - 000000000009fc00 (usable)
 BIOS-e820: 000000000009fc00 - 00000000000a0000 (reserved)
 BIOS-e820: 00000000000f0000 - 0000000000100000 (reserved)
 BIOS-e820: 0000000000100000 - 000000000fff0000 (usable)
 BIOS-e820: 000000000fff0000 - 000000000fff3000 (ACPI NVS)
 BIOS-e820: 000000000fff3000 - 0000000010000000 (ACPI data)
 BIOS-e820: 00000000fec00000 - 00000000fec01000 (reserved)
 BIOS-e820: 00000000fee00000 - 00000000fee01000 (reserved)
 BIOS-e820: 00000000ffff0000 - 0000000100000000 (reserved)
255MB LOWMEM available.
found SMP MP-table at 000f4d30
hm, page 000f4000 reserved twice.
hm, page 000f5000 reserved twice.
hm, page 000f0000 reserved twice.
hm, page 000f1000 reserved twice.
On node 0 totalpages: 65520
  DMA zone: 4096 pages, LIFO batch:1
  Normal zone: 61424 pages, LIFO batch:14
  HighMem zone: 0 pages, LIFO batch:1
DMI 2.3 present.
ACPI: RSDP (v000 GBT                                       ) @ 0x000f66a0
ACPI: RSDT (v001 GBT    AWRDACPI 0x42302e31 AWRD 0x01010101) @ 0x0fff3000
ACPI: FADT (v001 GBT    AWRDACPI 0x42302e31 AWRD 0x01010101) @ 0x0fff3040
ACPI: MADT (v001 GBT    AWRDACPI 0x42302e31 AWRD 0x01010101) @ 0x0fff6cc0
ACPI: DSDT (v001 GBT    AWRDACPI 0x00001000 MSFT 0x0100000c) @ 0x00000000
ACPI: Local APIC address 0xfee00000
ACPI: LAPIC (acpi_id[0x00] lapic_id[0x00] enabled)
Processor #0 6:6 APIC version 16
ACPI: LAPIC_NMI (acpi_id[0x00] polarity[0x0] trigger[0x0] lint[0x1])
ACPI: IOAPIC (id[0x02] address[0xfec00000] global_irq_base[0x0])
IOAPIC[0]: Assigned apic_id 2
IOAPIC[0]: apic_id 2, version 3, address 0xfec00000, IRQ 0-23
ACPI: INT_SRC_OVR (bus[0] irq[0x0] global_irq[0x2] polarity[0x0] trigger[0x0])
ACPI: INT_SRC_OVR (bus[0] irq[0x9] global_irq[0x9] polarity[0x3] trigger[0x3])
Enabling APIC mode:  Flat.  Using 1 I/O APICs
Using ACPI (MADT) for SMP configuration information
Building zonelist for node : 0
Kernel command line: root=/dev/hda1
current: c02eea60
current->thread_info: c0356000
Initializing CPU#0
PID hash table entries: 1024 (order 10: 8192 bytes)
Detected 1540.938 MHz processor.
Using tsc for high-res timesource
Console: colour VGA+ 80x25
Memory: 256044k/262080k available (1701k kernel code, 5308k reserved, 689k data, 132k init, 0k highmem)
zapping low mappings.
Calibrating delay loop... 3031.04 BogoMIPS
Dentry cache hash table entries: 32768 (order: 5, 131072 bytes)
Inode-cache hash table entries: 16384 (order: 4, 65536 bytes)
Mount-cache hash table entries: 512 (order: 0, 4096 bytes)
CPU:     After generic identify, caps: 0383fbff c1c3fbff 00000000 00000000
CPU:     After vendor identify, caps: 0383fbff c1c3fbff 00000000 00000000
CPU: L1 I Cache: 64K (64 bytes/line), D cache 64K (64 bytes/line)
CPU: L2 Cache: 256K (64 bytes/line)
CPU:     After all inits, caps: 0383fbff c1c3fbff 00000000 00000020
Intel machine check architecture supported.
Intel machine check reporting enabled on CPU#0.
CPU: AMD Athlon(tm) XP 1800+ stepping 02
Enabling fast FPU save and restore... done.
Enabling unmasked SIMD FPU exception support... done.
Checking 'hlt' instruction... OK.
POSIX conformance testing by UNIFIX
enabled ExtINT on CPU#0
ESR value before enabling vector: 00000000
ESR value after enabling vector: 00000000
ENABLING IO-APIC IRQs
init IO_APIC IRQs
 IO-APIC (apicid-pin) 2-0, 2-16, 2-17, 2-18, 2-19, 2-20, 2-21, 2-22, 2-23 not connected.
..TIMER: vector=0x31 pin1=2 pin2=-1
Using local APIC timer interrupts.
calibrating APIC timer ...
..... CPU clock speed is 1540.0195 MHz.
..... host bus clock speed is 267.0860 MHz.
NET: Registered protocol family 16
PCI: PCI BIOS revision 2.10 entry at 0xfa000, last bus=1
PCI: Using configuration type 1
mtrr: v2.0 (20020519)
ACPI: Subsystem revision 20031002
IOAPIC[0]: Set PCI routing entry (2-9 -> 0x71 -> IRQ 9 Mode:1 Active:1)
ACPI: Interpreter enabled
ACPI: Using IOAPIC for interrupt routing
ACPI: PCI Root Bridge [PCI0] (00:00)
PCI: Probing PCI hardware (bus 00)
ACPI: PCI Interrupt Routing Table [\_SB_.PCI0._PRT]
ACPI: PCI Interrupt Link [LNKA] (IRQs 1 3 4 5 6 *7 10 11 12 14 15)
ACPI: PCI Interrupt Link [LNKB] (IRQs 1 *3 4 5 6 7 10 11 12 14 15)
ACPI: PCI Interrupt Link [LNKC] (IRQs 1 3 4 5 6 7 *10 11 12 14 15)
ACPI: PCI Interrupt Link [LNKD] (IRQs 1 3 4 5 6 7 10 *11 12 14 15)
ACPI: PCI Interrupt Link [ALKA] (IRQs 20)
ACPI: PCI Interrupt Link [ALKB] (IRQs 21)
ACPI: PCI Interrupt Link [ALKC] (IRQs 22)
ACPI: PCI Interrupt Link [ALKD] (IRQs 23)
Linux Plug and Play Support v0.97 (c) Adam Belay
PnPBIOS: Scanning system for PnP BIOS support...
PnPBIOS: Found PnP BIOS installation structure at 0xc00faa60
PnPBIOS: PnP BIOS version 1.0, entry 0xf0000:0xaa90, dseg 0xf0000
PnPBIOS: 14 nodes reported by PnP BIOS; 14 recorded by driver
IOAPIC[0]: Set PCI routing entry (2-17 -> 0xa9 -> IRQ 17 Mode:1 Active:1)
00:00:09[A] -> 2-17 -> IRQ 17
IOAPIC[0]: Set PCI routing entry (2-18 -> 0xb1 -> IRQ 18 Mode:1 Active:1)
00:00:09[B] -> 2-18 -> IRQ 18
IOAPIC[0]: Set PCI routing entry (2-19 -> 0xb9 -> IRQ 19 Mode:1 Active:1)
00:00:09[C] -> 2-19 -> IRQ 19
IOAPIC[0]: Set PCI routing entry (2-16 -> 0xc1 -> IRQ 16 Mode:1 Active:1)
00:00:09[D] -> 2-16 -> IRQ 16
Pin 2-18 already programmed
Pin 2-19 already programmed
Pin 2-16 already programmed
Pin 2-17 already programmed
Pin 2-19 already programmed
Pin 2-16 already programmed
Pin 2-17 already programmed
Pin 2-18 already programmed
Pin 2-16 already programmed
Pin 2-17 already programmed
Pin 2-18 already programmed
Pin 2-19 already programmed
Pin 2-17 already programmed
Pin 2-18 already programmed
Pin 2-19 already programmed
Pin 2-16 already programmed
Pin 2-19 already programmed
Pin 2-16 already programmed
Pin 2-17 already programmed
Pin 2-18 already programmed
Pin 2-16 already programmed
Pin 2-17 already programmed
Pin 2-18 already programmed
Pin 2-19 already programmed
Pin 2-18 already programmed
Pin 2-19 already programmed
Pin 2-16 already programmed
Pin 2-17 already programmed
Pin 2-17 already programmed
Pin 2-18 already programmed
Pin 2-19 already programmed
Pin 2-16 already programmed
_CRS returns NULL! Using IRQ 21 for device (PCI Interrupt Link [ALKB]).
ACPI: PCI Interrupt Link [ALKB] enabled at IRQ 21
IOAPIC[0]: Set PCI routing entry (2-21 -> 0xc9 -> IRQ 21 Mode:1 Active:1)
00:00:10[A] -> 2-21 -> IRQ 21
Pin 2-21 already programmed
Pin 2-21 already programmed
Pin 2-21 already programmed
_CRS returns NULL! Using IRQ 20 for device (PCI Interrupt Link [ALKA]).
ACPI: PCI Interrupt Link [ALKA] enabled at IRQ 20
IOAPIC[0]: Set PCI routing entry (2-20 -> 0xd1 -> IRQ 20 Mode:1 Active:1)
00:00:11[A] -> 2-20 -> IRQ 20
Pin 2-21 already programmed
_CRS returns NULL! Using IRQ 22 for device (PCI Interrupt Link [ALKC]).
ACPI: PCI Interrupt Link [ALKC] enabled at IRQ 22
IOAPIC[0]: Set PCI routing entry (2-22 -> 0xd9 -> IRQ 22 Mode:1 Active:1)
00:00:11[C] -> 2-22 -> IRQ 22
_CRS returns NULL! Using IRQ 23 for device (PCI Interrupt Link [ALKD]).
ACPI: PCI Interrupt Link [ALKD] enabled at IRQ 23
IOAPIC[0]: Set PCI routing entry (2-23 -> 0xe1 -> IRQ 23 Mode:1 Active:1)
00:00:11[D] -> 2-23 -> IRQ 23
Pin 2-16 already programmed
Pin 2-17 already programmed
Pin 2-18 already programmed
Pin 2-19 already programmed
Pin 2-23 already programmed
Pin 2-23 already programmed
Pin 2-23 already programmed
Pin 2-23 already programmed
number of MP IRQ sources: 15.
number of IO-APIC #2 registers: 24.
testing the IO APIC.......................
IO APIC #2......
.... register #00: 02000000
.......    : physical APIC id: 02
.......    : Delivery Type: 0
.......    : LTS          : 0
.... register #01: 00178003
.......     : max redirection entries: 0017
.......     : PRQ implemented: 1
.......     : IO APIC version: 0003
.... IRQ redirection table:
 NR Log Phy Mask Trig IRR Pol Stat Dest Deli Vect:   
 00 000 00  1    0    0   0   0    0    0    00
 01 001 01  0    0    0   0   0    1    1    39
 02 001 01  0    0    0   0   0    1    1    31
 03 001 01  0    0    0   0   0    1    1    41
 04 001 01  0    0    0   0   0    1    1    49
 05 001 01  0    0    0   0   0    1    1    51
 06 001 01  0    0    0   0   0    1    1    59
 07 001 01  0    0    0   0   0    1    1    61
 08 001 01  0    0    0   0   0    1    1    69
 09 001 01  0    1    0   1   0    1    1    71
 0a 001 01  0    0    0   0   0    1    1    79
 0b 001 01  0    0    0   0   0    1    1    81
 0c 001 01  0    0    0   0   0    1    1    89
 0d 001 01  0    0    0   0   0    1    1    91
 0e 001 01  0    0    0   0   0    1    1    99
 0f 001 01  0    0    0   0   0    1    1    A1
 10 001 01  1    1    0   1   0    1    1    C1
 11 001 01  1    1    0   1   0    1    1    A9
 12 001 01  1    1    0   1   0    1    1    B1
 13 001 01  1    1    0   1   0    1    1    B9
 14 001 01  1    1    0   1   0    1    1    D1
 15 001 01  1    1    0   1   0    1    1    C9
 16 001 01  1    1    0   1   0    1    1    D9
 17 001 01  1    1    0   1   0    1    1    E1
IRQ to pin mappings:
IRQ0 -> 0:2
IRQ1 -> 0:1
IRQ3 -> 0:3
IRQ4 -> 0:4
IRQ5 -> 0:5
IRQ6 -> 0:6
IRQ7 -> 0:7
IRQ8 -> 0:8
IRQ9 -> 0:9-> 0:9
IRQ10 -> 0:10
IRQ11 -> 0:11
IRQ12 -> 0:12
IRQ13 -> 0:13
IRQ14 -> 0:14
IRQ15 -> 0:15
IRQ16 -> 0:16
IRQ17 -> 0:17
IRQ18 -> 0:18
IRQ19 -> 0:19
IRQ20 -> 0:20
IRQ21 -> 0:21
IRQ22 -> 0:22
IRQ23 -> 0:23
.................................... done.
PCI: Using ACPI for IRQ routing
PCI: if you experience problems, try using option 'pci=noacpi' or even 'acpi=off'
Machine check exception polling timer started.
ikconfig 0.7 with /proc/config*
devfs: v1.22 (20021013) Richard Gooch (rgooch@atnf.csiro.au)
devfs: boot_options: 0x1
Initializing Cryptographic API
PCI: Via IRQ fixup for 0000:00:10.0, from 7 to 5
PCI: Via IRQ fixup for 0000:00:10.1, from 3 to 5
PCI: Via IRQ fixup for 0000:00:10.2, from 10 to 5
ACPI: Power Button (FF) [PWRF]
ACPI: Processor [CPU0] (supports C1, 2 throttling states)
pty: 256 Unix98 ptys configured
Real Time Clock Driver v1.12
8139too Fast Ethernet driver 0.9.26
eth0: RealTek RTL8139 at 0xd0807000, 00:20:ed:68:3a:db, IRQ 18
eth0:  Identified 8139 chip type 'RTL-8100B/8139D'
Uniform Multi-Platform E-IDE driver Revision: 7.00alpha2
ide: Assuming 33MHz system bus speed for PIO modes; override with idebus=xx
PDC20276: IDE controller at PCI slot 0000:00:0f.0
PDC20276: chipset revision 1
PDC20276: 100% native mode on irq 19
    ide2: BM-DMA at 0xd000-0xd007, BIOS settings: hde:pio, hdf:pio
    ide3: BM-DMA at 0xd008-0xd00f, BIOS settings: hdg:pio, hdh:pio
hde: LITE-ON LTR-32123S, ATAPI CD/DVD-ROM drive
Using anticipatory io scheduler
ide2 at 0xc000-0xc007,0xc402 on irq 19
VP_IDE: IDE controller at PCI slot 0000:00:11.1
VP_IDE: chipset revision 6
VP_IDE: not 100% native mode: will probe irqs later
ide: Assuming 33MHz system bus speed for PIO modes; override with idebus=xx
VP_IDE: VIA vt8235 (rev 00) IDE UDMA133 controller on pci0000:00:11.1
    ide0: BM-DMA at 0xe000-0xe007, BIOS settings: hda:DMA, hdb:pio
    ide1: BM-DMA at 0xe008-0xe00f, BIOS settings: hdc:DMA, hdd:DMA
hda: IC35L080AVVA07-0, ATA DISK drive
ide0 at 0x1f0-0x1f7,0x3f6 on irq 14
hdc: IBM-DTLA-307045, ATA DISK drive
hdd: IBM-DHEA-36480, ATA DISK drive
ide1 at 0x170-0x177,0x376 on irq 15
hda: max request size: 128KiB
hda: 160836480 sectors (82348 MB) w/1863KiB Cache, CHS=65535/16/63, UDMA(100)
 /dev/ide/host0/bus0/target0/lun0: p1 p2 p3 p4
hdc: max request size: 128KiB
hdc: 90069840 sectors (46115 MB) w/1916KiB Cache, CHS=65535/16/63, UDMA(100)
 /dev/ide/host0/bus1/target0/lun0: p1 p2
hdd: max request size: 128KiB
hdd: 12692736 sectors (6498 MB) w/476KiB Cache, CHS=12592/16/63, UDMA(33)
 /dev/ide/host0/bus1/target1/lun0: p1
hde: ATAPI 40X CD-ROM CD-R/RW drive, 1984kB Cache
Uniform CD-ROM driver Revision: 3.12
mice: PS/2 mouse device common for all mice
input: ImPS/2 Logitech Wheel Mouse on isa0060/serio1
serio: i8042 AUX port at 0x60,0x64 irq 12
input: AT Translated Set 2 keyboard on isa0060/serio0
serio: i8042 KBD port at 0x60,0x64 irq 1
NET: Registered protocol family 2
IP: routing cache hash table of 2048 buckets, 16Kbytes
TCP: Hash tables configured (established 16384 bind 32768)
NET: Registered protocol family 1
NET: Registered protocol family 15
NET: Registered protocol family 8
NET: Registered protocol family 20
ACPI: (supports S0 S1 S4 S5)
found reiserfs format "3.6" with standard journal
Reiserfs journal params: device hda1, size 8192, journal first block 18, max trans len 1024, max batch 900, max commit age 30, max trans age 30
reiserfs: checking transaction log (hda1) for (hda1)
reiserfs: replayed 63 transactions in 2 seconds
Using r5 hash to sort names
VFS: Mounted root (reiserfs filesystem) readonly.
Mounted devfs on /dev
Freeing unused kernel memory: 132k freed
Adding 265064k swap on /dev/hda2.  Priority:-1 extents:1
NTFS driver 2.1.5 [Flags: R/W MODULE].
NTFS volume version 3.0.
found reiserfs format "3.6" with standard journal
Reiserfs journal params: device hda4, size 8192, journal first block 18, max trans len 1024, max batch 900, max commit age 30, max trans age 30
reiserfs: checking transaction log (hda4) for (hda4)
Using r5 hash to sort names
cdrom: open failed.
found reiserfs format "3.6" with standard journal
Reiserfs journal params: device hdd1, size 8192, journal first block 18, max trans len 1024, max batch 900, max commit age 30, max trans age 30
reiserfs: checking transaction log (hdd1) for (hdd1)
Using r5 hash to sort names
drivers/usb/core/usb.c: registered new driver usbfs
drivers/usb/core/usb.c: registered new driver hub
ehci_hcd 0000:00:10.3: EHCI Host Controller
ehci_hcd 0000:00:10.3: irq 21, pci mem d0ac4000
ehci_hcd 0000:00:10.3: new USB bus registered, assigned bus number 1
ehci_hcd 0000:00:10.3: USB 2.0 enabled, EHCI 1.00, driver 2003-Jun-13
hub 1-0:1.0: USB hub found
hub 1-0:1.0: 6 ports detected
drivers/usb/host/uhci-hcd.c: USB Universal Host Controller Interface driver v2.1
uhci_hcd 0000:00:10.0: UHCI Host Controller
uhci_hcd 0000:00:10.0: irq 21, io base 0000d400
uhci_hcd 0000:00:10.0: new USB bus registered, assigned bus number 2
hub 2-0:1.0: USB hub found
hub 2-0:1.0: 2 ports detected
uhci_hcd 0000:00:10.1: UHCI Host Controller
uhci_hcd 0000:00:10.1: irq 21, io base 0000d800
uhci_hcd 0000:00:10.1: new USB bus registered, assigned bus number 3
hub 3-0:1.0: USB hub found
hub 3-0:1.0: 2 ports detected
uhci_hcd 0000:00:10.2: UHCI Host Controller
uhci_hcd 0000:00:10.2: irq 21, io base 0000dc00
uhci_hcd 0000:00:10.2: new USB bus registered, assigned bus number 4
hub 4-0:1.0: USB hub found
hub 4-0:1.0: 2 ports detected
atmsvc: no signaling demon
hub 2-0:1.0: new USB device on port 1, assigned address 2
hub 2-0:1.0: new USB device on port 2, assigned address 3
eth0: link down
blk: queue c1382c00, I/O limit 4095Mb (mask 0xffffffff)
blk: queue c1382400, I/O limit 4095Mb (mask 0xffffffff)
drivers/usb/core/usb.c: registered new driver speedtch
via82xx: Assuming DXS channels with 48k fixed sample rate.
         Please try dxs_support=1 option and report if it works on your machine.
PCI: Setting latency timer of device 0000:00:11.5 to 64
NET: Registered protocol family 17
usb 2-2: bulk timeout on ep5in
usbfs: USBDEVFS_BULK failed dev 3 ep 0x85 len 512 ret -110
ip_tables: (C) 2000-2002 Netfilter core team
ip_conntrack version 2.1 (2047 buckets, 16376 max) - 300 bytes per conntrack
usbfs: process 1094 (modem_run) did not claim interface 0 before use
HTB init, kernel part version 3.13

[-- Attachment #4: config-2.6.0-test10-mm1 --]
[-- Type: text/plain, Size: 26056 bytes --]

#
# Automatically generated make config: don't edit
#
CONFIG_X86=y
CONFIG_MMU=y
CONFIG_UID16=y
CONFIG_GENERIC_ISA_DMA=y

#
# Code maturity level options
#
CONFIG_EXPERIMENTAL=y
CONFIG_CLEAN_COMPILE=y
CONFIG_STANDALONE=y
CONFIG_BROKEN_ON_SMP=y

#
# General setup
#
CONFIG_SWAP=y
CONFIG_SYSVIPC=y
CONFIG_BSD_PROCESS_ACCT=y
CONFIG_SYSCTL=y
CONFIG_LOG_BUF_SHIFT=14
CONFIG_IKCONFIG=y
CONFIG_IKCONFIG_PROC=y
# CONFIG_EMBEDDED is not set
CONFIG_KALLSYMS=y
CONFIG_FUTEX=y
CONFIG_EPOLL=y
CONFIG_IOSCHED_NOOP=y
CONFIG_IOSCHED_AS=y
CONFIG_IOSCHED_DEADLINE=y
CONFIG_IOSCHED_CFQ=y
# CONFIG_CC_OPTIMIZE_FOR_SIZE is not set

#
# Loadable module support
#
CONFIG_MODULES=y
CONFIG_MODULE_UNLOAD=y
# CONFIG_MODULE_FORCE_UNLOAD is not set
CONFIG_OBSOLETE_MODPARM=y
# CONFIG_MODVERSIONS is not set
CONFIG_KMOD=y

#
# Processor type and features
#
CONFIG_X86_PC=y
# CONFIG_X86_VOYAGER is not set
# CONFIG_X86_NUMAQ is not set
# CONFIG_X86_SUMMIT is not set
# CONFIG_X86_BIGSMP is not set
# CONFIG_X86_VISWS is not set
# CONFIG_X86_GENERICARCH is not set
# CONFIG_X86_ES7000 is not set
# CONFIG_M386 is not set
# CONFIG_M486 is not set
# CONFIG_M586 is not set
# CONFIG_M586TSC is not set
# CONFIG_M586MMX is not set
# CONFIG_M686 is not set
# CONFIG_MPENTIUMII is not set
# CONFIG_MPENTIUMIII is not set
# CONFIG_MPENTIUM4 is not set
# CONFIG_MK6 is not set
CONFIG_MK7=y
# CONFIG_MK8 is not set
# CONFIG_MELAN is not set
# CONFIG_MCRUSOE is not set
# CONFIG_MWINCHIPC6 is not set
# CONFIG_MWINCHIP2 is not set
# CONFIG_MWINCHIP3D is not set
# CONFIG_MCYRIXIII is not set
# CONFIG_MVIAC3_2 is not set
# CONFIG_X86_GENERIC is not set
CONFIG_X86_CMPXCHG=y
CONFIG_X86_XADD=y
CONFIG_X86_L1_CACHE_SHIFT=6
CONFIG_RWSEM_XCHGADD_ALGORITHM=y
CONFIG_X86_WP_WORKS_OK=y
CONFIG_X86_INVLPG=y
CONFIG_X86_BSWAP=y
CONFIG_X86_POPAD_OK=y
CONFIG_X86_GOOD_APIC=y
CONFIG_X86_INTEL_USERCOPY=y
CONFIG_X86_USE_PPRO_CHECKSUM=y
CONFIG_X86_USE_3DNOW=y
# CONFIG_X86_4G is not set
# CONFIG_X86_SWITCH_PAGETABLES is not set
# CONFIG_X86_4G_VM_LAYOUT is not set
# CONFIG_X86_UACCESS_INDIRECT is not set
# CONFIG_X86_HIGH_ENTRY is not set
CONFIG_HPET_TIMER=y
CONFIG_HPET_EMULATE_RTC=y
# CONFIG_SMP is not set
CONFIG_PREEMPT=y
CONFIG_X86_UP_APIC=y
CONFIG_X86_UP_IOAPIC=y
CONFIG_X86_LOCAL_APIC=y
CONFIG_X86_IO_APIC=y
CONFIG_X86_TSC=y
CONFIG_X86_MCE=y
CONFIG_X86_MCE_NONFATAL=y
# CONFIG_X86_MCE_P4THERMAL is not set
# CONFIG_TOSHIBA is not set
# CONFIG_I8K is not set
# CONFIG_MICROCODE is not set
# CONFIG_X86_MSR is not set
# CONFIG_X86_CPUID is not set
# CONFIG_EDD is not set
CONFIG_NOHIGHMEM=y
# CONFIG_HIGHMEM4G is not set
# CONFIG_HIGHMEM64G is not set
# CONFIG_MATH_EMULATION is not set
CONFIG_MTRR=y
# CONFIG_EFI is not set
CONFIG_HAVE_DEC_LOCK=y

#
# Power management options (ACPI, APM)
#
CONFIG_PM=y
CONFIG_SOFTWARE_SUSPEND=y
# CONFIG_PM_DISK is not set

#
# ACPI (Advanced Configuration and Power Interface) Support
#
CONFIG_ACPI=y
CONFIG_ACPI_BOOT=y
CONFIG_ACPI_INTERPRETER=y
CONFIG_ACPI_SLEEP=y
CONFIG_ACPI_SLEEP_PROC_FS=y
# CONFIG_ACPI_AC is not set
# CONFIG_ACPI_BATTERY is not set
CONFIG_ACPI_BUTTON=y
CONFIG_ACPI_FAN=y
CONFIG_ACPI_PROCESSOR=y
CONFIG_ACPI_THERMAL=y
# CONFIG_ACPI_ASUS is not set
# CONFIG_ACPI_TOSHIBA is not set
# CONFIG_ACPI_DEBUG is not set
CONFIG_ACPI_BUS=y
CONFIG_ACPI_EC=y
CONFIG_ACPI_POWER=y
CONFIG_ACPI_PCI=y
CONFIG_ACPI_SYSTEM=y
# CONFIG_ACPI_RELAXED_AML is not set
# CONFIG_X86_PM_TIMER is not set

#
# APM (Advanced Power Management) BIOS Support
#
# CONFIG_APM is not set

#
# CPU Frequency scaling
#
# CONFIG_CPU_FREQ is not set

#
# Bus options (PCI, PCMCIA, EISA, MCA, ISA)
#
CONFIG_PCI=y
# CONFIG_PCI_GOBIOS is not set
# CONFIG_PCI_GODIRECT is not set
CONFIG_PCI_GOANY=y
CONFIG_PCI_BIOS=y
CONFIG_PCI_DIRECT=y
# CONFIG_PCI_USE_VECTOR is not set
# CONFIG_PCI_LEGACY_PROC is not set
CONFIG_PCI_NAMES=y
# CONFIG_ISA is not set
# CONFIG_MCA is not set
# CONFIG_SCx200 is not set
CONFIG_HOTPLUG=y

#
# PCMCIA/CardBus support
#
# CONFIG_PCMCIA is not set

#
# PCI Hotplug Support
#
# CONFIG_HOTPLUG_PCI is not set

#
# Executable file formats
#
CONFIG_BINFMT_ELF=y
CONFIG_BINFMT_AOUT=m
CONFIG_BINFMT_MISC=m

#
# Device Drivers
#

#
# Generic Driver Options
#
# CONFIG_FW_LOADER is not set

#
# Memory Technology Devices (MTD)
#
# CONFIG_MTD is not set

#
# Parallel port support
#
CONFIG_PARPORT=m
CONFIG_PARPORT_PC=m
CONFIG_PARPORT_PC_CML1=m
# CONFIG_PARPORT_SERIAL is not set
CONFIG_PARPORT_PC_FIFO=y
# CONFIG_PARPORT_PC_SUPERIO is not set
# CONFIG_PARPORT_OTHER is not set
CONFIG_PARPORT_1284=y

#
# Plug and Play support
#
CONFIG_PNP=y
# CONFIG_PNP_DEBUG is not set

#
# Protocols
#
# CONFIG_ISAPNP is not set
CONFIG_PNPBIOS=y
# CONFIG_PNPBIOS_PROC_FS is not set

#
# Block devices
#
CONFIG_BLK_DEV_FD=m
CONFIG_PARIDE=m
CONFIG_PARIDE_PARPORT=m

#
# Parallel IDE high-level drivers
#
# CONFIG_PARIDE_PD is not set
# CONFIG_PARIDE_PCD is not set
# CONFIG_PARIDE_PF is not set
# CONFIG_PARIDE_PT is not set
# CONFIG_PARIDE_PG is not set

#
# Parallel IDE protocol modules
#
# CONFIG_PARIDE_ATEN is not set
# CONFIG_PARIDE_BPCK is not set
# CONFIG_PARIDE_BPCK6 is not set
# CONFIG_PARIDE_COMM is not set
# CONFIG_PARIDE_DSTR is not set
# CONFIG_PARIDE_FIT2 is not set
# CONFIG_PARIDE_FIT3 is not set
# CONFIG_PARIDE_EPAT is not set
# CONFIG_PARIDE_EPIA is not set
# CONFIG_PARIDE_FRIQ is not set
# CONFIG_PARIDE_FRPW is not set
# CONFIG_PARIDE_KBIC is not set
# CONFIG_PARIDE_KTTI is not set
# CONFIG_PARIDE_ON20 is not set
# CONFIG_PARIDE_ON26 is not set
# CONFIG_BLK_CPQ_DA is not set
# CONFIG_BLK_CPQ_CISS_DA is not set
# CONFIG_BLK_DEV_DAC960 is not set
# CONFIG_BLK_DEV_UMEM is not set
CONFIG_BLK_DEV_LOOP=m
CONFIG_BLK_DEV_CRYPTOLOOP=m
CONFIG_BLK_DEV_NBD=m
CONFIG_BLK_DEV_RAM=m
CONFIG_BLK_DEV_RAM_SIZE=4096
# CONFIG_BLK_DEV_INITRD is not set
# CONFIG_LBD is not set

#
# ATA/ATAPI/MFM/RLL support
#
CONFIG_IDE=y
CONFIG_BLK_DEV_IDE=y

#
# Please see Documentation/ide.txt for help/info on IDE drives
#
# CONFIG_BLK_DEV_HD_IDE is not set
CONFIG_BLK_DEV_IDEDISK=y
# CONFIG_IDEDISK_MULTI_MODE is not set
# CONFIG_IDEDISK_STROKE is not set
CONFIG_BLK_DEV_IDECD=y
# CONFIG_BLK_DEV_IDETAPE is not set
# CONFIG_BLK_DEV_IDEFLOPPY is not set
CONFIG_IDE_TASK_IOCTL=y
CONFIG_IDE_TASKFILE_IO=y

#
# IDE chipset support/bugfixes
#
# CONFIG_BLK_DEV_CMD640 is not set
# CONFIG_BLK_DEV_IDEPNP is not set
CONFIG_BLK_DEV_IDEPCI=y
CONFIG_IDEPCI_SHARE_IRQ=y
# CONFIG_BLK_DEV_OFFBOARD is not set
CONFIG_BLK_DEV_GENERIC=y
# CONFIG_BLK_DEV_OPTI621 is not set
# CONFIG_BLK_DEV_RZ1000 is not set
CONFIG_BLK_DEV_IDEDMA_PCI=y
# CONFIG_BLK_DEV_IDEDMA_FORCED is not set
CONFIG_IDEDMA_PCI_AUTO=y
# CONFIG_IDEDMA_ONLYDISK is not set
# CONFIG_IDEDMA_PCI_WIP is not set
CONFIG_BLK_DEV_ADMA=y
# CONFIG_BLK_DEV_AEC62XX is not set
# CONFIG_BLK_DEV_ALI15X3 is not set
# CONFIG_BLK_DEV_AMD74XX is not set
# CONFIG_BLK_DEV_CMD64X is not set
# CONFIG_BLK_DEV_TRIFLEX is not set
# CONFIG_BLK_DEV_CY82C693 is not set
# CONFIG_BLK_DEV_CS5520 is not set
# CONFIG_BLK_DEV_CS5530 is not set
# CONFIG_BLK_DEV_HPT34X is not set
# CONFIG_BLK_DEV_HPT366 is not set
# CONFIG_BLK_DEV_SC1200 is not set
# CONFIG_BLK_DEV_PIIX is not set
# CONFIG_BLK_DEV_NS87415 is not set
# CONFIG_BLK_DEV_PDC202XX_OLD is not set
CONFIG_BLK_DEV_PDC202XX_NEW=y
CONFIG_PDC202XX_FORCE=y
# CONFIG_BLK_DEV_SVWKS is not set
# CONFIG_BLK_DEV_SIIMAGE is not set
# CONFIG_BLK_DEV_SIS5513 is not set
# CONFIG_BLK_DEV_SLC90E66 is not set
# CONFIG_BLK_DEV_TRM290 is not set
CONFIG_BLK_DEV_VIA82CXXX=y
CONFIG_BLK_DEV_IDEDMA=y
# CONFIG_IDEDMA_IVB is not set
CONFIG_IDEDMA_AUTO=y
# CONFIG_DMA_NONPCI is not set
# CONFIG_BLK_DEV_HD is not set

#
# SCSI device support
#
# CONFIG_SCSI is not set

#
# Multi-device support (RAID and LVM)
#
# CONFIG_MD is not set

#
# Fusion MPT device support
#

#
# IEEE 1394 (FireWire) support (EXPERIMENTAL)
#
# CONFIG_IEEE1394 is not set

#
# I2O device support
#
# CONFIG_I2O is not set

#
# Networking support
#
CONFIG_NET=y

#
# Networking options
#
CONFIG_PACKET=m
CONFIG_PACKET_MMAP=y
# CONFIG_NETLINK_DEV is not set
CONFIG_UNIX=y
CONFIG_NET_KEY=y
CONFIG_INET=y
# CONFIG_IP_MULTICAST is not set
# CONFIG_IP_ADVANCED_ROUTER is not set
# CONFIG_IP_PNP is not set
# CONFIG_NET_IPIP is not set
# CONFIG_NET_IPGRE is not set
# CONFIG_ARPD is not set
CONFIG_INET_ECN=y
CONFIG_SYN_COOKIES=y
# CONFIG_INET_AH is not set
# CONFIG_INET_ESP is not set
# CONFIG_INET_IPCOMP is not set

#
# IP: Virtual Server Configuration
#
# CONFIG_IP_VS is not set
# CONFIG_IPV6 is not set
# CONFIG_DECNET is not set
# CONFIG_BRIDGE is not set
CONFIG_NETFILTER=y
# CONFIG_NETFILTER_DEBUG is not set

#
# IP: Netfilter Configuration
#
CONFIG_IP_NF_CONNTRACK=m
CONFIG_IP_NF_FTP=m
CONFIG_IP_NF_IRC=m
# CONFIG_IP_NF_TFTP is not set
# CONFIG_IP_NF_AMANDA is not set
CONFIG_IP_NF_QUEUE=m
CONFIG_IP_NF_IPTABLES=m
CONFIG_IP_NF_MATCH_LIMIT=m
CONFIG_IP_NF_MATCH_IPRANGE=m
CONFIG_IP_NF_MATCH_MAC=m
CONFIG_IP_NF_MATCH_PKTTYPE=m
CONFIG_IP_NF_MATCH_MARK=m
CONFIG_IP_NF_MATCH_MULTIPORT=m
CONFIG_IP_NF_MATCH_TOS=m
CONFIG_IP_NF_MATCH_RECENT=m
CONFIG_IP_NF_MATCH_ECN=m
CONFIG_IP_NF_MATCH_DSCP=m
CONFIG_IP_NF_MATCH_AH_ESP=m
CONFIG_IP_NF_MATCH_LENGTH=m
CONFIG_IP_NF_MATCH_TTL=m
CONFIG_IP_NF_MATCH_TCPMSS=m
CONFIG_IP_NF_MATCH_HELPER=m
CONFIG_IP_NF_MATCH_STATE=m
CONFIG_IP_NF_MATCH_CONNTRACK=m
CONFIG_IP_NF_MATCH_OWNER=m
CONFIG_IP_NF_FILTER=m
CONFIG_IP_NF_TARGET_REJECT=m
CONFIG_IP_NF_NAT=m
CONFIG_IP_NF_NAT_NEEDED=y
CONFIG_IP_NF_TARGET_MASQUERADE=m
CONFIG_IP_NF_TARGET_REDIRECT=m
CONFIG_IP_NF_TARGET_NETMAP=m
# CONFIG_IP_NF_TARGET_SAME is not set
# CONFIG_IP_NF_NAT_LOCAL is not set
# CONFIG_IP_NF_NAT_SNMP_BASIC is not set
CONFIG_IP_NF_NAT_IRC=m
CONFIG_IP_NF_NAT_FTP=m
CONFIG_IP_NF_MANGLE=m
CONFIG_IP_NF_TARGET_TOS=m
CONFIG_IP_NF_TARGET_ECN=m
CONFIG_IP_NF_TARGET_DSCP=m
CONFIG_IP_NF_TARGET_MARK=m
CONFIG_IP_NF_TARGET_CLASSIFY=m
CONFIG_IP_NF_TARGET_LOG=m
CONFIG_IP_NF_TARGET_ULOG=m
CONFIG_IP_NF_TARGET_TCPMSS=m
# CONFIG_IP_NF_ARPTABLES is not set
# CONFIG_IP_NF_COMPAT_IPCHAINS is not set
# CONFIG_IP_NF_COMPAT_IPFWADM is not set
CONFIG_XFRM=y
# CONFIG_XFRM_USER is not set

#
# SCTP Configuration (EXPERIMENTAL)
#
CONFIG_IPV6_SCTP__=y
# CONFIG_IP_SCTP is not set
CONFIG_ATM=y
CONFIG_ATM_CLIP=m
CONFIG_ATM_CLIP_NO_ICMP=y
CONFIG_ATM_LANE=m
CONFIG_ATM_MPOA=m
CONFIG_ATM_BR2684=m
# CONFIG_ATM_BR2684_IPFILTER is not set
# CONFIG_VLAN_8021Q is not set
# CONFIG_LLC2 is not set
# CONFIG_IPX is not set
# CONFIG_ATALK is not set
# CONFIG_X25 is not set
# CONFIG_LAPB is not set
# CONFIG_NET_DIVERT is not set
# CONFIG_ECONET is not set
# CONFIG_WAN_ROUTER is not set
# CONFIG_NET_FASTROUTE is not set
# CONFIG_NET_HW_FLOWCONTROL is not set

#
# QoS and/or fair queueing
#
CONFIG_NET_SCHED=y
CONFIG_NET_SCH_CBQ=m
CONFIG_NET_SCH_HTB=m
CONFIG_NET_SCH_CSZ=m
# CONFIG_NET_SCH_ATM is not set
CONFIG_NET_SCH_PRIO=m
CONFIG_NET_SCH_RED=m
CONFIG_NET_SCH_SFQ=m
CONFIG_NET_SCH_TEQL=m
CONFIG_NET_SCH_TBF=m
CONFIG_NET_SCH_GRED=m
CONFIG_NET_SCH_DSMARK=m
CONFIG_NET_SCH_INGRESS=m
CONFIG_NET_QOS=y
CONFIG_NET_ESTIMATOR=y
CONFIG_NET_CLS=y
CONFIG_NET_CLS_TCINDEX=m
CONFIG_NET_CLS_ROUTE4=m
CONFIG_NET_CLS_ROUTE=y
CONFIG_NET_CLS_FW=m
CONFIG_NET_CLS_U32=m
CONFIG_NET_CLS_RSVP=m
CONFIG_NET_CLS_RSVP6=m
CONFIG_NET_CLS_POLICE=y

#
# Network testing
#
# CONFIG_NET_PKTGEN is not set
CONFIG_NETDEVICES=y

#
# ARCnet devices
#
# CONFIG_ARCNET is not set
CONFIG_DUMMY=m
# CONFIG_BONDING is not set
# CONFIG_EQUALIZER is not set
# CONFIG_TUN is not set
# CONFIG_NET_SB1000 is not set

#
# Ethernet (10 or 100Mbit)
#
CONFIG_NET_ETHERNET=y
CONFIG_MII=y
# CONFIG_HAPPYMEAL is not set
# CONFIG_SUNGEM is not set
# CONFIG_NET_VENDOR_3COM is not set

#
# Tulip family network device support
#
# CONFIG_NET_TULIP is not set
# CONFIG_HP100 is not set
CONFIG_NET_PCI=y
# CONFIG_PCNET32 is not set
# CONFIG_AMD8111_ETH is not set
# CONFIG_ADAPTEC_STARFIRE is not set
# CONFIG_B44 is not set
# CONFIG_FORCEDETH is not set
# CONFIG_DGRS is not set
# CONFIG_EEPRO100 is not set
# CONFIG_E100 is not set
# CONFIG_FEALNX is not set
# CONFIG_NATSEMI is not set
# CONFIG_NE2K_PCI is not set
# CONFIG_8139CP is not set
CONFIG_8139TOO=y
# CONFIG_8139TOO_PIO is not set
# CONFIG_8139TOO_TUNE_TWISTER is not set
# CONFIG_8139TOO_8129 is not set
# CONFIG_8139_OLD_RX_RESET is not set
# CONFIG_SIS900 is not set
# CONFIG_EPIC100 is not set
# CONFIG_SUNDANCE is not set
# CONFIG_TLAN is not set
# CONFIG_VIA_RHINE is not set

#
# Ethernet (1000 Mbit)
#
# CONFIG_ACENIC is not set
# CONFIG_DL2K is not set
# CONFIG_E1000 is not set
# CONFIG_NS83820 is not set
# CONFIG_HAMACHI is not set
# CONFIG_YELLOWFIN is not set
# CONFIG_R8169 is not set
# CONFIG_SIS190 is not set
# CONFIG_SK98LIN is not set
# CONFIG_TIGON3 is not set

#
# Ethernet (10000 Mbit)
#
# CONFIG_IXGB is not set
# CONFIG_FDDI is not set
# CONFIG_HIPPI is not set
# CONFIG_PLIP is not set
CONFIG_PPP=m
CONFIG_PPP_MULTILINK=y
CONFIG_PPP_FILTER=y
CONFIG_PPP_ASYNC=m
CONFIG_PPP_SYNC_TTY=m
CONFIG_PPP_DEFLATE=m
CONFIG_PPP_BSDCOMP=m
CONFIG_PPPOE=m
CONFIG_PPPOATM=m
CONFIG_SLIP=m
CONFIG_SLIP_COMPRESSED=y
CONFIG_SLIP_SMART=y
# CONFIG_SLIP_MODE_SLIP6 is not set

#
# Wireless LAN (non-hamradio)
#
# CONFIG_NET_RADIO is not set

#
# Token Ring devices
#
# CONFIG_TR is not set
# CONFIG_RCPCI is not set
# CONFIG_SHAPER is not set
# CONFIG_NET_POLL_CONTROLLER is not set

#
# Wan interfaces
#
# CONFIG_WAN is not set

#
# ATM drivers
#
# CONFIG_ATM_TCP is not set
# CONFIG_ATM_LANAI is not set
# CONFIG_ATM_ENI is not set
# CONFIG_ATM_FIRESTREAM is not set
# CONFIG_ATM_ZATM is not set
# CONFIG_ATM_NICSTAR is not set
# CONFIG_ATM_IDT77252 is not set
# CONFIG_ATM_AMBASSADOR is not set
# CONFIG_ATM_HORIZON is not set
# CONFIG_ATM_IA is not set
# CONFIG_ATM_FORE200E_MAYBE is not set
# CONFIG_ATM_HE is not set

#
# Amateur Radio support
#
# CONFIG_HAMRADIO is not set

#
# IrDA (infrared) support
#
# CONFIG_IRDA is not set

#
# Bluetooth support
#
# CONFIG_BT is not set

#
# ISDN subsystem
#
# CONFIG_ISDN_BOOL is not set

#
# Telephony Support
#
# CONFIG_PHONE is not set

#
# Input device support
#
CONFIG_INPUT=y

#
# Userland interfaces
#
CONFIG_INPUT_MOUSEDEV=y
CONFIG_INPUT_MOUSEDEV_PSAUX=y
CONFIG_INPUT_MOUSEDEV_SCREEN_X=1024
CONFIG_INPUT_MOUSEDEV_SCREEN_Y=768
CONFIG_INPUT_JOYDEV=y
# CONFIG_INPUT_TSDEV is not set
# CONFIG_INPUT_EVDEV is not set
# CONFIG_INPUT_EVBUG is not set

#
# Input I/O drivers
#
# CONFIG_GAMEPORT is not set
CONFIG_SOUND_GAMEPORT=y
CONFIG_SERIO=y
CONFIG_SERIO_I8042=y
CONFIG_SERIO_SERPORT=y
# CONFIG_SERIO_CT82C710 is not set
# CONFIG_SERIO_PARKBD is not set
# CONFIG_SERIO_PCIPS2 is not set

#
# Input Device Drivers
#
CONFIG_INPUT_KEYBOARD=y
CONFIG_KEYBOARD_ATKBD=y
# CONFIG_KEYBOARD_SUNKBD is not set
# CONFIG_KEYBOARD_XTKBD is not set
# CONFIG_KEYBOARD_NEWTON is not set
CONFIG_INPUT_MOUSE=y
CONFIG_MOUSE_PS2=y
# CONFIG_MOUSE_PS2_SYNAPTICS is not set
# CONFIG_MOUSE_SERIAL is not set
# CONFIG_INPUT_JOYSTICK is not set
# CONFIG_INPUT_TOUCHSCREEN is not set
# CONFIG_INPUT_MISC is not set

#
# Character devices
#
CONFIG_VT=y
CONFIG_VT_CONSOLE=y
CONFIG_HW_CONSOLE=y
# CONFIG_SERIAL_NONSTANDARD is not set

#
# Serial drivers
#
CONFIG_SERIAL_8250=m
# CONFIG_SERIAL_8250_ACPI is not set
CONFIG_SERIAL_8250_NR_UARTS=4
# CONFIG_SERIAL_8250_EXTENDED is not set

#
# Non-8250 serial port support
#
CONFIG_SERIAL_CORE=m
CONFIG_UNIX98_PTYS=y
CONFIG_UNIX98_PTY_COUNT=256
CONFIG_PRINTER=m
# CONFIG_LP_CONSOLE is not set
# CONFIG_PPDEV is not set
# CONFIG_TIPAR is not set

#
# I2C support
#
CONFIG_I2C=m
CONFIG_I2C_CHARDEV=m

#
# I2C Algorithms
#
CONFIG_I2C_ALGOBIT=m
# CONFIG_I2C_ALGOPCF is not set

#
# I2C Hardware Bus support
#
# CONFIG_I2C_ALI1535 is not set
# CONFIG_I2C_ALI15X3 is not set
# CONFIG_I2C_AMD756 is not set
# CONFIG_I2C_AMD8111 is not set
# CONFIG_I2C_I801 is not set
# CONFIG_I2C_I810 is not set
# CONFIG_I2C_NFORCE2 is not set
# CONFIG_I2C_PHILIPSPAR is not set
# CONFIG_I2C_PIIX4 is not set
# CONFIG_I2C_PROSAVAGE is not set
# CONFIG_I2C_SAVAGE4 is not set
# CONFIG_SCx200_ACB is not set
# CONFIG_I2C_SIS5595 is not set
# CONFIG_I2C_SIS630 is not set
# CONFIG_I2C_SIS96X is not set
CONFIG_I2C_VIA=m
CONFIG_I2C_VIAPRO=m
# CONFIG_I2C_VOODOO3 is not set

#
# I2C Hardware Sensors Chip support
#
CONFIG_I2C_SENSOR=m
CONFIG_SENSORS_ADM1021=m
CONFIG_SENSORS_EEPROM=m
CONFIG_SENSORS_IT87=m
CONFIG_SENSORS_LM75=m
CONFIG_SENSORS_LM78=m
CONFIG_SENSORS_LM85=m
CONFIG_SENSORS_VIA686A=m
# CONFIG_SENSORS_W83781D is not set

#
# Linux InfraRed Controller
#
CONFIG_LIRC_ATIUSB=m
# CONFIG_LIRC_SUPPORT is not set

#
# Mice
#
# CONFIG_BUSMOUSE is not set
# CONFIG_QIC02_TAPE is not set

#
# IPMI
#
# CONFIG_IPMI_HANDLER is not set

#
# Watchdog Cards
#
# CONFIG_WATCHDOG is not set
# CONFIG_HW_RANDOM is not set
# CONFIG_NVRAM is not set
CONFIG_RTC=y
# CONFIG_DTLK is not set
# CONFIG_R3964 is not set
# CONFIG_APPLICOM is not set
# CONFIG_SONYPI is not set

#
# Ftape, the floppy tape device driver
#
# CONFIG_FTAPE is not set
CONFIG_AGP=m
# CONFIG_AGP_ALI is not set
# CONFIG_AGP_ATI is not set
# CONFIG_AGP_AMD is not set
# CONFIG_AGP_AMD64 is not set
# CONFIG_AGP_INTEL is not set
# CONFIG_AGP_NVIDIA is not set
# CONFIG_AGP_SIS is not set
# CONFIG_AGP_SWORKS is not set
CONFIG_AGP_VIA=m
# CONFIG_DRM is not set
# CONFIG_MWAVE is not set
# CONFIG_RAW_DRIVER is not set
# CONFIG_HANGCHECK_TIMER is not set

#
# Multimedia devices
#
CONFIG_VIDEO_DEV=m

#
# Video For Linux
#

#
# Video Adapters
#
CONFIG_VIDEO_BT848=m
# CONFIG_VIDEO_BWQCAM is not set
# CONFIG_VIDEO_CQCAM is not set
# CONFIG_VIDEO_W9966 is not set
# CONFIG_VIDEO_CPIA is not set
# CONFIG_VIDEO_SAA5249 is not set
# CONFIG_TUNER_3036 is not set
# CONFIG_VIDEO_STRADIS is not set
# CONFIG_VIDEO_ZORAN is not set
# CONFIG_VIDEO_SAA7134 is not set
# CONFIG_VIDEO_MXB is not set
# CONFIG_VIDEO_DPC is not set
# CONFIG_VIDEO_HEXIUM_ORION is not set
# CONFIG_VIDEO_HEXIUM_GEMINI is not set

#
# Radio Adapters
#
# CONFIG_RADIO_GEMTEK_PCI is not set
# CONFIG_RADIO_MAXIRADIO is not set
# CONFIG_RADIO_MAESTRO is not set

#
# Digital Video Broadcasting Devices
#
# CONFIG_DVB is not set
CONFIG_VIDEO_TUNER=m
CONFIG_VIDEO_BUF=m
CONFIG_VIDEO_BTCX=m
CONFIG_VIDEO_IR=m

#
# Graphics support
#
# CONFIG_FB is not set
# CONFIG_VIDEO_SELECT is not set

#
# Console display driver support
#
CONFIG_VGA_CONSOLE=y
# CONFIG_MDA_CONSOLE is not set
CONFIG_DUMMY_CONSOLE=y

#
# Sound
#
CONFIG_SOUND=m

#
# Advanced Linux Sound Architecture
#
CONFIG_SND=m
CONFIG_SND_SEQUENCER=m
# CONFIG_SND_SEQ_DUMMY is not set
CONFIG_SND_OSSEMUL=y
CONFIG_SND_MIXER_OSS=m
CONFIG_SND_PCM_OSS=m
CONFIG_SND_SEQUENCER_OSS=y
CONFIG_SND_RTCTIMER=m
# CONFIG_SND_VERBOSE_PRINTK is not set
# CONFIG_SND_DEBUG is not set

#
# Generic devices
#
# CONFIG_SND_DUMMY is not set
# CONFIG_SND_VIRMIDI is not set
# CONFIG_SND_MTPAV is not set
# CONFIG_SND_SERIAL_U16550 is not set
# CONFIG_SND_MPU401 is not set

#
# PCI devices
#
# CONFIG_SND_ALI5451 is not set
# CONFIG_SND_AZT3328 is not set
# CONFIG_SND_CS46XX is not set
# CONFIG_SND_CS4281 is not set
# CONFIG_SND_EMU10K1 is not set
# CONFIG_SND_KORG1212 is not set
# CONFIG_SND_NM256 is not set
# CONFIG_SND_RME32 is not set
# CONFIG_SND_RME96 is not set
# CONFIG_SND_RME9652 is not set
# CONFIG_SND_HDSP is not set
# CONFIG_SND_TRIDENT is not set
# CONFIG_SND_YMFPCI is not set
# CONFIG_SND_ALS4000 is not set
# CONFIG_SND_CMIPCI is not set
# CONFIG_SND_ENS1370 is not set
# CONFIG_SND_ENS1371 is not set
# CONFIG_SND_ES1938 is not set
# CONFIG_SND_ES1968 is not set
# CONFIG_SND_MAESTRO3 is not set
# CONFIG_SND_FM801 is not set
# CONFIG_SND_ICE1712 is not set
# CONFIG_SND_ICE1724 is not set
# CONFIG_SND_INTEL8X0 is not set
# CONFIG_SND_SONICVIBES is not set
CONFIG_SND_VIA82XX=m
# CONFIG_SND_VX222 is not set

#
# ALSA USB devices
#
# CONFIG_SND_USB_AUDIO is not set

#
# Open Sound System
#
# CONFIG_SOUND_PRIME is not set

#
# USB support
#
CONFIG_USB=m
# CONFIG_USB_DEBUG is not set

#
# Miscellaneous USB options
#
CONFIG_USB_DEVICEFS=y
CONFIG_USB_BANDWIDTH=y
# CONFIG_USB_DYNAMIC_MINORS is not set

#
# USB Host Controller Drivers
#
CONFIG_USB_EHCI_HCD=m
# CONFIG_USB_OHCI_HCD is not set
CONFIG_USB_UHCI_HCD=m

#
# USB Device Class drivers
#
# CONFIG_USB_AUDIO is not set
# CONFIG_USB_BLUETOOTH_TTY is not set
# CONFIG_USB_MIDI is not set
# CONFIG_USB_ACM is not set
# CONFIG_USB_PRINTER is not set

#
# SCSI support is needed for USB Storage
#
# CONFIG_USB_STORAGE is not set

#
# USB Human Interface Devices (HID)
#
# CONFIG_USB_HID is not set

#
# USB HID Boot Protocol drivers
#
# CONFIG_USB_KBD is not set
# CONFIG_USB_MOUSE is not set
# CONFIG_USB_AIPTEK is not set
# CONFIG_USB_WACOM is not set
# CONFIG_USB_KBTAB is not set
# CONFIG_USB_POWERMATE is not set
# CONFIG_USB_XPAD is not set

#
# USB Imaging devices
#
# CONFIG_USB_MDC800 is not set
# CONFIG_USB_SCANNER is not set

#
# USB Multimedia devices
#
# CONFIG_USB_DABUSB is not set
# CONFIG_USB_VICAM is not set
# CONFIG_USB_DSBR is not set
# CONFIG_USB_IBMCAM is not set
# CONFIG_USB_KONICAWC is not set
# CONFIG_USB_OV511 is not set
# CONFIG_USB_PWC is not set
# CONFIG_USB_SE401 is not set
# CONFIG_USB_STV680 is not set

#
# USB Network adaptors
#
# CONFIG_USB_CATC is not set
# CONFIG_USB_KAWETH is not set
# CONFIG_USB_PEGASUS is not set
# CONFIG_USB_RTL8150 is not set
# CONFIG_USB_USBNET is not set

#
# USB port drivers
#
# CONFIG_USB_USS720 is not set

#
# USB Serial Converter support
#
# CONFIG_USB_SERIAL is not set

#
# USB Miscellaneous drivers
#
# CONFIG_USB_TIGL is not set
# CONFIG_USB_AUERSWALD is not set
# CONFIG_USB_RIO500 is not set
# CONFIG_USB_BRLVGER is not set
# CONFIG_USB_LCD is not set
CONFIG_USB_SPEEDTOUCH=m
# CONFIG_USB_TEST is not set
# CONFIG_USB_GADGET is not set

#
# File systems
#
CONFIG_EXT2_FS=m
# CONFIG_EXT2_FS_XATTR is not set
CONFIG_EXT3_FS=m
CONFIG_EXT3_FS_XATTR=y
# CONFIG_EXT3_FS_POSIX_ACL is not set
# CONFIG_EXT3_FS_SECURITY is not set
CONFIG_JBD=m
# CONFIG_JBD_DEBUG is not set
CONFIG_FS_MBCACHE=m
CONFIG_REISERFS_FS=y
# CONFIG_REISERFS_CHECK is not set
# CONFIG_REISERFS_PROC_INFO is not set
# CONFIG_JFS_FS is not set
# CONFIG_XFS_FS is not set
# CONFIG_MINIX_FS is not set
# CONFIG_ROMFS_FS is not set
# CONFIG_QUOTA is not set
# CONFIG_AUTOFS_FS is not set
# CONFIG_AUTOFS4_FS is not set

#
# CD-ROM/DVD Filesystems
#
CONFIG_ISO9660_FS=m
CONFIG_JOLIET=y
CONFIG_ZISOFS=y
CONFIG_ZISOFS_FS=m
# CONFIG_UDF_FS is not set

#
# DOS/FAT/NT Filesystems
#
CONFIG_FAT_FS=m
# CONFIG_MSDOS_FS is not set
CONFIG_VFAT_FS=m
CONFIG_NTFS_FS=m
# CONFIG_NTFS_DEBUG is not set
CONFIG_NTFS_RW=y

#
# Pseudo filesystems
#
CONFIG_PROC_FS=y
CONFIG_PROC_KCORE=y
CONFIG_DEVFS_FS=y
CONFIG_DEVFS_MOUNT=y
# CONFIG_DEVFS_DEBUG is not set
CONFIG_DEVPTS_FS=y
# CONFIG_DEVPTS_FS_XATTR is not set
CONFIG_TMPFS=y
# CONFIG_HUGETLBFS is not set
# CONFIG_HUGETLB_PAGE is not set
CONFIG_RAMFS=y

#
# Miscellaneous filesystems
#
# CONFIG_ADFS_FS is not set
# CONFIG_AFFS_FS is not set
CONFIG_HFS_FS=m
# CONFIG_BEFS_FS is not set
# CONFIG_BFS_FS is not set
# CONFIG_EFS_FS is not set
# CONFIG_CRAMFS is not set
# CONFIG_VXFS_FS is not set
# CONFIG_HPFS_FS is not set
# CONFIG_QNX4FS_FS is not set
# CONFIG_SYSV_FS is not set
# CONFIG_UFS_FS is not set

#
# Network File Systems
#
CONFIG_NFS_FS=m
CONFIG_NFS_V3=y
# CONFIG_NFS_V4 is not set
CONFIG_NFS_DIRECTIO=y
CONFIG_NFSD=m
CONFIG_NFSD_V3=y
# CONFIG_NFSD_V4 is not set
# CONFIG_NFSD_TCP is not set
CONFIG_LOCKD=m
CONFIG_LOCKD_V4=y
CONFIG_EXPORTFS=m
CONFIG_SUNRPC=m
# CONFIG_SUNRPC_GSS is not set
CONFIG_SMB_FS=m
# CONFIG_SMB_NLS_DEFAULT is not set
# CONFIG_CIFS is not set
# CONFIG_NCP_FS is not set
# CONFIG_CODA_FS is not set
# CONFIG_INTERMEZZO_FS is not set
# CONFIG_AFS_FS is not set

#
# Partition Types
#
# CONFIG_PARTITION_ADVANCED is not set
CONFIG_MSDOS_PARTITION=y

#
# Native Language Support
#
CONFIG_NLS=y
CONFIG_NLS_DEFAULT="iso8859-1"
CONFIG_NLS_CODEPAGE_437=m
# CONFIG_NLS_CODEPAGE_737 is not set
# CONFIG_NLS_CODEPAGE_775 is not set
CONFIG_NLS_CODEPAGE_850=m
# CONFIG_NLS_CODEPAGE_852 is not set
# CONFIG_NLS_CODEPAGE_855 is not set
# CONFIG_NLS_CODEPAGE_857 is not set
# CONFIG_NLS_CODEPAGE_860 is not set
# CONFIG_NLS_CODEPAGE_861 is not set
# CONFIG_NLS_CODEPAGE_862 is not set
# CONFIG_NLS_CODEPAGE_863 is not set
# CONFIG_NLS_CODEPAGE_864 is not set
# CONFIG_NLS_CODEPAGE_865 is not set
# CONFIG_NLS_CODEPAGE_866 is not set
# CONFIG_NLS_CODEPAGE_869 is not set
# CONFIG_NLS_CODEPAGE_936 is not set
# CONFIG_NLS_CODEPAGE_950 is not set
# CONFIG_NLS_CODEPAGE_932 is not set
# CONFIG_NLS_CODEPAGE_949 is not set
# CONFIG_NLS_CODEPAGE_874 is not set
# CONFIG_NLS_ISO8859_8 is not set
# CONFIG_NLS_CODEPAGE_1250 is not set
# CONFIG_NLS_CODEPAGE_1251 is not set
CONFIG_NLS_ISO8859_1=m
# CONFIG_NLS_ISO8859_2 is not set
# CONFIG_NLS_ISO8859_3 is not set
# CONFIG_NLS_ISO8859_4 is not set
# CONFIG_NLS_ISO8859_5 is not set
# CONFIG_NLS_ISO8859_6 is not set
# CONFIG_NLS_ISO8859_7 is not set
# CONFIG_NLS_ISO8859_9 is not set
# CONFIG_NLS_ISO8859_13 is not set
# CONFIG_NLS_ISO8859_14 is not set
CONFIG_NLS_ISO8859_15=m
# CONFIG_NLS_KOI8_R is not set
# CONFIG_NLS_KOI8_U is not set
# CONFIG_NLS_UTF8 is not set

#
# Profiling support
#
# CONFIG_PROFILING is not set

#
# Kernel hacking
#
CONFIG_DEBUG_KERNEL=y
# CONFIG_DEBUG_STACKOVERFLOW is not set
# CONFIG_DEBUG_SLAB is not set
# CONFIG_DEBUG_IOVIRT is not set
CONFIG_MAGIC_SYSRQ=y
# CONFIG_DEBUG_SPINLOCK is not set
# CONFIG_DEBUG_PAGEALLOC is not set
# CONFIG_SPINLINE is not set
# CONFIG_DEBUG_INFO is not set
# CONFIG_DEBUG_SPINLOCK_SLEEP is not set
# CONFIG_KGDB is not set
# CONFIG_FRAME_POINTER is not set
CONFIG_X86_EXTRA_IRQS=y
CONFIG_X86_FIND_SMP_CONFIG=y
CONFIG_X86_MPPARSE=y
CONFIG_KMSGDUMP=y
CONFIG_KMSGDUMP_FAT=y
CONFIG_KMSGDUMP_AUTO=y
# CONFIG_KMSGDUMP_SAFE is not set

#
# Security options
#
# CONFIG_SECURITY is not set

#
# Cryptographic options
#
CONFIG_CRYPTO=y
CONFIG_CRYPTO_HMAC=y
CONFIG_CRYPTO_NULL=m
CONFIG_CRYPTO_MD4=m
CONFIG_CRYPTO_MD5=m
CONFIG_CRYPTO_SHA1=m
CONFIG_CRYPTO_SHA256=m
CONFIG_CRYPTO_SHA512=m
CONFIG_CRYPTO_DES=m
CONFIG_CRYPTO_BLOWFISH=m
CONFIG_CRYPTO_TWOFISH=m
CONFIG_CRYPTO_SERPENT=m
CONFIG_CRYPTO_AES=m
CONFIG_CRYPTO_CAST5=m
CONFIG_CRYPTO_CAST6=m
CONFIG_CRYPTO_DEFLATE=m
CONFIG_CRYPTO_TEST=m

#
# Library routines
#
CONFIG_CRC32=y
CONFIG_ZLIB_INFLATE=m
CONFIG_ZLIB_DEFLATE=m
CONFIG_X86_BIOS_REBOOT=y
CONFIG_PC=y

[-- Attachment #5: lsmod --]
[-- Type: text/plain, Size: 2065 bytes --]

Module                  Size  Used by
sch_ingress             3012  1 
cls_u32                 6532  7 
sch_sfq                 4480  3 
sch_htb                22208  1 
ip_conntrack_ftp       71092  0 
ipt_MASQUERADE          2816  2 
iptable_mangle          2112  0 
iptable_nat            19500  2 ipt_MASQUERADE
ipt_REJECT              5312  8 
ipt_limit               1856  29 
ipt_state               1472  4 
ip_conntrack           27248  4 ip_conntrack_ftp,ipt_MASQUERADE,iptable_nat,ipt_state
ipt_LOG                 4928  15 
ipt_ULOG                5672  12 
iptable_filter          2176  1 
ip_tables              15616  9 ipt_MASQUERADE,iptable_mangle,iptable_nat,ipt_REJECT,ipt_limit,ipt_state,ipt_LOG,ipt_ULOG,iptable_filter
binfmt_misc             8072  1 
af_packet              17032  2 
snd_seq_oss            32000  0 
snd_seq_midi_event      6272  1 snd_seq_oss
snd_seq                51600  4 snd_seq_oss,snd_seq_midi_event
snd_pcm_oss            48164  0 
snd_mixer_oss          16768  1 snd_pcm_oss
snd_via82xx            21792  0 
snd_pcm                85668  2 snd_pcm_oss,snd_via82xx
snd_timer              21572  2 snd_seq,snd_pcm
snd_ac97_codec         51716  1 snd_via82xx
snd_page_alloc          8964  2 snd_via82xx,snd_pcm
snd_mpu401_uart         6016  1 snd_via82xx
snd_rawmidi            20384  1 snd_mpu401_uart
snd_seq_device          6600  3 snd_seq_oss,snd_seq,snd_rawmidi
snd                    43492  12 snd_seq_oss,snd_seq_midi_event,snd_seq,snd_pcm_oss,snd_mixer_oss,snd_via82xx,snd_pcm,snd_timer,snd_ac97_codec,snd_mpu401_uart,snd_rawmidi,snd_seq_device
soundcore               7168  1 snd
speedtch               12848  1 
clip                   13668  1 
uhci_hcd               29584  0 
ehci_hcd               21764  0 
usbcore                97244  5 speedtch,uhci_hcd,ehci_hcd
isofs                  31544  0 
zlib_inflate           21184  1 isofs
nls_cp437               5376  1 
vfat                   12672  1 
fat                    40512  1 vfat
nls_iso8859_1           3776  2 
ntfs                   96364  1 

^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [kernel panic @ reboot] 2.6.0-test10-mm1
  2003-11-26 16:51 [kernel panic @ reboot] 2.6.0-test10-mm1 Vince
@ 2003-11-26 17:16 ` Zwane Mwaikambo
  2003-11-26 17:34   ` Vince
  0 siblings, 1 reply; 113+ messages in thread
From: Zwane Mwaikambo @ 2003-11-26 17:16 UTC (permalink / raw)
  To: Vince; +Cc: linux-kernel

On Wed, 26 Nov 2003, Vince wrote:

>    I get a kernel panic each time I'm rebooting my system on all
> recent 2.6.0testx kernels (cpu is an Athlon 1800XP, kernel compiled with
> preempt and ACPI ; config and dmesg is attached).
>
>    This time, I got tired of seeing this and finally installed kmsgdump
> in order to collect some data, available in messages.txt (*)
>
> For my particular case, X was not loaded: I just logged in in console
> mode and did a reboot. No nvidia or other binary driver loaded. Any hint
> on tracking down this bug is appreciated (I can compile my kernel with
> additional debugging options if required).

I can't see the first oops, it looks like it's been spewing them out for a
while too;

<4>Oops: 0000 [#49]

At the point you're at there really isn't much state left to work from.
Any chance you can get at the logs (if it hit disk) and get the first
oops?


^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [kernel panic @ reboot] 2.6.0-test10-mm1
  2003-11-26 17:16 ` Zwane Mwaikambo
@ 2003-11-26 17:34   ` Vince
  2003-11-26 17:35     ` Randy.Dunlap
  2003-11-26 17:40     ` Zwane Mwaikambo
  0 siblings, 2 replies; 113+ messages in thread
From: Vince @ 2003-11-26 17:34 UTC (permalink / raw)
  To: Zwane Mwaikambo; +Cc: linux-kernel

Zwane Mwaikambo wrote:
> On Wed, 26 Nov 2003, Vince wrote:
> 
> <4>Oops: 0000 [#49]
> 
> At the point you're at there really isn't much state left to work from.
> Any chance you can get at the logs (if it hit disk) and get the first
> oops?
> 

Nothing ever hits the disk (In interrupt handler - not syncing ...), 
that's the reason why I had to install kmsgdump in the first place.
(Sidenote: a few days ago, I had the intent to install the lkcd kernel 
patches, but gave up because of the time required to 
patch/compile/install/setup correctly the kernel and userspace utilities 
(not .deb of lkcd-utils available...)).
I suppose I could enlarge the kernel message log size, but the kmsgdup 
documentation states:
---------------------------------
If you have changed your messages buffer size (which is 16 kB by 
default), you should modify the size in "include/asm/kmsgdump.h", 
parameter LOG_BUF_LEN. Some people required 32 kB. But you shouldn't 
exceed 60 kB since the dump is done in real mode (16 bits).
For kernel versions 2.5.6x and later, the LOG_BUF_LEN parameter is part
of the kernel .config file (LOG_BUF_SHIFT) so you don't need to modify
it at all.
---------------------------------

...so I you think 60kB would be enough to catch the first oops -- or if 
the doc is outdated -- I can try this...


^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [kernel panic @ reboot] 2.6.0-test10-mm1
  2003-11-26 17:34   ` Vince
@ 2003-11-26 17:35     ` Randy.Dunlap
  2003-11-26 17:40     ` Zwane Mwaikambo
  1 sibling, 0 replies; 113+ messages in thread
From: Randy.Dunlap @ 2003-11-26 17:35 UTC (permalink / raw)
  To: Vince; +Cc: zwane, linux-kernel

On Wed, 26 Nov 2003 18:34:34 +0100 Vince <fuzzy77@free.fr> wrote:

| Zwane Mwaikambo wrote:
| > On Wed, 26 Nov 2003, Vince wrote:
| > 
| > <4>Oops: 0000 [#49]
| > 
| > At the point you're at there really isn't much state left to work from.
| > Any chance you can get at the logs (if it hit disk) and get the first
| > oops?
| > 
| 
| Nothing ever hits the disk (In interrupt handler - not syncing ...), 
| that's the reason why I had to install kmsgdump in the first place.
| (Sidenote: a few days ago, I had the intent to install the lkcd kernel 
| patches, but gave up because of the time required to 
| patch/compile/install/setup correctly the kernel and userspace utilities 
| (not .deb of lkcd-utils available...)).
| I suppose I could enlarge the kernel message log size, but the kmsgdup 
| documentation states:
| ---------------------------------
| If you have changed your messages buffer size (which is 16 kB by 
| default), you should modify the size in "include/asm/kmsgdump.h", 
| parameter LOG_BUF_LEN. Some people required 32 kB. But you shouldn't 
| exceed 60 kB since the dump is done in real mode (16 bits).
| For kernel versions 2.5.6x and later, the LOG_BUF_LEN parameter is part
| of the kernel .config file (LOG_BUF_SHIFT) so you don't need to modify
| it at all.
| ---------------------------------
| 
| ...so I you think 60kB would be enough to catch the first oops -- or if 
| the doc is outdated -- I can try this...

wow... ooops.  a kmsgdump user.  :)

No, the doc is not outdated, and since the log buf size must be a
power of 2, 32 KB is the largest that is currently supported.
Sorry about that.

--
~Randy
MOTD:  Always include version info.

^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [kernel panic @ reboot] 2.6.0-test10-mm1
  2003-11-26 17:34   ` Vince
  2003-11-26 17:35     ` Randy.Dunlap
@ 2003-11-26 17:40     ` Zwane Mwaikambo
  2003-11-26 17:54       ` Vince
  1 sibling, 1 reply; 113+ messages in thread
From: Zwane Mwaikambo @ 2003-11-26 17:40 UTC (permalink / raw)
  To: Vince; +Cc: linux-kernel

On Wed, 26 Nov 2003, Vince wrote:

> parameter LOG_BUF_LEN. Some people required 32 kB. But you shouldn't
> exceed 60 kB since the dump is done in real mode (16 bits).
> For kernel versions 2.5.6x and later, the LOG_BUF_LEN parameter is part
> of the kernel .config file (LOG_BUF_SHIFT) so you don't need to modify
> it at all.
> ---------------------------------
>
> ...so I you think 60kB would be enough to catch the first oops -- or if
> the doc is outdated -- I can try this...

*groan* do you have a PDA?


^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [kernel panic @ reboot] 2.6.0-test10-mm1
  2003-11-26 17:40     ` Zwane Mwaikambo
@ 2003-11-26 17:54       ` Vince
  2003-11-26 18:18         ` Zwane Mwaikambo
  0 siblings, 1 reply; 113+ messages in thread
From: Vince @ 2003-11-26 17:54 UTC (permalink / raw)
  To: Zwane Mwaikambo; +Cc: linux-kernel

Zwane Mwaikambo wrote:
> On Wed, 26 Nov 2003, Vince wrote:
>>parameter LOG_BUF_LEN. Some people required 32 kB. But you shouldn't
>>exceed 60 kB since the dump is done in real mode (16 bits).
>>For kernel versions 2.5.6x and later, the LOG_BUF_LEN parameter is part
>>of the kernel .config file (LOG_BUF_SHIFT) so you don't need to modify
>>it at all.
>>---------------------------------
>>
>>...so I you think 60kB would be enough to catch the first oops -- or if
>>the doc is outdated -- I can try this...
> 
> 
> *groan* do you have a PDA?
> 

Nope. I could probably borrow a laptop to a friend but am not excited at 
the idea of having to setup some serial console thing (I do not even 
have a serial cable). Dump to floppy/swap/disk would be much easier in 
my case... if it could me made to work, of course ;-)


^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [kernel panic @ reboot] 2.6.0-test10-mm1
  2003-11-26 17:54       ` Vince
@ 2003-11-26 18:18         ` Zwane Mwaikambo
  2003-11-26 23:37           ` Mike Fedyk
  2003-11-27  0:59           ` [kernel panic @ reboot in usbcore] 2.6.0-test10-mm1 (culprit: modem_run) Vince
  0 siblings, 2 replies; 113+ messages in thread
From: Zwane Mwaikambo @ 2003-11-26 18:18 UTC (permalink / raw)
  To: Vince; +Cc: Linux Kernel, Randy.Dunlap

On Wed, 26 Nov 2003, Vince wrote:

> > *groan* do you have a PDA?
> >
>
> Nope. I could probably borrow a laptop to a friend but am not excited at
> the idea of having to setup some serial console thing (I do not even
> have a serial cable). Dump to floppy/swap/disk would be much easier in
> my case... if it could me made to work, of course ;-)

Those oopses looked rather spurious, i'm not sure what help those other
methods would be here. Try applying the following patch and be sure to
have access to the console. You may have to hand transcribe...

Index: linux-2.6.0-test10-mm1-bochs/arch/i386/kernel/traps.c
===================================================================
RCS file: /build/cvsroot/linux-2.6.0-test10-mm1/arch/i386/kernel/traps.c,v
retrieving revision 1.1.1.1
diff -u -p -B -r1.1.1.1 traps.c
--- linux-2.6.0-test10-mm1-bochs/arch/i386/kernel/traps.c	26 Nov 2003 05:28:50 -0000	1.1.1.1
+++ linux-2.6.0-test10-mm1-bochs/arch/i386/kernel/traps.c	26 Nov 2003 18:17:37 -0000
@@ -329,6 +329,10 @@ void die(const char * str, struct pt_reg
 	if (in_interrupt())
 		panic("Fatal exception in interrupt");

+	local_irq_disable();
+	while (1)
+		__asm__ __volatile__("hlt");
+
 	if (panic_on_oops) {
 		printk(KERN_EMERG "Fatal exception: panic in 5 seconds\n");
 		set_current_state(TASK_UNINTERRUPTIBLE);

^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [kernel panic @ reboot] 2.6.0-test10-mm1
  2003-11-26 18:18         ` Zwane Mwaikambo
@ 2003-11-26 23:37           ` Mike Fedyk
  2003-11-26 23:41             ` Vince
  2003-11-27  0:59           ` [kernel panic @ reboot in usbcore] 2.6.0-test10-mm1 (culprit: modem_run) Vince
  1 sibling, 1 reply; 113+ messages in thread
From: Mike Fedyk @ 2003-11-26 23:37 UTC (permalink / raw)
  To: Zwane Mwaikambo; +Cc: Vince, Linux Kernel, Randy.Dunlap

On Wed, Nov 26, 2003 at 01:18:48PM -0500, Zwane Mwaikambo wrote:
> On Wed, 26 Nov 2003, Vince wrote:
> 
> > > *groan* do you have a PDA?
> > >
> >
> > Nope. I could probably borrow a laptop to a friend but am not excited at
> > the idea of having to setup some serial console thing (I do not even
> > have a serial cable). Dump to floppy/swap/disk would be much easier in
> > my case... if it could me made to work, of course ;-)
> 
> Those oopses looked rather spurious, i'm not sure what help those other
> methods would be here. Try applying the following patch and be sure to
> have access to the console. You may have to hand transcribe...

Interesting.  It would be nice to have a boot option that halts the system
after the first oops, instead of trying to continue.

Vince/Randy:
Did you use the 2.5.65 patch at http://w.ods.org/tools/kmsgdump/ or is there
some other place that has newer patches?

BTW, http://www.xenotime.net/linux/kmsgdump gives a 404 error.

^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [kernel panic @ reboot] 2.6.0-test10-mm1
  2003-11-26 23:37           ` Mike Fedyk
@ 2003-11-26 23:41             ` Vince
  2003-12-03  0:03               ` Randy.Dunlap
  0 siblings, 1 reply; 113+ messages in thread
From: Vince @ 2003-11-26 23:41 UTC (permalink / raw)
  To: Mike Fedyk; +Cc: Zwane Mwaikambo, Linux Kernel, Randy.Dunlap

Mike Fedyk wrote:
> Interesting.  It would be nice to have a boot option that halts the system
> after the first oops, instead of trying to continue.
> 
> Vince/Randy:
> Did you use the 2.5.65 patch at http://w.ods.org/tools/kmsgdump/ or is there
> some other place that has newer patches?
> 
> BTW, http://www.xenotime.net/linux/kmsgdump gives a 404 error.

My version comes from:
http://developer.osdl.org/rddunlap/kmsgdump/


^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [kernel panic @ reboot in usbcore] 2.6.0-test10-mm1 (culprit: modem_run)
  2003-11-26 18:18         ` Zwane Mwaikambo
  2003-11-26 23:37           ` Mike Fedyk
@ 2003-11-27  0:59           ` Vince
  2003-11-27  3:13             ` Zwane Mwaikambo
  2003-11-27  8:11             ` Duncan Sands
  1 sibling, 2 replies; 113+ messages in thread
From: Vince @ 2003-11-27  0:59 UTC (permalink / raw)
  To: Zwane Mwaikambo; +Cc: Linux Kernel, Randy.Dunlap

It worked, but I had -- as expected -- to write the oops by hand.
(user request to Randy: would it be possible to have an option in 
kmsgdump to only write the first oops on floppy ???)

I it have all on paper, but I'm too lazy to write the full stack right 
now (available later on request: I have to go to bed now 8):

------------------------------------------------------------------
CPU: 0
EIP: 0060 : [<d0ae9822>]
EFLAGS: 00010246
EIP is at releaseintf+0x62/0x80 [usbcore]
eax:00000000 ebx:ceddc224 ecx:cs6D5DC0 edx:00000000
esi:ceddc200 edi:00000000 ebp:cd647f0c esp:cd647ef8
ds: 007b es:007b ss:0068

Process: modem_run (pid: 1121, threadinfo=cd646000, task=ce644040)
Stack: c016ffe3 ce0bfb24 ce6d5dc0 ...
[...]

Call trace
[<c016ffe3>] iput+0x63/0x80
[<d0ae9c27>] usbdev_release+0xb7/0xc0 [usbcore]
[<c0157a5c>] __fput+0x10c/0x120
[<c0156047>] filp_close+0x57/0x80
[<c0123d17>] put_files_struct+0x67/0xd0
[<c012491e>] do_exit+0x3a/0xb0
[<c0124c4a>] do_group_exit+0x3a/0xb0
[<c02a302e>] sysenter_past_esp+0x43/0x65
-------------------------------------------------------------------

The modem_run process is the one uploading the firmware for my 
speedtouch dsl modem. I'm using the kernel-space speedtouch driver, with 
modem_run from http://speedtouch.sourceforge.net/
Manually shutting down the network and killing modem_run before 
rebooting makes the oops disapear.

  However, I believe the fact that modem_run can cause a kernel panic is 
still a bug that should be fixed. I'm willing to test any patch to fix 
this issue that has ennoyed me since a long time (in the meantime, I'll 
work around this in my shutdown scripts). :-)



Zwane Mwaikambo wrote:
> On Wed, 26 Nov 2003, Vince wrote:
> 
> 
>>>*groan* do you have a PDA?
>>>
>>
>>Nope. I could probably borrow a laptop to a friend but am not excited at
>>the idea of having to setup some serial console thing (I do not even
>>have a serial cable). Dump to floppy/swap/disk would be much easier in
>>my case... if it could me made to work, of course ;-)
> 
> 
> Those oopses looked rather spurious, i'm not sure what help those other
> methods would be here. Try applying the following patch and be sure to
> have access to the console. You may have to hand transcribe...
> 
> Index: linux-2.6.0-test10-mm1-bochs/arch/i386/kernel/traps.c
> ===================================================================
> RCS file: /build/cvsroot/linux-2.6.0-test10-mm1/arch/i386/kernel/traps.c,v
> retrieving revision 1.1.1.1
> diff -u -p -B -r1.1.1.1 traps.c
> --- linux-2.6.0-test10-mm1-bochs/arch/i386/kernel/traps.c	26 Nov 2003 05:28:50 -0000	1.1.1.1
> +++ linux-2.6.0-test10-mm1-bochs/arch/i386/kernel/traps.c	26 Nov 2003 18:17:37 -0000
> @@ -329,6 +329,10 @@ void die(const char * str, struct pt_reg
>  	if (in_interrupt())
>  		panic("Fatal exception in interrupt");
> 
> +	local_irq_disable();
> +	while (1)
> +		__asm__ __volatile__("hlt");
> +
>  	if (panic_on_oops) {
>  		printk(KERN_EMERG "Fatal exception: panic in 5 seconds\n");
>  		set_current_state(TASK_UNINTERRUPTIBLE);



^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [kernel panic @ reboot in usbcore] 2.6.0-test10-mm1 (culprit: modem_run)
  2003-11-27  0:59           ` [kernel panic @ reboot in usbcore] 2.6.0-test10-mm1 (culprit: modem_run) Vince
@ 2003-11-27  3:13             ` Zwane Mwaikambo
  2003-11-27  8:14               ` Vince
  2003-11-27  8:11             ` Duncan Sands
  1 sibling, 1 reply; 113+ messages in thread
From: Zwane Mwaikambo @ 2003-11-27  3:13 UTC (permalink / raw)
  To: Vince; +Cc: Linux Kernel, Randy.Dunlap

On Thu, 27 Nov 2003, Vince wrote:

> It worked, but I had -- as expected -- to write the oops by hand.
> (user request to Randy: would it be possible to have an option in
> kmsgdump to only write the first oops on floppy ???)
>
> I it have all on paper, but I'm too lazy to write the full stack right
> now (available later on request: I have to go to bed now 8):

Yes please get it all done =) Especially the first line and the bottom
"Code:" line.

> CPU: 0
> EIP: 0060 : [<d0ae9822>]
> EFLAGS: 00010246
> EIP is at releaseintf+0x62/0x80 [usbcore]
> eax:00000000 ebx:ceddc224 ecx:cs6D5DC0 edx:00000000
> esi:ceddc200 edi:00000000 ebp:cd647f0c esp:cd647ef8
> ds: 007b es:007b ss:0068
>
> Process: modem_run (pid: 1121, threadinfo=cd646000, task=ce644040)
> Stack: c016ffe3 ce0bfb24 ce6d5dc0 ...
> [...]
>
> Call trace
> [<c016ffe3>] iput+0x63/0x80
> [<d0ae9c27>] usbdev_release+0xb7/0xc0 [usbcore]
> [<c0157a5c>] __fput+0x10c/0x120
> [<c0156047>] filp_close+0x57/0x80
> [<c0123d17>] put_files_struct+0x67/0xd0
> [<c012491e>] do_exit+0x3a/0xb0
> [<c0124c4a>] do_group_exit+0x3a/0xb0
> [<c02a302e>] sysenter_past_esp+0x43/0x65

^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [kernel panic @ reboot in usbcore] 2.6.0-test10-mm1 (culprit: modem_run)
  2003-11-27  0:59           ` [kernel panic @ reboot in usbcore] 2.6.0-test10-mm1 (culprit: modem_run) Vince
  2003-11-27  3:13             ` Zwane Mwaikambo
@ 2003-11-27  8:11             ` Duncan Sands
  1 sibling, 0 replies; 113+ messages in thread
From: Duncan Sands @ 2003-11-27  8:11 UTC (permalink / raw)
  To: Vince, Zwane Mwaikambo; +Cc: Linux Kernel, Randy.Dunlap

This looks like a problem I have been seeing.  I have a fix for this
in the works.  Unfortunately I'm pretty busy right now, so I can't
say when I'll have it ready.  The problem is in drivers/usb/core/devio.c.
Actually there are lots of problems in devio.c :)  One problem is that
devio.c is not protected against actconfig changing under it (thanks
to a usb core change, it becomes NULL before becoming something
else, which causes devio.c to oops rather than quietly do the wrong
thing).  The use of dev->serialize (usb semaphore) in releaseintf is
also wrong because it can lead to deadlock.  Another problem comes
from devio.c (i.e. usbfs) thinking that disconnects are for the whole
device and not just an interface.  Furthermore, there are various
oopsen that come from devio.c not handling urb unlink failures.
That's all I remember off the top of my head :)  I sent a couple of
emails about this (especially the locking problems) to the usb
mailing list lately.  Hang on, I just remembered another one:
releaseintf needs to be called with a write lock taken on
ps->devsem rather than a read lock, otherwise it can boot other
parts of devio.c off an interface when they think they still have
it.

Ciao,

Duncan.


On Thursday 27 November 2003 01:59, Vince wrote:
> It worked, but I had -- as expected -- to write the oops by hand.
> (user request to Randy: would it be possible to have an option in
> kmsgdump to only write the first oops on floppy ???)
>
> I it have all on paper, but I'm too lazy to write the full stack right
> now (available later on request: I have to go to bed now 8):
>
> ------------------------------------------------------------------
> CPU: 0
> EIP: 0060 : [<d0ae9822>]
> EFLAGS: 00010246
> EIP is at releaseintf+0x62/0x80 [usbcore]
> eax:00000000 ebx:ceddc224 ecx:cs6D5DC0 edx:00000000
> esi:ceddc200 edi:00000000 ebp:cd647f0c esp:cd647ef8
> ds: 007b es:007b ss:0068
>
> Process: modem_run (pid: 1121, threadinfo=cd646000, task=ce644040)
> Stack: c016ffe3 ce0bfb24 ce6d5dc0 ...
> [...]
>
> Call trace
> [<c016ffe3>] iput+0x63/0x80
> [<d0ae9c27>] usbdev_release+0xb7/0xc0 [usbcore]
> [<c0157a5c>] __fput+0x10c/0x120
> [<c0156047>] filp_close+0x57/0x80
> [<c0123d17>] put_files_struct+0x67/0xd0
> [<c012491e>] do_exit+0x3a/0xb0
> [<c0124c4a>] do_group_exit+0x3a/0xb0
> [<c02a302e>] sysenter_past_esp+0x43/0x65
> -------------------------------------------------------------------
>
> The modem_run process is the one uploading the firmware for my
> speedtouch dsl modem. I'm using the kernel-space speedtouch driver, with
> modem_run from http://speedtouch.sourceforge.net/
> Manually shutting down the network and killing modem_run before
> rebooting makes the oops disapear.
>
>   However, I believe the fact that modem_run can cause a kernel panic is
> still a bug that should be fixed. I'm willing to test any patch to fix
> this issue that has ennoyed me since a long time (in the meantime, I'll
> work around this in my shutdown scripts). :-)
>
> Zwane Mwaikambo wrote:
> > On Wed, 26 Nov 2003, Vince wrote:
> >>>*groan* do you have a PDA?
> >>
> >>Nope. I could probably borrow a laptop to a friend but am not excited at
> >>the idea of having to setup some serial console thing (I do not even
> >>have a serial cable). Dump to floppy/swap/disk would be much easier in
> >>my case... if it could me made to work, of course ;-)
> >
> > Those oopses looked rather spurious, i'm not sure what help those other
> > methods would be here. Try applying the following patch and be sure to
> > have access to the console. You may have to hand transcribe...
> >
> > Index: linux-2.6.0-test10-mm1-bochs/arch/i386/kernel/traps.c
> > ===================================================================
> > RCS file:
> > /build/cvsroot/linux-2.6.0-test10-mm1/arch/i386/kernel/traps.c,v
> > retrieving revision 1.1.1.1
> > diff -u -p -B -r1.1.1.1 traps.c
> > --- linux-2.6.0-test10-mm1-bochs/arch/i386/kernel/traps.c	26 Nov 2003
> > 05:28:50 -0000	1.1.1.1 +++
> > linux-2.6.0-test10-mm1-bochs/arch/i386/kernel/traps.c	26 Nov 2003
> > 18:17:37 -0000 @@ -329,6 +329,10 @@ void die(const char * str, struct
> > pt_reg
> >  	if (in_interrupt())
> >  		panic("Fatal exception in interrupt");
> >
> > +	local_irq_disable();
> > +	while (1)
> > +		__asm__ __volatile__("hlt");
> > +
> >  	if (panic_on_oops) {
> >  		printk(KERN_EMERG "Fatal exception: panic in 5 seconds\n");
> >  		set_current_state(TASK_UNINTERRUPTIBLE);
>
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/

^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [kernel panic @ reboot in usbcore] 2.6.0-test10-mm1 (culprit: modem_run)
  2003-11-27  3:13             ` Zwane Mwaikambo
@ 2003-11-27  8:14               ` Vince
  0 siblings, 0 replies; 113+ messages in thread
From: Vince @ 2003-11-27  8:14 UTC (permalink / raw)
  To: Zwane Mwaikambo; +Cc: Linux Kernel, Randy.Dunlap

Zwane Mwaikambo wrote:

> 
> Yes please get it all done =) Especially the first line and the bottom
> "Code:" line.
>

Hmm either the "Code:" line was not available, or I simply forgot to 
write it  :-/
Here is all the info I have written:


CPU: 0
EIP: 0060 : [<d0ae9822>]
EFLAGS: 00010246
EIP is at releaseintf+0x62/0x80 [usbcore]
eax:00000000 ebx:ceddc224 ecx:cs6D5DC0 edx:00000000
esi:ceddc200 edi:00000000 ebp:cd647f0c esp:cd647ef8
ds: 007b es:007b ss:0068

Process: modem_run (pid: 1121, threadinfo=cd646000, task=ce644040)
Stack: c016ffe3 ce0bfb24 ce6d5dc0 00000000 cffe4dc0 cd647f24 d0ae9c27
        00000000 ce773800 00000000 cd647f48 c0157a5c ce529240 ce773800
        ce511000 ce773800 00000000 cf699c80 cd647f64 c0156047 ce773800

Call trace
[<c016ffe3>] iput+0x63/0x80
[<d0ae9c27>] usbdev_release+0xb7/0xc0 [usbcore]
[<c0157a5c>] __fput+0x10c/0x120
[<c0156047>] filp_close+0x57/0x80
[<c0123d17>] put_files_struct+0x67/0xd0
[<c012491e>] do_exit+0x3a/0xb0
[<c0124c4a>] do_group_exit+0x3a/0xb0
[<c02a302e>] sysenter_past_esp+0x43/0x65


(If another trace is required, I'll do it... just ask!)



^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [kernel panic @ reboot] 2.6.0-test10-mm1
  2003-11-26 23:41             ` Vince
@ 2003-12-03  0:03               ` Randy.Dunlap
  2003-12-03  0:31                 ` Mike Fedyk
  0 siblings, 1 reply; 113+ messages in thread
From: Randy.Dunlap @ 2003-12-03  0:03 UTC (permalink / raw)
  To: Vince; +Cc: mfedyk, zwane, linux-kernel

On Thu, 27 Nov 2003 00:41:47 +0100 Vince <fuzzy77@free.fr> wrote:

| Mike Fedyk wrote:
| > Interesting.  It would be nice to have a boot option that halts the system
| > after the first oops, instead of trying to continue.

You mean like the "panic_on_oops" sysctl??  (implemented in i386 & ppc64)

| > Vince/Randy:
| > Did you use the 2.5.65 patch at http://w.ods.org/tools/kmsgdump/ or is there
| > some other place that has newer patches?
| > 
| > BTW, http://www.xenotime.net/linux/kmsgdump gives a 404 error.
| 
| My version comes from:
| http://developer.osdl.org/rddunlap/kmsgdump/

Yes, kmsgdump is now here ^^^^^^^^^^^^^^^^^^.
Sorry about any confusion there.

>From Vince:
| It worked, but I had -- as expected -- to write the oops by hand.
| (user request to Randy: would it be possible to have an option in 
| kmsgdump to only write the first oops on floppy ???)

Um, could you elaborate on why you would want that?
kmsgdump assumes that the entire floppy belongs to it, so there
should be plenty of room for multiple oopsen (although I don't
know what it does on disk-full....).

I plan to add support for > 32 KB log buf sizes, but that's all I have
planned for now.

--
~Randy
MOTD:  Always include version info.

^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [kernel panic @ reboot] 2.6.0-test10-mm1
  2003-12-03  0:31                 ` Mike Fedyk
@ 2003-12-03  0:27                   ` Randy.Dunlap
  2003-12-03 13:28                     ` Vince
  0 siblings, 1 reply; 113+ messages in thread
From: Randy.Dunlap @ 2003-12-03  0:27 UTC (permalink / raw)
  To: Mike Fedyk; +Cc: fuzzy77, zwane, linux-kernel

On Tue, 2 Dec 2003 16:31:06 -0800 Mike Fedyk <mfedyk@matchmail.com> wrote:

| On Tue, Dec 02, 2003 at 04:03:03PM -0800, Randy.Dunlap wrote:
| > On Thu, 27 Nov 2003 00:41:47 +0100 Vince <fuzzy77@free.fr> wrote:
| > 
| > | Mike Fedyk wrote:
| > | > Interesting.  It would be nice to have a boot option that halts the system
| > | > after the first oops, instead of trying to continue.
| > 
| > You mean like the "panic_on_oops" sysctl??  (implemented in i386 & ppc64)
| 
| ...
| 
| > From Vince:
| > | It worked, but I had -- as expected -- to write the oops by hand.
| > | (user request to Randy: would it be possible to have an option in 
| > | kmsgdump to only write the first oops on floppy ???)
| > 
| > Um, could you elaborate on why you would want that?
| > kmsgdump assumes that the entire floppy belongs to it, so there
| > should be plenty of room for multiple oopsen (although I don't
| > know what it does on disk-full....).
| > 
| > I plan to add support for > 32 KB log buf sizes, but that's all I have
| > planned for now.
| 
| Wouldn't he only get the first oops on the diskette if he had the sysctl
| mentioned above enabled?

Yes, I think that you are right.

--
~Randy
MOTD:  Always include version info.

^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [kernel panic @ reboot] 2.6.0-test10-mm1
  2003-12-03  0:03               ` Randy.Dunlap
@ 2003-12-03  0:31                 ` Mike Fedyk
  2003-12-03  0:27                   ` Randy.Dunlap
  0 siblings, 1 reply; 113+ messages in thread
From: Mike Fedyk @ 2003-12-03  0:31 UTC (permalink / raw)
  To: Randy.Dunlap; +Cc: Vince, zwane, linux-kernel

On Tue, Dec 02, 2003 at 04:03:03PM -0800, Randy.Dunlap wrote:
> On Thu, 27 Nov 2003 00:41:47 +0100 Vince <fuzzy77@free.fr> wrote:
> 
> | Mike Fedyk wrote:
> | > Interesting.  It would be nice to have a boot option that halts the system
> | > after the first oops, instead of trying to continue.
> 
> You mean like the "panic_on_oops" sysctl??  (implemented in i386 & ppc64)

...

> From Vince:
> | It worked, but I had -- as expected -- to write the oops by hand.
> | (user request to Randy: would it be possible to have an option in 
> | kmsgdump to only write the first oops on floppy ???)
> 
> Um, could you elaborate on why you would want that?
> kmsgdump assumes that the entire floppy belongs to it, so there
> should be plenty of room for multiple oopsen (although I don't
> know what it does on disk-full....).
> 
> I plan to add support for > 32 KB log buf sizes, but that's all I have
> planned for now.

Wouldn't he only get the first oops on the diskette if he had the sysctl
mentioned above enabled?

^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [kernel panic @ reboot] 2.6.0-test10-mm1
  2003-12-03  0:27                   ` Randy.Dunlap
@ 2003-12-03 13:28                     ` Vince
  2003-12-03 19:12                       ` Zwane Mwaikambo
  0 siblings, 1 reply; 113+ messages in thread
From: Vince @ 2003-12-03 13:28 UTC (permalink / raw)
  To: Randy.Dunlap; +Cc: Mike Fedyk, zwane, linux-kernel

Randy.Dunlap wrote:
> On Tue, 2 Dec 2003 16:31:06 -0800 Mike Fedyk <mfedyk@matchmail.com> wrote:
> 
> | On Tue, Dec 02, 2003 at 04:03:03PM -0800, Randy.Dunlap wrote:
> | > On Thu, 27 Nov 2003 00:41:47 +0100 Vince <fuzzy77@free.fr> wrote:
> | > 
> | > | Mike Fedyk wrote:
> | > | > Interesting.  It would be nice to have a boot option that halts the system
> | > | > after the first oops, instead of trying to continue.
> | > 
> | > You mean like the "panic_on_oops" sysctl??  (implemented in i386 & ppc64)
> | 
> | ...
> | 
> | Wouldn't he only get the first oops on the diskette if he had the sysctl
> | mentioned above enabled?
> 
> Yes, I think that you are right.
> 

Well, I get indeed a nice oops on screen with this sysctl... but the 
oops/panic does not appear on the floppy dump  :-/

--------------------------------------------------------
<0>Kernel panic: Fatal exception
<4> <0>Dumping messages in 100 seconds : last chance for 
Alt-SysRq...<6>SysRq :
Emergency Sync
<6>SysRq : Emergency Sync
<6>SysRq : Emergency Remount R/O
<6>SysRq : Trying to dump through real mode
<4>
---------------------------------------------------------


^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [kernel panic @ reboot] 2.6.0-test10-mm1
  2003-12-03 13:28                     ` Vince
@ 2003-12-03 19:12                       ` Zwane Mwaikambo
  2003-12-04  1:01                         ` Vince
  0 siblings, 1 reply; 113+ messages in thread
From: Zwane Mwaikambo @ 2003-12-03 19:12 UTC (permalink / raw)
  To: Vince; +Cc: Randy.Dunlap, Mike Fedyk, Linux Kernel

On Wed, 3 Dec 2003, Vince wrote:

> Well, I get indeed a nice oops on screen with this sysctl... but the
> oops/panic does not appear on the floppy dump  :-/
>
> --------------------------------------------------------
> <0>Kernel panic: Fatal exception
> <4> <0>Dumping messages in 100 seconds : last chance for
> Alt-SysRq...<6>SysRq :
> Emergency Sync
> <6>SysRq : Emergency Sync
> <6>SysRq : Emergency Remount R/O
> <6>SysRq : Trying to dump through real mode
> <4>
> ---------------------------------------------------------

Do you see any floppy disk activity at all? I'll see if i can come up with
something.

^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [kernel panic @ reboot] 2.6.0-test10-mm1
  2003-12-03 19:12                       ` Zwane Mwaikambo
@ 2003-12-04  1:01                         ` Vince
  2003-12-04  1:34                           ` Mike Fedyk
  0 siblings, 1 reply; 113+ messages in thread
From: Vince @ 2003-12-04  1:01 UTC (permalink / raw)
  To: Zwane Mwaikambo; +Cc: Randy.Dunlap, Mike Fedyk, Linux Kernel

Zwane Mwaikambo wrote:
> On Wed, 3 Dec 2003, Vince wrote:
> 
> 
>>Well, I get indeed a nice oops on screen with this sysctl... but the
>>oops/panic does not appear on the floppy dump  :-/
>>
>>--------------------------------------------------------
>><0>Kernel panic: Fatal exception
>><4> <0>Dumping messages in 100 seconds : last chance for
>>Alt-SysRq...<6>SysRq :
>>Emergency Sync
>><6>SysRq : Emergency Sync
>><6>SysRq : Emergency Remount R/O
>><6>SysRq : Trying to dump through real mode
>><4>
>>---------------------------------------------------------
> 
> 
> Do you see any floppy disk activity at all? I'll see if i can come up with
> something.

Yes, there *is* floppy activity. The previous messages make it to the 
floppy (in that case, I experienced with 
Alt-Sysrq+S/Alt-Sysrq+U/Alt-Sysrq+D), but the oops doesn't...


^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [kernel panic @ reboot] 2.6.0-test10-mm1
  2003-12-04  1:01                         ` Vince
@ 2003-12-04  1:34                           ` Mike Fedyk
  2003-12-04  4:11                             ` Randy.Dunlap
  0 siblings, 1 reply; 113+ messages in thread
From: Mike Fedyk @ 2003-12-04  1:34 UTC (permalink / raw)
  To: Vince; +Cc: Zwane Mwaikambo, Randy.Dunlap, Linux Kernel

On Thu, Dec 04, 2003 at 02:01:47AM +0100, Vince wrote:
> Zwane Mwaikambo wrote:
> >On Wed, 3 Dec 2003, Vince wrote:
> >
> >
> >>Well, I get indeed a nice oops on screen with this sysctl... but the
> >>oops/panic does not appear on the floppy dump  :-/
> >>
> >>--------------------------------------------------------
> >><0>Kernel panic: Fatal exception
> >><4> <0>Dumping messages in 100 seconds : last chance for
> >>Alt-SysRq...<6>SysRq :
> >>Emergency Sync
> >><6>SysRq : Emergency Sync
> >><6>SysRq : Emergency Remount R/O
> >><6>SysRq : Trying to dump through real mode
> >><4>
> >>---------------------------------------------------------
> >
> >
> >Do you see any floppy disk activity at all? I'll see if i can come up with
> >something.
> 
> Yes, there *is* floppy activity. The previous messages make it to the 
> floppy (in that case, I experienced with 
> Alt-Sysrq+S/Alt-Sysrq+U/Alt-Sysrq+D), but the oops doesn't...

do you mean s/Alt-Sysrq+D/Alt-Sysrq+B/  ?

On 2.4 there isn't a Alt-Sysrq+D, but maybe there is on 2.6...?

^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [kernel panic @ reboot] 2.6.0-test10-mm1
  2003-12-04  1:34                           ` Mike Fedyk
@ 2003-12-04  4:11                             ` Randy.Dunlap
  2003-12-04 10:59                               ` [OOPS, usbcore, releaseintf] 2.6.0-test10-mm1 Vince
  2003-12-05  0:08                               ` [kernel panic @ reboot] 2.6.0-test10-mm1 Zwane Mwaikambo
  0 siblings, 2 replies; 113+ messages in thread
From: Randy.Dunlap @ 2003-12-04  4:11 UTC (permalink / raw)
  To: Mike Fedyk; +Cc: fuzzy77, zwane, linux-kernel

On Wed, 3 Dec 2003 17:34:08 -0800 Mike Fedyk <mfedyk@matchmail.com> wrote:

| On Thu, Dec 04, 2003 at 02:01:47AM +0100, Vince wrote:
| > Zwane Mwaikambo wrote:
| > >On Wed, 3 Dec 2003, Vince wrote:
| > >
| > >
| > >>Well, I get indeed a nice oops on screen with this sysctl... but the
| > >>oops/panic does not appear on the floppy dump  :-/
| > >>
| > >>--------------------------------------------------------
| > >><0>Kernel panic: Fatal exception
| > >><4> <0>Dumping messages in 100 seconds : last chance for
| > >>Alt-SysRq...<6>SysRq :
| > >>Emergency Sync
| > >><6>SysRq : Emergency Sync
| > >><6>SysRq : Emergency Remount R/O
| > >><6>SysRq : Trying to dump through real mode
| > >><4>
| > >>---------------------------------------------------------
| > >
| > >
| > >Do you see any floppy disk activity at all? I'll see if i can come up with
| > >something.
| > 
| > Yes, there *is* floppy activity. The previous messages make it to the 
| > floppy (in that case, I experienced with 
| > Alt-Sysrq+S/Alt-Sysrq+U/Alt-Sysrq+D), but the oops doesn't...

It seems possible that these commands (above) are flushing the kernel
log buffer to disk (/var/log/messages e.g.), so that they don't need
to be saved by kmsgdump.  Have you looked in the kernel message file
for them?

| do you mean s/Alt-Sysrq+D/Alt-Sysrq+B/  ?
| 
| On 2.4 there isn't a Alt-Sysrq+D, but maybe there is on 2.6...?

The kmsgdump patch adds Alt-SysRq-D to force entry into kmsgdump
instead of getting there via a panic.

--
~Randy

^ permalink raw reply	[flat|nested] 113+ messages in thread

* [OOPS,  usbcore, releaseintf] 2.6.0-test10-mm1
  2003-12-04  4:11                             ` Randy.Dunlap
@ 2003-12-04 10:59                               ` Vince
  2003-12-04 11:14                                 ` Duncan Sands
  2003-12-05  0:08                               ` [kernel panic @ reboot] 2.6.0-test10-mm1 Zwane Mwaikambo
  1 sibling, 1 reply; 113+ messages in thread
From: Vince @ 2003-12-04 10:59 UTC (permalink / raw)
  To: Randy.Dunlap; +Cc: Mike Fedyk, zwane, linux-kernel, baldrick

Randy.Dunlap wrote:
> It seems possible that these commands (above) are flushing the kernel
> log buffer to disk (/var/log/messages e.g.), so that they don't need
> to be saved by kmsgdump.  Have you looked in the kernel message file
> for them?

You are right, the oops indeed makes it to the disk in that case, thanks!
Here follows a nice oops. Probably very easy to reproduce to any 
speedtouch owner by launching /etc/init.d/hotplug stop while modem_run 
is still running...


ehci_hcd 0000:00:10.3: remove, state 1
usb usb1: USB disconnect, address 1
ehci_hcd 0000:00:10.3: USB bus 1 deregistered
diablo modem_run[1033]: Device disconnected, shutting down
uhci_hcd 0000:00:10.0: remove, state 1
usb usb2: USB disconnect, address 1
usb 2-1: USB disconnect, address 2
drivers/char/lirc/lirc_atiusb.c: USB Remote on #200 now disconnected
usb 2-2: USB disconnect, address 3
printing eip:
c8ae9822
Oops: 0000 [#1]
PREEMPT
CPU:    0
EIP:    0060:[<c8ae9822>]    Not tainted VLI
EFLAGS: 00010246
EIP is at releaseintf+0x62/0x80 [usbcore]
eax: 00000000   ebx: c6dcb024   ecx: c663c0c0   edx: 00000000
esi: c6dcb000   edi: 00000000   ebp: c595df0c   esp: c595def8
ds: 007b   es: 007b   ss: 0068
Process modem_run (pid: 1033, threadinfo=c595c000 task=c6ddc080)
Stack: c016ffe3 c6205ca4 c663c0c0 00000000 c7ff4dc0 c595df24 c8ae9c27 
c663c0c0
00000000 c6719f00 00000000 c595df48 c0157a5c c66ba4c0 c6719f00 c66ba4c0
c66d1a40 c6719f00 00000000 c6e10740 c595df64 c0156047 c6719f00 c6e10740
Call Trace:
[<c016ffe3>] iput+0x63/0x80
[<c8ae9c27>] usbdev_release+0xb7/0xc0 [usbcore]
[<c0157a5c>] __fput+0x10c/0x120
[<c0156047>] filp_close+0x57/0x80
[<c0123d17>] put_files_struct+0x67/0xd0
[<c012491e>] do_exit+0x15e/0x3e0
[<c0124c4a>] do_group_exit+0x3a/0xb0
[<c02a302e>] sysenter_past_esp+0x43/0x65

Code: 08 0f b3 51 40 19 c0 85 c0 75 18 89 d9 ff 46 24 0f 8e 34 23 00 00 
89 f8 8b 5d f4 8b 75 f8 8b 7d fc c9 c3 8b 86 90 01 00 00 31 ff <8b> 44 
90 0c c7 04 24 40 70 af c8 89 44 24 04 e8 da 6b ff ff eb
<0>Fatal exception: panic in 5 seconds


^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [OOPS,  usbcore, releaseintf] 2.6.0-test10-mm1
  2003-12-04 10:59                               ` [OOPS, usbcore, releaseintf] 2.6.0-test10-mm1 Vince
@ 2003-12-04 11:14                                 ` Duncan Sands
  2003-12-04 16:57                                   ` Randy.Dunlap
  0 siblings, 1 reply; 113+ messages in thread
From: Duncan Sands @ 2003-12-04 11:14 UTC (permalink / raw)
  To: Vince, Randy.Dunlap; +Cc: Mike Fedyk, zwane, linux-kernel

> EIP is at releaseintf+0x62/0x80 [usbcore]

I haven't found time to work on this, sorry -
I'm really busy with my real jobs right now.

> <0>Fatal exception: panic in 5 seconds

What is this, by the way?  I never saw it.

Duncan.

^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [OOPS,  usbcore, releaseintf] 2.6.0-test10-mm1
  2003-12-04 11:14                                 ` Duncan Sands
@ 2003-12-04 16:57                                   ` Randy.Dunlap
  2003-12-05  7:38                                     ` Duncan Sands
  0 siblings, 1 reply; 113+ messages in thread
From: Randy.Dunlap @ 2003-12-04 16:57 UTC (permalink / raw)
  To: Duncan Sands; +Cc: fuzzy77, mfedyk, zwane, linux-kernel

On Thu, 4 Dec 2003 12:14:33 +0100 Duncan Sands <baldrick@free.fr> wrote:

| > EIP is at releaseintf+0x62/0x80 [usbcore]
| 
| I haven't found time to work on this, sorry -
| I'm really busy with my real jobs right now.
| 
| > <0>Fatal exception: panic in 5 seconds
| 
| What is this, by the way?  I never saw it.

That comes from setting the sysctl "panic_on_oops" so that an oops
goes straight to a panic condition.

--
~Randy
MOTD:  Always include version info.

^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [kernel panic @ reboot] 2.6.0-test10-mm1
  2003-12-04  4:11                             ` Randy.Dunlap
  2003-12-04 10:59                               ` [OOPS, usbcore, releaseintf] 2.6.0-test10-mm1 Vince
@ 2003-12-05  0:08                               ` Zwane Mwaikambo
  1 sibling, 0 replies; 113+ messages in thread
From: Zwane Mwaikambo @ 2003-12-05  0:08 UTC (permalink / raw)
  To: Randy.Dunlap; +Cc: Mike Fedyk, fuzzy77, linux-kernel

On Wed, 3 Dec 2003, Randy.Dunlap wrote:

> | > >Do you see any floppy disk activity at all? I'll see if i can come up with
> | > >something.
> | >
> | > Yes, there *is* floppy activity. The previous messages make it to the
> | > floppy (in that case, I experienced with
> | > Alt-Sysrq+S/Alt-Sysrq+U/Alt-Sysrq+D), but the oops doesn't...
>
> It seems possible that these commands (above) are flushing the kernel
> log buffer to disk (/var/log/messages e.g.), so that they don't need
> to be saved by kmsgdump.  Have you looked in the kernel message file
> for them?

In which case i'm slightly confused as to what's happening to the oops ;)
Dare i delve into kmsgdump?

^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [OOPS,  usbcore, releaseintf] 2.6.0-test10-mm1
  2003-12-04 16:57                                   ` Randy.Dunlap
@ 2003-12-05  7:38                                     ` Duncan Sands
  2003-12-05 10:11                                       ` Vince
  0 siblings, 1 reply; 113+ messages in thread
From: Duncan Sands @ 2003-12-05  7:38 UTC (permalink / raw)
  To: Randy.Dunlap; +Cc: fuzzy77, mfedyk, zwane, linux-kernel

On Thursday 04 December 2003 17:57, Randy.Dunlap wrote:
> On Thu, 4 Dec 2003 12:14:33 +0100 Duncan Sands <baldrick@free.fr> wrote:
> | > EIP is at releaseintf+0x62/0x80 [usbcore]
> |
> | I haven't found time to work on this, sorry -
> | I'm really busy with my real jobs right now.
> |
> | > <0>Fatal exception: panic in 5 seconds
> |
> | What is this, by the way?  I never saw it.
>
> That comes from setting the sysctl "panic_on_oops" so that an oops
> goes straight to a panic condition.

That explains why this relatively harmless Oops was
freezing Vince's box.  I guess he should turn it off.

Thanks for the info,

Duncan.

^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [OOPS,  usbcore, releaseintf] 2.6.0-test10-mm1
  2003-12-05  7:38                                     ` Duncan Sands
@ 2003-12-05 10:11                                       ` Vince
  2003-12-05 10:18                                         ` Duncan Sands
  2003-12-07  0:25                                         ` Duncan Sands
  0 siblings, 2 replies; 113+ messages in thread
From: Vince @ 2003-12-05 10:11 UTC (permalink / raw)
  To: Duncan Sands; +Cc: Randy.Dunlap, mfedyk, zwane, linux-kernel

Duncan Sands wrote:
> On Thursday 04 December 2003 17:57, Randy.Dunlap wrote:
> 
>>On Thu, 4 Dec 2003 12:14:33 +0100 Duncan Sands <baldrick@free.fr> wrote:
>>| > EIP is at releaseintf+0x62/0x80 [usbcore]
>>|
>>| I haven't found time to work on this, sorry -
>>| I'm really busy with my real jobs right now.
>>|
>>| > <0>Fatal exception: panic in 5 seconds
>>|
>>| What is this, by the way?  I never saw it.
>>
>>That comes from setting the sysctl "panic_on_oops" so that an oops
>>goes straight to a panic condition.
> 
> 
> That explains why this relatively harmless Oops was
> freezing Vince's box.  I guess he should turn it off.

Well, I don't find this oops harmless at all : my box is usually 
freezing while in a huge of other oopses that directly follow this one, 
and then nothing makes it into the logs. I had to set this sysctl once 
in order to get the first oops, but that's not related to the other 
freeze...


^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [OOPS,  usbcore, releaseintf] 2.6.0-test10-mm1
  2003-12-05 10:11                                       ` Vince
@ 2003-12-05 10:18                                         ` Duncan Sands
  2003-12-05 10:34                                           ` Vince
  2003-12-07  0:25                                         ` Duncan Sands
  1 sibling, 1 reply; 113+ messages in thread
From: Duncan Sands @ 2003-12-05 10:18 UTC (permalink / raw)
  To: Vince; +Cc: Randy.Dunlap, mfedyk, zwane, linux-kernel

> > That explains why this relatively harmless Oops was
> > freezing Vince's box.  I guess he should turn it off.
>
> Well, I don't find this oops harmless at all : my box is usually
> freezing while in a huge of other oopses that directly follow this one,
> and then nothing makes it into the logs. I had to set this sysctl once
> in order to get the first oops, but that's not related to the other
> freeze...

What is the second Oops?

Thanks,

Duncan.

^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [OOPS,  usbcore, releaseintf] 2.6.0-test10-mm1
  2003-12-05 10:18                                         ` Duncan Sands
@ 2003-12-05 10:34                                           ` Vince
  0 siblings, 0 replies; 113+ messages in thread
From: Vince @ 2003-12-05 10:34 UTC (permalink / raw)
  To: Duncan Sands; +Cc: Randy.Dunlap, mfedyk, zwane, linux-kernel

Duncan Sands wrote:
>>>That explains why this relatively harmless Oops was
>>>freezing Vince's box.  I guess he should turn it off.
>>
>>Well, I don't find this oops harmless at all : my box is usually
>>freezing while in a huge number of other oopses that directly follow this one,
>>and then nothing makes it into the logs. I had to set this sysctl once
>>in order to get the first oops, but that's not related to the other
>>freeze...
> 
> 
> What is the second Oops?

No idea... without the sysctl, the screen keeps scrolling printing new 
oopes (49 of them in my last attempt), in which case nothing about it 
ever reaches the disk log (and it looks like the kernel buffer is too 
short when using kmsgdump).
  Could it be possible to have something like:
echo 2 > /proc/sys/kernel/panic_on_oops
...and have the system panic at the 2nd oops ?


^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [OOPS,  usbcore, releaseintf] 2.6.0-test10-mm1
  2003-12-05 10:11                                       ` Vince
  2003-12-05 10:18                                         ` Duncan Sands
@ 2003-12-07  0:25                                         ` Duncan Sands
  2003-12-07 21:09                                           ` Vince
  1 sibling, 1 reply; 113+ messages in thread
From: Duncan Sands @ 2003-12-07  0:25 UTC (permalink / raw)
  To: Vince; +Cc: Randy.Dunlap, mfedyk, zwane, linux-kernel

On Friday 05 December 2003 11:11, Vince wrote:
> Duncan Sands wrote:
> > On Thursday 04 December 2003 17:57, Randy.Dunlap wrote:
> >>On Thu, 4 Dec 2003 12:14:33 +0100 Duncan Sands <baldrick@free.fr> wrote:
> >>| > EIP is at releaseintf+0x62/0x80 [usbcore]
> >>|
> >>| I haven't found time to work on this, sorry -
> >>| I'm really busy with my real jobs right now.
> >>|
> >>| > <0>Fatal exception: panic in 5 seconds
> >>|
> >>| What is this, by the way?  I never saw it.
> >>
> >>That comes from setting the sysctl "panic_on_oops" so that an oops
> >>goes straight to a panic condition.
> >
> > That explains why this relatively harmless Oops was
> > freezing Vince's box.  I guess he should turn it off.
>
> Well, I don't find this oops harmless at all : my box is usually
> freezing while in a huge of other oopses that directly follow this one,
> and then nothing makes it into the logs. I had to set this sysctl once
> in order to get the first oops, but that's not related to the other
> freeze...

Does this help?  It isn't finished - it represents the current state of my fix.
Warning: have barf bag ready.

Ciao,

Duncan.

diff -Nru a/drivers/usb/core/devio.c b/drivers/usb/core/devio.c
--- a/drivers/usb/core/devio.c	Sun Dec  7 01:20:31 2003
+++ b/drivers/usb/core/devio.c	Sun Dec  7 01:20:31 2003
@@ -87,17 +87,15 @@
 static ssize_t usbdev_read(struct file *file, char __user *buf, size_t nbytes, loff_t *ppos)
 {
 	struct dev_state *ps = (struct dev_state *)file->private_data;
+	struct usb_device *dev = ps->dev;
 	ssize_t ret = 0;
 	unsigned len;
 	loff_t pos;
 	int i;
 
 	pos = *ppos;
-	down_read(&ps->devsem);
-	if (!ps->dev) {
-		ret = -ENODEV;
-		goto err;
-	} else if (pos < 0) {
+	down(&dev->serialize);
+	if (pos < 0) {
 		ret = -EINVAL;
 		goto err;
 	}
@@ -106,7 +104,7 @@
 		len = sizeof(struct usb_device_descriptor) - pos;
 		if (len > nbytes)
 			len = nbytes;
-		if (copy_to_user(buf, ((char *)&ps->dev->descriptor) + pos, len)) {
+		if (copy_to_user(buf, ((char *)&dev->descriptor) + pos, len)) {
 			ret = -EFAULT;
 			goto err;
 		}
@@ -118,9 +116,9 @@
 	}
 
 	pos = sizeof(struct usb_device_descriptor);
-	for (i = 0; nbytes && i < ps->dev->descriptor.bNumConfigurations; i++) {
+	for (i = 0; nbytes && i < dev->descriptor.bNumConfigurations; i++) {
 		struct usb_config_descriptor *config =
-			(struct usb_config_descriptor *)ps->dev->rawdescriptors[i];
+			(struct usb_config_descriptor *)dev->rawdescriptors[i];
 		unsigned int length = le16_to_cpu(config->wTotalLength);
 
 		if (*ppos < pos + length) {
@@ -129,7 +127,7 @@
 				len = nbytes;
 
 			if (copy_to_user(buf,
-			    ps->dev->rawdescriptors[i] + (*ppos - pos), len)) {
+			    dev->rawdescriptors[i] + (*ppos - pos), len)) {
 				ret = -EFAULT;
 				goto err;
 			}
@@ -144,7 +142,7 @@
 	}
 
 err:
-	up_read(&ps->devsem);
+	up(&dev->serialize);
 	return ret;
 }
 
@@ -324,22 +322,18 @@
 static void driver_disconnect(struct usb_interface *intf)
 {
 	struct dev_state *ps = usb_get_intfdata (intf);
+	unsigned int ifnum = intf->altsetting->desc.bInterfaceNumber;
 
 	if (!ps)
 		return;
 
-	/* this waits till synchronous requests complete */
-	down_write (&ps->devsem);
-
 	/* prevent new I/O requests */
-	ps->dev = 0;
-	ps->ifclaimed = 0;
 	usb_set_intfdata (intf, NULL);
+	if (ifnum < 8*sizeof(ps->ifclaimed))
+		clear_bit(ifnum, &ps->ifclaimed);
 
 	/* force async requests to complete */
-	destroy_all_async (ps);
-
-	up_write (&ps->devsem);
+	destroy_async_on_interface (ps, ifnum);
 }
 
 struct usb_driver usbdevfs_driver = {
@@ -355,21 +349,18 @@
 	struct usb_interface *iface;
 	int err;
 
-	if (intf >= 8*sizeof(ps->ifclaimed) || !dev
-			|| intf >= dev->actconfig->desc.bNumInterfaces)
+	if (intf >= 8*sizeof(ps->ifclaimed) || intf >= dev->actconfig->desc.bNumInterfaces)
 		return -EINVAL;
-	/* already claimed */
-	if (test_bit(intf, &ps->ifclaimed))
-		return 0;
-	iface = dev->actconfig->interface[intf];
+	if (test_and_set_bit(intf, &ps->ifclaimed))
+		return 0; /* already claimed */
 	err = -EBUSY;
-	lock_kernel();
+	iface = dev->actconfig->interface[intf]; /* change to usb_ifnum_to_if here and elsewhere? */
 	if (!usb_interface_claimed(iface)) {
 		usb_driver_claim_interface(&usbdevfs_driver, iface, ps);
-		set_bit(intf, &ps->ifclaimed);
 		err = 0;
 	}
-	unlock_kernel();
+	if (err)
+		clear_bit(intf, &ps->ifclaimed);
 	return err;
 }
 
@@ -383,13 +374,13 @@
 		return -EINVAL;
 	err = -EINVAL;
 	dev = ps->dev;
-	down(&dev->serialize);
+
+	/* the configuration cannot change under us if we have a claimed interface */
 	if (test_and_clear_bit(intf, &ps->ifclaimed)) {
 		iface = dev->actconfig->interface[intf];
-		usb_driver_release_interface(&usbdevfs_driver, iface);
+		usb_driver_release_interface(&usbdevfs_driver, iface); /* may sleep - beware for BKL! */
 		err = 0;
 	}
-	up(&dev->serialize);
 	return err;
 }
 
@@ -492,7 +483,7 @@
 
 	lock_kernel();
 	ret = -ENOENT;
-	dev = inode->u.generic_ip;
+	dev = usb_get_dev (inode->u.generic_ip);
 	if (!dev) {
 		kfree(ps);
 		goto out;
@@ -504,7 +495,6 @@
 	INIT_LIST_HEAD(&ps->async_pending);
 	INIT_LIST_HEAD(&ps->async_completed);
 	init_waitqueue_head(&ps->wait);
-	init_rwsem(&ps->devsem);
 	ps->discsignr = 0;
 	ps->disctask = current;
 	ps->disccontext = NULL;
@@ -521,18 +511,21 @@
 static int usbdev_release(struct inode *inode, struct file *file)
 {
 	struct dev_state *ps = (struct dev_state *)file->private_data;
+	struct usb_device *dev = ps->dev;
 	unsigned int i;
 
-	lock_kernel();
+	/* the only race is with driver_disconnect */
+	down(&dev->serialize);
 	list_del_init(&ps->list);
 
-	if (ps->dev) {
+	if (dev->state != USB_STATE_NOTATTACHED)
 		for (i = 0; ps->ifclaimed && i < 8*sizeof(ps->ifclaimed); i++)
 			if (test_bit(i, &ps->ifclaimed))
-				releaseintf(ps, i);
-	}
-	unlock_kernel();
+				releaseintf(ps, i); /* may sleep - makes the BKL problematic */
 	destroy_all_async(ps);
+	up(&dev->serialize);
+	usb_put_dev (ps->dev);
+	ps->dev = NULL;
 	kfree(ps);
         return 0;
 }
@@ -712,22 +705,32 @@
 
 static int proc_resetdevice(struct dev_state *ps)
 {
+	struct usb_device *dev = ps->dev;
 	int i, ret;
 
-	ret = usb_reset_device(ps->dev);
+	up(&dev->serialize);
+	ret = usb_reset_device(dev);
+	down(&dev->serialize);
+	if (!ret && dev->state == USB_STATE_NOTATTACHED)
+		ret = -ENODEV;
 	if (ret < 0)
 		return ret;
 
-	for (i = 0; i < ps->dev->actconfig->desc.bNumInterfaces; i++) {
-		struct usb_interface *intf = ps->dev->actconfig->interface[i];
+	for (i = 0; i < dev->actconfig->desc.bNumInterfaces; i++) {
+		struct usb_interface *intf = dev->actconfig->interface[i];
 
 		/* Don't simulate interfaces we've claimed */
 		if (test_bit(i, &ps->ifclaimed))
 			continue;
 
 		err ("%s - this function is broken", __FUNCTION__);
-		if (intf->driver && ps->dev) {
+		if (intf->driver) {
+			up(&dev->serialize);
 			usb_probe_interface (&intf->dev);
+			down(&dev->serialize);
+		}
+		if (dev->state == USB_STATE_NOTATTACHED) {
+			return -ENODEV;
 		}
 	}
 
@@ -976,18 +979,19 @@
         DECLARE_WAITQUEUE(wait, current);
 	struct async *as = NULL;
 	void __user *addr;
+	struct usb_device *dev = ps->dev;
 	int ret;
 
 	add_wait_queue(&ps->wait, &wait);
-	while (ps->dev) {
+	while (dev->state != USB_STATE_NOTATTACHED) {
 		__set_current_state(TASK_INTERRUPTIBLE);
 		if ((as = async_getcompleted(ps)))
 			break;
 		if (signal_pending(current))
 			break;
-		up_read(&ps->devsem);
+		up(&dev->serialize);
 		schedule();
-		down_read(&ps->devsem);
+		down(&dev->serialize);
 	}
 	remove_wait_queue(&ps->wait, &wait);
 	set_current_state(TASK_RUNNING);
@@ -1089,57 +1093,53 @@
 		}
 	}
 
-       if (!ps->dev)
-               retval = -ENODEV;
-       else if (!(ifp = usb_ifnum_to_if (ps->dev, ctrl.ifno)))
+       if (!(ifp = usb_ifnum_to_if (ps->dev, ctrl.ifno)))
                retval = -EINVAL;
-       else switch (ctrl.ioctl_code) {
-
-       /* disconnect kernel driver from interface, leaving it unbound.  */
-       /* maybe unbound - you get no guarantee it stays unbound */
-       case USBDEVFS_DISCONNECT:
-		/* this function is misdesigned - retained for compatibility */
+	else {
 		lock_kernel();
-		driver = ifp->driver;
-		if (driver) {
-			dbg ("disconnect '%s' from dev %d interface %d",
-			     driver->name, ps->dev->devnum, ctrl.ifno);
-			usb_unbind_interface(&ifp->dev);
-		} else
-			retval = -ENODATA;
-		unlock_kernel();
-		break;
+		up(&ps->dev->serialize);
+		switch (ctrl.ioctl_code) {
 
-	/* let kernel drivers try to (re)bind to the interface */
-	case USBDEVFS_CONNECT:
-		lock_kernel();
-		retval = usb_probe_interface (&ifp->dev);
-		unlock_kernel();
-		break;
+		/* disconnect kernel driver from interface, leaving it unbound.  */
+		/* maybe unbound - you get no guarantee it stays unbound */
+		case USBDEVFS_DISCONNECT:
+			/* this function is misdesigned - retained for compatibility */
+			driver = ifp->driver;
+			if (driver) {
+				dbg ("disconnect '%s' from dev %d interface %d",
+					driver->name, ps->dev->devnum, ctrl.ifno);
+				usb_unbind_interface(&ifp->dev);
+			} else
+				retval = -ENODATA;
+			break;
 
-	/* talk directly to the interface's driver */
-	default:
-		/* BKL used here to protect against changing the binding
-		 * of this driver to this device, as well as unloading its
-		 * driver module.
-		 */
-		lock_kernel ();
-		driver = ifp->driver;
-		if (driver == 0 || driver->ioctl == 0) {
-			unlock_kernel();
-			retval = -ENOSYS;
-		} else {
-			if (!try_module_get (driver->owner)) {
-				unlock_kernel();
+		/* let kernel drivers try to (re)bind to the interface */
+		case USBDEVFS_CONNECT:
+			retval = usb_probe_interface (&ifp->dev);
+			break;
+
+		/* talk directly to the interface's driver */
+		default:
+			/* BKL used here to protect against changing the binding
+			* of this driver to this device, as well as unloading its
+			* driver module.
+			*/
+			driver = ifp->driver;
+			if (driver == 0 || driver->ioctl == 0) {
 				retval = -ENOSYS;
-				break;
+			} else {
+				if (!try_module_get (driver->owner)) {
+					retval = -ENOSYS;
+					break;
+				}
+				retval = driver->ioctl (ifp, ctrl.ioctl_code, buf);
+				if (retval == -ENOIOCTLCMD)
+					retval = -ENOTTY;
+				module_put (driver->owner);
 			}
-			unlock_kernel ();
-			retval = driver->ioctl (ifp, ctrl.ioctl_code, buf);
-			if (retval == -ENOIOCTLCMD)
-				retval = -ENOTTY;
-			module_put (driver->owner);
 		}
+		down(&ps->dev->serialize);
+		unlock_kernel();
 	}
 
 	/* cleanup and return */
@@ -1161,13 +1161,15 @@
 static int usbdev_ioctl(struct inode *inode, struct file *file, unsigned int cmd, unsigned long arg)
 {
 	struct dev_state *ps = (struct dev_state *)file->private_data;
+	struct usb_device *dev = ps->dev;
 	int ret = -ENOTTY;
 
 	if (!(file->f_mode & FMODE_WRITE))
 		return -EPERM;
-	down_read(&ps->devsem);
-	if (!ps->dev) {
-		up_read(&ps->devsem);
+
+	down(&dev->serialize);
+	if (dev->state == USB_STATE_NOTATTACHED) {
+		up(&dev->serialize);
 		return -ENODEV;
 	}
 	switch (cmd) {
@@ -1212,7 +1214,9 @@
 		break;
 
 	case USBDEVFS_SETCONFIGURATION:
+		up(&dev->serialize);
 		ret = proc_setconfig(ps, (void __user *)arg);
+		down(&dev->serialize);
 		break;
 
 	case USBDEVFS_SUBMITURB:
@@ -1233,6 +1237,10 @@
 		ret = proc_reapurbnonblock(ps, (void __user *)arg);
 		break;
 
+	case USBDEVFS_RELEASEINTERFACE:
+		ret = proc_releaseinterface(ps, (void __user *)arg);
+		break;
+
 	case USBDEVFS_DISCSIGNAL:
 		ret = proc_disconnectsignal(ps, (void __user *)arg);
 		break;
@@ -1241,15 +1249,12 @@
 		ret = proc_claiminterface(ps, (void __user *)arg);
 		break;
 
-	case USBDEVFS_RELEASEINTERFACE:
-		ret = proc_releaseinterface(ps, (void __user *)arg);
-		break;
-
 	case USBDEVFS_IOCTL:
 		ret = proc_ioctl(ps, (void __user *) arg);
-		break;
+	break;
 	}
-	up_read(&ps->devsem);
+	up(&dev->serialize);
+
 	if (ret >= 0)
 		inode->i_atime = CURRENT_TIME;
 	return ret;
@@ -1264,7 +1269,7 @@
 	poll_wait(file, &ps->wait, wait);
 	if (file->f_mode & FMODE_WRITE && !list_empty(&ps->async_completed))
 		mask |= POLLOUT | POLLWRNORM;
-	if (!ps->dev)
+	if (ps->dev->state == USB_STATE_NOTATTACHED)
 		mask |= POLLERR | POLLHUP;
 	return mask;
 }
diff -Nru a/drivers/usb/core/inode.c b/drivers/usb/core/inode.c
--- a/drivers/usb/core/inode.c	Sun Dec  7 01:20:31 2003
+++ b/drivers/usb/core/inode.c	Sun Dec  7 01:20:31 2003
@@ -717,9 +717,9 @@
 	while (!list_empty(&dev->filelist)) {
 		ds = list_entry(dev->filelist.next, struct dev_state, list);
 		list_del_init(&ds->list);
-		down_write(&ds->devsem);
-		ds->dev = NULL;
-		up_write(&ds->devsem);
+//		down_write(&ds->devsem);
+//		ds->dev = NULL;
+//		up_write(&ds->devsem);
 		if (ds->discsignr) {
 			sinfo.si_signo = SIGPIPE;
 			sinfo.si_errno = EPIPE;
diff -Nru a/include/linux/usbdevice_fs.h b/include/linux/usbdevice_fs.h
--- a/include/linux/usbdevice_fs.h	Sun Dec  7 01:20:31 2003
+++ b/include/linux/usbdevice_fs.h	Sun Dec  7 01:20:31 2003
@@ -154,7 +154,6 @@
 
 struct dev_state {
 	struct list_head list;      /* state list */
-	struct rw_semaphore devsem; /* protects modifications to dev (dev == NULL indicating disconnect) */ 
 	struct usb_device *dev;
 	struct file *file;
 	spinlock_t lock;            /* protects the async urb lists */

^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [OOPS,  usbcore, releaseintf] 2.6.0-test10-mm1
  2003-12-07  0:25                                         ` Duncan Sands
@ 2003-12-07 21:09                                           ` Vince
  2003-12-07 21:24                                             ` Duncan Sands
  0 siblings, 1 reply; 113+ messages in thread
From: Vince @ 2003-12-07 21:09 UTC (permalink / raw)
  To: Duncan Sands; +Cc: Randy.Dunlap, mfedyk, zwane, linux-kernel

Duncan Sands wrote:

> 
> Does this help?  It isn't finished - it represents the current state of my fix.
> Warning: have barf bag ready.
 > [patch cut]

Yes, your patch fixes the problem: no more oops and modem_run now exits 
cleanly. Thank you very much !

Vincent


^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [OOPS,  usbcore, releaseintf] 2.6.0-test10-mm1
  2003-12-07 21:09                                           ` Vince
@ 2003-12-07 21:24                                             ` Duncan Sands
  2003-12-07 22:24                                               ` Vince
  2003-12-07 22:54                                               ` Vince
  0 siblings, 2 replies; 113+ messages in thread
From: Duncan Sands @ 2003-12-07 21:24 UTC (permalink / raw)
  To: Vince; +Cc: Randy.Dunlap, mfedyk, zwane, linux-kernel

On Sunday 07 December 2003 22:09, Vince wrote:
> Duncan Sands wrote:
> > Does this help?  It isn't finished - it represents the current state of
> > my fix. Warning: have barf bag ready.
> >
>  > [patch cut]
>
> Yes, your patch fixes the problem: no more oops and modem_run now exits
> cleanly. Thank you very much !

Hi Vincent, that's great!  I think the fix is solid, but can you please beat on it
a bit just to be sure...

Thanks,

Duncan.

^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [OOPS,  usbcore, releaseintf] 2.6.0-test10-mm1
  2003-12-07 21:24                                             ` Duncan Sands
@ 2003-12-07 22:24                                               ` Vince
  2003-12-07 22:54                                               ` Vince
  1 sibling, 0 replies; 113+ messages in thread
From: Vince @ 2003-12-07 22:24 UTC (permalink / raw)
  To: Duncan Sands; +Cc: Randy.Dunlap, mfedyk, zwane, linux-kernel

Duncan Sands wrote:
> On Sunday 07 December 2003 22:09, Vince wrote:
> 
>>Duncan Sands wrote:
>>
>>>Does this help?  It isn't finished - it represents the current state of
>>>my fix. Warning: have barf bag ready.
>>>
>>
>> > [patch cut]
>>
>>Yes, your patch fixes the problem: no more oops and modem_run now exits
>>cleanly. Thank you very much !
> 
> 
> Hi Vincent, that's great!  I think the fix is solid, but can you please beat on it
> a bit just to be sure...
> 
> Thanks,
> 
> Duncan.

Yes, I was doing just that, and perhaps I've spoken too early as I got 
this (non-fatal) oops in the log (the kernel is tainted as this time 
I've loaded X and therefore the nvidia kernel driver; however I've never 
experienced this bug before; I'll try to reproduce without it):

------------[ cut here ]------------
kernel BUG at include/linux/list.h:148!
invalid operand: 0000 [#1]
PREEMPT
CPU:    0
EIP:    0060:[<c016d3a3>]    Tainted: PF  VLI
EFLAGS: 00010206
EIP is at prune_dcache+0x1d3/0x1e0
eax: 00000000   ebx: c259dbc0   ecx: c259dbd4   edx: c7009e1c
esi: c259dc30   edi: c1190000   ebp: c1191e84   esp: c1191e6c
ds: 007b   es: 007b   ss: 0068
Process kswapd0 (pid: 7, threadinfo=c1190000 task=c1195340)
Stack: c70dfa40 c1191e70 0000007b 00000080 c1190000 0000000d c1191e90 
c016d892
        00000080 c1191ec4 c014524f 00000080 000000d0 000069a8 00110162 
00000000
        00000029 00000000 c7ffeb60 000000ad c02ec9f4 00000001 c1191f08 
c014662e
Call Trace:
  [<c016d892>] shrink_dcache_memory+0x22/0x30
  [<c014524f>] shrink_slab+0x10f/0x160
  [<c014662e>] balance_pgdat+0x1ce/0x1f0
  [<c0146729>] kswapd+0xd9/0xf0
  [<c0120640>] autoremove_wake_function+0x0/0x50
  [<c02a2f66>] ret_from_fork+0x6/0x14
  [<c0120640>] autoremove_wake_function+0x0/0x50
  [<c0146650>] kswapd+0x0/0xf0
  [<c010a2a9>] kernel_thread_helper+0x5/0xc

Code: 4f 14 a8 08 75 11 8b 47 08 ff 4f 14 a8 08 74 b3 e8 83 1a fb ff eb 
ac e8 7c 1a fb ff eb e8 0f 0b 95 00 5d 84 2b c0 e9 33 ff ff ff <0f> 0b 
94 00 5d 84 2b c0 e9 1a ff ff ff 55 89 e5 57 56 53 83 ec
  <6>note: kswapd0[7] exited with preempt_count 2



^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [OOPS,  usbcore, releaseintf] 2.6.0-test10-mm1
  2003-12-07 21:24                                             ` Duncan Sands
  2003-12-07 22:24                                               ` Vince
@ 2003-12-07 22:54                                               ` Vince
  2003-12-08 10:10                                                 ` Duncan Sands
  1 sibling, 1 reply; 113+ messages in thread
From: Vince @ 2003-12-07 22:54 UTC (permalink / raw)
  To: Duncan Sands; +Cc: Randy.Dunlap, mfedyk, zwane, linux-kernel

Duncan Sands wrote:
> Hi Vincent, that's great!  I think the fix is solid, but can you please beat on it
> a bit just to be sure...
> 
> Thanks,
> 
> Duncan.

I'm not sure how to reproduce the previous oops (not even if it was 
really related to your patch...), but here follows a real, untainted 
oops I finally got:

[9889]: shutting down for system reboot
printing eip:
c8ae8999
Oops: 0000 [#1]
PREEMPT
CPU:    0
EIP:    0060:[<c8ae8999>]    Not tainted VLI
EFLAGS: 00010286
EIP is at hcd_pci_release+0x19/0x20 [usbcore]
eax: c8c69d80   ebx: c637f050   ecx: c8af6c20   edx: c637f000
esi: c031e65c   edi: c031e680   ebp: c0019ec4   esp: c0019ec0
ds: 007b   es: 007b   ss: 0068
Process modem_run (pid: 8460, threadinfo=c0018000 task=c1508080)
Stack: c637f000 c0019ed0 c8ae455d c637f000 c0019ee8 c0203738 c637f048 
c0019f00
c8ae77d6 c031e450 c0019f00 c01bc88f c637f050 c6b09200 c031e428 c031e440
c0019f10 c8ae08b6 c637f050 00000000 c0019f2c c02019e1 c6b092cc c0019f2c
Call Trace:
[<c8ae455d>] usb_host_release+0x1d/0x20 [usbcore]
[<c0203738>] class_dev_release+0x58/0x60
[<c8ae77d6>] usb_destroy_configuration+0xb6/0xf0 [usbcore]
[<c01bc88f>] kobject_cleanup+0x6f/0x80
[<c8ae08b6>] usb_release_dev+0x46/0x60 [usbcore]
[<c02019e1>] device_release+0x21/0x80
[<c01bc88f>] kobject_cleanup+0x6f/0x80
[<c8ae9b38>] usbdev_release+0x88/0xc0 [usbcore]
[<c0157a5c>] __fput+0x10c/0x120
[<c0156047>] filp_close+0x57/0x80
[<c01560d1>] sys_close+0x61/0x90
[<c02a302e>] sysenter_past_esp+0x43/0x65

Code: 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 55 89 e5 83 
ec 04 8b 45 08 8b 50 30 85 d2 74 0c 8b 82 08 01 00 00 89 14 24 <ff> 50 
28 c9 c3 89 f6 55 89 e5 57 56 53 83 ec 34 8b 5d 0c e8 3f
<0>Fatal exception: panic in 5 seconds



^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [OOPS,  usbcore, releaseintf] 2.6.0-test10-mm1
  2003-12-07 22:54                                               ` Vince
@ 2003-12-08 10:10                                                 ` Duncan Sands
  2003-12-08 16:03                                                   ` [linux-usb-devel] " David Brownell
  0 siblings, 1 reply; 113+ messages in thread
From: Duncan Sands @ 2003-12-08 10:10 UTC (permalink / raw)
  To: Vince; +Cc: Randy.Dunlap, mfedyk, zwane, linux-kernel, USB development list

Hi Vince, I'm not sure, but it looks like a bug in the USB core.
I was kind of expecting this :)  My patch causes devio.c to hold
a reference to the usb_device maybe long after the device has
been disconnected.  This is supposed to be OK, but from your
Oops it looks like some part of the hcd was finalized too early
(before devio.c dropped its reference to the usb_device).  Maybe
one of the USB guys can comment?

All the best,

Duncan.

On Sunday 07 December 2003 23:54, Vince wrote:
> Duncan Sands wrote:
> > Hi Vincent, that's great!  I think the fix is solid, but can you please
> > beat on it a bit just to be sure...
> >
> > Thanks,
> >
> > Duncan.
>
> I'm not sure how to reproduce the previous oops (not even if it was
> really related to your patch...), but here follows a real, untainted
> oops I finally got:
>
> [9889]: shutting down for system reboot
> printing eip:
> c8ae8999
> Oops: 0000 [#1]
> PREEMPT
> CPU:    0
> EIP:    0060:[<c8ae8999>]    Not tainted VLI
> EFLAGS: 00010286
> EIP is at hcd_pci_release+0x19/0x20 [usbcore]
> eax: c8c69d80   ebx: c637f050   ecx: c8af6c20   edx: c637f000
> esi: c031e65c   edi: c031e680   ebp: c0019ec4   esp: c0019ec0
> ds: 007b   es: 007b   ss: 0068
> Process modem_run (pid: 8460, threadinfo=c0018000 task=c1508080)
> Stack: c637f000 c0019ed0 c8ae455d c637f000 c0019ee8 c0203738 c637f048
> c0019f00
> c8ae77d6 c031e450 c0019f00 c01bc88f c637f050 c6b09200 c031e428 c031e440
> c0019f10 c8ae08b6 c637f050 00000000 c0019f2c c02019e1 c6b092cc c0019f2c
> Call Trace:
> [<c8ae455d>] usb_host_release+0x1d/0x20 [usbcore]
> [<c0203738>] class_dev_release+0x58/0x60
> [<c8ae77d6>] usb_destroy_configuration+0xb6/0xf0 [usbcore]
> [<c01bc88f>] kobject_cleanup+0x6f/0x80
> [<c8ae08b6>] usb_release_dev+0x46/0x60 [usbcore]
> [<c02019e1>] device_release+0x21/0x80
> [<c01bc88f>] kobject_cleanup+0x6f/0x80
> [<c8ae9b38>] usbdev_release+0x88/0xc0 [usbcore]
> [<c0157a5c>] __fput+0x10c/0x120
> [<c0156047>] filp_close+0x57/0x80
> [<c01560d1>] sys_close+0x61/0x90
> [<c02a302e>] sysenter_past_esp+0x43/0x65
>
> Code: 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 55 89 e5 83
> ec 04 8b 45 08 8b 50 30 85 d2 74 0c 8b 82 08 01 00 00 89 14 24 <ff> 50
> 28 c9 c3 89 f6 55 89 e5 57 56 53 83 ec 34 8b 5d 0c e8 3f
> <0>Fatal exception: panic in 5 seconds

^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [linux-usb-devel] Re: [OOPS,  usbcore, releaseintf] 2.6.0-test10-mm1
  2003-12-08 10:10                                                 ` Duncan Sands
@ 2003-12-08 16:03                                                   ` David Brownell
  2003-12-08 16:15                                                     ` Duncan Sands
  0 siblings, 1 reply; 113+ messages in thread
From: David Brownell @ 2003-12-08 16:03 UTC (permalink / raw)
  To: Duncan Sands
  Cc: Vince, Randy.Dunlap, mfedyk, zwane, linux-kernel, USB development list

Duncan Sands wrote:
> Hi Vince, I'm not sure, but it looks like a bug in the USB core.
> I was kind of expecting this :)  My patch causes devio.c to hold
> a reference to the usb_device maybe long after the device has
> been disconnected.  This is supposed to be OK, but from your

... no, that's not supposed to be OK.  Returning from disconnect()
means that a device driver is no longer referencing the interface
the driver bound to, or ep0.

- Dave



> Oops it looks like some part of the hcd was finalized too early
> (before devio.c dropped its reference to the usb_device).  Maybe
> one of the USB guys can comment?
> 



^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [linux-usb-devel] Re: [OOPS,  usbcore, releaseintf] 2.6.0-test10-mm1
  2003-12-08 16:03                                                   ` [linux-usb-devel] " David Brownell
@ 2003-12-08 16:15                                                     ` Duncan Sands
  2003-12-08 16:31                                                       ` Alan Stern
  0 siblings, 1 reply; 113+ messages in thread
From: Duncan Sands @ 2003-12-08 16:15 UTC (permalink / raw)
  To: David Brownell
  Cc: Vince, Randy.Dunlap, mfedyk, zwane, linux-kernel, USB development list

On Monday 08 December 2003 17:03, David Brownell wrote:
> Duncan Sands wrote:
> > Hi Vince, I'm not sure, but it looks like a bug in the USB core.
> > I was kind of expecting this :)  My patch causes devio.c to hold
> > a reference to the usb_device maybe long after the device has
> > been disconnected.  This is supposed to be OK, but from your
>
> ... no, that's not supposed to be OK.  Returning from disconnect()
> means that a device driver is no longer referencing the interface
> the driver bound to, or ep0.

Well, I thought Greg wanted it to be OK :)  Anyway, I don't use
the device after disconnect except to take the semaphore
(dev->serialize), check for disconnection (dev->state), and
of course to execute a usb_put_dev.  Surely this usage should
be OK?

Ciao,

Duncan.

^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [linux-usb-devel] Re: [OOPS,  usbcore, releaseintf] 2.6.0-test10-mm1
  2003-12-08 16:15                                                     ` Duncan Sands
@ 2003-12-08 16:31                                                       ` Alan Stern
  2003-12-08 17:20                                                         ` David Brownell
  2003-12-08 17:59                                                         ` Duncan Sands
  0 siblings, 2 replies; 113+ messages in thread
From: Alan Stern @ 2003-12-08 16:31 UTC (permalink / raw)
  To: Duncan Sands
  Cc: David Brownell, Vince, Randy.Dunlap, mfedyk, zwane, linux-kernel,
	USB development list

On Mon, 8 Dec 2003, Duncan Sands wrote:

> On Monday 08 December 2003 17:03, David Brownell wrote:
> > Duncan Sands wrote:
> > > Hi Vince, I'm not sure, but it looks like a bug in the USB core.
> > > I was kind of expecting this :)  My patch causes devio.c to hold
> > > a reference to the usb_device maybe long after the device has
> > > been disconnected.  This is supposed to be OK, but from your
> >
> > ... no, that's not supposed to be OK.  Returning from disconnect()
> > means that a device driver is no longer referencing the interface
> > the driver bound to, or ep0.
> 
> Well, I thought Greg wanted it to be OK :)  Anyway, I don't use
> the device after disconnect except to take the semaphore
> (dev->serialize), check for disconnection (dev->state), and
> of course to execute a usb_put_dev.  Surely this usage should
> be OK?

As long as your disconnect routine doesn't do usb_put_dev, so that it
maintains its reference, I don't see a problem.  But why do you want to
check dev->state later on?  Once your disconnect routine has returned, you
should be totally through with the device.  You should no longer care
whether it's attached or not.

And of course, remember that there are valid reasons for your disconnect 
routine to be called even when the device remains attached.  (rmmod is a 
good example.)

Alan Stern


^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [linux-usb-devel] Re: [OOPS,  usbcore, releaseintf] 2.6.0-test10-mm1
  2003-12-08 16:31                                                       ` Alan Stern
@ 2003-12-08 17:20                                                         ` David Brownell
  2003-12-08 17:59                                                         ` Duncan Sands
  1 sibling, 0 replies; 113+ messages in thread
From: David Brownell @ 2003-12-08 17:20 UTC (permalink / raw)
  To: Alan Stern, Duncan Sands
  Cc: Vince, Randy.Dunlap, mfedyk, zwane, linux-kernel, USB development list


>>>>...  hold
>>>>a reference to the usb_device maybe long after the device has
>>>>been disconnected.  This is supposed to be OK, but from your
>>>
>>>... no, that's not supposed to be OK.  Returning from disconnect()
>>>means that a device driver is no longer referencing the interface
>>>the driver bound to, or ep0.
>>
>>Well, I thought Greg wanted it to be OK :)  Anyway, I don't use
>>the device after disconnect except to take the semaphore
>>(dev->serialize), check for disconnection (dev->state), and
>>of course to execute a usb_put_dev.  Surely this usage should
>>be OK?

Why do you even need that much though?  You're not allowed to
be USING the device any more; that's the sense in which I
was using "reference".   Refcounting is orthogonal, except
in the sense that to use without owning/borrowing a refcount
will likely cause oopsing someday.


> As long as your disconnect routine doesn't do usb_put_dev, so that it

There's an implicit usb_get_dev() associated with probe(),
and an implicit usb_put_dev() associated with disconnect().
If you're going to add an explicit put(), you need to also
add an explicit get().  Few drivers do; most rely on the
implicit refcounts.

But if you keep an extra reference to the device, you'd
need some way to get rid of it.

Yes, "usbfs" is wierd in lots of ways ... it's got references
associated with several distinct roles, including implicitly
associated with device creation, and so I'd suspect it doesn't
keep them all straight.

Plus, using the claim/release binding model (in its current
state) opens it up to a different family of bugs ... since
that doesn't hook up properly to the driver model yet, and
making it do so is non-trivial.


> maintains its reference, I don't see a problem.  But why do you want to
> check dev->state later on?  Once your disconnect routine has returned, you
> should be totally through with the device.  You should no longer care
> whether it's attached or not.
> 
> And of course, remember that there are valid reasons for your disconnect 
> routine to be called even when the device remains attached.  (rmmod is a 
> good example.)

And adding special case logic for rmmod paths isn't a good thing;
better just to implement disconnect() as I described.

- Dave




^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [linux-usb-devel] Re: [OOPS,  usbcore, releaseintf] 2.6.0-test10-mm1
  2003-12-08 16:31                                                       ` Alan Stern
  2003-12-08 17:20                                                         ` David Brownell
@ 2003-12-08 17:59                                                         ` Duncan Sands
  2003-12-08 18:35                                                           ` Alan Stern
  2003-12-12  2:21                                                           ` David Brownell
  1 sibling, 2 replies; 113+ messages in thread
From: Duncan Sands @ 2003-12-08 17:59 UTC (permalink / raw)
  To: Alan Stern
  Cc: David Brownell, Vince, Randy.Dunlap, mfedyk, zwane, linux-kernel,
	USB development list

On Monday 08 December 2003 17:31, Alan Stern wrote:
> On Mon, 8 Dec 2003, Duncan Sands wrote:
> > On Monday 08 December 2003 17:03, David Brownell wrote:
> > > Duncan Sands wrote:
> > > > Hi Vince, I'm not sure, but it looks like a bug in the USB core.
> > > > I was kind of expecting this :)  My patch causes devio.c to hold
> > > > a reference to the usb_device maybe long after the device has
> > > > been disconnected.  This is supposed to be OK, but from your
> > >
> > > ... no, that's not supposed to be OK.  Returning from disconnect()
> > > means that a device driver is no longer referencing the interface
> > > the driver bound to, or ep0.
> >
> > Well, I thought Greg wanted it to be OK :)  Anyway, I don't use
> > the device after disconnect except to take the semaphore
> > (dev->serialize), check for disconnection (dev->state), and
> > of course to execute a usb_put_dev.  Surely this usage should
> > be OK?
>
> As long as your disconnect routine doesn't do usb_put_dev, so that it
> maintains its reference, I don't see a problem.  But why do you want to
> check dev->state later on?  Once your disconnect routine has returned, you
> should be totally through with the device.  You should no longer care
> whether it's attached or not.

Hi Alan, this is for usbfs, not a normal driver.  Recall that I want to replace
use of ps->devsem with ps->dev->serialize.  Currently ps->dev is set to NULL in
the devio.c usbfs disconnect method (if some interface is claimed) or in
inode.c on device disconnect, making it hard to lock with ps->dev->serialize :)
Thus disconnect should no longer be signalled by setting ps->dev to NULL.
For the same reason ps->dev should not be freed on disconnect.  It follows
that I should hold a reference to ps->dev until ps goes down.  And this is
what I do.  By the way, rather than introducing a new flag to indicate
disconnection, ps->dev->state will do.

> And of course, remember that there are valid reasons for your disconnect
> routine to be called even when the device remains attached.  (rmmod is a
> good example.)

Sure.

All the best,

Duncan.

PS: Here is the patch that fixed the original usbfs Oops, but gained the new
one Vince reported:

diff -Nru a/drivers/usb/core/devio.c b/drivers/usb/core/devio.c
--- a/drivers/usb/core/devio.c	Sun Dec  7 01:20:31 2003
+++ b/drivers/usb/core/devio.c	Sun Dec  7 01:20:31 2003
@@ -87,17 +87,15 @@
 static ssize_t usbdev_read(struct file *file, char __user *buf, size_t nbytes, loff_t *ppos)
 {
 	struct dev_state *ps = (struct dev_state *)file->private_data;
+	struct usb_device *dev = ps->dev;
 	ssize_t ret = 0;
 	unsigned len;
 	loff_t pos;
 	int i;
 
 	pos = *ppos;
-	down_read(&ps->devsem);
-	if (!ps->dev) {
-		ret = -ENODEV;
-		goto err;
-	} else if (pos < 0) {
+	down(&dev->serialize);
+	if (pos < 0) {
 		ret = -EINVAL;
 		goto err;
 	}
@@ -106,7 +104,7 @@
 		len = sizeof(struct usb_device_descriptor) - pos;
 		if (len > nbytes)
 			len = nbytes;
-		if (copy_to_user(buf, ((char *)&ps->dev->descriptor) + pos, len)) {
+		if (copy_to_user(buf, ((char *)&dev->descriptor) + pos, len)) {
 			ret = -EFAULT;
 			goto err;
 		}
@@ -118,9 +116,9 @@
 	}
 
 	pos = sizeof(struct usb_device_descriptor);
-	for (i = 0; nbytes && i < ps->dev->descriptor.bNumConfigurations; i++) {
+	for (i = 0; nbytes && i < dev->descriptor.bNumConfigurations; i++) {
 		struct usb_config_descriptor *config =
-			(struct usb_config_descriptor *)ps->dev->rawdescriptors[i];
+			(struct usb_config_descriptor *)dev->rawdescriptors[i];
 		unsigned int length = le16_to_cpu(config->wTotalLength);
 
 		if (*ppos < pos + length) {
@@ -129,7 +127,7 @@
 				len = nbytes;
 
 			if (copy_to_user(buf,
-			    ps->dev->rawdescriptors[i] + (*ppos - pos), len)) {
+			    dev->rawdescriptors[i] + (*ppos - pos), len)) {
 				ret = -EFAULT;
 				goto err;
 			}
@@ -144,7 +142,7 @@
 	}
 
 err:
-	up_read(&ps->devsem);
+	up(&dev->serialize);
 	return ret;
 }
 
@@ -324,22 +322,18 @@
 static void driver_disconnect(struct usb_interface *intf)
 {
 	struct dev_state *ps = usb_get_intfdata (intf);
+	unsigned int ifnum = intf->altsetting->desc.bInterfaceNumber;
 
 	if (!ps)
 		return;
 
-	/* this waits till synchronous requests complete */
-	down_write (&ps->devsem);
-
 	/* prevent new I/O requests */
-	ps->dev = 0;
-	ps->ifclaimed = 0;
 	usb_set_intfdata (intf, NULL);
+	if (ifnum < 8*sizeof(ps->ifclaimed))
+		clear_bit(ifnum, &ps->ifclaimed);
 
 	/* force async requests to complete */
-	destroy_all_async (ps);
-
-	up_write (&ps->devsem);
+	destroy_async_on_interface (ps, ifnum);
 }
 
 struct usb_driver usbdevfs_driver = {
@@ -355,21 +349,18 @@
 	struct usb_interface *iface;
 	int err;
 
-	if (intf >= 8*sizeof(ps->ifclaimed) || !dev
-			|| intf >= dev->actconfig->desc.bNumInterfaces)
+	if (intf >= 8*sizeof(ps->ifclaimed) || intf >= dev->actconfig->desc.bNumInterfaces)
 		return -EINVAL;
-	/* already claimed */
-	if (test_bit(intf, &ps->ifclaimed))
-		return 0;
-	iface = dev->actconfig->interface[intf];
+	if (test_and_set_bit(intf, &ps->ifclaimed))
+		return 0; /* already claimed */
 	err = -EBUSY;
-	lock_kernel();
+	iface = dev->actconfig->interface[intf]; /* change to usb_ifnum_to_if here and elsewhere? */
 	if (!usb_interface_claimed(iface)) {
 		usb_driver_claim_interface(&usbdevfs_driver, iface, ps);
-		set_bit(intf, &ps->ifclaimed);
 		err = 0;
 	}
-	unlock_kernel();
+	if (err)
+		clear_bit(intf, &ps->ifclaimed);
 	return err;
 }
 
@@ -383,13 +374,13 @@
 		return -EINVAL;
 	err = -EINVAL;
 	dev = ps->dev;
-	down(&dev->serialize);
+
+	/* the configuration cannot change under us if we have a claimed interface */
 	if (test_and_clear_bit(intf, &ps->ifclaimed)) {
 		iface = dev->actconfig->interface[intf];
-		usb_driver_release_interface(&usbdevfs_driver, iface);
+		usb_driver_release_interface(&usbdevfs_driver, iface); /* may sleep - beware for BKL! */
 		err = 0;
 	}
-	up(&dev->serialize);
 	return err;
 }
 
@@ -492,7 +483,7 @@
 
 	lock_kernel();
 	ret = -ENOENT;
-	dev = inode->u.generic_ip;
+	dev = usb_get_dev (inode->u.generic_ip);
 	if (!dev) {
 		kfree(ps);
 		goto out;
@@ -504,7 +495,6 @@
 	INIT_LIST_HEAD(&ps->async_pending);
 	INIT_LIST_HEAD(&ps->async_completed);
 	init_waitqueue_head(&ps->wait);
-	init_rwsem(&ps->devsem);
 	ps->discsignr = 0;
 	ps->disctask = current;
 	ps->disccontext = NULL;
@@ -521,18 +511,21 @@
 static int usbdev_release(struct inode *inode, struct file *file)
 {
 	struct dev_state *ps = (struct dev_state *)file->private_data;
+	struct usb_device *dev = ps->dev;
 	unsigned int i;
 
-	lock_kernel();
+	/* the only race is with driver_disconnect */
+	down(&dev->serialize);
 	list_del_init(&ps->list);
 
-	if (ps->dev) {
+	if (dev->state != USB_STATE_NOTATTACHED)
 		for (i = 0; ps->ifclaimed && i < 8*sizeof(ps->ifclaimed); i++)
 			if (test_bit(i, &ps->ifclaimed))
-				releaseintf(ps, i);
-	}
-	unlock_kernel();
+				releaseintf(ps, i); /* may sleep - makes the BKL problematic */
 	destroy_all_async(ps);
+	up(&dev->serialize);
+	usb_put_dev (ps->dev);
+	ps->dev = NULL;
 	kfree(ps);
         return 0;
 }
@@ -712,22 +705,32 @@
 
 static int proc_resetdevice(struct dev_state *ps)
 {
+	struct usb_device *dev = ps->dev;
 	int i, ret;
 
-	ret = usb_reset_device(ps->dev);
+	up(&dev->serialize);
+	ret = usb_reset_device(dev);
+	down(&dev->serialize);
+	if (!ret && dev->state == USB_STATE_NOTATTACHED)
+		ret = -ENODEV;
 	if (ret < 0)
 		return ret;
 
-	for (i = 0; i < ps->dev->actconfig->desc.bNumInterfaces; i++) {
-		struct usb_interface *intf = ps->dev->actconfig->interface[i];
+	for (i = 0; i < dev->actconfig->desc.bNumInterfaces; i++) {
+		struct usb_interface *intf = dev->actconfig->interface[i];
 
 		/* Don't simulate interfaces we've claimed */
 		if (test_bit(i, &ps->ifclaimed))
 			continue;
 
 		err ("%s - this function is broken", __FUNCTION__);
-		if (intf->driver && ps->dev) {
+		if (intf->driver) {
+			up(&dev->serialize);
 			usb_probe_interface (&intf->dev);
+			down(&dev->serialize);
+		}
+		if (dev->state == USB_STATE_NOTATTACHED) {
+			return -ENODEV;
 		}
 	}
 
@@ -976,18 +979,19 @@
         DECLARE_WAITQUEUE(wait, current);
 	struct async *as = NULL;
 	void __user *addr;
+	struct usb_device *dev = ps->dev;
 	int ret;
 
 	add_wait_queue(&ps->wait, &wait);
-	while (ps->dev) {
+	while (dev->state != USB_STATE_NOTATTACHED) {
 		__set_current_state(TASK_INTERRUPTIBLE);
 		if ((as = async_getcompleted(ps)))
 			break;
 		if (signal_pending(current))
 			break;
-		up_read(&ps->devsem);
+		up(&dev->serialize);
 		schedule();
-		down_read(&ps->devsem);
+		down(&dev->serialize);
 	}
 	remove_wait_queue(&ps->wait, &wait);
 	set_current_state(TASK_RUNNING);
@@ -1089,57 +1093,53 @@
 		}
 	}
 
-       if (!ps->dev)
-               retval = -ENODEV;
-       else if (!(ifp = usb_ifnum_to_if (ps->dev, ctrl.ifno)))
+       if (!(ifp = usb_ifnum_to_if (ps->dev, ctrl.ifno)))
                retval = -EINVAL;
-       else switch (ctrl.ioctl_code) {
-
-       /* disconnect kernel driver from interface, leaving it unbound.  */
-       /* maybe unbound - you get no guarantee it stays unbound */
-       case USBDEVFS_DISCONNECT:
-		/* this function is misdesigned - retained for compatibility */
+	else {
 		lock_kernel();
-		driver = ifp->driver;
-		if (driver) {
-			dbg ("disconnect '%s' from dev %d interface %d",
-			     driver->name, ps->dev->devnum, ctrl.ifno);
-			usb_unbind_interface(&ifp->dev);
-		} else
-			retval = -ENODATA;
-		unlock_kernel();
-		break;
+		up(&ps->dev->serialize);
+		switch (ctrl.ioctl_code) {
 
-	/* let kernel drivers try to (re)bind to the interface */
-	case USBDEVFS_CONNECT:
-		lock_kernel();
-		retval = usb_probe_interface (&ifp->dev);
-		unlock_kernel();
-		break;
+		/* disconnect kernel driver from interface, leaving it unbound.  */
+		/* maybe unbound - you get no guarantee it stays unbound */
+		case USBDEVFS_DISCONNECT:
+			/* this function is misdesigned - retained for compatibility */
+			driver = ifp->driver;
+			if (driver) {
+				dbg ("disconnect '%s' from dev %d interface %d",
+					driver->name, ps->dev->devnum, ctrl.ifno);
+				usb_unbind_interface(&ifp->dev);
+			} else
+				retval = -ENODATA;
+			break;
 
-	/* talk directly to the interface's driver */
-	default:
-		/* BKL used here to protect against changing the binding
-		 * of this driver to this device, as well as unloading its
-		 * driver module.
-		 */
-		lock_kernel ();
-		driver = ifp->driver;
-		if (driver == 0 || driver->ioctl == 0) {
-			unlock_kernel();
-			retval = -ENOSYS;
-		} else {
-			if (!try_module_get (driver->owner)) {
-				unlock_kernel();
+		/* let kernel drivers try to (re)bind to the interface */
+		case USBDEVFS_CONNECT:
+			retval = usb_probe_interface (&ifp->dev);
+			break;
+
+		/* talk directly to the interface's driver */
+		default:
+			/* BKL used here to protect against changing the binding
+			* of this driver to this device, as well as unloading its
+			* driver module.
+			*/
+			driver = ifp->driver;
+			if (driver == 0 || driver->ioctl == 0) {
 				retval = -ENOSYS;
-				break;
+			} else {
+				if (!try_module_get (driver->owner)) {
+					retval = -ENOSYS;
+					break;
+				}
+				retval = driver->ioctl (ifp, ctrl.ioctl_code, buf);
+				if (retval == -ENOIOCTLCMD)
+					retval = -ENOTTY;
+				module_put (driver->owner);
 			}
-			unlock_kernel ();
-			retval = driver->ioctl (ifp, ctrl.ioctl_code, buf);
-			if (retval == -ENOIOCTLCMD)
-				retval = -ENOTTY;
-			module_put (driver->owner);
 		}
+		down(&ps->dev->serialize);
+		unlock_kernel();
 	}
 
 	/* cleanup and return */
@@ -1161,13 +1161,15 @@
 static int usbdev_ioctl(struct inode *inode, struct file *file, unsigned int cmd, unsigned long arg)
 {
 	struct dev_state *ps = (struct dev_state *)file->private_data;
+	struct usb_device *dev = ps->dev;
 	int ret = -ENOTTY;
 
 	if (!(file->f_mode & FMODE_WRITE))
 		return -EPERM;
-	down_read(&ps->devsem);
-	if (!ps->dev) {
-		up_read(&ps->devsem);
+
+	down(&dev->serialize);
+	if (dev->state == USB_STATE_NOTATTACHED) {
+		up(&dev->serialize);
 		return -ENODEV;
 	}
 	switch (cmd) {
@@ -1212,7 +1214,9 @@
 		break;
 
 	case USBDEVFS_SETCONFIGURATION:
+		up(&dev->serialize);
 		ret = proc_setconfig(ps, (void __user *)arg);
+		down(&dev->serialize);
 		break;
 
 	case USBDEVFS_SUBMITURB:
@@ -1233,6 +1237,10 @@
 		ret = proc_reapurbnonblock(ps, (void __user *)arg);
 		break;
 
+	case USBDEVFS_RELEASEINTERFACE:
+		ret = proc_releaseinterface(ps, (void __user *)arg);
+		break;
+
 	case USBDEVFS_DISCSIGNAL:
 		ret = proc_disconnectsignal(ps, (void __user *)arg);
 		break;
@@ -1241,15 +1249,12 @@
 		ret = proc_claiminterface(ps, (void __user *)arg);
 		break;
 
-	case USBDEVFS_RELEASEINTERFACE:
-		ret = proc_releaseinterface(ps, (void __user *)arg);
-		break;
-
 	case USBDEVFS_IOCTL:
 		ret = proc_ioctl(ps, (void __user *) arg);
-		break;
+	break;
 	}
-	up_read(&ps->devsem);
+	up(&dev->serialize);
+
 	if (ret >= 0)
 		inode->i_atime = CURRENT_TIME;
 	return ret;
@@ -1264,7 +1269,7 @@
 	poll_wait(file, &ps->wait, wait);
 	if (file->f_mode & FMODE_WRITE && !list_empty(&ps->async_completed))
 		mask |= POLLOUT | POLLWRNORM;
-	if (!ps->dev)
+	if (ps->dev->state == USB_STATE_NOTATTACHED)
 		mask |= POLLERR | POLLHUP;
 	return mask;
 }
diff -Nru a/drivers/usb/core/inode.c b/drivers/usb/core/inode.c
--- a/drivers/usb/core/inode.c	Sun Dec  7 01:20:31 2003
+++ b/drivers/usb/core/inode.c	Sun Dec  7 01:20:31 2003
@@ -717,9 +717,9 @@
 	while (!list_empty(&dev->filelist)) {
 		ds = list_entry(dev->filelist.next, struct dev_state, list);
 		list_del_init(&ds->list);
-		down_write(&ds->devsem);
-		ds->dev = NULL;
-		up_write(&ds->devsem);
+//		down_write(&ds->devsem);
+//		ds->dev = NULL;
+//		up_write(&ds->devsem);
 		if (ds->discsignr) {
 			sinfo.si_signo = SIGPIPE;
 			sinfo.si_errno = EPIPE;
diff -Nru a/include/linux/usbdevice_fs.h b/include/linux/usbdevice_fs.h
--- a/include/linux/usbdevice_fs.h	Sun Dec  7 01:20:31 2003
+++ b/include/linux/usbdevice_fs.h	Sun Dec  7 01:20:31 2003
@@ -154,7 +154,6 @@
 
 struct dev_state {
 	struct list_head list;      /* state list */
-	struct rw_semaphore devsem; /* protects modifications to dev (dev == NULL indicating disconnect) */ 
 	struct usb_device *dev;
 	struct file *file;
 	spinlock_t lock;            /* protects the async urb lists */

^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [linux-usb-devel] Re: [OOPS,  usbcore, releaseintf] 2.6.0-test10-mm1
  2003-12-08 17:59                                                         ` Duncan Sands
@ 2003-12-08 18:35                                                           ` Alan Stern
  2003-12-08 19:53                                                             ` Duncan Sands
  2003-12-12  2:21                                                           ` David Brownell
  1 sibling, 1 reply; 113+ messages in thread
From: Alan Stern @ 2003-12-08 18:35 UTC (permalink / raw)
  To: Duncan Sands
  Cc: David Brownell, Vince, Randy.Dunlap, mfedyk, zwane, linux-kernel,
	USB development list

On Mon, 8 Dec 2003, Duncan Sands wrote:

> Hi Alan, this is for usbfs, not a normal driver.  Recall that I want to replace
> use of ps->devsem with ps->dev->serialize.

Maybe you shouldn't do that.  Other drivers maintain their own data 
structure separately from the struct usb_device and with its own lock.  
But usbfs may suffer from complications as a result of its unorthodox 
approach to device ownership.

>  Currently ps->dev is set to NULL in
> the devio.c usbfs disconnect method (if some interface is claimed) or in
> inode.c on device disconnect, making it hard to lock with ps->dev->serialize :)
> Thus disconnect should no longer be signalled by setting ps->dev to NULL.

If you would keep the ps->devsem lock, would there be any problem in 
setting ps->dev to NULL to indicate disconnection?

Are they any reasons for not keeping ps->devsem?  Since usbfs generally 
acts as a driver and drivers generally don't have to concern themselves 
with usbdev->serialize (the core handles it for them), shouldn't usbfs 
also be able to ignore ps->dev->serialize?

Alan Stern


^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [linux-usb-devel] Re: [OOPS,  usbcore, releaseintf] 2.6.0-test10-mm1
  2003-12-08 18:35                                                           ` Alan Stern
@ 2003-12-08 19:53                                                             ` Duncan Sands
  2003-12-08 21:32                                                               ` Alan Stern
  0 siblings, 1 reply; 113+ messages in thread
From: Duncan Sands @ 2003-12-08 19:53 UTC (permalink / raw)
  To: Alan Stern
  Cc: David Brownell, Vince, Randy.Dunlap, mfedyk, zwane, linux-kernel,
	USB development list

Hi Alan,

> But usbfs may suffer from complications as a result of its unorthodox
> approach to device ownership.

Yes, you have put your finger on it.

> >  Currently ps->dev is set to NULL in
> > the devio.c usbfs disconnect method (if some interface is claimed) or in
> > inode.c on device disconnect, making it hard to lock with
> > ps->dev->serialize :) Thus disconnect should no longer be signalled by
> > setting ps->dev to NULL.
>
> If you would keep the ps->devsem lock, would there be any problem in
> setting ps->dev to NULL to indicate disconnection?

You can't keep the ps->devsem lock and use ps->dev->serialize, because it
leads to deadlock.  Actually, simply replacing ps->devsem with ps->dev->serialize
cannot lead to any new deadlocks, it makes deadlocks that could occasionally
happen always happen (such deadlocks exist right now in usbfs).  Some of the
current deadlocks can be eliminated without giving up ps->devsem, but not all.
So the question is: must ps->dev->serialize be used?

> Are they any reasons for not keeping ps->devsem?  Since usbfs generally
> acts as a driver and drivers generally don't have to concern themselves
> with usbdev->serialize (the core handles it for them), shouldn't usbfs
> also be able to ignore ps->dev->serialize?

No, because it needs to do operations on interfaces it hasn't claimed (such
as looking them up and claiming them).  This is why it needs to protect
itself, at least momentarily, against configurations shifting under it.  This
can be done by using the BKL more.  However it can be done more simply
using ps->dev->serialize (in fact it is simpler than what is there now).

By the way, if it is somehow fatal to do usb_put_dev after disconnect,
what is the point of referencing counting at all?  You might as well
free up the usb_device structure immediately after disconnect, since
there is sure to be a reference before disconnect, and (apparently)
there had better not be a reference after disconnect...

All the best,

Duncan.

^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [linux-usb-devel] Re: [OOPS,  usbcore, releaseintf] 2.6.0-test10-mm1
  2003-12-08 19:53                                                             ` Duncan Sands
@ 2003-12-08 21:32                                                               ` Alan Stern
  2003-12-08 21:55                                                                 ` Duncan Sands
  0 siblings, 1 reply; 113+ messages in thread
From: Alan Stern @ 2003-12-08 21:32 UTC (permalink / raw)
  To: Duncan Sands
  Cc: David Brownell, Vince, Randy.Dunlap, mfedyk, zwane, linux-kernel,
	USB development list

On Mon, 8 Dec 2003, Duncan Sands wrote:

> > If you would keep the ps->devsem lock, would there be any problem in
> > setting ps->dev to NULL to indicate disconnection?
> 
> You can't keep the ps->devsem lock and use ps->dev->serialize, because it
> leads to deadlock.

How so?  Remember that I am almost totally unfamiliar with the details of 
the usbfs code.  Are you saying there are places where the driver holds 
one lock and needs to acquire the other and vice versa?

>  Actually, simply replacing ps->devsem with ps->dev->serialize
> cannot lead to any new deadlocks, it makes deadlocks that could occasionally
> happen always happen (such deadlocks exist right now in usbfs).  Some of the
> current deadlocks can be eliminated without giving up ps->devsem, but not all.
> So the question is: must ps->dev->serialize be used?

It must be held when you call usb_reset_configuration().  It must _not_ be
held when you call usb_set_configuration().  For usb_reset_device() right
now you must not hold it, although that may change in the future.  For
usb_unbind_interface() you must not hold it.  There's a note that
usb_driver_claim_interface() grabs the BKL for some reason having to do
with usbfs -- no doubt when usbfs is fixed that won't be needed and the 
caller will be required to hold dev->serialize instead.

If you call usb_ifnum_to_if() you ought to hold the serialize lock; 
otherwise the configuration might change out from under you.  But it's not 
necessary.  Likewise for usb_epnum_to_ep_desc if you're looking up an 
endpoint that isn't part of an interface you have bound.

> > Are they any reasons for not keeping ps->devsem?  Since usbfs generally
> > acts as a driver and drivers generally don't have to concern themselves
> > with usbdev->serialize (the core handles it for them), shouldn't usbfs
> > also be able to ignore ps->dev->serialize?
> 
> No, because it needs to do operations on interfaces it hasn't claimed (such
> as looking them up and claiming them).  This is why it needs to protect
> itself, at least momentarily, against configurations shifting under it.  This
> can be done by using the BKL more.  However it can be done more simply
> using ps->dev->serialize (in fact it is simpler than what is there now).

That agrees with my assessment.  It ought to be possible to remove these 
references to the BKL in favor of ps->dev->serialize.


> By the way, if it is somehow fatal to do usb_put_dev after disconnect,
> what is the point of referencing counting at all?  You might as well
> free up the usb_device structure immediately after disconnect, since
> there is sure to be a reference before disconnect, and (apparently)
> there had better not be a reference after disconnect...

There's some sort of misunderstanding here.  It's not fatal to do 
usb_put_dev() after disconnect, provided you called usb_get_dev() earlier.
I'm not sure what the cause was of the oops you were getting, but it 
wasn't that.

Alan Stern



^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [linux-usb-devel] Re: [OOPS,  usbcore, releaseintf] 2.6.0-test10-mm1
  2003-12-08 21:32                                                               ` Alan Stern
@ 2003-12-08 21:55                                                                 ` Duncan Sands
  2003-12-08 23:09                                                                   ` Alan Stern
  0 siblings, 1 reply; 113+ messages in thread
From: Duncan Sands @ 2003-12-08 21:55 UTC (permalink / raw)
  To: Alan Stern
  Cc: David Brownell, Vince, Randy.Dunlap, mfedyk, zwane, linux-kernel,
	USB development list

> > You can't keep the ps->devsem lock and use ps->dev->serialize, because it
> > leads to deadlock.
>
> How so?  Remember that I am almost totally unfamiliar with the details of
> the usbfs code.  Are you saying there are places where the driver holds
> one lock and needs to acquire the other and vice versa?

Yes.  ps->devsem is used to protect against disconnection: all top level
routines take it (as a read lock), and in driver_disconnect it is taken as a
write lock.  Top level routines call lower level routines which sometimes
need to take dev->serialize (and do already in several places).

Thus: ps->devsem taken, then dev->serialize.

However, dev->serialize is taken by the USB core before calling
driver_disconnect.

Thus: dev->serialize taken, then ps->devsem.

> >  Actually, simply replacing ps->devsem with ps->dev->serialize
> > cannot lead to any new deadlocks, it makes deadlocks that could
> > occasionally happen always happen (such deadlocks exist right now in
> > usbfs).  Some of the current deadlocks can be eliminated without giving
> > up ps->devsem, but not all. So the question is: must ps->dev->serialize
> > be used?
>
> It must be held when you call usb_reset_configuration().  It must _not_ be
> held when you call usb_set_configuration().  For usb_reset_device() right
> now you must not hold it, although that may change in the future.  For
> usb_unbind_interface() you must not hold it.  There's a note that
> usb_driver_claim_interface() grabs the BKL for some reason having to do
> with usbfs -- no doubt when usbfs is fixed that won't be needed and the
> caller will be required to hold dev->serialize instead.

Right.  And why should (for example) dev->serialize not be held when it
calls usb_set_configuration? - because usb_set_configuration takes
dev->serialize.  This is one of the places I mentioned above where
deadlock can occur right now.

> If you call usb_ifnum_to_if() you ought to hold the serialize lock;
> otherwise the configuration might change out from under you.  But it's not
> necessary.  Likewise for usb_epnum_to_ep_desc if you're looking up an
> endpoint that isn't part of an interface you have bound.

Why isn't it necessary?  As far as I can see it is vital.

> > > Are they any reasons for not keeping ps->devsem?  Since usbfs generally
> > > acts as a driver and drivers generally don't have to concern themselves
> > > with usbdev->serialize (the core handles it for them), shouldn't usbfs
> > > also be able to ignore ps->dev->serialize?
> >
> > No, because it needs to do operations on interfaces it hasn't claimed
> > (such as looking them up and claiming them).  This is why it needs to
> > protect itself, at least momentarily, against configurations shifting
> > under it.  This can be done by using the BKL more.  However it can be
> > done more simply using ps->dev->serialize (in fact it is simpler than
> > what is there now).
>
> That agrees with my assessment.  It ought to be possible to remove these
> references to the BKL in favor of ps->dev->serialize.

Yes, and that is what my patch does.  And due to the above problem with
deadlock it replaces ps->devsem with ps->dev->serialize everywhere.

> > By the way, if it is somehow fatal to do usb_put_dev after disconnect,
> > what is the point of referencing counting at all?  You might as well
> > free up the usb_device structure immediately after disconnect, since
> > there is sure to be a reference before disconnect, and (apparently)
> > there had better not be a reference after disconnect...
>
> There's some sort of misunderstanding here.  It's not fatal to do
> usb_put_dev() after disconnect, provided you called usb_get_dev() earlier.
> I'm not sure what the cause was of the oops you were getting, but it
> wasn't that.

It was AFAICS, though of course it shouldn't be.

Ciao,

Duncan.

^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [linux-usb-devel] Re: [OOPS,  usbcore, releaseintf] 2.6.0-test10-mm1
  2003-12-08 21:55                                                                 ` Duncan Sands
@ 2003-12-08 23:09                                                                   ` Alan Stern
  2003-12-09 10:23                                                                     ` Duncan Sands
                                                                                       ` (3 more replies)
  0 siblings, 4 replies; 113+ messages in thread
From: Alan Stern @ 2003-12-08 23:09 UTC (permalink / raw)
  To: Duncan Sands
  Cc: David Brownell, Vince, Randy.Dunlap, mfedyk, zwane, linux-kernel,
	USB development list

On Mon, 8 Dec 2003, Duncan Sands wrote:

> > > You can't keep the ps->devsem lock and use ps->dev->serialize, because it
> > > leads to deadlock.
> >
> > How so?  Remember that I am almost totally unfamiliar with the details of
> > the usbfs code.  Are you saying there are places where the driver holds
> > one lock and needs to acquire the other and vice versa?
> 
> Yes.  ps->devsem is used to protect against disconnection: all top level
> routines take it (as a read lock), and in driver_disconnect it is taken as a
> write lock.  Top level routines call lower level routines which sometimes
> need to take dev->serialize (and do already in several places).
> 
> Thus: ps->devsem taken, then dev->serialize.
> 
> However, dev->serialize is taken by the USB core before calling
> driver_disconnect.
> 
> Thus: dev->serialize taken, then ps->devsem.

This is a tricky situation, no doubt about it.

Your situation is a little different from the usual one because ps->devsem
locks the whole device, not just a single interface.  It should still be
able to work.  But maybe you're right; since ps->devsem locks the same
thing as ps->dev->serialize, maybe it's not needed.  By the way, when
usbfs takes ownership of a device, does it bind to the device's
interfaces?

> Right.  And why should (for example) dev->serialize not be held when it
> calls usb_set_configuration? - because usb_set_configuration takes
> dev->serialize.  This is one of the places I mentioned above where
> deadlock can occur right now.

You may simply have to release the lock because calling 
usb_set_configuration and then reacquire it afterwards.

That leads to the question of how to assure that the device doesn't go 
away before usb_set_configuration is called.  Perhaps 
usb_set_configuration and usb_unbind_interface should be changed to 
require the caller to hold the serialize lock.


> > If you call usb_ifnum_to_if() you ought to hold the serialize lock;
> > otherwise the configuration might change out from under you.  But it's not
> > necessary.  Likewise for usb_epnum_to_ep_desc if you're looking up an
> > endpoint that isn't part of an interface you have bound.
> 
> Why isn't it necessary?  As far as I can see it is vital.

I mean it won't cause an oops, although it might provide an invalid 
result.  It's not _required_ by the API (maybe it should be).

Actually, there's another sense in which it's not necessary.  Since
changing configurations first involves unbinding the existing drivers, if
you hold a driver-private lock that will block your disconnect routine
then you can safely call usb_ifnum_to_if even without holding
dev->serialize.

> > There's some sort of misunderstanding here.  It's not fatal to do
> > usb_put_dev() after disconnect, provided you called usb_get_dev() earlier.
> > I'm not sure what the cause was of the oops you were getting, but it
> > wasn't that.
> 
> It was AFAICS, though of course it shouldn't be.

I didn't note the reason for the oops.  Was it a segmentation violation?  
The usb_device memory isn't deallocated until the reference count goes to 
0.  Maybe something was doing an extra usb_put_dev.

Alan Stern


^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [linux-usb-devel] Re: [OOPS,  usbcore, releaseintf] 2.6.0-test10-mm1
  2003-12-08 23:09                                                                   ` Alan Stern
@ 2003-12-09 10:23                                                                     ` Duncan Sands
  2003-12-09 15:55                                                                       ` Alan Stern
  2003-12-09 10:36                                                                     ` Duncan Sands
                                                                                       ` (2 subsequent siblings)
  3 siblings, 1 reply; 113+ messages in thread
From: Duncan Sands @ 2003-12-09 10:23 UTC (permalink / raw)
  To: Alan Stern
  Cc: David Brownell, Vince, Randy.Dunlap, mfedyk, zwane, linux-kernel,
	USB development list

> > Yes.  ps->devsem is used to protect against disconnection: all top level
> > routines take it (as a read lock), and in driver_disconnect it is taken
> > as a write lock.  Top level routines call lower level routines which
> > sometimes need to take dev->serialize (and do already in several places).
> >
> > Thus: ps->devsem taken, then dev->serialize.
> >
> > However, dev->serialize is taken by the USB core before calling
> > driver_disconnect.
> >
> > Thus: dev->serialize taken, then ps->devsem.
>
> This is a tricky situation, no doubt about it.
>
> Your situation is a little different from the usual one because ps->devsem
> locks the whole device, not just a single interface.  It should still be
> able to work.  But maybe you're right; since ps->devsem locks the same
> thing as ps->dev->serialize, maybe it's not needed.  By the way, when
> usbfs takes ownership of a device, does it bind to the device's
> interfaces?

Well usbfs never owns anything really.  What it does is allow you (from user
space) to claim and use an interface that nobody else has claimed yet.  Thus
it needs to be able to look at the interfaces of any USB device, find out which
ones are already claimed and maybe claim any ones that are not in use.

> > Right.  And why should (for example) dev->serialize not be held when it
> > calls usb_set_configuration? - because usb_set_configuration takes
> > dev->serialize.  This is one of the places I mentioned above where
> > deadlock can occur right now.
>
> You may simply have to release the lock because calling
> usb_set_configuration and then reacquire it afterwards.

Right, I did this in my patch along with the other changes, but in fact it could
be fixed separately.

> That leads to the question of how to assure that the device doesn't go
> away before usb_set_configuration is called.  Perhaps
> usb_set_configuration and usb_unbind_interface should be changed to
> require the caller to hold the serialize lock.

Well, you could just ensure you have a reference to the usb_device, and
change usb_set_configuration and friends so that they don't Oops if the
device has been disconnected.  This should be done anyway by the way -
surely all core routines should behave themselves (eg: by failing with
an error code) when called with a not-yet-freed struct usb_device?

> > > If you call usb_ifnum_to_if() you ought to hold the serialize lock;
> > > otherwise the configuration might change out from under you.  But it's
> > > not necessary.  Likewise for usb_epnum_to_ep_desc if you're looking up
> > > an endpoint that isn't part of an interface you have bound.
> >
> > Why isn't it necessary?  As far as I can see it is vital.
>
> I mean it won't cause an oops, although it might provide an invalid
> result.  It's not _required_ by the API (maybe it should be).

It will cause an Oops - actconfig may be NULL.  This is the case after
disconnect for example, and also momentarily the case doing configuration
changes.

> Actually, there's another sense in which it's not necessary.  Since
> changing configurations first involves unbinding the existing drivers, if
> you hold a driver-private lock that will block your disconnect routine
> then you can safely call usb_ifnum_to_if even without holding
> dev->serialize.

The disconnect routine is only called if you have claimed an interface.
If usbfs is looking for an interface to claim (and hasn't yet claimed
one), then disconnect will not be called.  There is code in inode.c that
informs usbfs when the device has been disconnected, but now that
disconnect is per-interface, that is not good enough.

> > > There's some sort of misunderstanding here.  It's not fatal to do
> > > usb_put_dev() after disconnect, provided you called usb_get_dev()
> > > earlier. I'm not sure what the cause was of the oops you were getting,
> > > but it wasn't that.
> >
> > It was AFAICS, though of course it shouldn't be.
>
> I didn't note the reason for the oops.  Was it a segmentation violation?
> The usb_device memory isn't deallocated until the reference count goes to
> 0.  Maybe something was doing an extra usb_put_dev.

More on this in another email.

Ciao,

Duncan.

^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [linux-usb-devel] Re: [OOPS,  usbcore, releaseintf] 2.6.0-test10-mm1
  2003-12-08 23:09                                                                   ` Alan Stern
  2003-12-09 10:23                                                                     ` Duncan Sands
@ 2003-12-09 10:36                                                                     ` Duncan Sands
  2003-12-09 16:08                                                                       ` Alan Stern
  2003-12-09 10:49                                                                     ` Duncan Sands
  2003-12-10 13:22                                                                     ` Duncan Sands
  3 siblings, 1 reply; 113+ messages in thread
From: Duncan Sands @ 2003-12-09 10:36 UTC (permalink / raw)
  To: Alan Stern
  Cc: David Brownell, Vince, Randy.Dunlap, mfedyk, zwane, linux-kernel,
	USB development list

There is another solution by the way, which involves changes in the core:
make dev->serialize into a read/write semaphore, and have disconnect
be called with a READ lock taken on dev->serialize.  Then in usbfs, I can
take a read lock on dev->serialize whenever I want to rummage around
in the device configuration.  This way things can be left as they are (i.e.
keep ps->devsem) and deadlock will not occur as long as usbfs only
takes read locks on dev->serialize.  Of course calls to usb_set_configuration
may deadlock as now unless you drop ps->devsem before calling, but I'm
sure this can be dealt with (this is because it needs a write lock on
dev->serialize).  I quite like this solution because it makes
things more robust: as long as device drivers only take a read lock on
dev->serialize, they should not deadlock for this reason.  (These deadlock
possibilities are a fundamental problem because drivers call into the
core (which makes itself atomic using dev->serialize and friends), but
the core also calls back into drivers via disconnect methods which it
wants to make atomic by holding the same lock).

Ciao,

Duncan.

^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [linux-usb-devel] Re: [OOPS,  usbcore, releaseintf] 2.6.0-test10-mm1
  2003-12-08 23:09                                                                   ` Alan Stern
  2003-12-09 10:23                                                                     ` Duncan Sands
  2003-12-09 10:36                                                                     ` Duncan Sands
@ 2003-12-09 10:49                                                                     ` Duncan Sands
  2003-12-09 15:47                                                                       ` Alan Stern
  2003-12-10  1:49                                                                       ` Greg KH
  2003-12-10 13:22                                                                     ` Duncan Sands
  3 siblings, 2 replies; 113+ messages in thread
From: Duncan Sands @ 2003-12-09 10:49 UTC (permalink / raw)
  To: Alan Stern
  Cc: David Brownell, Vince, Randy.Dunlap, mfedyk, zwane, linux-kernel,
	USB development list, Greg KH

> I didn't note the reason for the oops.  Was it a segmentation violation?
> The usb_device memory isn't deallocated until the reference count goes to
> 0.  Maybe something was doing an extra usb_put_dev.

Maybe this is related to "oopses in kobjects in 2.6.0-test11 (was Re: kobject patch)"?
My call to usb_put_dev in usbdev_release is releasing the kobject,
which shows that the reference count was not already zero.  However
it dereferences a NULL pointer in here:

static void hcd_pci_release(struct usb_bus *bus)
{
        struct usb_hcd *hcd = bus->hcpriv;

        if (hcd)
                hcd->driver->hcd_free(hcd);
}

which suggests that the hcd was already released.  Maybe Greg can comment?

[9889]: shutting down for system reboot
printing eip:
c8ae8999
Oops: 0000 [#1]
PREEMPT
CPU:    0
EIP:    0060:[<c8ae8999>]    Not tainted VLI
EFLAGS: 00010286
EIP is at hcd_pci_release+0x19/0x20 [usbcore]
eax: c8c69d80   ebx: c637f050   ecx: c8af6c20   edx: c637f000
esi: c031e65c   edi: c031e680   ebp: c0019ec4   esp: c0019ec0
ds: 007b   es: 007b   ss: 0068
Process modem_run (pid: 8460, threadinfo=c0018000 task=c1508080)
Stack: c637f000 c0019ed0 c8ae455d c637f000 c0019ee8 c0203738 c637f048 
c0019f00
c8ae77d6 c031e450 c0019f00 c01bc88f c637f050 c6b09200 c031e428 c031e440
c0019f10 c8ae08b6 c637f050 00000000 c0019f2c c02019e1 c6b092cc c0019f2c
Call Trace:
[<c8ae455d>] usb_host_release+0x1d/0x20 [usbcore]
[<c0203738>] class_dev_release+0x58/0x60
[<c8ae77d6>] usb_destroy_configuration+0xb6/0xf0 [usbcore]
[<c01bc88f>] kobject_cleanup+0x6f/0x80
[<c8ae08b6>] usb_release_dev+0x46/0x60 [usbcore]
[<c02019e1>] device_release+0x21/0x80
[<c01bc88f>] kobject_cleanup+0x6f/0x80
[<c8ae9b38>] usbdev_release+0x88/0xc0 [usbcore]
[<c0157a5c>] __fput+0x10c/0x120
[<c0156047>] filp_close+0x57/0x80
[<c01560d1>] sys_close+0x61/0x90
[<c02a302e>] sysenter_past_esp+0x43/0x65

Code: 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 55 89 e5 83 
ec 04 8b 45 08 8b 50 30 85 d2 74 0c 8b 82 08 01 00 00 89 14 24 <ff> 50 
28 c9 c3 89 f6 55 89 e5 57 56 53 83 ec 34 8b 5d 0c e8 3f


^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [linux-usb-devel] Re: [OOPS,  usbcore, releaseintf] 2.6.0-test10-mm1
  2003-12-09 10:49                                                                     ` Duncan Sands
@ 2003-12-09 15:47                                                                       ` Alan Stern
  2003-12-09 21:12                                                                         ` Duncan Sands
  2003-12-10  1:49                                                                       ` Greg KH
  1 sibling, 1 reply; 113+ messages in thread
From: Alan Stern @ 2003-12-09 15:47 UTC (permalink / raw)
  To: Duncan Sands
  Cc: David Brownell, Vince, Randy.Dunlap, mfedyk, zwane, linux-kernel,
	USB development list, Greg KH

On Tue, 9 Dec 2003, Duncan Sands wrote:

> Maybe this is related to "oopses in kobjects in 2.6.0-test11 (was Re: kobject patch)"?
> My call to usb_put_dev in usbdev_release is releasing the kobject,
> which shows that the reference count was not already zero.  However
> it dereferences a NULL pointer in here:
> 
> static void hcd_pci_release(struct usb_bus *bus)
> {
>         struct usb_hcd *hcd = bus->hcpriv;
> 
>         if (hcd)
>                 hcd->driver->hcd_free(hcd);
> }
> 
> which suggests that the hcd was already released.  Maybe Greg can comment?
> 
> [9889]: shutting down for system reboot
> printing eip:
> c8ae8999
> Oops: 0000 [#1]
> PREEMPT
> CPU:    0
> EIP:    0060:[<c8ae8999>]    Not tainted VLI
> EFLAGS: 00010286
> EIP is at hcd_pci_release+0x19/0x20 [usbcore]
> eax: c8c69d80   ebx: c637f050   ecx: c8af6c20   edx: c637f000
> esi: c031e65c   edi: c031e680   ebp: c0019ec4   esp: c0019ec0
> ds: 007b   es: 007b   ss: 0068
> Process modem_run (pid: 8460, threadinfo=c0018000 task=c1508080)
> Stack: c637f000 c0019ed0 c8ae455d c637f000 c0019ee8 c0203738 c637f048 
> c0019f00
> c8ae77d6 c031e450 c0019f00 c01bc88f c637f050 c6b09200 c031e428 c031e440
> c0019f10 c8ae08b6 c637f050 00000000 c0019f2c c02019e1 c6b092cc c0019f2c
> Call Trace:
> [<c8ae455d>] usb_host_release+0x1d/0x20 [usbcore]
> [<c0203738>] class_dev_release+0x58/0x60
> [<c8ae77d6>] usb_destroy_configuration+0xb6/0xf0 [usbcore]
> [<c01bc88f>] kobject_cleanup+0x6f/0x80
> [<c8ae08b6>] usb_release_dev+0x46/0x60 [usbcore]
> [<c02019e1>] device_release+0x21/0x80
> [<c01bc88f>] kobject_cleanup+0x6f/0x80
> [<c8ae9b38>] usbdev_release+0x88/0xc0 [usbcore]
> [<c0157a5c>] __fput+0x10c/0x120
> [<c0156047>] filp_close+0x57/0x80
> [<c01560d1>] sys_close+0x61/0x90
> [<c02a302e>] sysenter_past_esp+0x43/0x65
> 
> Code: 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 55 89 e5 83 
> ec 04 8b 45 08 8b 50 30 85 d2 74 0c 8b 82 08 01 00 00 89 14 24 <ff> 50 
> 28 c9 c3 89 f6 55 89 e5 57 56 53 83 ec 34 8b 5d 0c e8 3f

I don't understand this stack dump.  The EIP address is _after the end_ of 
hcd_pci_release, as you can see from the fact that the following code is 
nothing but a long string of NOPs.  Also, I don't understand the cause of 
the oops.  What does the PREEMPT mean?  There's no indication that a null 
pointer was dereferenced.  None of the registers contains 0.

But if you think that's the problem, try adding a printk to 
hcd_pci_release to display the values of bus, hcd->driver, and 
hcd->driver->hcd_free.  Knowing which one is NULL ought to help your 
analysis.

Alan Stern


^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [linux-usb-devel] Re: [OOPS,  usbcore, releaseintf] 2.6.0-test10-mm1
  2003-12-09 10:23                                                                     ` Duncan Sands
@ 2003-12-09 15:55                                                                       ` Alan Stern
  2003-12-09 20:36                                                                         ` Duncan Sands
  0 siblings, 1 reply; 113+ messages in thread
From: Alan Stern @ 2003-12-09 15:55 UTC (permalink / raw)
  To: Duncan Sands
  Cc: David Brownell, Vince, Randy.Dunlap, mfedyk, zwane, linux-kernel,
	USB development list

On Tue, 9 Dec 2003, Duncan Sands wrote:

> > You may simply have to release the lock because calling
> > usb_set_configuration and then reacquire it afterwards.
> 
> Right, I did this in my patch along with the other changes, but in fact it could
> be fixed separately.

Doesn't this approach work?  I don't see anything wrong with it.  (Read 
"before" rather than "because" above -- my fingers don't always do what my 
mind wants them to do.)


> Well, you could just ensure you have a reference to the usb_device, and
> change usb_set_configuration and friends so that they don't Oops if the
> device has been disconnected.  This should be done anyway by the way -
> surely all core routines should behave themselves (eg: by failing with
> an error code) when called with a not-yet-freed struct usb_device?

Yes, that's the correct way to handle it.


> > I mean it won't cause an oops, although it might provide an invalid
> > result.  It's not _required_ by the API (maybe it should be).
> 
> It will cause an Oops - actconfig may be NULL.  This is the case after
> disconnect for example, and also momentarily the case doing configuration
> changes.

Sorry -- what I _really_ meant to say was that usb_ifnum_to_if needs to be 
rewritten to add a test for actconfig == NULL.  Once that's done properly, 
calling it without holding the lock won't oops even though it also might 
not give you the right answer.  Minor point; nobody would want to do that.


> The disconnect routine is only called if you have claimed an interface.
> If usbfs is looking for an interface to claim (and hasn't yet claimed
> one), then disconnect will not be called.  There is code in inode.c that
> informs usbfs when the device has been disconnected, but now that
> disconnect is per-interface, that is not good enough.

What about the call to usbfs_remove_device that's in usb_disconnect?

Alan Stern


^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [linux-usb-devel] Re: [OOPS,  usbcore, releaseintf] 2.6.0-test10-mm1
  2003-12-09 10:36                                                                     ` Duncan Sands
@ 2003-12-09 16:08                                                                       ` Alan Stern
  2003-12-09 20:24                                                                         ` Duncan Sands
  0 siblings, 1 reply; 113+ messages in thread
From: Alan Stern @ 2003-12-09 16:08 UTC (permalink / raw)
  To: Duncan Sands
  Cc: David Brownell, Vince, Randy.Dunlap, mfedyk, zwane, linux-kernel,
	USB development list

On Tue, 9 Dec 2003, Duncan Sands wrote:

> There is another solution by the way, which involves changes in the core:
> make dev->serialize into a read/write semaphore, and have disconnect
> be called with a READ lock taken on dev->serialize.  Then in usbfs, I can
> take a read lock on dev->serialize whenever I want to rummage around
> in the device configuration.  This way things can be left as they are (i.e.
> keep ps->devsem) and deadlock will not occur as long as usbfs only
> takes read locks on dev->serialize.  Of course calls to usb_set_configuration
> may deadlock as now unless you drop ps->devsem before calling, but I'm
> sure this can be dealt with (this is because it needs a write lock on
> dev->serialize).  I quite like this solution because it makes
> things more robust: as long as device drivers only take a read lock on
> dev->serialize, they should not deadlock for this reason.  (These deadlock
> possibilities are a fundamental problem because drivers call into the
> core (which makes itself atomic using dev->serialize and friends), but
> the core also calls back into drivers via disconnect methods which it
> wants to make atomic by holding the same lock).

Here's how I see it.

dev->serialize is meant to protect significant state changes, things like
set_configuration and device disconnect.  ps->devsem is meant to protect
against usbfs trying to do two things to the device at the same time.  
But you also want to protect against usbfs using the device during a state
change.  (The normal protection mechanisms don't work because usbfs might
be using a device without having a driver bound to any of its interfaces.)  
Given that, there's no reason usbfs shouldn't just use serialize instead
of devsem.

The fact that the core calls driver_disconnect with serialize already held 
then just makes your life simpler: You don't need to acquire the lock 
yourself!

The only tricky part is that you have to release serialize (which now is
your only lock) before calling set_configuration.  But as you said, you
would have to release ps->devsem anyway, so nothing's lost there.

Anything wrong with this approach?

Alan Stern


^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [linux-usb-devel] Re: [OOPS,  usbcore, releaseintf] 2.6.0-test10-mm1
  2003-12-09 16:08                                                                       ` Alan Stern
@ 2003-12-09 20:24                                                                         ` Duncan Sands
  0 siblings, 0 replies; 113+ messages in thread
From: Duncan Sands @ 2003-12-09 20:24 UTC (permalink / raw)
  To: Alan Stern
  Cc: David Brownell, Vince, Randy.Dunlap, mfedyk, zwane, linux-kernel,
	USB development list

> Here's how I see it.
>
> dev->serialize is meant to protect significant state changes, things like
> set_configuration and device disconnect.  ps->devsem is meant to protect
> against usbfs trying to do two things to the device at the same time.

Actually no: it is a read lock and all routines take it with down_read except
for the disconnect routine.  So it is only there to guard against disconnect.

> But you also want to protect against usbfs using the device during a state
> change.  (The normal protection mechanisms don't work because usbfs might
> be using a device without having a driver bound to any of its interfaces.)
> Given that, there's no reason usbfs shouldn't just use serialize instead
> of devsem.
>
> The fact that the core calls driver_disconnect with serialize already held
> then just makes your life simpler: You don't need to acquire the lock
> yourself!
>
> The only tricky part is that you have to release serialize (which now is
> your only lock) before calling set_configuration.  But as you said, you
> would have to release ps->devsem anyway, so nothing's lost there.
>
> Anything wrong with this approach?

Nothing - that is exactly what my patch does.  And it works... except
for the Oops in usb_put_dev.

All the best,

Duncan.

^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [linux-usb-devel] Re: [OOPS,  usbcore, releaseintf] 2.6.0-test10-mm1
  2003-12-09 15:55                                                                       ` Alan Stern
@ 2003-12-09 20:36                                                                         ` Duncan Sands
  0 siblings, 0 replies; 113+ messages in thread
From: Duncan Sands @ 2003-12-09 20:36 UTC (permalink / raw)
  To: Alan Stern
  Cc: David Brownell, Vince, Randy.Dunlap, mfedyk, zwane, linux-kernel,
	USB development list

On Tuesday 09 December 2003 16:55, Alan Stern wrote:
> On Tue, 9 Dec 2003, Duncan Sands wrote:
> > > You may simply have to release the lock because calling
> > > usb_set_configuration and then reacquire it afterwards.
> >
> > Right, I did this in my patch along with the other changes, but in fact
> > it could be fixed separately.
>
> Doesn't this approach work?  I don't see anything wrong with it.  (Read
> "before" rather than "because" above -- my fingers don't always do what my
> mind wants them to do.)

You mean, drop ps->devsem, take dev->serialize, check for disconnect,
proceed if not disconnected, do some stuff (traverse the configuration for
example), drop dev->serialize, take ps->devsem, check for disconnect,
proceed if not disconnected?  Well yes, but doing this all over the place
would only make the whole driver more complicated and more fragile.

> > Well, you could just ensure you have a reference to the usb_device, and
> > change usb_set_configuration and friends so that they don't Oops if the
> > device has been disconnected.  This should be done anyway by the way -
> > surely all core routines should behave themselves (eg: by failing with
> > an error code) when called with a not-yet-freed struct usb_device?
>
> Yes, that's the correct way to handle it.
>
> > > I mean it won't cause an oops, although it might provide an invalid
> > > result.  It's not _required_ by the API (maybe it should be).
> >
> > It will cause an Oops - actconfig may be NULL.  This is the case after
> > disconnect for example, and also momentarily the case doing configuration
> > changes.
>
> Sorry -- what I _really_ meant to say was that usb_ifnum_to_if needs to be
> rewritten to add a test for actconfig == NULL.  Once that's done properly,
> calling it without holding the lock won't oops even though it also might
> not give you the right answer.  Minor point; nobody would want to do that.
>
> > The disconnect routine is only called if you have claimed an interface.
> > If usbfs is looking for an interface to claim (and hasn't yet claimed
> > one), then disconnect will not be called.  There is code in inode.c that
> > informs usbfs when the device has been disconnected, but now that
> > disconnect is per-interface, that is not good enough.
>
> What about the call to usbfs_remove_device that's in usb_disconnect?

That's the code in inode.c that I mentioned.

Ciao,

Duncan.

^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [linux-usb-devel] Re: [OOPS,  usbcore, releaseintf] 2.6.0-test10-mm1
  2003-12-09 15:47                                                                       ` Alan Stern
@ 2003-12-09 21:12                                                                         ` Duncan Sands
  2003-12-09 21:58                                                                           ` Alan Stern
  0 siblings, 1 reply; 113+ messages in thread
From: Duncan Sands @ 2003-12-09 21:12 UTC (permalink / raw)
  To: Alan Stern
  Cc: David Brownell, Vince, Randy.Dunlap, mfedyk, zwane, linux-kernel,
	USB development list, Greg KH

> > EIP is at hcd_pci_release+0x19/0x20 [usbcore]

> I don't understand this stack dump.  The EIP address is _after the end_ of
> hcd_pci_release, as you can see from the fact that the following code is
> nothing but a long string of NOPs.

Hi Alan, I'm not sure what you mean.  0x19/0x20 seems to be inside the code
to me :)  On my machine, this is what it corresponds to:

static void hcd_pci_release(struct usb_bus *bus)
{
   0:   55                      push   %ebp
   1:   89 e5                   mov    %esp,%ebp
   3:   83 ec 04                sub    $0x4,%esp
        struct usb_hcd *hcd = bus->hcpriv;
   6:   8b 45 08                mov    0x8(%ebp),%eax
   9:   8b 50 30                mov    0x30(%eax),%edx

        if (hcd)
   c:   85 d2                   test   %edx,%edx
   e:   74 0c                   je     1c <hcd_pci_release+0x1c>
                hcd->driver->hcd_free(hcd);
  10:   8b 82 38 01 00 00       mov    0x138(%edx),%eax
  16:   89 14 24                mov    %edx,(%esp,1)
  19:   ff 50 28                call   *0x28(%eax)      <= HERE
}
  1c:   c9                      leave
  1d:   c3                      ret
  1e:   89 f6                   mov    %esi,%esi

So if Vince's disassembly is the same, the problem is that
hcd->driver or hcd->driver->hcd_free is stuffed.

> Also, I don't understand the cause of
> the oops.  What does the PREEMPT mean?  There's no indication that a null
> pointer was dereferenced.  None of the registers contains 0.

I guess PREEMPT means it's a kernel with preempt support.  There is
indeed no indication that a NULL pointer was dereferenced.  Maybe it
is use-after-free.

> But if you think that's the problem, try adding a printk to
> hcd_pci_release to display the values of bus, hcd->driver, and
> hcd->driver->hcd_free.  Knowing which one is NULL ought to help your
> analysis.

I will send Vince a patch.

Ciao,

Duncan.

^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [linux-usb-devel] Re: [OOPS,  usbcore, releaseintf] 2.6.0-test10-mm1
  2003-12-09 21:12                                                                         ` Duncan Sands
@ 2003-12-09 21:58                                                                           ` Alan Stern
  2003-12-09 22:07                                                                             ` Duncan Sands
  0 siblings, 1 reply; 113+ messages in thread
From: Alan Stern @ 2003-12-09 21:58 UTC (permalink / raw)
  To: Duncan Sands
  Cc: David Brownell, Vince, Randy.Dunlap, mfedyk, zwane, linux-kernel,
	USB development list, Greg KH

On Tue, 9 Dec 2003, Duncan Sands wrote:

> > > EIP is at hcd_pci_release+0x19/0x20 [usbcore]
> 
> > I don't understand this stack dump.  The EIP address is _after the end_ of
> > hcd_pci_release, as you can see from the fact that the following code is
> > nothing but a long string of NOPs.
> 
> Hi Alan, I'm not sure what you mean.  0x19/0x20 seems to be inside the code
> to me :)  On my machine, this is what it corresponds to:
> 
> static void hcd_pci_release(struct usb_bus *bus)
> {
>    0:   55                      push   %ebp
>    1:   89 e5                   mov    %esp,%ebp
>    3:   83 ec 04                sub    $0x4,%esp
>         struct usb_hcd *hcd = bus->hcpriv;
>    6:   8b 45 08                mov    0x8(%ebp),%eax
>    9:   8b 50 30                mov    0x30(%eax),%edx
> 
>         if (hcd)
>    c:   85 d2                   test   %edx,%edx
>    e:   74 0c                   je     1c <hcd_pci_release+0x1c>
>                 hcd->driver->hcd_free(hcd);
>   10:   8b 82 38 01 00 00       mov    0x138(%edx),%eax
>   16:   89 14 24                mov    %edx,(%esp,1)
>   19:   ff 50 28                call   *0x28(%eax)      <= HERE
> }
>   1c:   c9                      leave
>   1d:   c3                      ret
>   1e:   89 f6                   mov    %esi,%esi
> 
> So if Vince's disassembly is the same, the problem is that
> hcd->driver or hcd->driver->hcd_free is stuffed.

Clearly that compiler is different from mine.  On my machine the "ret"  
opcode is at offset 0x16, not 0x1d.  Also, I guess the display of the code
bytes in stack dumps got changed at some point; now it shows values both
before and after the EIP location (it used to show just the values after
EIP).  Okay, that clears that up.

The oops occurring where it did means that hcd->driver->hcd_free is not a
valid function pointer, even though hcd->driver appears to point to actual
data.  So it's not a data access through a null pointer; it's a call to an
unmapped (possibly null) location.

It's not at all clear how that could happen.  Those pointers are located
in static data in the HCD modules.  It doesn't seem likely that the
pointer was overwritten.  The only other possibility I can think of is
that the module was already unloaded.  But that's not possible since you
were holding a reference to a device on that bus.

Maybe the answer is that hcd->driver is messed up but for some reason 
still points to actual data.  I can't imagine why that would happen 
either.

Alan Stern


^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [linux-usb-devel] Re: [OOPS,  usbcore, releaseintf] 2.6.0-test10-mm1
  2003-12-09 21:58                                                                           ` Alan Stern
@ 2003-12-09 22:07                                                                             ` Duncan Sands
  2003-12-09 22:25                                                                               ` David Brownell
  2003-12-10  4:31                                                                               ` Vince
  0 siblings, 2 replies; 113+ messages in thread
From: Duncan Sands @ 2003-12-09 22:07 UTC (permalink / raw)
  To: Alan Stern
  Cc: David Brownell, Vince, Randy.Dunlap, mfedyk, zwane, linux-kernel,
	USB development list, Greg KH

> It's not at all clear how that could happen.  Those pointers are located
> in static data in the HCD modules.  It doesn't seem likely that the
> pointer was overwritten.  The only other possibility I can think of is
> that the module was already unloaded.  But that's not possible since you
> were holding a reference to a device on that bus.

It occurred on system shutdown - so I guess the module was unloaded.
Maybe the bus reference counting is borked.  I've sent Vince a patch that
should produce some more info.

> Maybe the answer is that hcd->driver is messed up but for some reason
> still points to actual data.  I can't imagine why that would happen
> either.

Me neither.

Duncan.

^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [linux-usb-devel] Re: [OOPS,  usbcore, releaseintf] 2.6.0-test10-mm1
  2003-12-09 22:07                                                                             ` Duncan Sands
@ 2003-12-09 22:25                                                                               ` David Brownell
  2003-12-09 22:33                                                                                 ` Duncan Sands
  2003-12-10  3:43                                                                                 ` Alan Stern
  2003-12-10  4:31                                                                               ` Vince
  1 sibling, 2 replies; 113+ messages in thread
From: David Brownell @ 2003-12-09 22:25 UTC (permalink / raw)
  To: Duncan Sands
  Cc: Alan Stern, Vince, Randy.Dunlap, mfedyk, zwane, linux-kernel,
	USB development list, Greg KH

Duncan Sands wrote:
>>It's not at all clear how that could happen.  Those pointers are located
>>in static data in the HCD modules.  It doesn't seem likely that the
>>pointer was overwritten.  The only other possibility I can think of is
>>that the module was already unloaded.  But that's not possible since you
>>were holding a reference to a device on that bus.
> 
> 
> It occurred on system shutdown - so I guess the module was unloaded.
> Maybe the bus reference counting is borked.  

Various folk have reported similar problems on system shutdown
before, and the simple fix has been not to clean up so aggressively.

What puzzled me was that a normal "rmmod" wouldn't give the
same symptoms -- but the same codepaths could oops in certain
system shutdown scenarios.

- Dave




^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [linux-usb-devel] Re: [OOPS,  usbcore, releaseintf] 2.6.0-test10-mm1
  2003-12-09 22:25                                                                               ` David Brownell
@ 2003-12-09 22:33                                                                                 ` Duncan Sands
  2003-12-10  3:12                                                                                   ` David Brownell
  2003-12-10  3:43                                                                                 ` Alan Stern
  1 sibling, 1 reply; 113+ messages in thread
From: Duncan Sands @ 2003-12-09 22:33 UTC (permalink / raw)
  To: David Brownell
  Cc: Alan Stern, Vince, Randy.Dunlap, mfedyk, zwane, linux-kernel,
	USB development list, Greg KH

> > It occurred on system shutdown - so I guess the module was unloaded.
> > Maybe the bus reference counting is borked.
>
> Various folk have reported similar problems on system shutdown
> before, and the simple fix has been not to clean up so aggressively.

?

> What puzzled me was that a normal "rmmod" wouldn't give the
> same symptoms -- but the same codepaths could oops in certain
> system shutdown scenarios.

Duncan.

^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [linux-usb-devel] Re: [OOPS,  usbcore, releaseintf] 2.6.0-test10-mm1
  2003-12-09 10:49                                                                     ` Duncan Sands
  2003-12-09 15:47                                                                       ` Alan Stern
@ 2003-12-10  1:49                                                                       ` Greg KH
  1 sibling, 0 replies; 113+ messages in thread
From: Greg KH @ 2003-12-10  1:49 UTC (permalink / raw)
  To: Duncan Sands
  Cc: Alan Stern, David Brownell, Vince, Randy.Dunlap, mfedyk, zwane,
	linux-kernel, USB development list

On Tue, Dec 09, 2003 at 11:49:27AM +0100, Duncan Sands wrote:
> > I didn't note the reason for the oops.  Was it a segmentation violation?
> > The usb_device memory isn't deallocated until the reference count goes to
> > 0.  Maybe something was doing an extra usb_put_dev.
> 
> Maybe this is related to "oopses in kobjects in 2.6.0-test11 (was Re: kobject patch)"?
> My call to usb_put_dev in usbdev_release is releasing the kobject,
> which shows that the reference count was not already zero.  However
> it dereferences a NULL pointer in here:
> 
> static void hcd_pci_release(struct usb_bus *bus)
> {
>         struct usb_hcd *hcd = bus->hcpriv;
> 
>         if (hcd)
>                 hcd->driver->hcd_free(hcd);
> }
> 
> which suggests that the hcd was already released.  Maybe Greg can comment?

Does not look like the kobject oops.  This looks like something else is
messing up the hcd pointer.

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [linux-usb-devel] Re: [OOPS,  usbcore, releaseintf] 2.6.0-test10-mm1
  2003-12-09 22:33                                                                                 ` Duncan Sands
@ 2003-12-10  3:12                                                                                   ` David Brownell
  0 siblings, 0 replies; 113+ messages in thread
From: David Brownell @ 2003-12-10  3:12 UTC (permalink / raw)
  To: Duncan Sands
  Cc: Alan Stern, Vince, Randy.Dunlap, mfedyk, zwane, linux-kernel,
	USB development list, Greg KH

Duncan Sands wrote:
>>>It occurred on system shutdown - so I guess the module was unloaded.
>>>Maybe the bus reference counting is borked.
>>
>>Various folk have reported similar problems on system shutdown
>>before, and the simple fix has been not to clean up so aggressively.
> 
> 
> ?

Like:  don't rmmod during system shutdown.  Some distros do that,
others don't.  Or maybe it was individual customization, I don't
recall (never having had the issue myself).


>>What puzzled me was that a normal "rmmod" wouldn't give the
>>same symptoms -- but the same codepaths could oops in certain
>>system shutdown scenarios.
> 
> 
> Duncan.
> 



^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [linux-usb-devel] Re: [OOPS,  usbcore, releaseintf] 2.6.0-test10-mm1
  2003-12-09 22:25                                                                               ` David Brownell
  2003-12-09 22:33                                                                                 ` Duncan Sands
@ 2003-12-10  3:43                                                                                 ` Alan Stern
  2003-12-10 13:12                                                                                   ` Duncan Sands
  2003-12-10 15:30                                                                                   ` Greg KH
  1 sibling, 2 replies; 113+ messages in thread
From: Alan Stern @ 2003-12-10  3:43 UTC (permalink / raw)
  To: David Brownell
  Cc: Duncan Sands, Vince, Randy.Dunlap, mfedyk, zwane, linux-kernel,
	USB development list, Greg KH

On Tue, 9 Dec 2003, David Brownell wrote:

> Various folk have reported similar problems on system shutdown
> before, and the simple fix has been not to clean up so aggressively.
> 
> What puzzled me was that a normal "rmmod" wouldn't give the
> same symptoms -- but the same codepaths could oops in certain
> system shutdown scenarios.

In an earlier message I wrote that the HC driver couldn't unload so long
as the device usbfs was using held a reference to its bus.  I just did
some checking, and guess what: It can!

I looked at both the UHCI and OHCI drivers.  In their module_exit routines
they call pci_unregister_driver().  Without knowing how the PCI subsystem
works, I would assume this behaves like any other "deregister" routine in
the driver model and returns without waiting for any reference count to go
to 0 -- that's what release callbacks are for.

However, the module_exit routines _don't_ wait for the release callbacks.  
They just go right on ahead and exit.  Result: when the reference count 
eventually does go to 0 (when usbfs drops its last reference), the 
hcd_free routine is no longer present and you get an oops.

The proper fix would be to have each HC driver keep track of how many 
instances are allocated.  The module_exit routine must wait for that 
number to drop to 0 before returning.

Alan Stern


^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [linux-usb-devel] Re: [OOPS,  usbcore, releaseintf] 2.6.0-test10-mm1
  2003-12-09 22:07                                                                             ` Duncan Sands
  2003-12-09 22:25                                                                               ` David Brownell
@ 2003-12-10  4:31                                                                               ` Vince
  1 sibling, 0 replies; 113+ messages in thread
From: Vince @ 2003-12-10  4:31 UTC (permalink / raw)
  To: Duncan Sands
  Cc: Alan Stern, David Brownell, Randy.Dunlap, mfedyk, zwane,
	linux-kernel, USB development list, Greg KH

[-- Attachment #1: Type: text/plain, Size: 867 bytes --]

Duncan Sands wrote:
>>It's not at all clear how that could happen.  Those pointers are located
>>in static data in the HCD modules.  It doesn't seem likely that the
>>pointer was overwritten.  The only other possibility I can think of is
>>that the module was already unloaded.  But that's not possible since you
>>were holding a reference to a device on that bus.
> 
> 
> It occurred on system shutdown - so I guess the module was unloaded.
> Maybe the bus reference counting is borked.  I've sent Vince a patch that
> should produce some more info.

Actually, no need to be at system shutdown: in my case a simple
"/etc/init.d/hotplug stop" is enough to trigger the oops.
However, it doesn't happen every time, I had to try some times before I 
got another oops again. I enclose the logs with the additional debugging 
information in attachment.

Regards,

Vincent

[-- Attachment #2: logs --]
[-- Type: text/plain, Size: 6017 bytes --]

kernel: ehci_hcd 0000:00:10.3: remove, state 1
kernel: usb usb1: USB disconnect, address 1
kernel: ehci_hcd 0000:00:10.3: USB bus 1 deregistered
kernel: DEBUG
kernel: Call Trace:
kernel: [<e0afca9f>] hcd_pci_release+0x1f/0x70 [usbcore]
kernel: [<c01bea5a>] unlink+0x7a/0x80
kernel: [<e0af857d>] usb_host_release+0x1d/0x20 [usbcore]
kernel: [<c0205eb8>] class_dev_release+0x58/0x60
kernel: [<c01bee7b>] kobject_cleanup+0x7b/0x80
kernel: [<e0afd0ba>] usb_hcd_pci_remove+0x12a/0x180 [usbcore]
kernel: [<c01c650b>] pci_device_remove+0x3b/0x40
kernel: [<c0205494>] device_release_driver+0x64/0x70
kernel: [<c02054c0>] driver_detach+0x20/0x30
kernel: [<c02056ed>] bus_remove_driver+0x3d/0x80
kernel: [<c0205b03>] driver_unregister+0x13/0x28
kernel: [<c01c66e6>] pci_unregister_driver+0x16/0x30
kernel: [<e0a19e6f>] cleanup+0xf/0x13 [ehci_hcd]
kernel: [<c0136e43>] sys_delete_module+0x133/0x150
kernel: [<c014c124>] sys_munmap+0x44/0x70
kernel: [<c02a6f8e>] sysenter_past_esp+0x43/0x65
kernel: 
kernel: DEBUG: hcd->driver: e0a19ee0
kernel: DEBUG: hcd->driver->hcd_free: e0a166d0
kernel: uhci_hcd 0000:00:10.0: remove, state 1
kernel: usb usb2: USB disconnect, address 1
kernel: usb 2-1: USB disconnect, address 2
kernel: drivers/char/lirc/lirc_atiusb.c: USB Remote on #200 now disconnected
kernel: usb 2-2: USB disconnect, address 3
kernel: uhci_hcd 0000:00:10.0: USB bus 2 deregistered
kernel: uhci_hcd 0000:00:10.1: remove, state 1
kernel: usb usb3: USB disconnect, address 1
kernel: uhci_hcd 0000:00:10.1: USB bus 3 deregistered
kernel: DEBUG
kernel: Call Trace:
kernel: [<e0afca9f>] hcd_pci_release+0x1f/0x70 [usbcore]
kernel: [<c01bea5a>] unlink+0x7a/0x80
kernel: [<e0af857d>] usb_host_release+0x1d/0x20 [usbcore]
kernel: [<c0205eb8>] class_dev_release+0x58/0x60
kernel: [<c01bee7b>] kobject_cleanup+0x7b/0x80
kernel: [<e0afd0ba>] usb_hcd_pci_remove+0x12a/0x180 [usbcore]
kernel: [<c01c650b>] pci_device_remove+0x3b/0x40
kernel: [<c0205494>] device_release_driver+0x64/0x70
kernel: [<c02054c0>] driver_detach+0x20/0x30
kernel: [<c02056ed>] bus_remove_driver+0x3d/0x80
kernel: [<c0205b03>] driver_unregister+0x13/0x28
kernel: [<c01c66e6>] pci_unregister_driver+0x16/0x30
kernel: [<e0c8dd4f>] uhci_hcd_cleanup+0xf/0x59 [uhci_hcd]
kernel: [<c0136e43>] sys_delete_module+0x133/0x150
kernel: [<c014c124>] sys_munmap+0x44/0x70
kernel: [<c02a6f8e>] sysenter_past_esp+0x43/0x65
kernel: 
kernel: DEBUG: hcd->driver: e0c8de40
kernel: DEBUG: hcd->driver->hcd_free: e0c8dce0
kernel: uhci_hcd 0000:00:10.2: remove, state 1
kernel: usb usb4: USB disconnect, address 1
kernel: uhci_hcd 0000:00:10.2: USB bus 4 deregistered
kernel: DEBUG
kernel: Call Trace:
kernel: [<e0afca9f>] hcd_pci_release+0x1f/0x70 [usbcore]
kernel: [<c01bea5a>] unlink+0x7a/0x80
kernel: [<e0af857d>] usb_host_release+0x1d/0x20 [usbcore]
kernel: [<c0205eb8>] class_dev_release+0x58/0x60
kernel: [<c01bee7b>] kobject_cleanup+0x7b/0x80
kernel: [<e0afd0ba>] usb_hcd_pci_remove+0x12a/0x180 [usbcore]
kernel: [<c01c650b>] pci_device_remove+0x3b/0x40
kernel: [<c0205494>] device_release_driver+0x64/0x70
kernel: [<c02054c0>] driver_detach+0x20/0x30
kernel: [<c02056ed>] bus_remove_driver+0x3d/0x80
kernel: [<c0205b03>] driver_unregister+0x13/0x28
kernel: [<c01c66e6>] pci_unregister_driver+0x16/0x30
kernel: [<e0c8dd4f>] uhci_hcd_cleanup+0xf/0x59 [uhci_hcd]
kernel: [<c0136e43>] sys_delete_module+0x133/0x150
kernel: [<c014c124>] sys_munmap+0x44/0x70
kernel: [<c02a6f8e>] sysenter_past_esp+0x43/0x65
kernel: 
kernel: DEBUG: hcd->driver: e0c8de40
kernel: DEBUG: hcd->driver->hcd_free: e0c8dce0
kernel: DEBUG
kernel: Call Trace:
kernel: [<e0afca9f>] hcd_pci_release+0x1f/0x70 [usbcore]
kernel: [<e0af857d>] usb_host_release+0x1d/0x20 [usbcore]
kernel: [<c0205eb8>] class_dev_release+0x58/0x60
kernel: [<e0afb854>] usb_destroy_configuration+0xb4/0xf0 [usbcore]
kernel: [<c01bee7b>] kobject_cleanup+0x7b/0x80
kernel: [<e0af48f6>] usb_release_dev+0x46/0x60 [usbcore]
kernel: [<c0204160>] device_release+0x20/0x80
kernel: [<c01bee7b>] kobject_cleanup+0x7b/0x80
kernel: [<e0afdd39>] usbdev_release+0x79/0xb0 [usbcore]
kernel: [<c01586c0>] __fput+0x100/0x120
kernel: [<c0156c99>] filp_close+0x59/0x90
kernel: [<c0156d31>] sys_close+0x61/0xa0
kernel: [<c02a6f8e>] sysenter_past_esp+0x43/0x65
kernel: 
kernel: DEBUG: hcd->driver: e0c8de40
kernel: Unable to handle kernel paging request at virtual address e0c8de68
kernel: printing eip:
kernel: e0afcabf
kernel: *pde = 1da4c067
kernel: *pte = 00000000
kernel: Oops: 0000 [#1]
kernel: PREEMPT 
kernel: CPU:    0
kernel: EIP:    0060:[<e0afcabf>]    Not tainted VLI
kernel: EFLAGS: 00010286
kernel: EIP is at hcd_pci_release+0x3f/0x70 [usbcore]
kernel: eax: e0c8de40   ebx: de529000   ecx: 00000001   edx: c02ef058
kernel: esi: c032265c   edi: c0322680   ebp: decb6200   esp: da1cdee4
kernel: ds: 007b   es: 007b   ss: 0068
kernel: Process modem_run (pid: 2424, threadinfo=da1cc000 task=de73ad00)
kernel: Stack: e0b02781 e0c8de40 de529050 e0af857d de529000 c0205eb8 de529048 e0afb854 
kernel: c0322450 00000282 c01bee7b de529050 decb6200 c0322428 c0322440 e0af48f6 
kernel: de529050 00000000 c0204160 decb62cc da1cc000 de6a4ec0 decf1398 decb62f4 
kernel: Call Trace:
kernel: [<e0af857d>] usb_host_release+0x1d/0x20 [usbcore]
kernel: [<c0205eb8>] class_dev_release+0x58/0x60
kernel: [<e0afb854>] usb_destroy_configuration+0xb4/0xf0 [usbcore]
kernel: [<c01bee7b>] kobject_cleanup+0x7b/0x80
kernel: [<e0af48f6>] usb_release_dev+0x46/0x60 [usbcore]
kernel: [<c0204160>] device_release+0x20/0x80
kernel: [<c01bee7b>] kobject_cleanup+0x7b/0x80
kernel: [<e0afdd39>] usbdev_release+0x79/0xb0 [usbcore]
kernel: [<c01586c0>] __fput+0x100/0x120
kernel: [<c0156c99>] filp_close+0x59/0x90
kernel: [<c0156d31>] sys_close+0x61/0xa0
kernel: [<c02a6f8e>] sysenter_past_esp+0x43/0x65
kernel: 
kernel: Code: e0 e8 46 63 62 df e8 91 e9 60 df 85 db 74 3b 8b 83 08 01 00 00 c7 04 24 81 27 b0 e0 89 44 24 04 e8 27 63 62 df 8b 83 08 01 00 00 <8b> 40 28 c7 04 24 c0 43 b0 e0 89 44 24 04 e8 0e 63 62 df 8b 83 
kernel: <0>Fatal exception: panic in 5 seconds

^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [linux-usb-devel] Re: [OOPS,  usbcore, releaseintf] 2.6.0-test10-mm1
  2003-12-10  3:43                                                                                 ` Alan Stern
@ 2003-12-10 13:12                                                                                   ` Duncan Sands
  2003-12-10 15:13                                                                                     ` Alan Stern
  2003-12-10 15:30                                                                                   ` Greg KH
  1 sibling, 1 reply; 113+ messages in thread
From: Duncan Sands @ 2003-12-10 13:12 UTC (permalink / raw)
  To: Alan Stern, David Brownell
  Cc: Vince, Randy.Dunlap, mfedyk, zwane, linux-kernel,
	USB development list, Greg KH

> In an earlier message I wrote that the HC driver couldn't unload so long
> as the device usbfs was using held a reference to its bus.  I just did
> some checking, and guess what: It can!

Uh oh.

> I looked at both the UHCI and OHCI drivers.  In their module_exit routines
> they call pci_unregister_driver().  Without knowing how the PCI subsystem
> works, I would assume this behaves like any other "deregister" routine in
> the driver model and returns without waiting for any reference count to go
> to 0 -- that's what release callbacks are for.
>
> However, the module_exit routines _don't_ wait for the release callbacks.
> They just go right on ahead and exit.  Result: when the reference count
> eventually does go to 0 (when usbfs drops its last reference), the
> hcd_free routine is no longer present and you get an oops.
>
> The proper fix would be to have each HC driver keep track of how many
> instances are allocated.  The module_exit routine must wait for that
> number to drop to 0 before returning.

Is this how it is usually done?

All the best,

Duncan.

^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [linux-usb-devel] Re: [OOPS,  usbcore, releaseintf] 2.6.0-test10-mm1
  2003-12-08 23:09                                                                   ` Alan Stern
                                                                                       ` (2 preceding siblings ...)
  2003-12-09 10:49                                                                     ` Duncan Sands
@ 2003-12-10 13:22                                                                     ` Duncan Sands
  2003-12-10 16:20                                                                       ` Oliver Neukum
  2003-12-10 17:21                                                                       ` David Brownell
  3 siblings, 2 replies; 113+ messages in thread
From: Duncan Sands @ 2003-12-10 13:22 UTC (permalink / raw)
  To: Alan Stern
  Cc: David Brownell, Vince, Randy.Dunlap, mfedyk, zwane, linux-kernel,
	USB development list

> That leads to the question of how to assure that the device doesn't go
> away before usb_set_configuration is called.  Perhaps
> usb_set_configuration and usb_unbind_interface should be changed to
> require the caller to hold the serialize lock.

How about

__usb_set_configuration - lockless version
usb_set_configuration - locked version

?

By the way, here is the list of routines that cause trouble for usbfs:

usb_probe_interface
usb_reset_device
usb_set_configuration
usb_unbind_interface

Both usb_set_configuration and usb_unbind_interface can be trivially
modified to have a __ form.  usb_probe_interface and usb_reset_device
require thought.

All the best,

Duncan.

^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [linux-usb-devel] Re: [OOPS,  usbcore, releaseintf] 2.6.0-test10-mm1
  2003-12-10 13:12                                                                                   ` Duncan Sands
@ 2003-12-10 15:13                                                                                     ` Alan Stern
  0 siblings, 0 replies; 113+ messages in thread
From: Alan Stern @ 2003-12-10 15:13 UTC (permalink / raw)
  To: Duncan Sands
  Cc: David Brownell, Vince, Randy.Dunlap, mfedyk, zwane, linux-kernel,
	USB development list, Greg KH

On Wed, 10 Dec 2003, Duncan Sands wrote:

> > The proper fix would be to have each HC driver keep track of how many
> > instances are allocated.  The module_exit routine must wait for that
> > number to drop to 0 before returning.
> 
> Is this how it is usually done?

I don't know -- but it's how I would do it.

Maybe Greg or Randy can tell us?

Alan Stern


^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [linux-usb-devel] Re: [OOPS,  usbcore, releaseintf] 2.6.0-test10-mm1
  2003-12-10  3:43                                                                                 ` Alan Stern
  2003-12-10 13:12                                                                                   ` Duncan Sands
@ 2003-12-10 15:30                                                                                   ` Greg KH
  2003-12-10 16:02                                                                                     ` Duncan Sands
  2003-12-10 17:25                                                                                     ` Alan Stern
  1 sibling, 2 replies; 113+ messages in thread
From: Greg KH @ 2003-12-10 15:30 UTC (permalink / raw)
  To: Alan Stern
  Cc: David Brownell, Duncan Sands, Vince, Randy.Dunlap, mfedyk, zwane,
	linux-kernel, USB development list

On Tue, Dec 09, 2003 at 10:43:23PM -0500, Alan Stern wrote:
> On Tue, 9 Dec 2003, David Brownell wrote:
> 
> > Various folk have reported similar problems on system shutdown
> > before, and the simple fix has been not to clean up so aggressively.
> > 
> > What puzzled me was that a normal "rmmod" wouldn't give the
> > same symptoms -- but the same codepaths could oops in certain
> > system shutdown scenarios.
> 
> In an earlier message I wrote that the HC driver couldn't unload so long
> as the device usbfs was using held a reference to its bus.  I just did
> some checking, and guess what: It can!
> 
> I looked at both the UHCI and OHCI drivers.  In their module_exit routines
> they call pci_unregister_driver().  Without knowing how the PCI subsystem
> works, I would assume this behaves like any other "deregister" routine in
> the driver model and returns without waiting for any reference count to go
> to 0 -- that's what release callbacks are for.

No, the pci core calls the release() function in the pci driver that is
bound to that device.  It waits for that release() call to return before
continuing on.  You can sleep for however long you want in that
function, but once you return from there, the pci structures for that
device will be cleaned up.

> However, the module_exit routines _don't_ wait for the release callbacks.  

Not true.

> They just go right on ahead and exit.  Result: when the reference count 
> eventually does go to 0 (when usbfs drops its last reference), the 
> hcd_free routine is no longer present and you get an oops.

Hm, this could be easily tested by sleeping until usb_host_release() is
called when you unregister a device.  The i2c, pcmcia, and network
subsytems do this.  I think we now have a helper function in the driver
core to do this for us, so we don't have to declare our own completion
variable...

> The proper fix would be to have each HC driver keep track of how many 
> instances are allocated.  The module_exit routine must wait for that 
> number to drop to 0 before returning.

That's what my proposal 1 paragraph up would do.  If I get the chance
this afternoon, I'll try to implement it if no one beats me to it...

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [linux-usb-devel] Re: [OOPS,  usbcore, releaseintf] 2.6.0-test10-mm1
  2003-12-10 15:30                                                                                   ` Greg KH
@ 2003-12-10 16:02                                                                                     ` Duncan Sands
  2003-12-10 20:53                                                                                       ` Greg KH
  2003-12-10 17:25                                                                                     ` Alan Stern
  1 sibling, 1 reply; 113+ messages in thread
From: Duncan Sands @ 2003-12-10 16:02 UTC (permalink / raw)
  To: Greg KH, Alan Stern
  Cc: David Brownell, Vince, Randy.Dunlap, mfedyk, zwane, linux-kernel,
	USB development list

> No, the pci core calls the release() function in the pci driver that is
> bound to that device.  It waits for that release() call to return before
> continuing on.  You can sleep for however long you want in that
> function, but once you return from there, the pci structures for that
> device will be cleaned up.
>
> > However, the module_exit routines _don't_ wait for the release callbacks.
>
> Not true.
>
> > They just go right on ahead and exit.  Result: when the reference count
> > eventually does go to 0 (when usbfs drops its last reference), the
> > hcd_free routine is no longer present and you get an oops.
>
> Hm, this could be easily tested by sleeping until usb_host_release() is
> called when you unregister a device.  The i2c, pcmcia, and network
> subsytems do this.  I think we now have a helper function in the driver
> core to do this for us, so we don't have to declare our own completion
> variable...
>
> > The proper fix would be to have each HC driver keep track of how many
> > instances are allocated.  The module_exit routine must wait for that
> > number to drop to 0 before returning.
>
> That's what my proposal 1 paragraph up would do.  If I get the chance
> this afternoon, I'll try to implement it if no one beats me to it...

Hi Greg, so this means that rmmod will sleep in an unkillable state until
all references are dropped?  I don't know if you've been following this
thread or not, but the oops occurred when I modified usbfs to hold a
reference to the usb_device until no-one was using a given usbfs file.  I
guess this means that I should change my patch so that the reference to
the usb_device is dropped as soon as possible, right?

Thanks for looking into this,

Duncan.

^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [linux-usb-devel] Re: [OOPS,  usbcore, releaseintf] 2.6.0-test10-mm1
  2003-12-10 13:22                                                                     ` Duncan Sands
@ 2003-12-10 16:20                                                                       ` Oliver Neukum
  2003-12-10 16:49                                                                         ` Duncan Sands
  2003-12-10 17:21                                                                       ` David Brownell
  1 sibling, 1 reply; 113+ messages in thread
From: Oliver Neukum @ 2003-12-10 16:20 UTC (permalink / raw)
  To: Duncan Sands, Alan Stern
  Cc: David Brownell, Vince, Randy.Dunlap, mfedyk, zwane, linux-kernel,
	USB development list

Am Mittwoch, 10. Dezember 2003 14:22 schrieb Duncan Sands:
> > That leads to the question of how to assure that the device doesn't go
> > away before usb_set_configuration is called.  Perhaps
> > usb_set_configuration and usb_unbind_interface should be changed to
> > require the caller to hold the serialize lock.
> 
> How about
> 
> __usb_set_configuration - lockless version
> usb_set_configuration - locked version

Partially done.
That's what the _physical version of usb_reset_device() is about.

	Regards
		Oliver
 

^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [linux-usb-devel] Re: [OOPS,  usbcore, releaseintf] 2.6.0-test10-mm1
  2003-12-10 16:20                                                                       ` Oliver Neukum
@ 2003-12-10 16:49                                                                         ` Duncan Sands
  2003-12-10 16:58                                                                           ` Oliver Neukum
  2003-12-10 17:34                                                                           ` David Brownell
  0 siblings, 2 replies; 113+ messages in thread
From: Duncan Sands @ 2003-12-10 16:49 UTC (permalink / raw)
  To: Oliver Neukum, Alan Stern
  Cc: David Brownell, Vince, Randy.Dunlap, mfedyk, zwane, linux-kernel,
	USB development list

On Wednesday 10 December 2003 17:20, Oliver Neukum wrote:
> Am Mittwoch, 10. Dezember 2003 14:22 schrieb Duncan Sands:
> > > That leads to the question of how to assure that the device doesn't go
> > > away before usb_set_configuration is called.  Perhaps
> > > usb_set_configuration and usb_unbind_interface should be changed to
> > > require the caller to hold the serialize lock.
> >
> > How about
> >
> > __usb_set_configuration - lockless version
> > usb_set_configuration - locked version
>
> Partially done.
> That's what the _physical version of usb_reset_device() is about.

Unfortunately, usb_physical_reset_device calls usb_set_configuration
which takes dev->serialize.

Duncan.

^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [linux-usb-devel] Re: [OOPS,  usbcore, releaseintf] 2.6.0-test10-mm1
  2003-12-10 16:49                                                                         ` Duncan Sands
@ 2003-12-10 16:58                                                                           ` Oliver Neukum
  2003-12-11  9:45                                                                             ` Duncan Sands
  2003-12-10 17:34                                                                           ` David Brownell
  1 sibling, 1 reply; 113+ messages in thread
From: Oliver Neukum @ 2003-12-10 16:58 UTC (permalink / raw)
  To: Duncan Sands, Alan Stern
  Cc: David Brownell, Vince, Randy.Dunlap, mfedyk, zwane, linux-kernel,
	USB development list

Am Mittwoch, 10. Dezember 2003 17:49 schrieb Duncan Sands:
> On Wednesday 10 December 2003 17:20, Oliver Neukum wrote:
> > Am Mittwoch, 10. Dezember 2003 14:22 schrieb Duncan Sands:
> > > > That leads to the question of how to assure that the device doesn't go
> > > > away before usb_set_configuration is called.  Perhaps
> > > > usb_set_configuration and usb_unbind_interface should be changed to
> > > > require the caller to hold the serialize lock.
> > >
> > > How about
> > >
> > > __usb_set_configuration - lockless version
> > > usb_set_configuration - locked version
> >
> > Partially done.
> > That's what the _physical version of usb_reset_device() is about.
> 
> Unfortunately, usb_physical_reset_device calls usb_set_configuration
> which takes dev->serialize.

That is bad, but the solution is obvious.
All such operations need a _physical version.
At first sight this may look less elegant than some lock dropping schemes,
but it is a solution that produces obviously correct code paths with respect
to locking.

	Regards
		Oliver


^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [linux-usb-devel] Re: [OOPS,  usbcore, releaseintf] 2.6.0-test10-mm1
  2003-12-10 13:22                                                                     ` Duncan Sands
  2003-12-10 16:20                                                                       ` Oliver Neukum
@ 2003-12-10 17:21                                                                       ` David Brownell
  2003-12-11  9:42                                                                         ` Duncan Sands
  1 sibling, 1 reply; 113+ messages in thread
From: David Brownell @ 2003-12-10 17:21 UTC (permalink / raw)
  To: Duncan Sands; +Cc: linux-kernel, USB development list

[ CC list trimmed, most folk are on one of the cc'd lists ]

> By the way, here is the list of routines that cause trouble for usbfs:
> 
> usb_probe_interface

In proc_resetdevice() ... after usb_reset_device().
If usb_reset_device() worked sanely, it wouldn't be
necessary to try fixing up its result.  Plus, last I
looked, I don't think usbfs fixed it up correctly.

Actually that call is dangerous and probably should
fail if usbfs isn't controlling all the interfaces
on the device ... checking before it tries.

> usb_reset_device

We've known for some time this routine needs a rewrite.
It's never quite worked right, and it doesn't handle
DFU style devices (like the most common USB 802.11b
adapters) well at all.

> usb_set_configuration

That is, you're saying that _if_ usbfs is modified to
get rid of ps->devsem and use dev->serialize instead,
then you'd need some other way to guard proc_setconfig()
against disconnect?  That still seems like a chicken/egg
issue to me.

> usb_unbind_interface

See the patch I posted yesterday evening, with usbfs parts
of the updates to driver binding.  It's incorrect for usbfs
ever to be calling that ... device_release_driver() is the
thing to call, for drivers that weren't bound using the
usb_driver_claim_interface() call.  That way the sysfs
state also gets cleaned up ...

- Dave



^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [linux-usb-devel] Re: [OOPS,  usbcore, releaseintf] 2.6.0-test10-mm1
  2003-12-10 15:30                                                                                   ` Greg KH
  2003-12-10 16:02                                                                                     ` Duncan Sands
@ 2003-12-10 17:25                                                                                     ` Alan Stern
  2003-12-10 20:46                                                                                       ` Greg KH
  1 sibling, 1 reply; 113+ messages in thread
From: Alan Stern @ 2003-12-10 17:25 UTC (permalink / raw)
  To: Greg KH
  Cc: David Brownell, Duncan Sands, Vince, Randy.Dunlap, mfedyk, zwane,
	linux-kernel, USB development list

Sorry Greg, I'm having trouble understanding if you're agreeing with me or 
disagreeing :-) ...

On Wed, 10 Dec 2003, Greg KH wrote:

> On Tue, Dec 09, 2003 at 10:43:23PM -0500, Alan Stern wrote:
> > 
> > I looked at both the UHCI and OHCI drivers.  In their module_exit routines
> > they call pci_unregister_driver().  Without knowing how the PCI subsystem
> > works, I would assume this behaves like any other "deregister" routine in
> > the driver model and returns without waiting for any reference count to go
> > to 0 -- that's what release callbacks are for.
> 
> No, the pci core calls the release() function in the pci driver that is
> bound to that device.  It waits for that release() call to return before
> continuing on.  You can sleep for however long you want in that
> function, but once you return from there, the pci structures for that
> device will be cleaned up.

I looked through the call tree for pci_unregister_driver().  There doesn't 
appear to be any place where it calls a release function.  In fact, struct 
pci_driver doesn't even _contain_ a release function!  Maybe you're 
thinking of the "remove" function.  But pci_unregister_driver() doesn't 
wait for that; it will return immediately even if the reference count is 
still > 0.

> 
> > However, the module_exit routines _don't_ wait for the release callbacks.  
> 
> Not true.

Here's the entire source for the UHCI module_exit routine:


static void __exit uhci_hcd_cleanup(void) 
{
	pci_unregister_driver(&uhci_pci_driver);
	
	if (kmem_cache_destroy(uhci_up_cachep))
		printk(KERN_INFO "uhci: not all urb_priv's were freed\n");

#ifdef CONFIG_PROC_FS
	remove_proc_entry("driver/uhci", 0);
#endif

	if (errbuf)
		kfree(errbuf);
}


Where in there does it wait for a release callback?

> 
> > They just go right on ahead and exit.  Result: when the reference count 
> > eventually does go to 0 (when usbfs drops its last reference), the 
> > hcd_free routine is no longer present and you get an oops.
> 
> Hm, this could be easily tested by sleeping until usb_host_release() is
> called when you unregister a device.  The i2c, pcmcia, and network
> subsytems do this.  I think we now have a helper function in the driver
> core to do this for us, so we don't have to declare our own completion
> variable...

Or sleeping until the actual release function (struct hc_driver->hcd_free) 
is called.  But you have to make sure it was called for the host you are 
trying to deregister, not some other host.

Of course, if all you want to do is unload the module then it doesn't 
matter which host is which.  You just have to wait until they are all 
gone.

Alan Stern


^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [linux-usb-devel] Re: [OOPS,  usbcore, releaseintf] 2.6.0-test10-mm1
  2003-12-10 16:49                                                                         ` Duncan Sands
  2003-12-10 16:58                                                                           ` Oliver Neukum
@ 2003-12-10 17:34                                                                           ` David Brownell
  2003-12-10 17:54                                                                             ` Duncan Sands
  1 sibling, 1 reply; 113+ messages in thread
From: David Brownell @ 2003-12-10 17:34 UTC (permalink / raw)
  To: Duncan Sands; +Cc: linux-kernel, USB development list


> Unfortunately, usb_physical_reset_device calls usb_set_configuration
> which takes dev->serialize.

Not since late August it doesn't ...



^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [linux-usb-devel] Re: [OOPS,  usbcore, releaseintf] 2.6.0-test10-mm1
  2003-12-10 17:34                                                                           ` David Brownell
@ 2003-12-10 17:54                                                                             ` Duncan Sands
  2003-12-10 18:19                                                                               ` Alan Stern
  2003-12-10 19:43                                                                               ` David Brownell
  0 siblings, 2 replies; 113+ messages in thread
From: Duncan Sands @ 2003-12-10 17:54 UTC (permalink / raw)
  To: David Brownell; +Cc: linux-kernel, USB development list

On Wednesday 10 December 2003 18:34, David Brownell wrote:
> > Unfortunately, usb_physical_reset_device calls usb_set_configuration
> > which takes dev->serialize.
>
> Not since late August it doesn't ...

In current 2.5 bitkeeper it does.

Duncan.

int usb_set_configuration(struct usb_device *dev, int configuration)
{
        int i, ret;
        struct usb_host_config *cp = NULL;

        /* dev->serialize guards all config changes */
        down(&dev->serialize);

^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [linux-usb-devel] Re: [OOPS,  usbcore, releaseintf] 2.6.0-test10-mm1
  2003-12-10 17:54                                                                             ` Duncan Sands
@ 2003-12-10 18:19                                                                               ` Alan Stern
  2003-12-11  9:36                                                                                 ` Duncan Sands
  2003-12-10 19:43                                                                               ` David Brownell
  1 sibling, 1 reply; 113+ messages in thread
From: Alan Stern @ 2003-12-10 18:19 UTC (permalink / raw)
  To: Duncan Sands; +Cc: Kernel development list, USB development list

On Wed, 10 Dec 2003, Duncan Sands wrote:

> On Wednesday 10 December 2003 18:34, David Brownell wrote:
> > > Unfortunately, usb_physical_reset_device calls usb_set_configuration
> > > which takes dev->serialize.
> >
> > Not since late August it doesn't ...
> 
> In current 2.5 bitkeeper it does.

I don't understand the problem.  What's wrong with dropping dev->serialize 
before calling usb_reset_device() or usb_set_configuration() and then 
reacquiring it afterward?

Alan Stern


^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [linux-usb-devel] Re: [OOPS,  usbcore, releaseintf] 2.6.0-test10-mm1
  2003-12-10 17:54                                                                             ` Duncan Sands
  2003-12-10 18:19                                                                               ` Alan Stern
@ 2003-12-10 19:43                                                                               ` David Brownell
  2003-12-11  9:21                                                                                 ` Duncan Sands
  1 sibling, 1 reply; 113+ messages in thread
From: David Brownell @ 2003-12-10 19:43 UTC (permalink / raw)
  To: Duncan Sands; +Cc: linux-kernel, USB development list

Duncan Sands wrote:
> On Wednesday 10 December 2003 18:34, David Brownell wrote:
> 
>>>Unfortunately, usb_physical_reset_device calls usb_set_configuration
>>>which takes dev->serialize.
>>
>>Not since late August it doesn't ...
> 
> 
> In current 2.5 bitkeeper it does.

usb_physical_reset_device() does not call usb_set_configuration()
except in the known-broken (for other reasons too!) "firmware changed"
path.  Known-broken, but not yet removed since nobody has reported
running into that or the other deadlock; the real fix is force
re-enumeration of the device.

The main path uses usb_control_msg(), because usb_reset_device()
currently guarantees it preserves (restore) altsettings as well
as driver bindings.  It couldn't even use usb_reset_configuration(),
since that gives altsettings their initial values (zero).

- Dave



> Duncan.
> 
> int usb_set_configuration(struct usb_device *dev, int configuration)
> {
>         int i, ret;
>         struct usb_host_config *cp = NULL;
> 
>         /* dev->serialize guards all config changes */
>         down(&dev->serialize);
> 



^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [linux-usb-devel] Re: [OOPS,  usbcore, releaseintf] 2.6.0-test10-mm1
  2003-12-10 17:25                                                                                     ` Alan Stern
@ 2003-12-10 20:46                                                                                       ` Greg KH
  2003-12-10 21:08                                                                                         ` Greg KH
  2003-12-10 22:08                                                                                         ` Alan Stern
  0 siblings, 2 replies; 113+ messages in thread
From: Greg KH @ 2003-12-10 20:46 UTC (permalink / raw)
  To: Alan Stern
  Cc: David Brownell, Duncan Sands, Vince, Randy.Dunlap, mfedyk, zwane,
	linux-kernel, USB development list

On Wed, Dec 10, 2003 at 12:25:38PM -0500, Alan Stern wrote:
> Sorry Greg, I'm having trouble understanding if you're agreeing with me or 
> disagreeing :-) ...

Heh, I don't really know either, just trying to state what is really
happening in the pci code.

> On Wed, 10 Dec 2003, Greg KH wrote:
> 
> > On Tue, Dec 09, 2003 at 10:43:23PM -0500, Alan Stern wrote:
> > > 
> > > I looked at both the UHCI and OHCI drivers.  In their module_exit routines
> > > they call pci_unregister_driver().  Without knowing how the PCI subsystem
> > > works, I would assume this behaves like any other "deregister" routine in
> > > the driver model and returns without waiting for any reference count to go
> > > to 0 -- that's what release callbacks are for.
> > 
> > No, the pci core calls the release() function in the pci driver that is
> > bound to that device.  It waits for that release() call to return before
> > continuing on.  You can sleep for however long you want in that
> > function, but once you return from there, the pci structures for that
> > device will be cleaned up.
> 
> I looked through the call tree for pci_unregister_driver().  There doesn't 
> appear to be any place where it calls a release function.  In fact, struct 
> pci_driver doesn't even _contain_ a release function!  Maybe you're 
> thinking of the "remove" function.  But pci_unregister_driver() doesn't 
> wait for that; it will return immediately even if the reference count is 
> still > 0.

Sorry, the pci driver has a remove() function, that's what I ment.

But anyway, yes, it does get called.  It looks like it is time to help
explain the driver core...

When pci_unregister_driver() gets called, here is what happens:
	- pci_unregister_driver calls driver_unregister() with a pointer
	  to the pci_driver->driver field.
	- That pci_driver->driver field is set up when
	  pci_register_driver() is called, and contains pointers to the
	  pci_device_remove function.
	- driver_unregister() calls bus_remove_driver()
	- bus_remove_driver() locks some locks and calls
	  driver_detach().
	- driver_detach() walks the list of all devices that this driver
	  is attached to, and calls device_release_driver() on every
	  device.
	- device_release_driver() unlinks some sysfs files, does some
	  power management stuff, and then calls the remove() function
	  that is associated with that driver.  That remove function for
	  a pci driver is pci_device_remove()
	- pci_device_remove() then calls down to the pci_driver's remove
	  function, which in our case for a USB PCI Host controller
	  driver would be usb_hcd_pci_remove()
	- I think you can follow what usb_hcd_pci_remove() does.  After
	  it is finished, the call stack is unwound, and eventually
	  returns back to the caller of pci_unregister_driver().

Now grasshopper, are you wise in the ways of the driver core or are you
wishing you never asked?  :)

> > > However, the module_exit routines _don't_ wait for the release callbacks.  
> > 
> > Not true.
> 
> Here's the entire source for the UHCI module_exit routine:
> 
> 
> static void __exit uhci_hcd_cleanup(void) 
> {
> 	pci_unregister_driver(&uhci_pci_driver);

This call does all the cleanups we need.

> 	
> 	if (kmem_cache_destroy(uhci_up_cachep))
> 		printk(KERN_INFO "uhci: not all urb_priv's were freed\n");
> 
> #ifdef CONFIG_PROC_FS
> 	remove_proc_entry("driver/uhci", 0);
> #endif
> 
> 	if (errbuf)
> 		kfree(errbuf);
> }
> 
> 
> Where in there does it wait for a release callback?

In pci_unregister_driver().

> > > They just go right on ahead and exit.  Result: when the reference count 
> > > eventually does go to 0 (when usbfs drops its last reference), the 
> > > hcd_free routine is no longer present and you get an oops.
> > 
> > Hm, this could be easily tested by sleeping until usb_host_release() is
> > called when you unregister a device.  The i2c, pcmcia, and network
> > subsytems do this.  I think we now have a helper function in the driver
> > core to do this for us, so we don't have to declare our own completion
> > variable...
> 
> Or sleeping until the actual release function (struct hc_driver->hcd_free) 
> is called.  But you have to make sure it was called for the host you are 
> trying to deregister, not some other host.

That is done by the following logic (yeah, it's a maze of twisty
paths...)

	- in usb_hcd_pci_remove() we call usb_deregister_bus() to
	  unregister the bus structure.
	- usb_deregister_bus() calls class_device_unregister() with a
	  pointer to the bus's class device structure.  That class
	  device structure was previously reigstered to the
	  usb_host_class.  The usb_host_class's release function is the
	  usb_host_release() call.  That release function will be called
	  when the last reference on the class device is release.
	- usb_host_release() calls the release() function of the usb_bus
	  structure, which points back to how to clean up the memory for
	  that specific usb bus driver.  Now for all host controllers
	  that use the hcd framework, that points to the
	  hcd_pci_release() function.  Which will then call the
	  hcd_free() function for that specific hcd driver.

Does that help?  Or does your head hurt even more now?

So, if we wait for the class_device_unregister() function to actually
free the memory (wait for the usb_host_release() function to complete)
then we know it is absolutely safe for the driver to be removed from the
system.

> Of course, if all you want to do is unload the module then it doesn't 
> matter which host is which.  You just have to wait until they are all 
> gone.

Exactly, and that will happen, if we wait on that
class_device_unregister() call.  An example of how to do that can be
seen in the i2c_del_adapter() function in drivers/i2c/i2c-core.c.

greg k-h

^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [linux-usb-devel] Re: [OOPS,  usbcore, releaseintf] 2.6.0-test10-mm1
  2003-12-10 16:02                                                                                     ` Duncan Sands
@ 2003-12-10 20:53                                                                                       ` Greg KH
  2003-12-11  8:49                                                                                         ` Duncan Sands
  0 siblings, 1 reply; 113+ messages in thread
From: Greg KH @ 2003-12-10 20:53 UTC (permalink / raw)
  To: Duncan Sands
  Cc: Alan Stern, David Brownell, Vince, Randy.Dunlap, mfedyk, zwane,
	linux-kernel, USB development list

On Wed, Dec 10, 2003 at 05:02:16PM +0100, Duncan Sands wrote:
> > That's what my proposal 1 paragraph up would do.  If I get the chance
> > this afternoon, I'll try to implement it if no one beats me to it...
> 
> Hi Greg, so this means that rmmod will sleep in an unkillable state until
> all references are dropped?

Yes, that is what would happen.

Now you could get into a deadlock by trying something pathilogical like:
	rmmod usb-hcd < /sys/devices/pci0000:00/0000:00:1d.7/usb1/idVendor

but hey, if you do that, you deserve the deadlock :)

(and yes, I know you can do this for network devices, but they have
their own thread/timer/something to prevent this deadlock from
happening...)

> I don't know if you've been following this thread or not, but the oops
> occurred when I modified usbfs to hold a reference to the usb_device
> until no-one was using a given usbfs file.

That's a good thing to do.  It should work.

> I guess this means that I should change my patch so that the reference
> to the usb_device is dropped as soon as possible, right?

No, the bug should be fixed.  I've seen this bug happen if someone has a
usb-serial device open and then unload the host controller driver.  In
fact, I think there's a bugzilla entry just for that...

Yeah, here it is:
	http://bugme.osdl.org/show_bug.cgi?id=1191

The very same oops you are seeing.

So no, it's not your fault.  We need to fix the real problem.

> Thanks for looking into this,

No problem, thanks for reminding me about this.

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [linux-usb-devel] Re: [OOPS,  usbcore, releaseintf] 2.6.0-test10-mm1
  2003-12-10 20:46                                                                                       ` Greg KH
@ 2003-12-10 21:08                                                                                         ` Greg KH
  2003-12-11  2:10                                                                                           ` Vince
  2003-12-10 22:08                                                                                         ` Alan Stern
  1 sibling, 1 reply; 113+ messages in thread
From: Greg KH @ 2003-12-10 21:08 UTC (permalink / raw)
  To: Alan Stern
  Cc: David Brownell, Duncan Sands, Vince, Randy.Dunlap, mfedyk, zwane,
	linux-kernel, USB development list

On Wed, Dec 10, 2003 at 12:46:21PM -0800, Greg KH wrote:
> > Of course, if all you want to do is unload the module then it doesn't 
> > matter which host is which.  You just have to wait until they are all 
> > gone.
> 
> Exactly, and that will happen, if we wait on that
> class_device_unregister() call.  An example of how to do that can be
> seen in the i2c_del_adapter() function in drivers/i2c/i2c-core.c.

Ok, below is the patch.  I've only compile tested it, not run it yet.
Please let me know if it works for you or not.

thanks,

greg k-h


===== hcd.c 1.123 vs edited =====
--- 1.123/drivers/usb/core/hcd.c	Sun Dec  7 04:29:05 2003
+++ edited/hcd.c	Wed Dec 10 13:06:19 2003
@@ -588,6 +588,9 @@
 
 	if (bus->release)
 		bus->release(bus);
+	/* FIXME change this when the driver core gets the
+	 * class_device_unregister_wait() call */
+	complete(&bus->released);
 }
 
 static struct class usb_host_class = {
@@ -724,7 +727,11 @@
 
 	clear_bit (bus->busnum, busmap.busmap);
 
+	/* FIXME change this when the driver core gets the
+	 * class_device_unregister_wait() call */
+	init_completion(&bus->released);
 	class_device_unregister(&bus->class_dev);
+	wait_for_completion(&bus->released);
 }
 EXPORT_SYMBOL (usb_deregister_bus);
 
===== usb.h 1.164 vs edited =====
--- 1.164/include/linux/usb.h	Mon Oct  6 10:46:13 2003
+++ edited/usb.h	Wed Dec 10 13:07:27 2003
@@ -210,6 +210,8 @@
 
 	struct class_device class_dev;	/* class device for this bus */
 	void (*release)(struct usb_bus *bus);	/* function to destroy this bus's memory */
+	/* FIXME, remove this when the driver core gets class_device_unregister_wait */
+	struct completion released;
 };
 #define	to_usb_bus(d) container_of(d, struct usb_bus, class_dev)
 

^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [linux-usb-devel] Re: [OOPS,  usbcore, releaseintf] 2.6.0-test10-mm1
  2003-12-10 20:46                                                                                       ` Greg KH
  2003-12-10 21:08                                                                                         ` Greg KH
@ 2003-12-10 22:08                                                                                         ` Alan Stern
  2003-12-11  6:47                                                                                           ` Greg KH
  1 sibling, 1 reply; 113+ messages in thread
From: Alan Stern @ 2003-12-10 22:08 UTC (permalink / raw)
  To: Greg KH
  Cc: David Brownell, Duncan Sands, Vince, Randy.Dunlap, mfedyk, zwane,
	linux-kernel, USB development list

On Wed, 10 Dec 2003, Greg KH wrote:

> Sorry, the pci driver has a remove() function, that's what I ment.
> 
> But anyway, yes, it does get called.  It looks like it is time to help
> explain the driver core...
> 
> When pci_unregister_driver() gets called, here is what happens:
> 	- pci_unregister_driver calls driver_unregister() with a pointer
> 	  to the pci_driver->driver field.
> 	- That pci_driver->driver field is set up when
> 	  pci_register_driver() is called, and contains pointers to the
> 	  pci_device_remove function.
> 	- driver_unregister() calls bus_remove_driver()
> 	- bus_remove_driver() locks some locks and calls
> 	  driver_detach().
> 	- driver_detach() walks the list of all devices that this driver
> 	  is attached to, and calls device_release_driver() on every
> 	  device.

That's about as far as I had followed it.  I should have kept going... but 
it wouldn't have made any difference.

> 	- device_release_driver() unlinks some sysfs files, does some
> 	  power management stuff, and then calls the remove() function
> 	  that is associated with that driver.  That remove function for
> 	  a pci driver is pci_device_remove()
> 	- pci_device_remove() then calls down to the pci_driver's remove
> 	  function, which in our case for a USB PCI Host controller
> 	  driver would be usb_hcd_pci_remove()
> 	- I think you can follow what usb_hcd_pci_remove() does.  After
> 	  it is finished, the call stack is unwound, and eventually
> 	  returns back to the caller of pci_unregister_driver().
> 
> Now grasshopper, are you wise in the ways of the driver core or are you
> wishing you never asked?  :)

Both, I think.  I still don't see where pci_unregister_driver() ends up
waiting for the reference count to drop to 0.  In fact, I think maybe you
agree that it _doesn't_ wait.  Which was my earlier point.


> > Or sleeping until the actual release function (struct hc_driver->hcd_free) 
> > is called.  But you have to make sure it was called for the host you are 
> > trying to deregister, not some other host.
> 
> That is done by the following logic (yeah, it's a maze of twisty
> paths...)
> 
> 	- in usb_hcd_pci_remove() we call usb_deregister_bus() to
> 	  unregister the bus structure.
> 	- usb_deregister_bus() calls class_device_unregister() with a
> 	  pointer to the bus's class device structure.  That class
> 	  device structure was previously reigstered to the
> 	  usb_host_class.  The usb_host_class's release function is the
> 	  usb_host_release() call.  That release function will be called
> 	  when the last reference on the class device is release.
> 	- usb_host_release() calls the release() function of the usb_bus
> 	  structure, which points back to how to clean up the memory for
> 	  that specific usb bus driver.  Now for all host controllers
> 	  that use the hcd framework, that points to the
> 	  hcd_pci_release() function.  Which will then call the
> 	  hcd_free() function for that specific hcd driver.
> 
> Does that help?  Or does your head hurt even more now?

I had already figured that much out for myself.  So
pci_unregister_driver() will follow this all the way down to
class_device_unregister(), which will decrement a reference count and
return immediately without calling usb_host_release() if the count isn't
0, which it wasn't in this case.

As a result pci_unregister_driver() returns immediately and the module is 
unloaded.  Later on when usb_host_release() does get called -- BOOM!

> So, if we wait for the class_device_unregister() function to actually
> free the memory (wait for the usb_host_release() function to complete)
> then we know it is absolutely safe for the driver to be removed from the
> system.
> 
> > Of course, if all you want to do is unload the module then it doesn't 
> > matter which host is which.  You just have to wait until they are all 
> > gone.
> 
> Exactly, and that will happen, if we wait on that
> class_device_unregister() call.  An example of how to do that can be
> seen in the i2c_del_adapter() function in drivers/i2c/i2c-core.c.

In the absence of the class_device_unregister_wait() function, the patch
you have created appears to be necessary.

As Pat LaVarre would say, I think we're agreeing violently.

Alan Stern




^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [linux-usb-devel] Re: [OOPS,  usbcore, releaseintf] 2.6.0-test10-mm1
  2003-12-10 21:08                                                                                         ` Greg KH
@ 2003-12-11  2:10                                                                                           ` Vince
  2003-12-11  6:46                                                                                             ` Greg KH
  0 siblings, 1 reply; 113+ messages in thread
From: Vince @ 2003-12-11  2:10 UTC (permalink / raw)
  To: Greg KH
  Cc: Alan Stern, David Brownell, Duncan Sands, Randy.Dunlap, mfedyk,
	zwane, linux-kernel, USB development list

I've tried this patch together with Duncan's fix for my first problem 
and I couldn't reproduce any oops on my system. I've only tried a few (a 
dozen) cyles of loading/unloading for now (but earlier that was much 
enough to trigger an oops). I'll report later if/when I get any bad 
behaviour but so far everything looks fine.

Many thanks,

Vincent

Greg KH wrote:
> Ok, below is the patch.  I've only compile tested it, not run it yet.
> Please let me know if it works for you or not.
> 
> thanks,
> 
> greg k-h
> 
> 
> ===== hcd.c 1.123 vs edited =====
> --- 1.123/drivers/usb/core/hcd.c	Sun Dec  7 04:29:05 2003
> +++ edited/hcd.c	Wed Dec 10 13:06:19 2003
> @@ -588,6 +588,9 @@
>  
>  	if (bus->release)
>  		bus->release(bus);
> +	/* FIXME change this when the driver core gets the
> +	 * class_device_unregister_wait() call */
> +	complete(&bus->released);
>  }
>  
>  static struct class usb_host_class = {
> @@ -724,7 +727,11 @@
>  
>  	clear_bit (bus->busnum, busmap.busmap);
>  
> +	/* FIXME change this when the driver core gets the
> +	 * class_device_unregister_wait() call */
> +	init_completion(&bus->released);
>  	class_device_unregister(&bus->class_dev);
> +	wait_for_completion(&bus->released);
>  }
>  EXPORT_SYMBOL (usb_deregister_bus);
>  
> ===== usb.h 1.164 vs edited =====
> --- 1.164/include/linux/usb.h	Mon Oct  6 10:46:13 2003
> +++ edited/usb.h	Wed Dec 10 13:07:27 2003
> @@ -210,6 +210,8 @@
>  
>  	struct class_device class_dev;	/* class device for this bus */
>  	void (*release)(struct usb_bus *bus);	/* function to destroy this bus's memory */
> +	/* FIXME, remove this when the driver core gets class_device_unregister_wait */
> +	struct completion released;
>  };
>  #define	to_usb_bus(d) container_of(d, struct usb_bus, class_dev)
>  


^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [linux-usb-devel] Re: [OOPS,  usbcore, releaseintf] 2.6.0-test10-mm1
  2003-12-11  2:10                                                                                           ` Vince
@ 2003-12-11  6:46                                                                                             ` Greg KH
  0 siblings, 0 replies; 113+ messages in thread
From: Greg KH @ 2003-12-11  6:46 UTC (permalink / raw)
  To: Vince
  Cc: Alan Stern, David Brownell, Duncan Sands, Randy.Dunlap, mfedyk,
	zwane, linux-kernel, USB development list

On Thu, Dec 11, 2003 at 03:10:33AM +0100, Vince wrote:
> I've tried this patch together with Duncan's fix for my first problem 
> and I couldn't reproduce any oops on my system. I've only tried a few (a 
> dozen) cyles of loading/unloading for now (but earlier that was much 
> enough to trigger an oops). I'll report later if/when I get any bad 
> behaviour but so far everything looks fine.

Hm, on my box, it seems to unload just fine, but oopses when loading the
ehci driver again (ohci and uhci loaded just fine.)  If I get the chance
tomorrow I'll try to figure it out...

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [linux-usb-devel] Re: [OOPS,  usbcore, releaseintf] 2.6.0-test10-mm1
  2003-12-10 22:08                                                                                         ` Alan Stern
@ 2003-12-11  6:47                                                                                           ` Greg KH
  0 siblings, 0 replies; 113+ messages in thread
From: Greg KH @ 2003-12-11  6:47 UTC (permalink / raw)
  To: Alan Stern
  Cc: David Brownell, Duncan Sands, Vince, Randy.Dunlap, mfedyk, zwane,
	linux-kernel, USB development list

On Wed, Dec 10, 2003 at 05:08:12PM -0500, Alan Stern wrote:
> > Now grasshopper, are you wise in the ways of the driver core or are you
> > wishing you never asked?  :)
> 
> Both, I think.  I still don't see where pci_unregister_driver() ends up
> waiting for the reference count to drop to 0.  In fact, I think maybe you
> agree that it _doesn't_ wait.  Which was my earlier point.

Ok, yes.  I think we are agreeing here.

Anyway, the patch seems to work for him, but kills my box (not using any
usbfs devices.)  I'll see if we have some odd reference count issues
still floating around...

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [linux-usb-devel] Re: [OOPS,  usbcore, releaseintf] 2.6.0-test10-mm1
  2003-12-10 20:53                                                                                       ` Greg KH
@ 2003-12-11  8:49                                                                                         ` Duncan Sands
  2003-12-11  9:23                                                                                           ` Greg KH
  0 siblings, 1 reply; 113+ messages in thread
From: Duncan Sands @ 2003-12-11  8:49 UTC (permalink / raw)
  To: Greg KH
  Cc: Alan Stern, David Brownell, Vince, Randy.Dunlap, mfedyk, zwane,
	linux-kernel, USB development list

> > I don't know if you've been following this thread or not, but the oops
> > occurred when I modified usbfs to hold a reference to the usb_device
> > until no-one was using a given usbfs file.
>
> That's a good thing to do.  It should work.
>
> > I guess this means that I should change my patch so that the reference
> > to the usb_device is dropped as soon as possible, right?
>
> No, the bug should be fixed.  I've seen this bug happen if someone has a
> usb-serial device open and then unload the host controller driver.  In
> fact, I think there's a bugzilla entry just for that...

Hi Greg, what I meant was: should I make my patch friendlier to rmmod by
trying hard to drop the reference as soon as possible, though some code paths
may have to hold on to it for a long time (cost: code complication), or is it OK
to always hang onto the usb_device as long as one of the usbfs files is open
(cost: rmmod may take a long or infinite time; advantages: simple, robust)?
This lowly one humbly awaits enlightenment... :)

Thanks,

Duncan.

^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [linux-usb-devel] Re: [OOPS,  usbcore, releaseintf] 2.6.0-test10-mm1
  2003-12-10 19:43                                                                               ` David Brownell
@ 2003-12-11  9:21                                                                                 ` Duncan Sands
  0 siblings, 0 replies; 113+ messages in thread
From: Duncan Sands @ 2003-12-11  9:21 UTC (permalink / raw)
  To: David Brownell; +Cc: linux-kernel, USB development list

On Wednesday 10 December 2003 20:43, David Brownell wrote:
> Duncan Sands wrote:
> > On Wednesday 10 December 2003 18:34, David Brownell wrote:
> >>>Unfortunately, usb_physical_reset_device calls usb_set_configuration
> >>>which takes dev->serialize.
> >>
> >>Not since late August it doesn't ...
> >
> > In current 2.5 bitkeeper it does.
>
> usb_physical_reset_device() does not call usb_set_configuration()
> except in the known-broken (for other reasons too!) "firmware changed"
> path.  Known-broken, but not yet removed since nobody has reported
> running into that or the other deadlock; the real fix is force
> re-enumeration of the device.

Still, it could be changed into a call to usb_physical_set_configuration
while we're waiting for a real fix, right?

Ciao,

Duncan.

^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [linux-usb-devel] Re: [OOPS,  usbcore, releaseintf] 2.6.0-test10-mm1
  2003-12-11  8:49                                                                                         ` Duncan Sands
@ 2003-12-11  9:23                                                                                           ` Greg KH
  2003-12-11  9:29                                                                                             ` Duncan Sands
  0 siblings, 1 reply; 113+ messages in thread
From: Greg KH @ 2003-12-11  9:23 UTC (permalink / raw)
  To: Duncan Sands
  Cc: Alan Stern, David Brownell, Vince, Randy.Dunlap, mfedyk, zwane,
	linux-kernel, USB development list

On Thu, Dec 11, 2003 at 09:49:54AM +0100, Duncan Sands wrote:
> 
> Hi Greg, what I meant was: should I make my patch friendlier to rmmod by
> trying hard to drop the reference as soon as possible, though some code paths
> may have to hold on to it for a long time (cost: code complication), or is it OK
> to always hang onto the usb_device as long as one of the usbfs files is open
> (cost: rmmod may take a long or infinite time; advantages: simple, robust)?

If the file is open, keep the reference count.  If you were to try
anything else, it would just be to complex in the end.

It's ok to wait a long time on rmmod, that's a pretty unique situation.

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [linux-usb-devel] Re: [OOPS,  usbcore, releaseintf] 2.6.0-test10-mm1
  2003-12-11  9:23                                                                                           ` Greg KH
@ 2003-12-11  9:29                                                                                             ` Duncan Sands
  0 siblings, 0 replies; 113+ messages in thread
From: Duncan Sands @ 2003-12-11  9:29 UTC (permalink / raw)
  To: Greg KH
  Cc: Alan Stern, David Brownell, Vince, Randy.Dunlap, mfedyk, zwane,
	linux-kernel, USB development list

> If the file is open, keep the reference count.  If you were to try
> anything else, it would just be to complex in the end.
>
> It's ok to wait a long time on rmmod, that's a pretty unique situation.

Great - thanks.

Duncan.

^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [linux-usb-devel] Re: [OOPS,  usbcore, releaseintf] 2.6.0-test10-mm1
  2003-12-10 18:19                                                                               ` Alan Stern
@ 2003-12-11  9:36                                                                                 ` Duncan Sands
  2003-12-11 15:19                                                                                   ` Alan Stern
  0 siblings, 1 reply; 113+ messages in thread
From: Duncan Sands @ 2003-12-11  9:36 UTC (permalink / raw)
  To: Alan Stern; +Cc: Kernel development list, USB development list

On Wednesday 10 December 2003 19:19, Alan Stern wrote:
> On Wed, 10 Dec 2003, Duncan Sands wrote:
> > On Wednesday 10 December 2003 18:34, David Brownell wrote:
> > > > Unfortunately, usb_physical_reset_device calls usb_set_configuration
> > > > which takes dev->serialize.
> > >
> > > Not since late August it doesn't ...
> >
> > In current 2.5 bitkeeper it does.
>
> I don't understand the problem.  What's wrong with dropping dev->serialize
> before calling usb_reset_device() or usb_set_configuration() and then
> reacquiring it afterward?

The problem is that between dropping the lock and usb_set_configuration (or
whatever) picking it up again, the device may be disconnected, so usb_set_configuration
needs to handle the case of being called after disconnect (it doesn't seem to
check for that right now, but I only had a quick look).  Also, after usbfs picks up
the lock again it needs to check for disconnect.  None of this is a big deal, but
it could all be avoided by a simpler change: provide a usb_physical_set_configuration
(or whatever), which is usb_set_configuration without taking dev->serialize.

Duncan.

^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [linux-usb-devel] Re: [OOPS,  usbcore, releaseintf] 2.6.0-test10-mm1
  2003-12-10 17:21                                                                       ` David Brownell
@ 2003-12-11  9:42                                                                         ` Duncan Sands
  0 siblings, 0 replies; 113+ messages in thread
From: Duncan Sands @ 2003-12-11  9:42 UTC (permalink / raw)
  To: David Brownell; +Cc: linux-kernel, USB development list

On Wednesday 10 December 2003 18:21, David Brownell wrote:
> [ CC list trimmed, most folk are on one of the cc'd lists ]
>
> > By the way, here is the list of routines that cause trouble for usbfs:

Hi Dave, by "they cause trouble" I meant: they may take, or lead to
taking, dev->serialize.  This means that before my patch they could
lead to occasional deadlocks, and with my patch will always cause
deadlocks.

> > usb_probe_interface
>
> In proc_resetdevice() ... after usb_reset_device().
> If usb_reset_device() worked sanely, it wouldn't be
> necessary to try fixing up its result.  Plus, last I
> looked, I don't think usbfs fixed it up correctly.
>
> Actually that call is dangerous and probably should
> fail if usbfs isn't controlling all the interfaces
> on the device ... checking before it tries.
>
> > usb_reset_device
>
> We've known for some time this routine needs a rewrite.
> It's never quite worked right, and it doesn't handle
> DFU style devices (like the most common USB 802.11b
> adapters) well at all.
>
> > usb_set_configuration
>
> That is, you're saying that _if_ usbfs is modified to
> get rid of ps->devsem and use dev->serialize instead,
> then you'd need some other way to guard proc_setconfig()
> against disconnect?  That still seems like a chicken/egg
> issue to me.

No.  usb_set_configuration takes dev->serialize, which is
already taken.  There is no other problem.

> > usb_unbind_interface
>
> See the patch I posted yesterday evening, with usbfs parts
> of the updates to driver binding.  It's incorrect for usbfs
> ever to be calling that ... device_release_driver() is the
> thing to call, for drivers that weren't bound using the
> usb_driver_claim_interface() call.  That way the sysfs
> state also gets cleaned up ...

Yeah, these things are a mess.  My patch only fixes the
locking problems, not the fact that they never worked
anyway.  Even so, it may be hard to get it into 2.6.0.

Duncan.

^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [linux-usb-devel] Re: [OOPS,  usbcore, releaseintf] 2.6.0-test10-mm1
  2003-12-10 16:58                                                                           ` Oliver Neukum
@ 2003-12-11  9:45                                                                             ` Duncan Sands
  2003-12-11 10:19                                                                               ` Oliver Neukum
  0 siblings, 1 reply; 113+ messages in thread
From: Duncan Sands @ 2003-12-11  9:45 UTC (permalink / raw)
  To: Oliver Neukum, Alan Stern
  Cc: David Brownell, Vince, Randy.Dunlap, mfedyk, zwane, linux-kernel,
	USB development list

> > > > __usb_set_configuration - lockless version
> > > > usb_set_configuration - locked version
> > >
> > > Partially done.
> > > That's what the _physical version of usb_reset_device() is about.
> >
> > Unfortunately, usb_physical_reset_device calls usb_set_configuration
> > which takes dev->serialize.
>
> That is bad, but the solution is obvious.
> All such operations need a _physical version.
> At first sight this may look less elegant than some lock dropping schemes,
> but it is a solution that produces obviously correct code paths with
> respect to locking.

Hi Oliver, I agree, except that there are several layers of locking: dev->serialize
but also the bus rwsem.  So does "physical" mean no subsys.rwsem or no
dev->serialize or both?

Ciao, Duncan.

int usb_reset_device(struct usb_device *udev)
{
        struct device *gdev = &udev->dev;
        int r;

        down_read(&gdev->bus->subsys.rwsem);
        r = usb_physical_reset_device(udev);
        up_read(&gdev->bus->subsys.rwsem);

        return r;
}

^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [linux-usb-devel] Re: [OOPS,  usbcore, releaseintf] 2.6.0-test10-mm1
  2003-12-11  9:45                                                                             ` Duncan Sands
@ 2003-12-11 10:19                                                                               ` Oliver Neukum
  2003-12-11 21:43                                                                                 ` Duncan Sands
  0 siblings, 1 reply; 113+ messages in thread
From: Oliver Neukum @ 2003-12-11 10:19 UTC (permalink / raw)
  To: Duncan Sands, Alan Stern
  Cc: David Brownell, Vince, Randy.Dunlap, mfedyk, zwane, linux-kernel,
	USB development list

Am Donnerstag, 11. Dezember 2003 10:45 schrieb Duncan Sands:
> > > > > __usb_set_configuration - lockless version
> > > > > usb_set_configuration - locked version
> > > >
> > > > Partially done.
> > > > That's what the _physical version of usb_reset_device() is about.
> > >
> > > Unfortunately, usb_physical_reset_device calls usb_set_configuration
> > > which takes dev->serialize.
> >
> > That is bad, but the solution is obvious.
> > All such operations need a _physical version.
> > At first sight this may look less elegant than some lock dropping schemes,
> > but it is a solution that produces obviously correct code paths with
> > respect to locking.
> 
> Hi Oliver, I agree, except that there are several layers of locking: dev->serialize
> but also the bus rwsem.  So does "physical" mean no subsys.rwsem or no
> dev->serialize or both?

"physical" means no locking at all. It's the caller's responsibility.

> int usb_reset_device(struct usb_device *udev)
> {
>         struct device *gdev = &udev->dev;
>         int r;
> 
>         down_read(&gdev->bus->subsys.rwsem);
>         r = usb_physical_reset_device(udev);
>         up_read(&gdev->bus->subsys.rwsem);
> 
>         return r;
> }

That's what the core cares about. No probe() while a reset is in
progress. Taking the semaphore ensures that.

	Regards
		Oliver


^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [linux-usb-devel] Re: [OOPS,  usbcore, releaseintf] 2.6.0-test10-mm1
  2003-12-11  9:36                                                                                 ` Duncan Sands
@ 2003-12-11 15:19                                                                                   ` Alan Stern
  2003-12-11 21:23                                                                                     ` Duncan Sands
  2003-12-11 21:29                                                                                     ` Duncan Sands
  0 siblings, 2 replies; 113+ messages in thread
From: Alan Stern @ 2003-12-11 15:19 UTC (permalink / raw)
  To: Duncan Sands; +Cc: Kernel development list, USB development list

On Thu, 11 Dec 2003, Duncan Sands wrote:

> On Wednesday 10 December 2003 19:19, Alan Stern wrote:
> >
> > I don't understand the problem.  What's wrong with dropping dev->serialize
> > before calling usb_reset_device() or usb_set_configuration() and then
> > reacquiring it afterward?
> 
> The problem is that between dropping the lock and usb_set_configuration (or
> whatever) picking it up again, the device may be disconnected, so usb_set_configuration
> needs to handle the case of being called after disconnect (it doesn't seem to
> check for that right now, but I only had a quick look).

It should handle that okay (provided you retain a reference to the 
usb_device so that it doesn't get deallocated).  Although it wouldn't hurt 
to change one of the tests from

	if (dev->state != USB_STATE_ADDRESS)

to

	if (dev->state > USB_STATE_ADDRESS)

>  Also, after usbfs picks up
> the lock again it needs to check for disconnect.  None of this is a big deal, but
> it could all be avoided by a simpler change: provide a usb_physical_set_configuration
> (or whatever), which is usb_set_configuration without taking dev->serialize.

I agree that it would ease things to provide entry points for set_config 
and reset_device that require the caller to hold dev->serialize already.  
The issue you and Oliver noted about holding the bus semaphore will go 
away when I finally get around to rewriting usb_reset_device().

Alan Stern


^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [linux-usb-devel] Re: [OOPS,  usbcore, releaseintf] 2.6.0-test10-mm1
  2003-12-11 15:19                                                                                   ` Alan Stern
@ 2003-12-11 21:23                                                                                     ` Duncan Sands
  2003-12-12 15:46                                                                                       ` Alan Stern
  2003-12-11 21:29                                                                                     ` Duncan Sands
  1 sibling, 1 reply; 113+ messages in thread
From: Duncan Sands @ 2003-12-11 21:23 UTC (permalink / raw)
  To: Alan Stern; +Cc: Kernel development list, USB development list

> It should handle that okay (provided you retain a reference to the
> usb_device so that it doesn't get deallocated).  Although it wouldn't hurt
> to change one of the tests from
>
> 	if (dev->state != USB_STATE_ADDRESS)
>
> to
>
> 	if (dev->state > USB_STATE_ADDRESS)

By the way, my patch tests for disconnect in usbfs by doing:

if (dev->state == USB_STATE_NOTATTACHED)
	run_away();

Is this right?

Thanks,

Duncan.

^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [linux-usb-devel] Re: [OOPS,  usbcore, releaseintf] 2.6.0-test10-mm1
  2003-12-11 15:19                                                                                   ` Alan Stern
  2003-12-11 21:23                                                                                     ` Duncan Sands
@ 2003-12-11 21:29                                                                                     ` Duncan Sands
  2003-12-12 16:18                                                                                       ` Alan Stern
  1 sibling, 1 reply; 113+ messages in thread
From: Duncan Sands @ 2003-12-11 21:29 UTC (permalink / raw)
  To: Alan Stern; +Cc: Kernel development list, USB development list

> I agree that it would ease things to provide entry points for set_config
> and reset_device that require the caller to hold dev->serialize already.
> The issue you and Oliver noted about holding the bus semaphore will go
> away when I finally get around to rewriting usb_reset_device().

>From what Dave says, usb_reset_device shouldn't take dev->serialize (but
accidentally does via usb_set_configuration).  That seems strange to me:
I thought the point of usbfs taking dev->serialize is to protect against the
device settings changing, but now we have usb_reset_device that doesn't
take dev->serialize at all - and surely it changes the device settings!

With much confusion,

Duncan.

^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [linux-usb-devel] Re: [OOPS,  usbcore, releaseintf] 2.6.0-test10-mm1
  2003-12-11 10:19                                                                               ` Oliver Neukum
@ 2003-12-11 21:43                                                                                 ` Duncan Sands
  2003-12-11 22:57                                                                                   ` Oliver Neukum
  0 siblings, 1 reply; 113+ messages in thread
From: Duncan Sands @ 2003-12-11 21:43 UTC (permalink / raw)
  To: Oliver Neukum, Alan Stern
  Cc: David Brownell, Vince, Randy.Dunlap, mfedyk, zwane, linux-kernel,
	USB development list

> > Hi Oliver, I agree, except that there are several layers of locking:
> > dev->serialize but also the bus rwsem.  So does "physical" mean no
> > subsys.rwsem or no dev->serialize or both?
>
> "physical" means no locking at all. It's the caller's responsibility.
...

> That's what the core cares about. No probe() while a reset is in
> progress. Taking the semaphore ensures that.

Hi Oliver, I'm a bit confused about the locking rules.  I suppose

(1) If both subsys.rwsem and dev->serialize are taken, then
subsys.rwsem must be taken first.

(2) dev->serialize atomizes changes to the struct usb_device.

Why then is dev->serialize not taken in usb_reset_device
(except in a dud code path)?

Also, why isn't dev->serialize enough to protect against
probe() during usb_reset_device?  After all, won't
dev->serialize be held during the probe calls (I didn't
check this and I'm in need of coffee - I hope I'm on the
right planet...)

Ciao,

Duncan.

^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [linux-usb-devel] Re: [OOPS,  usbcore, releaseintf] 2.6.0-test10-mm1
  2003-12-11 21:43                                                                                 ` Duncan Sands
@ 2003-12-11 22:57                                                                                   ` Oliver Neukum
  2003-12-11 23:30                                                                                     ` Duncan Sands
  2003-12-12  0:02                                                                                     ` David Brownell
  0 siblings, 2 replies; 113+ messages in thread
From: Oliver Neukum @ 2003-12-11 22:57 UTC (permalink / raw)
  To: Duncan Sands, Alan Stern
  Cc: David Brownell, Vince, Randy.Dunlap, mfedyk, zwane, linux-kernel,
	USB development list


> (1) If both subsys.rwsem and dev->serialize are taken, then
> subsys.rwsem must be taken first.

Yes.
 
> (2) dev->serialize atomizes changes to the struct usb_device.
> 
> Why then is dev->serialize not taken in usb_reset_device
> (except in a dud code path)?
> 
> Also, why isn't dev->serialize enough to protect against
> probe() during usb_reset_device?  After all, won't
> dev->serialize be held during the probe calls (I didn't
> check this and I'm in need of coffee - I hope I'm on the
> right planet...)

In the current code definitely not.
You must make sure that the configuration is still available.
Guarding against probe() during reset is not enough.
AFAIK David is currently rewriting this.

	Regards
		Oliver


^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [linux-usb-devel] Re: [OOPS,  usbcore, releaseintf] 2.6.0-test10-mm1
  2003-12-11 22:57                                                                                   ` Oliver Neukum
@ 2003-12-11 23:30                                                                                     ` Duncan Sands
  2003-12-12  0:02                                                                                     ` David Brownell
  1 sibling, 0 replies; 113+ messages in thread
From: Duncan Sands @ 2003-12-11 23:30 UTC (permalink / raw)
  To: Oliver Neukum, Alan Stern
  Cc: David Brownell, Vince, Randy.Dunlap, mfedyk, zwane, linux-kernel,
	USB development list

> Why then is dev->serialize not taken in usb_reset_device
> (except in a dud code path)?

And this one?  Surely usb_reset_device changes configurations
etc...

Thanks,

Duncan.

^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [linux-usb-devel] Re: [OOPS,  usbcore, releaseintf] 2.6.0-test10-mm1
  2003-12-11 22:57                                                                                   ` Oliver Neukum
  2003-12-11 23:30                                                                                     ` Duncan Sands
@ 2003-12-12  0:02                                                                                     ` David Brownell
  1 sibling, 0 replies; 113+ messages in thread
From: David Brownell @ 2003-12-12  0:02 UTC (permalink / raw)
  To: Oliver Neukum, Duncan Sands; +Cc: linux-kernel, USB development list

Oliver Neukum wrote:
>>(1) If both subsys.rwsem and dev->serialize are taken, then
>>subsys.rwsem must be taken first.
> 
> 
> Yes.

Erm, no.  As I pointed out the other day, it's a locking
hierarchy.  It must always go the other way:

  - dev->serialize first.  That controls the existence
    of the devices representing each interface ... which
    are different depending on device configuration.
    (Example, one config might have three interfaces,
    another one might just have one.)

    Needs to be grabbed during normal enumeration paths
    and by paths that can re-enumerate -- reset_device().
    The critical step is setting the configuration.

  - subsys.rwsem is grabbed automatically by the driver
    model core when it probes to see which driver will
    bind to the per-interface devices, or unbinds them
    (disconnect callbacks).

    It needs to be manually grabbed during claim() and
    release() style driver binding ... usbfs was doing
    this wrong, and I recently posted a patch that should
    fix it (same thread as the other binding fixups).

If you get the lock sequence wrong you'll eventually
deadlock with something that's using the correct
lock sequence.  Like khubd -- wedging much of USB.


>>(2) dev->serialize atomizes changes to the struct usb_device.
>>
>>Why then is dev->serialize not taken in usb_reset_device
>>(except in a dud code path)?

It IS taken in usb_reset_device(), and that "dud" path
is broken because that needs to be done in some other
task context.  (Because the reset must fail, and yet
the device is still there -- it needs to re-enumerate,
which that thread can't do.)

That "physical" reset_device doesn't grab it, since
its caller is guaranteed to already have the lock.


>>Also, why isn't dev->serialize enough to protect against
>>probe() during usb_reset_device?  After all, won't
>>dev->serialize be held during the probe calls (I didn't
>>check this and I'm in need of coffee - I hope I'm on the
>>right planet...)

Why?  See above about lock hierarchy.  When devices are
configured -- during enumeration, or eventually during
the re-enumeration on that "dud" codepath) -- devices
are created for each interface.


> In the current code definitely not.
> You must make sure that the configuration is still available.
> Guarding against probe() during reset is not enough.
> AFAIK David is currently rewriting this.

I have some unfinished code, but it's got to wait utnil
those two "fix driver binding" patches get merged:
the one for usbcore (everyone seemed OK with that one),
and the other for usbfs (no feedback yet).

There are entirely too many locks involved in all this,
and many of them will cause problems the minute any
very "interesting" things start getting done ... which
is why locking cleanups need to be done first.

- Dave



> 
> 	Regards
> 		Oliver
> 
> 



^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [linux-usb-devel] Re: [OOPS,  usbcore, releaseintf] 2.6.0-test10-mm1
  2003-12-08 17:59                                                         ` Duncan Sands
  2003-12-08 18:35                                                           ` Alan Stern
@ 2003-12-12  2:21                                                           ` David Brownell
  2003-12-12  8:47                                                             ` Duncan Sands
  2003-12-12 15:35                                                             ` bill davidsen
  1 sibling, 2 replies; 113+ messages in thread
From: David Brownell @ 2003-12-12  2:21 UTC (permalink / raw)
  To: Duncan Sands
  Cc: Alan Stern, Vince, Randy.Dunlap, mfedyk, zwane, linux-kernel,
	USB development list

> PS: Here is the patch that fixed the original usbfs Oops, but gained the new
> one Vince reported:

Good -- more locks vanishing from usbcore; it's about time!
This is a case where fewer locks are better.

My main patch feedback here would be that it should merge
most of the usbfs patch I sent (second URL below).  There's
a driver model locking requirement that you didn't know about,
it needs to bubble up (subsys.rwsem writelock must be held if
you're going to change driver bindings).  And there were a
few other rough spots, which I think you've mentioned (and
I don't think they were new issues).

The more I think about it, the more I like your idea of
changing device->serialize to be an rwsem.  Changing config,
or resetting the device, would get the writelock.  All other
uses should share, with readlocks -- that's the right model.

Likely not before 2.6.1 though ... ;)

- Dave

[1] http://marc.theaimsgroup.com/?l=linux-usb-devel&m=107100212612153&w=2
[2] http://marc.theaimsgroup.com/?l=linux-usb-devel&m=107102580404037&w=2



^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [linux-usb-devel] Re: [OOPS,  usbcore, releaseintf] 2.6.0-test10-mm1
  2003-12-12  2:21                                                           ` David Brownell
@ 2003-12-12  8:47                                                             ` Duncan Sands
  2003-12-12 15:35                                                             ` bill davidsen
  1 sibling, 0 replies; 113+ messages in thread
From: Duncan Sands @ 2003-12-12  8:47 UTC (permalink / raw)
  To: David Brownell
  Cc: Alan Stern, Vince, Randy.Dunlap, mfedyk, zwane, linux-kernel,
	USB development list

On Friday 12 December 2003 03:21, David Brownell wrote:
> > PS: Here is the patch that fixed the original usbfs Oops, but gained the
> > new one Vince reported:
>
> Good -- more locks vanishing from usbcore; it's about time!
> This is a case where fewer locks are better.
>
> My main patch feedback here would be that it should merge
> most of the usbfs patch I sent (second URL below).  There's
> a driver model locking requirement that you didn't know about,
> it needs to bubble up (subsys.rwsem writelock must be held if
> you're going to change driver bindings).  And there were a
> few other rough spots, which I think you've mentioned (and
> I don't think they were new issues).

Hi Dave, indeed your patch [2] and mine should go together.
I will clean mine, amalgamate with yours, and send to Greg
in logical pieces.

Duncan.

^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [linux-usb-devel] Re: [OOPS,  usbcore, releaseintf] 2.6.0-test10-mm1
  2003-12-12  2:21                                                           ` David Brownell
  2003-12-12  8:47                                                             ` Duncan Sands
@ 2003-12-12 15:35                                                             ` bill davidsen
  1 sibling, 0 replies; 113+ messages in thread
From: bill davidsen @ 2003-12-12 15:35 UTC (permalink / raw)
  To: linux-kernel

In article <3FD92632.50200@pacbell.net>,
David Brownell  <david-b@pacbell.net> wrote:

| The more I think about it, the more I like your idea of
| changing device->serialize to be an rwsem.  Changing config,
| or resetting the device, would get the writelock.  All other
| uses should share, with readlocks -- that's the right model.
| 
| Likely not before 2.6.1 though ... ;)

It's not clear one way or the other, since there is an oops involved it
seems a bugfix is in order. It chould be fixed, perhaps the change
described above could be accepted, since it's clearly a bugfix for a
serious problem.

Not our choice, but arguable...
-- 
bill davidsen <davidsen@tmr.com>
  CTO, TMR Associates, Inc
Doing interesting things with little computers since 1979.

^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [linux-usb-devel] Re: [OOPS,  usbcore, releaseintf] 2.6.0-test10-mm1
  2003-12-11 21:23                                                                                     ` Duncan Sands
@ 2003-12-12 15:46                                                                                       ` Alan Stern
  0 siblings, 0 replies; 113+ messages in thread
From: Alan Stern @ 2003-12-12 15:46 UTC (permalink / raw)
  To: Duncan Sands; +Cc: Kernel development list, USB development list

On Thu, 11 Dec 2003, Duncan Sands wrote:

> By the way, my patch tests for disconnect in usbfs by doing:
> 
> if (dev->state == USB_STATE_NOTATTACHED)
> 	run_away();
> 
> Is this right?

Yes it is.

Alan Stern


^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [linux-usb-devel] Re: [OOPS,  usbcore, releaseintf] 2.6.0-test10-mm1
  2003-12-11 21:29                                                                                     ` Duncan Sands
@ 2003-12-12 16:18                                                                                       ` Alan Stern
  2003-12-12 18:37                                                                                         ` David Brownell
  2003-12-12 18:50                                                                                         ` Oliver Neukum
  0 siblings, 2 replies; 113+ messages in thread
From: Alan Stern @ 2003-12-12 16:18 UTC (permalink / raw)
  To: Duncan Sands; +Cc: Kernel development list, USB development list

On Thu, 11 Dec 2003, Duncan Sands wrote:

> From what Dave says, usb_reset_device shouldn't take dev->serialize (but
> accidentally does via usb_set_configuration).  That seems strange to me:
> I thought the point of usbfs taking dev->serialize is to protect against the
> device settings changing, but now we have usb_reset_device that doesn't
> take dev->serialize at all - and surely it changes the device settings!
> 
> With much confusion,

Maybe it will help if I explain how usb_reset_device will work in the 
future.

First of all, as David has said, it does and will grab dev->serialize.  
The alternate entry point (...physical...) will require the caller to 
hold it already.

The routine will:

	1. issue the port reset
	2. make sure the device is still attached
	3. assign it the same address as it had before
	4. read the device and configuration descriptors
	5. make sure they are equal to the old descriptor values
	6. install the old configuration (if the old state was CONFIGURED) 
	7. select the old altsettings for each interface

If anything goes wrong with step 1, the routine simply returns an error.

If something goes wrong in steps 2-6, the routine will set a flag in the
usb_device indicating STATE_CHANGE_PENDING.  Several other routines will
have to be modified to check this flag before doing anything else
(set_config, set_interface, and so on).  The khubd thread will be woken up
to handle the pending state change.

If the problem arose in steps 2-4, the device will be marked for a pending
port-disable and disconnect.  If it arose in step 5, the device will be
marked as changed -- a sort of logical disconnect followed by connect,
requiring enumeration and all the other usual stuff.  A problem in step 6 
will end up leaving the device back in the ADDRESS state.

If a problem arises in step 7, I'm not sure what to do.  The driver bound
to the malfunctioning interface should be forcibly unbound.  However that
might be the driver which called reset_device, so it can't be unbound
right away.  Regardless, the reset_device routine will return success.

With this approach, a large part of the work in the "device changed"  
pathway is pushed off to khubd.  Running in a different process context,
it will be able to acquire the serialize lock, the bus rwsem, and whatever
else is needed.

In fact, I'd like to do the actual work not in khubd itself but in a
different process.  Maybe a work queue, maybe just a temporary kernel
thread spawned each time it's needed by khubd.  That's true for regular
connect and disconnect processing as well.  The advantage of this is that
khubd itself will no longer be blocked by badly behaved drivers waiting
(or hung) in their probe/disconnect routines, so it will better be able to
concentrate on its main job of managing hubs.

I've been waiting for the code freeze to end before doing any of the real 
work.  Now that Greg is accepting and applying patches again, I'll get 
going on this.

Alan Stern


^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [linux-usb-devel] Re: [OOPS,  usbcore, releaseintf] 2.6.0-test10-mm1
  2003-12-12 16:18                                                                                       ` Alan Stern
@ 2003-12-12 18:37                                                                                         ` David Brownell
  2003-12-12 19:17                                                                                           ` Alan Stern
  2003-12-12 18:50                                                                                         ` Oliver Neukum
  1 sibling, 1 reply; 113+ messages in thread
From: David Brownell @ 2003-12-12 18:37 UTC (permalink / raw)
  To: Alan Stern; +Cc: Duncan Sands, Kernel development list, USB development list

Alan Stern wrote:

> Maybe it will help if I explain how usb_reset_device will work in the 
> future.
> 
> First of all, as David has said, it does and will grab dev->serialize.  

Well, it "does" in my tree, but test11 doesn't (except in the
broken DFU path).  That's likely a source of some of Duncan's
confusion -- my bad, sorry.


> The alternate entry point (...physical...) will require the caller to 
> hold it already.
> 
> The routine will:
> 
> 	1. issue the port reset
> 	2. make sure the device is still attached
> 	3. assign it the same address as it had before
> 	4. read the device and configuration descriptors

I'd split step 4 into "4a" (device descriptors) and "4b"
(config descriptors) ... and then re-factor so 1..4a is
the same code as normal khubd enumeration.  That's what
I was looking at a while back.  If you like, I'll finish
that and forward.

That would also reduce the length of time the address0_sem
is held, eliminating a deadlock when a driver probe() from
khubd calls "physical" reset_device() after firmware update.

You'll notice that today's "physical reset" codepath doesn't
work the same way as the normal "just connected" reset.  Up
through step (4a) there's no point to that -- it's all just
potential bugginess, there's no good reason I can see to
have those codepaths do the same thing differently.


> 	5. make sure they are equal to the old descriptor values
> 	6. install the old configuration (if the old state was CONFIGURED) 
> 	7. select the old altsettings for each interface
> 
> ...
> 
> If a problem arises in step 7, I'm not sure what to do.  ...

I think that ALL errors in that reset path should be handled
the same way:  fail the reset, mark the device as gone, hand
the device to some task context ... and in that task context,
disconnect all the drivers, clean up sysfs, and re-enumerate
the device.  (Without dropping power to the port; we don't
want to need to re-download any firmware.)  Maybe there
should be exceptiona if the old state wasn't CONFIGURED.

The notion of a device that's "partially reset" sounds like
bugs waiting to happen.

- Dave



^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [linux-usb-devel] Re: [OOPS,  usbcore, releaseintf] 2.6.0-test10-mm1
  2003-12-12 16:18                                                                                       ` Alan Stern
  2003-12-12 18:37                                                                                         ` David Brownell
@ 2003-12-12 18:50                                                                                         ` Oliver Neukum
  1 sibling, 0 replies; 113+ messages in thread
From: Oliver Neukum @ 2003-12-12 18:50 UTC (permalink / raw)
  To: Alan Stern, Duncan Sands; +Cc: Kernel development list, USB development list

Am Freitag, 12. Dezember 2003 17:18 schrieb Alan Stern:
> On Thu, 11 Dec 2003, Duncan Sands wrote:
> 
> > From what Dave says, usb_reset_device shouldn't take dev->serialize (but
> > accidentally does via usb_set_configuration).  That seems strange to me:
> > I thought the point of usbfs taking dev->serialize is to protect against the
> > device settings changing, but now we have usb_reset_device that doesn't
> > take dev->serialize at all - and surely it changes the device settings!
> > 
> > With much confusion,
> 
> Maybe it will help if I explain how usb_reset_device will work in the 
> future.
> 
> First of all, as David has said, it does and will grab dev->serialize.  
> The alternate entry point (...physical...) will require the caller to 
> hold it already.
> 
> The routine will:
> 
> 	1. issue the port reset
> 	2. make sure the device is still attached
> 	3. assign it the same address as it had before
> 	4. read the device and configuration descriptors
> 	5. make sure they are equal to the old descriptor values
> 	6. install the old configuration (if the old state was CONFIGURED) 
> 	7. select the old altsettings for each interface
> 
> If anything goes wrong with step 1, the routine simply returns an error.
> 
> If something goes wrong in steps 2-6, the routine will set a flag in the
> usb_device indicating STATE_CHANGE_PENDING.  Several other routines will
> have to be modified to check this flag before doing anything else
> (set_config, set_interface, and so on).  The khubd thread will be woken up
> to handle the pending state change.

There is a danger. You must make sure that you can safely drop
addr0sem before you wake khubd. Up to this point you must
handle errors yourself, up to recursively disabling ports higher up in
the tree.

> If the problem arose in steps 2-4, the device will be marked for a pending
> port-disable and disconnect.  If it arose in step 5, the device will be
> marked as changed -- a sort of logical disconnect followed by connect,
> requiring enumeration and all the other usual stuff.  A problem in step 6 
> will end up leaving the device back in the ADDRESS state.
> 
> If a problem arises in step 7, I'm not sure what to do.  The driver bound
> to the malfunctioning interface should be forcibly unbound.  However that
> might be the driver which called reset_device, so it can't be unbound
> right away.  Regardless, the reset_device routine will return success.

It must be the driver bound to that interface. You cannot simply leave other
interfaces bound while somebody does a reset.
I suggest that you change the signature of usb_reset_device() to take
an interface and leave his altsettings to the calling driver.

But you must decide how to handle a reset in terms of the other interfaces.
Either you call disconnect() on them or you introduce a new mechanism.

> With this approach, a large part of the work in the "device changed"  
> pathway is pushed off to khubd.  Running in a different process context,
> it will be able to acquire the serialize lock, the bus rwsem, and whatever
> else is needed.
> 
> In fact, I'd like to do the actual work not in khubd itself but in a
> different process.  Maybe a work queue, maybe just a temporary kernel
> thread spawned each time it's needed by khubd.  That's true for regular
> connect and disconnect processing as well.  The advantage of this is that
> khubd itself will no longer be blocked by badly behaved drivers waiting
> (or hung) in their probe/disconnect routines, so it will better be able to
> concentrate on its main job of managing hubs.

That is not an advantage. Blocking khubd means that a bad USB driver
will kill the USB subsystem. If you launch threads you will do other bad
things.

	Regards
		Oliver


^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [linux-usb-devel] Re: [OOPS,  usbcore, releaseintf] 2.6.0-test10-mm1
  2003-12-12 18:37                                                                                         ` David Brownell
@ 2003-12-12 19:17                                                                                           ` Alan Stern
  2003-12-12 19:45                                                                                             ` David Brownell
  0 siblings, 1 reply; 113+ messages in thread
From: Alan Stern @ 2003-12-12 19:17 UTC (permalink / raw)
  To: David Brownell
  Cc: Duncan Sands, Kernel development list, USB development list

On Fri, 12 Dec 2003, David Brownell wrote:

> Alan Stern wrote:
> 
> > The routine will:
> > 
> > 	1. issue the port reset
> > 	2. make sure the device is still attached
> > 	3. assign it the same address as it had before
> > 	4. read the device and configuration descriptors
> 
> I'd split step 4 into "4a" (device descriptors) and "4b"
> (config descriptors) ... and then re-factor so 1..4a is
> the same code as normal khubd enumeration.  That's what
> I was looking at a while back.  If you like, I'll finish
> that and forward.

Sure.  Although depending how you do it, step 3 might be different (reuse 
the old address vs. assign a new address).  Also the failure paths will be 
different.  But that could all be handled with proper refactoring.

I intended to share common code as much as possible.  Since you've already 
got part of that (almost) written, I'll be happy to use your work.

> That would also reduce the length of time the address0_sem
> is held,

It would?  How so?

>  eliminating a deadlock when a driver probe() from
> khubd calls "physical" reset_device() after firmware update.
> 
> You'll notice that today's "physical reset" codepath doesn't
> work the same way as the normal "just connected" reset.  Up
> through step (4a) there's no point to that -- it's all just
> potential bugginess, there's no good reason I can see to
> have those codepaths do the same thing differently.

Agreed.


> I think that ALL errors in that reset path should be handled
> the same way:  fail the reset, mark the device as gone, hand
> the device to some task context ... and in that task context,
> disconnect all the drivers, clean up sysfs, and re-enumerate
> the device.  (Without dropping power to the port; we don't
> want to need to re-download any firmware.)  Maybe there
> should be exceptiona if the old state wasn't CONFIGURED.
> 
> The notion of a device that's "partially reset" sounds like
> bugs waiting to happen.

Your choice makes error handling easier.  And failure to restore an 
altsetting is a pathological case anyhow, so it's not worth worrying 
about.

Alan Stern


^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [linux-usb-devel] Re: [OOPS,  usbcore, releaseintf] 2.6.0-test10-mm1
  2003-12-12 19:17                                                                                           ` Alan Stern
@ 2003-12-12 19:45                                                                                             ` David Brownell
  2003-12-12 20:48                                                                                               ` Alan Stern
  0 siblings, 1 reply; 113+ messages in thread
From: David Brownell @ 2003-12-12 19:45 UTC (permalink / raw)
  To: Alan Stern; +Cc: Duncan Sands, Kernel development list, USB development list

Alan Stern wrote:
> On Fri, 12 Dec 2003, David Brownell wrote:
>>I'd split step 4 into "4a" (device descriptors) and "4b"
>>(config descriptors) ... and then re-factor so 1..4a is
>>the same code as normal khubd enumeration.  ...
> 
> Sure.  Although depending how you do it, step 3 might be different (reuse 
> the old address vs. assign a new address).  Also the failure paths will be 
> different.  But that could all be handled with proper refactoring.

That logic has always been a bit strange -- picking out the
address _before_ it starts the reset/set_address/get_descriptor
code.  Here's where that strangeness actually helps ... :)


>>That would also reduce the length of time the address0_sem
>>is held,
> 
> 
> It would?  How so?

It would be dropped after the address is assigned (the bus
no longer has an "address zero") ... rather than waiting
until after the device was configured and all its interfaces
were probed.  I think that's the issue Oliver alluded to in
his followup to your comment.

- Dave



^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [linux-usb-devel] Re: [OOPS,  usbcore, releaseintf] 2.6.0-test10-mm1
  2003-12-12 19:45                                                                                             ` David Brownell
@ 2003-12-12 20:48                                                                                               ` Alan Stern
  2003-12-12 21:01                                                                                                 ` Oliver Neukum
  0 siblings, 1 reply; 113+ messages in thread
From: Alan Stern @ 2003-12-12 20:48 UTC (permalink / raw)
  To: David Brownell
  Cc: Duncan Sands, Kernel development list, USB development list

On Fri, 12 Dec 2003, David Brownell wrote:

> Alan Stern wrote:
> 
> >>That would also reduce the length of time the address0_sem
> >>is held,
> > 
> > 
> > It would?  How so?
> 
> It would be dropped after the address is assigned (the bus
> no longer has an "address zero") ... rather than waiting
> until after the device was configured and all its interfaces
> were probed.  I think that's the issue Oliver alluded to in
> his followup to your comment.

I thought it did that already.  Oh well...

Alan Stern


^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [linux-usb-devel] Re: [OOPS,  usbcore, releaseintf] 2.6.0-test10-mm1
  2003-12-12 20:48                                                                                               ` Alan Stern
@ 2003-12-12 21:01                                                                                                 ` Oliver Neukum
  2003-12-12 21:27                                                                                                   ` Alan Stern
  0 siblings, 1 reply; 113+ messages in thread
From: Oliver Neukum @ 2003-12-12 21:01 UTC (permalink / raw)
  To: Alan Stern, David Brownell
  Cc: Duncan Sands, Kernel development list, USB development list

Am Freitag, 12. Dezember 2003 21:48 schrieb Alan Stern:
> On Fri, 12 Dec 2003, David Brownell wrote:
> 
> > Alan Stern wrote:
> > 
> > >>That would also reduce the length of time the address0_sem
> > >>is held,
> > > 
> > > 
> > > It would?  How so?
> > 
> > It would be dropped after the address is assigned (the bus
> > no longer has an "address zero") ... rather than waiting
> > until after the device was configured and all its interfaces
> > were probed.  I think that's the issue Oliver alluded to in
> > his followup to your comment.
> 
> I thought it did that already.  Oh well...

Not so simple. Khubd goes down a list. If the first item on its list
is not your failed reset, a deadlock will occur.

After you have submitted the URB that really does the reset, you
are commited. You must either set a valid address or disable the port.
You can rely on nobody else to do that.

	Regards
		Oliver


^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [linux-usb-devel] Re: [OOPS,  usbcore, releaseintf] 2.6.0-test10-mm1
  2003-12-12 21:01                                                                                                 ` Oliver Neukum
@ 2003-12-12 21:27                                                                                                   ` Alan Stern
  2003-12-12 23:36                                                                                                     ` Oliver Neukum
  0 siblings, 1 reply; 113+ messages in thread
From: Alan Stern @ 2003-12-12 21:27 UTC (permalink / raw)
  To: Oliver Neukum
  Cc: David Brownell, Duncan Sands, Kernel development list,
	USB development list

On Fri, 12 Dec 2003, Oliver Neukum wrote:

> Not so simple. Khubd goes down a list. If the first item on its list
> is not your failed reset, a deadlock will occur.
> 
> After you have submitted the URB that really does the reset, you
> are commited. You must either set a valid address or disable the port.
> You can rely on nobody else to do that.

I think we agree on that.  It was never my intention that fixing up a 
failure between the port reset and setting the device address should be 
put off for later handling by khubd.  That would be done immediately.

Hoever the consequent changes to the device structure (i.e., everything
needed to reflect the fact that it is disconnected) could be done in
another thread.

Alan Stern


^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [linux-usb-devel] Re: [OOPS,  usbcore, releaseintf] 2.6.0-test10-mm1
  2003-12-12 21:27                                                                                                   ` Alan Stern
@ 2003-12-12 23:36                                                                                                     ` Oliver Neukum
  2003-12-13  1:10                                                                                                       ` Alan Stern
  0 siblings, 1 reply; 113+ messages in thread
From: Oliver Neukum @ 2003-12-12 23:36 UTC (permalink / raw)
  To: Alan Stern
  Cc: David Brownell, Duncan Sands, Kernel development list,
	USB development list

Am Freitag, 12. Dezember 2003 22:27 schrieb Alan Stern:
> On Fri, 12 Dec 2003, Oliver Neukum wrote:
> 
> > Not so simple. Khubd goes down a list. If the first item on its list
> > is not your failed reset, a deadlock will occur.
> > 
> > After you have submitted the URB that really does the reset, you
> > are commited. You must either set a valid address or disable the port.
> > You can rely on nobody else to do that.
> 
> I think we agree on that.  It was never my intention that fixing up a 
> failure between the port reset and setting the device address should be 
> put off for later handling by khubd.  That would be done immediately.

OK.

> Hoever the consequent changes to the device structure (i.e., everything
> needed to reflect the fact that it is disconnected) could be done in
> another thread.

Please clarify. You have to disconnect() before you do the physical reset.
IMHO you should do the code paths for late errors and the device morphed
case in another thread, but what's the benefit for success?

	Regards
		Oliver


^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [linux-usb-devel] Re: [OOPS,  usbcore, releaseintf] 2.6.0-test10-mm1
  2003-12-12 23:36                                                                                                     ` Oliver Neukum
@ 2003-12-13  1:10                                                                                                       ` Alan Stern
  2003-12-13 11:52                                                                                                         ` Oliver Neukum
  0 siblings, 1 reply; 113+ messages in thread
From: Alan Stern @ 2003-12-13  1:10 UTC (permalink / raw)
  To: Oliver Neukum
  Cc: David Brownell, Duncan Sands, Kernel development list,
	USB development list

On Sat, 13 Dec 2003, Oliver Neukum wrote:

> > Hoever the consequent changes to the device structure (i.e., everything
> > needed to reflect the fact that it is disconnected) could be done in
> > another thread.
> 
> Please clarify. You have to disconnect() before you do the physical reset.

No you don't.  In fact, that would defeat one of the purposes of
usb_reset_device, which is to re-initialize the device while leaving an 
existing driver bound to it (so far as I know that feature is only used by 
usb-storage).  It's a last-ditch form of error recovery.

The API has an admitted weak spot when more than one driver is bound to 
the device.  No one has settled on a definite policy for how to handle 
that situation.

> IMHO you should do the code paths for late errors and the device morphed
> case in another thread, but what's the benefit for success?

In the success case there are no errors, the device hasn't morphed, and 
there's no need to do anything in another thread.  The existing driver(s) 
can remain bound, usb_reset_device returns 0, and nothing more has to be 
done.

Alan Stern


^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [linux-usb-devel] Re: [OOPS,  usbcore, releaseintf] 2.6.0-test10-mm1
  2003-12-13  1:10                                                                                                       ` Alan Stern
@ 2003-12-13 11:52                                                                                                         ` Oliver Neukum
  0 siblings, 0 replies; 113+ messages in thread
From: Oliver Neukum @ 2003-12-13 11:52 UTC (permalink / raw)
  To: Alan Stern
  Cc: David Brownell, Duncan Sands, Kernel development list,
	USB development list

 
> The API has an admitted weak spot when more than one driver is bound to 
> the device.  No one has settled on a definite policy for how to handle 
> that situation.

Right. You have to disconnect all but the driver requesting the reset.

> > IMHO you should do the code paths for late errors and the device morphed
> > case in another thread, but what's the benefit for success?
> 
> In the success case there are no errors, the device hasn't morphed, and 
> there's no need to do anything in another thread.  The existing driver(s) 
> can remain bound, usb_reset_device returns 0, and nothing more has to be 
> done.

OK, we agree.
There is one pathological case. We could have a device with several
interfaces of the same kind. In this case we would reenter probe, if
the reset was requested from within probe().
Could we avoid any complication if we move the rebinding to another
thread always?

	Regards
		Oliver



^ permalink raw reply	[flat|nested] 113+ messages in thread

end of thread, other threads:[~2003-12-13 11:53 UTC | newest]

Thread overview: 113+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2003-11-26 16:51 [kernel panic @ reboot] 2.6.0-test10-mm1 Vince
2003-11-26 17:16 ` Zwane Mwaikambo
2003-11-26 17:34   ` Vince
2003-11-26 17:35     ` Randy.Dunlap
2003-11-26 17:40     ` Zwane Mwaikambo
2003-11-26 17:54       ` Vince
2003-11-26 18:18         ` Zwane Mwaikambo
2003-11-26 23:37           ` Mike Fedyk
2003-11-26 23:41             ` Vince
2003-12-03  0:03               ` Randy.Dunlap
2003-12-03  0:31                 ` Mike Fedyk
2003-12-03  0:27                   ` Randy.Dunlap
2003-12-03 13:28                     ` Vince
2003-12-03 19:12                       ` Zwane Mwaikambo
2003-12-04  1:01                         ` Vince
2003-12-04  1:34                           ` Mike Fedyk
2003-12-04  4:11                             ` Randy.Dunlap
2003-12-04 10:59                               ` [OOPS, usbcore, releaseintf] 2.6.0-test10-mm1 Vince
2003-12-04 11:14                                 ` Duncan Sands
2003-12-04 16:57                                   ` Randy.Dunlap
2003-12-05  7:38                                     ` Duncan Sands
2003-12-05 10:11                                       ` Vince
2003-12-05 10:18                                         ` Duncan Sands
2003-12-05 10:34                                           ` Vince
2003-12-07  0:25                                         ` Duncan Sands
2003-12-07 21:09                                           ` Vince
2003-12-07 21:24                                             ` Duncan Sands
2003-12-07 22:24                                               ` Vince
2003-12-07 22:54                                               ` Vince
2003-12-08 10:10                                                 ` Duncan Sands
2003-12-08 16:03                                                   ` [linux-usb-devel] " David Brownell
2003-12-08 16:15                                                     ` Duncan Sands
2003-12-08 16:31                                                       ` Alan Stern
2003-12-08 17:20                                                         ` David Brownell
2003-12-08 17:59                                                         ` Duncan Sands
2003-12-08 18:35                                                           ` Alan Stern
2003-12-08 19:53                                                             ` Duncan Sands
2003-12-08 21:32                                                               ` Alan Stern
2003-12-08 21:55                                                                 ` Duncan Sands
2003-12-08 23:09                                                                   ` Alan Stern
2003-12-09 10:23                                                                     ` Duncan Sands
2003-12-09 15:55                                                                       ` Alan Stern
2003-12-09 20:36                                                                         ` Duncan Sands
2003-12-09 10:36                                                                     ` Duncan Sands
2003-12-09 16:08                                                                       ` Alan Stern
2003-12-09 20:24                                                                         ` Duncan Sands
2003-12-09 10:49                                                                     ` Duncan Sands
2003-12-09 15:47                                                                       ` Alan Stern
2003-12-09 21:12                                                                         ` Duncan Sands
2003-12-09 21:58                                                                           ` Alan Stern
2003-12-09 22:07                                                                             ` Duncan Sands
2003-12-09 22:25                                                                               ` David Brownell
2003-12-09 22:33                                                                                 ` Duncan Sands
2003-12-10  3:12                                                                                   ` David Brownell
2003-12-10  3:43                                                                                 ` Alan Stern
2003-12-10 13:12                                                                                   ` Duncan Sands
2003-12-10 15:13                                                                                     ` Alan Stern
2003-12-10 15:30                                                                                   ` Greg KH
2003-12-10 16:02                                                                                     ` Duncan Sands
2003-12-10 20:53                                                                                       ` Greg KH
2003-12-11  8:49                                                                                         ` Duncan Sands
2003-12-11  9:23                                                                                           ` Greg KH
2003-12-11  9:29                                                                                             ` Duncan Sands
2003-12-10 17:25                                                                                     ` Alan Stern
2003-12-10 20:46                                                                                       ` Greg KH
2003-12-10 21:08                                                                                         ` Greg KH
2003-12-11  2:10                                                                                           ` Vince
2003-12-11  6:46                                                                                             ` Greg KH
2003-12-10 22:08                                                                                         ` Alan Stern
2003-12-11  6:47                                                                                           ` Greg KH
2003-12-10  4:31                                                                               ` Vince
2003-12-10  1:49                                                                       ` Greg KH
2003-12-10 13:22                                                                     ` Duncan Sands
2003-12-10 16:20                                                                       ` Oliver Neukum
2003-12-10 16:49                                                                         ` Duncan Sands
2003-12-10 16:58                                                                           ` Oliver Neukum
2003-12-11  9:45                                                                             ` Duncan Sands
2003-12-11 10:19                                                                               ` Oliver Neukum
2003-12-11 21:43                                                                                 ` Duncan Sands
2003-12-11 22:57                                                                                   ` Oliver Neukum
2003-12-11 23:30                                                                                     ` Duncan Sands
2003-12-12  0:02                                                                                     ` David Brownell
2003-12-10 17:34                                                                           ` David Brownell
2003-12-10 17:54                                                                             ` Duncan Sands
2003-12-10 18:19                                                                               ` Alan Stern
2003-12-11  9:36                                                                                 ` Duncan Sands
2003-12-11 15:19                                                                                   ` Alan Stern
2003-12-11 21:23                                                                                     ` Duncan Sands
2003-12-12 15:46                                                                                       ` Alan Stern
2003-12-11 21:29                                                                                     ` Duncan Sands
2003-12-12 16:18                                                                                       ` Alan Stern
2003-12-12 18:37                                                                                         ` David Brownell
2003-12-12 19:17                                                                                           ` Alan Stern
2003-12-12 19:45                                                                                             ` David Brownell
2003-12-12 20:48                                                                                               ` Alan Stern
2003-12-12 21:01                                                                                                 ` Oliver Neukum
2003-12-12 21:27                                                                                                   ` Alan Stern
2003-12-12 23:36                                                                                                     ` Oliver Neukum
2003-12-13  1:10                                                                                                       ` Alan Stern
2003-12-13 11:52                                                                                                         ` Oliver Neukum
2003-12-12 18:50                                                                                         ` Oliver Neukum
2003-12-10 19:43                                                                               ` David Brownell
2003-12-11  9:21                                                                                 ` Duncan Sands
2003-12-10 17:21                                                                       ` David Brownell
2003-12-11  9:42                                                                         ` Duncan Sands
2003-12-12  2:21                                                           ` David Brownell
2003-12-12  8:47                                                             ` Duncan Sands
2003-12-12 15:35                                                             ` bill davidsen
2003-12-05  0:08                               ` [kernel panic @ reboot] 2.6.0-test10-mm1 Zwane Mwaikambo
2003-11-27  0:59           ` [kernel panic @ reboot in usbcore] 2.6.0-test10-mm1 (culprit: modem_run) Vince
2003-11-27  3:13             ` Zwane Mwaikambo
2003-11-27  8:14               ` Vince
2003-11-27  8:11             ` Duncan Sands

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).