Dan Williams wrote: > I can reliably reproduce a null pointer dereference on 2.6.20 and > 2.6.21-rc2. I will keep digging to find the kernel version where this > last worked, but wanted to see if there were any immediate experiments I > should try. > > The failure is caused by running tiobench on a MD raid6 array with 6 out > of 8 disks available. The commands I issued to reproduce this are: > > mdadm -A /dev/md0 /dev/sd[bcdefg] > mount /dev/md0 /mnt/raid > tiobench --numruns 5 --size 2048 --dir /mnt/raid > > The filesystem is ext3. The controller is an LSI 1068. Here are the > two BUG messages first 2.6.21-rc2 followed by 2.6.20. I will reply to > this message with the config. > Kernel 2.6.20 on an i686 > > [ 177.299787] BUG: unable to handle kernel NULL pointer dereference at virtual address 0000005c > [ 177.308526] printing eip: > [ 177.311287] c01de510 > [ 177.313521] *pde = 34d40001 > [ 177.316353] Oops: 0000 [#1] > [ 177.319202] SMP > [ 177.321107] Modules linked in: raid456 xor nfsd exportfs lockd nfs_acl sunrpc autofs4 hidp l2cap bluetooth iptable_raw xt_policy xt_multiport ipt_ULOG ipt_TTL ipt_ttl ipt_TOS ipt_tos ipt_SAME ipt_REJECT ipt_REDIRECT ipt_recent ipt_owner ipt_NETMAP ipt_MASQUERADE ipt_LOG ipt_iprange ipt_ECN ipt_ecn ipt_CLUSTERIP ipt_ah ipt_addrtype xt_tcpmss xt_pkttype xt_physdev xt_NFQUEUE xt_MARK xt_mark xt_mac xt_limit xt_length xt_helper xt_dccp xt_conntrack xt_CONNMARK xt_connmark xt_CLASSIFY xt_tcpudp xt_state iptable_nat nf_nat nf_conntrack_ipv4 nf_conntrack iptable_mangle nfnetlink iptable_filter ip_tables x_tables video sbs i2c_ec dock button battery asus_acpi ac radeon drm ipv6 lp parport_pc parport e1000 uhci_hcd floppy mptsas mptscsih mptbase sg ehci_hcd scsi_transport_sas i2c_i801 i2c_core pcspkr dm_snapshot dm_zero dm_mirror dm_mod ata_piix ata_generic libata sd_mod scsi_mod ext3 jbd > [ 177.402252] CPU: 2 > [ 177.402253] EIP: 0060:[] Not tainted VLI > [ 177.402253] EFLAGS: 00210016 (2.6.20 #5) > [ 177.414194] EIP is at cfq_dispatch_insert+0xb/0x53 > [ 177.419056] eax: f7773ec0 ebx: 00000000 ecx: f7773cc0 edx: 00000000 > [ 177.425982] esi: f70abae0 edi: f7773cc0 ebp: 00000000 esp: f34dbcbc > [ 177.432953] ds: 007b es: 007b ss: 0068 > [ 177.437127] Process tiotest (pid: 5405, ti=f34db000 task=f7efc030 task.ti=f34db000) > [ 177.444763] Stack: 00000049 f77d3b9c f7773cc0 00000000 c01de6ce c014041e f8a26806 00000082 > [ 177.453456] f7efc030 fffe22d6 00000000 00000000 00000000 00000004 f7efc030 f7773cc0 > [ 177.462121] 00000000 00000000 00000000 f70abae0 f7cd5800 f70abae0 c01d4fcc 00000001 > [ 177.470798] Call Trace: > [ 177.473503] [] cfq_dispatch_requests+0x12d/0x466 > [ 177.479223] [] __lock_acquire+0x9e9/0xa72 > [ 177.484285] [] scsi_request_fn+0x286/0x336 [scsi_mod] > [ 177.490485] [] elv_next_request+0x1a2/0x1b2 > [ 177.495766] [] scsi_request_fn+0x286/0x336 [scsi_mod] > [ 177.501912] [] _spin_lock_irq+0x38/0x43 > [ 177.506840] [] scsi_request_fn+0x59/0x336 [scsi_mod] > [ 177.512981] [] blk_remove_plug+0x5a/0x66 > [ 177.517983] [] __generic_unplug_device+0x1d/0x1f > [ 177.523705] [] generic_unplug_device+0x15/0x21 > [ 177.529272] [] unplug_slaves+0x54/0x88 [raid456] > [ 177.535013] [] blk_backing_dev_unplug+0x73/0x7b > [ 177.540657] [] _spin_unlock_irqrestore+0x3e/0x4d > [ 177.546382] [] sync_page+0x0/0x3b > [ 177.550774] [] trace_hardirqs_on+0x12e/0x158 > [ 177.556108] [] sync_page+0x0/0x3b > [ 177.560471] [] block_sync_page+0x31/0x32 > [ 177.565449] [] sync_page+0x33/0x3b > [ 177.569916] [] __wait_on_bit_lock+0x2a/0x52 > [ 177.575201] [] __lock_page+0x58/0x5e > [ 177.579810] [] wake_bit_function+0x0/0x3c > [ 177.584905] [] do_generic_mapping_read+0x1db/0x44f > [ 177.590911] [] generic_file_aio_read+0x173/0x1a4 > [ 177.596617] [] file_read_actor+0x0/0xdb > [ 177.601525] [] do_sync_read+0xc7/0x10a > [ 177.606365] [] autoremove_wake_function+0x0/0x35 > [ 177.612130] [] do_sync_read+0x0/0x10a > [ 177.616867] [] vfs_read+0xa6/0x152 > [ 177.621362] [] sys_read+0x41/0x67 > [ 177.625794] [] syscall_call+0x7/0xb > [ 177.630403] ======================= > [ 177.634031] Code: da 11 3b c0 c7 04 24 51 9d 39 c0 e8 c9 a1 f4 ff e8 ca 6e f2 ff ff 4f 34 83 c4 18 5b 5e 5f 5d c3 55 57 56 89 c6 53 8b 40 0c 89 d3 <8b> 7a 5c 8b 68 04 89 d0 e8 b5 fe ff ff 8b 43 14 89 da 25 01 80 > [ 177.654378] EIP: [] cfq_dispatch_insert+0xb/0x53 SS:ESP 0068:f34dbcbc cfq_dispatch_requests() has called cfq_dispatch_insert() with a NULL second argument (struct request *rq) There are two patches for raid5/6 out there that might fix this. I'll attach them (the second just fixes a minor bug in the first one.)