LKML Archive on lore.kernel.org
help / color / mirror / Atom feed
* aic79xx trouble
@ 2004-05-13 19:25 Bernd Schubert
2004-05-13 19:36 ` Bernd Schubert
0 siblings, 1 reply; 11+ messages in thread
From: Bernd Schubert @ 2004-05-13 19:25 UTC (permalink / raw)
To: linux-kernel
[-- Attachment #1: signed data --]
[-- Type: text/plain, Size: 11842 bytes --]
Hello,
we are just in the process of setting up a new server, which will serve the
data of an IDE/SCSI raid system (transtec 5008). Some partions of this raid
device are also mirrored via drbd to a failover system. During a full resync
of all (3) failover partitions *from* the failover server, the main-server
first logs many scsi errors and later the access to the raid-partitions
completely locks up.
Below is some relevant dmesg output, I already enabled the verbose option for
the aic79xx driver. Should I also enable debugging, if so, which mode?
Any help is highly appreciated.
Thanks in advance,
Bernd
SCSI subsystem driver Revision: 1.00
ahd_pci:2:6:1: Reading VPD from SEEPROM...ahd_pci:2:6:1: VPD parsing
successful
ahd_pci:2:6:1: Reading SEEPROM...done.
ahd_pci:2:6:1: STPWLEVEL is on
ahd_pci:2:6:1: Manual Primary Termination
ahd_pci:2:6:1: Manual Secondary Termination
ahd_pci:2:6:1: Primary High byte termination Enabled
ahd_pci:2:6:1: Primary Low byte termination Enabled
ahd_pci:2:6:1: Secondary High byte termination Disabled
ahd_pci:2:6:1: Secondary Low byte termination Disabled
ahd_pci:2:6:1: Downloading Sequencer Program... 656 instructions downloaded
ahd_pci:2:6:1: Features 0x1c101, Bugs 0x700002, Flags 0x43f0
ahd_pci:2:6:0: Reading VPD from SEEPROM...ahd_pci:2:6:0: VPD parsing
successful
ahd_pci:2:6:0: Reading SEEPROM...done.
ahd_pci:2:6:0: STPWLEVEL is on
ahd_pci:2:6:0: Manual Primary Termination
ahd_pci:2:6:0: Manual Secondary Termination
ahd_pci:2:6:0: Primary High byte termination Enabled
ahd_pci:2:6:0: Primary Low byte termination Enabled
ahd_pci:2:6:0: Secondary High byte termination Disabled
ahd_pci:2:6:0: Secondary Low byte termination Disabled
ahd_pci:2:6:0: Downloading Sequencer Program... 656 instructions downloaded
ahd_pci:2:6:0: Features 0x1c101, Bugs 0x700002, Flags 0x43f1
scsi0 : Adaptec AIC79XX PCI-X SCSI HBA DRIVER, Rev 1.3.10
<Adaptec AIC7902 Ultra320 SCSI adapter>
aic7902: Ultra320 Wide Channel A, SCSI Id=7, PCI-X 67-100Mhz, 512 SCBs
scsi1 : Adaptec AIC79XX PCI-X SCSI HBA DRIVER, Rev 1.3.10
<Adaptec AIC7902 Ultra320 SCSI adapter>
aic7902: Ultra320 Wide Channel B, SCSI Id=7, PCI-X 67-100Mhz, 512 SCBs
blk: queue f7961e18, I/O limit 4095Mb (mask 0xffffffff)
scsi0:A:0:0: DV failed to configure device. Please file a bug report against
this driver.
(scsi0:A:0:0): Sending PPR bus_width 1, period 9, offset 7f, ppr_options 2
(scsi0:A:0:0): Received PPR width 1, period 9, offset 1f,options 2
Filtered to width 1, period 9, offset 1f, options 2
(scsi0:A:0): 6.600MB/s transfers (16bit)
scsi0: target 0 using 16bit transfers
(scsi0:A:0): 160.000MB/s transfers (80.000MHz DT, 16bit)
scsi0: target 0 synchronous with period = 0x9, offset = 0x1f(DT)
Vendor: transtec Model: Rev: 0001
Type: Direct-Access ANSI SCSI revision: 03
blk: queue f7961c18, I/O limit 4095Mb (mask 0xffffffff)
(scsi0:A:0): 160.000MB/s transfers (80.000MHz DT, 16bit)
scsi0:A:0:0: Tagged Queuing enabled. Depth 32
Attached scsi disk sda at scsi0, channel 0, id 0, lun 0
SCSI device sda: 4101521408 512-byte hdwr sectors (2099979 MB)
sda: sda1 sda2 sda3 < sda5 sda6 sda7 sda8 >
drbd: initialised. Version: 0.6.12 (api:64/proto:62)
drbd0: Connection established. size=24410736 KB / blksize=4096 B
drbd1: Connection established. size=19535008 KB / blksize=4096 B
drbd1: Synchronisation started blks=15
drbd2: Connection established. size=195760026 KB / blksize=4096 B
drbd2: Synchronisation started blks=15
scsi0:0:0:0: Attempting to abort cmd f78fce00: 0x28 0x0 0x5 0xd1 0x2a 0x89 0x0
0x0 0x78 0x0
scsi0: At time of recovery, card was not paused
>>>>>>>>>>>>>>>>>> Dump Card State Begins <<<<<<<<<<<<<<<<<
scsi0: Dumping Card State at program address 0x25 Mode 0x11
Card was paused
HS_MAILBOX[0x0] INTCTL[0xc0]:(SWTMINTEN|SWTMINTMASK)
SEQINTSTAT[0x0] SAVED_MODE[0x11] DFFSTAT[0x33]:(CURRFIFO_NONE|FIFO0FREE|
FIFO1FREE)
SCSISIGI[0x0]:(P_DATAOUT) SCSIPHASE[0x0] SCSIBUS[0x0]
LASTPHASE[0x1]:(P_DATAOUT|P_BUSFREE) SCSISEQ0[0x0]
SCSISEQ1[0x12]:(ENAUTOATNP|ENRSELI) SEQCTL0[0x0] SEQINTCTL[0x0]
SEQ_FLAGS[0xc0]:(NO_CDB_SENT|NOT_IDENTIFIED) SEQ_FLAGS2[0x0]
SSTAT0[0x0] SSTAT1[0x8]:(BUSFREE) SSTAT2[0x0] SSTAT3[0x0]
PERRDIAG[0x0] SIMODE1[0xa4]:(ENSCSIPERR|ENSCSIRST|ENSELTIMO)
LQISTAT0[0x0] LQISTAT1[0x0] LQISTAT2[0x0] LQOSTAT0[0x0]
LQOSTAT1[0x0] LQOSTAT2[0x0]
SCB Count = 32 CMDS_PENDING = 2 LASTSCB 0xffff CURRSCB 0x8 NEXTSCB 0x0
qinstart = 53192 qinfifonext = 53192
QINFIFO:
WAITING_TID_QUEUES:
Pending list:
5 FIFO_USE[0x0] SCB_CONTROL[0x64]:(DISCONNECTED|TAG_ENB|DISCENB)
SCB_SCSIID[0x7]
11 FIFO_USE[0x0] SCB_CONTROL[0x64]:(DISCONNECTED|TAG_ENB|DISCENB)
SCB_SCSIID[0x7]
Total 2
Kernel Free SCB list: 8 25 23 24 6 13 18 26 20 30 22 9 19 29 17 12 14 28 21 3
15 27 0 10 2 31 1 16 7 4
Sequencer Complete DMA-inprog list:
Sequencer Complete list:
Sequencer DMA-Up and Complete list:
scsi0: FIFO0 Free, LONGJMP == 0x80ff, SCB 0x0
SEQIMODE[0x3f]:(ENCFG4TCMD|ENCFG4ICMD|ENCFG4TSTAT|ENCFG4ISTAT|ENCFG4DATA|
ENSAVEPTRS)
SEQINTSRC[0x0] DFCNTRL[0x0] DFSTATUS[0x89]:(FIFOEMP|HDONE|PRELOAD_AVAIL)
SG_CACHE_SHADOW[0x2]:(LAST_SEG) SG_STATE[0x0] DFFSXFRCTL[0x0]
SOFFCNT[0x0] MDFFSTAT[0x5]:(FIFOFREE|DLZERO) SHADDR = 0x00, SHCNT = 0x0
HADDR = 0x00, HCNT = 0x0 CCSGCTL[0x10]:(SG_CACHE_AVAIL)
scsi0: FIFO1 Free, LONGJMP == 0x81d7, SCB 0x8
SEQIMODE[0x3f]:(ENCFG4TCMD|ENCFG4ICMD|ENCFG4TSTAT|ENCFG4ISTAT|ENCFG4DATA|
ENSAVEPTRS)
SEQINTSRC[0x0] DFCNTRL[0x4]:(DIRECTION) DFSTATUS[0x89]:(FIFOEMP|HDONE|
PRELOAD_AVAIL)
SG_CACHE_SHADOW[0x2]:(LAST_SEG) SG_STATE[0x0] DFFSXFRCTL[0x0]
SOFFCNT[0x0] MDFFSTAT[0x5]:(FIFOFREE|DLZERO) SHADDR = 0x00, SHCNT = 0x0
HADDR = 0x00, HCNT = 0x0 CCSGCTL[0x10]:(SG_CACHE_AVAIL)
LQIN: 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0
0x0 0x0
scsi0: LQISTATE = 0x0, LQOSTATE = 0x0, OPTIONMODE = 0x52
scsi0: OS_SPACE_CNT = 0x20 MAXCMDCNT = 0x0
SIMODE0[0xc]:(ENOVERRUN|ENIOERR)
CCSCBCTL[0x0]
scsi0: REG0 == 0x10, SINDEX = 0x1e0, DINDEX = 0xe1
scsi0: SCBPTR == 0x8, SCB_NEXT == 0xff00, SCB_NEXT2 == 0xff97
CDB 2a 0 1 80 1 7a
STACK: 0x13 0x0 0x0 0x0 0x0 0x0 0x0 0x0
<<<<<<<<<<<<<<<<< Dump Card State Ends >>>>>>>>>>>>>>>>>>
DevQ(0:0:0): 0 waiting
(scsi0:A:0:0): Device is disconnected, re-queuing SCB
Recovery code sleeping
(scsi0:A:0:0): Abort Tag Message Sent
(scsi0:A:0:0): SCB 11 - Abort Completed.
Recovery SCB completes
found == 0x1
Recovery code awake
scsi0:0:0:0: Attempting to abort cmd f78fd000: 0x28 0x0 0x3 0x7c 0x71 0x99 0x0
0x0 0x78 0x0
scsi0:0:0:0: Command not found
scsi0:A:0: no active SCB for reconnecting target - issuing BUS DEVICE RESET
SAVED_SCSIID == 0x7, SAVED_LUN == 0x0, REG0 == 0xff00 ACCUM = 0x0
SEQ_FLAGS == 0xc0, SCBPTR == 0xb, BTT == 0xff00, SINDEX == 0x104
SELID == 0x0, SCB_SCSIID == 0x7, SCB_LUN == 0x0, SCB_CONTROL == 0x68
SCSIBUS[0] == 0xb, SCSISIGI == 0xe6
SXFRCTL0 == 0x88
SEQCTL0 == 0x0
>>>>>>>>>>>>>>>>>> Dump Card State Begins <<<<<<<<<<<<<<<<<
scsi0: Dumping Card State at program address 0x14a Mode 0x33
Card was paused
HS_MAILBOX[0x0] INTCTL[0xc0]:(SWTMINTEN|SWTMINTMASK)
SEQINTSTAT[0x0] SAVED_MODE[0x11] DFFSTAT[0x33]:(CURRFIFO_NONE|FIFO0FREE|
FIFO1FREE)
SCSISIGI[0xe6]:(P_MESGIN|REQI|BSYI) SCSIPHASE[0x8]:(MSG_IN_PHASE)
SCSIBUS[0xb] LASTPHASE[0xe0]:(P_MESGIN) SCSISEQ0[0x0]
SCSISEQ1[0x12]:(ENAUTOATNP|ENRSELI) SEQCTL0[0x0] SEQINTCTL[0x0]
SEQ_FLAGS[0xc0]:(NO_CDB_SENT|NOT_IDENTIFIED) SEQ_FLAGS2[0x0]
SSTAT0[0x2]:(SPIORDY) SSTAT1[0x9]:(REQINIT|BUSFREE)
SSTAT2[0x0] SSTAT3[0x0] PERRDIAG[0x0] SIMODE1[0xac]:(ENSCSIPERR|ENBUSFREE|
ENSCSIRST|ENSELTIMO)
LQISTAT0[0x0] LQISTAT1[0x0] LQISTAT2[0x0] LQOSTAT0[0x0]
LQOSTAT1[0x0] LQOSTAT2[0x0]
SCB Count = 32 CMDS_PENDING = 0 LASTSCB 0xffff CURRSCB 0xb NEXTSCB 0x0
qinstart = 53195 qinfifonext = 53195
QINFIFO:
WAITING_TID_QUEUES:
Pending list:
Total 0
Kernel Free SCB list: 11 5 8 25 23 24 6 13 18 26 20 30 22 9 19 29 17 12 14 28
21 3 15 27 0 10 2 31 1 16 7 4
Sequencer Complete DMA-inprog list:
Sequencer Complete list:
Sequencer DMA-Up and Complete list:
scsi0: FIFO0 Free, LONGJMP == 0x80ff, SCB 0x0
SEQIMODE[0x3f]:(ENCFG4TCMD|ENCFG4ICMD|ENCFG4TSTAT|ENCFG4ISTAT|ENCFG4DATA|
ENSAVEPTRS)
SEQINTSRC[0x0] DFCNTRL[0x0] DFSTATUS[0x89]:(FIFOEMP|HDONE|PRELOAD_AVAIL)
SG_CACHE_SHADOW[0x2]:(LAST_SEG) SG_STATE[0x0] DFFSXFRCTL[0x0]
SOFFCNT[0x0] MDFFSTAT[0x5]:(FIFOFREE|DLZERO) SHADDR = 0x00, SHCNT = 0x0
HADDR = 0x00, HCNT = 0x0 CCSGCTL[0x10]:(SG_CACHE_AVAIL)
scsi0: FIFO1 Free, LONGJMP == 0x81d7, SCB 0xb
SEQIMODE[0x3f]:(ENCFG4TCMD|ENCFG4ICMD|ENCFG4TSTAT|ENCFG4ISTAT|ENCFG4DATA|
ENSAVEPTRS)
SEQINTSRC[0x0] DFCNTRL[0x0] DFSTATUS[0x89]:(FIFOEMP|HDONE|PRELOAD_AVAIL)
SG_CACHE_SHADOW[0x2]:(LAST_SEG) SG_STATE[0x0] DFFSXFRCTL[0x0]
SOFFCNT[0x0] MDFFSTAT[0x5]:(FIFOFREE|DLZERO) SHADDR = 0x00, SHCNT = 0x0
HADDR = 0x00, HCNT = 0x0 CCSGCTL[0x10]:(SG_CACHE_AVAIL)
LQIN: 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0
0x0 0x0
scsi0: LQISTATE = 0x0, LQOSTATE = 0x0, OPTIONMODE = 0x52
scsi0: OS_SPACE_CNT = 0x20 MAXCMDCNT = 0x0
SIMODE0[0xc]:(ENOVERRUN|ENIOERR)
CCSCBCTL[0x0]
scsi0: REG0 == 0xff00, SINDEX = 0x104, DINDEX = 0xa9
scsi0: SCBPTR == 0xb, SCB_NEXT == 0xff00, SCB_NEXT2 == 0xff97
CDB 28 0 5 80 a9 53
STACK: 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0
<<<<<<<<<<<<<<<<< Dump Card State Ends >>>>>>>>>>>>>>>>>>
DevQ(0:0:0): 0 waiting
scsi0:A:0:0: Target did not send an IDENTIFY message. LASTPHASE = 0x60.
(scsi0:A:0): 80.000MB/s transfers (80.000MHz DT)
scsi0: target 0 using 8bit transfers
(scsi0:A:0): 3.300MB/s transfers
scsi0: target 0 using asynchronous transfers
scsi0: target 1 using 8bit transfers
scsi0: target 1 using asynchronous transfers
scsi0: target 2 using 8bit transfers
scsi0: target 2 using asynchronous transfers
scsi0: target 3 using 8bit transfers
scsi0: target 3 using asynchronous transfers
scsi0: target 4 using 8bit transfers
scsi0: target 4 using asynchronous transfers
scsi0: target 5 using 8bit transfers
scsi0: target 5 using asynchronous transfers
scsi0: target 6 using 8bit transfers
scsi0: target 6 using asynchronous transfers
scsi0: target 8 using 8bit transfers
scsi0: target 8 using asynchronous transfers
scsi0: target 9 using 8bit transfers
scsi0: target 9 using asynchronous transfers
scsi0: target 10 using 8bit transfers
scsi0: target 10 using asynchronous transfers
scsi0: target 11 using 8bit transfers
scsi0: target 11 using asynchronous transfers
scsi0: target 12 using 8bit transfers
scsi0: target 12 using asynchronous transfers
scsi0: target 13 using 8bit transfers
scsi0: target 13 using asynchronous transfers
scsi0: target 14 using 8bit transfers
scsi0: target 14 using asynchronous transfers
scsi0: target 15 using 8bit transfers
scsi0: target 15 using asynchronous transfers
scsi0: Issued Channel A Bus Reset. 0 SCBs aborted
scsi0: target 0 using 8bit transfers
scsi0: target 0 using asynchronous transfers
scsi0: SCSI bus reset delivered. 0 SCBs aborted.
scsi0:A:0:0: DV failed to configure device. Please file a bug report against
this driver.
(scsi0:A:0:0): Sending PPR bus_width 1, period 9, offset 7f, ppr_options 2
(scsi0:A:0:0): Received PPR width 1, period 9, offset 1f,options 2
Filtered to width 1, period 9, offset 1f, options 2
(scsi0:A:0): 6.600MB/s transfers (16bit)
scsi0: target 0 using 16bit transfers
(scsi0:A:0): 160.000MB/s transfers (80.000MHz DT, 16bit)
scsi0: target 0 synchronous with period = 0x9, offset = 0x1f(DT)
--
Bernd Schubert
Physikalisch Chemisches Institut / Theoretische Chemie
Universität Heidelberg
INF 229
69120 Heidelberg
e-mail: bernd.schubert@pci.uni-heidelberg.de
[-- Attachment #2: signature --]
[-- Type: application/pgp-signature, Size: 189 bytes --]
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: aic79xx trouble
2004-05-13 19:25 aic79xx trouble Bernd Schubert
@ 2004-05-13 19:36 ` Bernd Schubert
2004-05-16 17:42 ` Etienne Vogt
0 siblings, 1 reply; 11+ messages in thread
From: Bernd Schubert @ 2004-05-13 19:36 UTC (permalink / raw)
To: linux-kernel
[-- Attachment #1: signed data --]
[-- Type: text/plain, Size: 978 bytes --]
Oh, I forgot the system specifications:
- dual opteron on tyan S2882 board
- vanilla linux-2.4.26
Also, is acpi relevant? It is enabled in the kernel-configuration and the the
kernel prints quite a lot of error messages when it parses the dsdt.
On Thursday 13 May 2004 21:25, Bernd Schubert wrote:
> Hello,
>
> we are just in the process of setting up a new server, which will serve the
> data of an IDE/SCSI raid system (transtec 5008). Some partions of this raid
> device are also mirrored via drbd to a failover system. During a full
> resync of all (3) failover partitions *from* the failover server, the
> main-server first logs many scsi errors and later the access to the
> raid-partitions completely locks up.
>
> Below is some relevant dmesg output, I already enabled the verbose option
> for the aic79xx driver. Should I also enable debugging, if so, which mode?
>
> Any help is highly appreciated.
>
>
> Thanks in advance,
> Bernd
>
[-- Attachment #2: signature --]
[-- Type: application/pgp-signature, Size: 189 bytes --]
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: aic79xx trouble
2004-05-13 19:36 ` Bernd Schubert
@ 2004-05-16 17:42 ` Etienne Vogt
2004-05-16 18:10 ` Justin T. Gibbs
0 siblings, 1 reply; 11+ messages in thread
From: Etienne Vogt @ 2004-05-16 17:42 UTC (permalink / raw)
To: linux-kernel
On Thu, 13 May 2004, Bernd Schubert wrote:
> Oh, I forgot the system specifications:
>
> - dual opteron on tyan S2882 board
> - vanilla linux-2.4.26
>
> > we are just in the process of setting up a new server, which will serve the
> > data of an IDE/SCSI raid system (transtec 5008). Some partions of this raid
> > device are also mirrored via drbd to a failover system. During a full
> > resync of all (3) failover partitions *from* the failover server, the
> > main-server first logs many scsi errors and later the access to the
> > raid-partitions completely locks up.
> >
> > Below is some relevant dmesg output, I already enabled the verbose option
> > for the aic79xx driver. Should I also enable debugging, if so, which mode?
The Adaptec Ultra320 cards (aic79xx) do not work reliably on Tyan Thunder
motherboards. Lots of SCSI errors and eventually complete system lockup.
I guess those motherboards have a crappy PCI bus with a lot of noise
that can't cope with the high transfer speed of these SCSI cards.
I suggest you try an Ultra160 card. We have 3 Tyan Thunder based systems
here (those are dual Athlon MP2800) that work fine with Adaptec Ultra160
cards (aic7xxx) but give lots of errors with the Ultra320 cards.
--
Etienne Vogt (Etienne.Vogt@obspm.fr)
Unix System Manager
Observatoire de Paris-Meudon, France
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: aic79xx trouble
2004-05-16 17:42 ` Etienne Vogt
@ 2004-05-16 18:10 ` Justin T. Gibbs
2004-05-18 15:48 ` Marcelo Tosatti
2004-05-20 20:30 ` Marcelo Tosatti
0 siblings, 2 replies; 11+ messages in thread
From: Justin T. Gibbs @ 2004-05-16 18:10 UTC (permalink / raw)
To: Etienne Vogt, linux-kernel
> The Adaptec Ultra320 cards (aic79xx) do not work reliably on Tyan Thunder
> motherboards.
The U320 chips likely work a lot better now if you use driver version 2.0.12.
The AMD chipsets seem to screw up split completions, and this version of
the driver avoids the issue for the most common case of triggering the
bug (transaction completion DMAs) by never crossing an ADB boundary with
a single DMA.
--
Justin
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: aic79xx trouble
2004-05-16 18:10 ` Justin T. Gibbs
@ 2004-05-18 15:48 ` Marcelo Tosatti
2004-05-20 15:22 ` Justin T. Gibbs
2004-05-20 20:30 ` Marcelo Tosatti
1 sibling, 1 reply; 11+ messages in thread
From: Marcelo Tosatti @ 2004-05-18 15:48 UTC (permalink / raw)
To: Justin T. Gibbs; +Cc: Etienne Vogt, linux-kernel, James.Bottomley
On Sun, May 16, 2004 at 12:10:12PM -0600, Justin T. Gibbs wrote:
> > The Adaptec Ultra320 cards (aic79xx) do not work reliably on Tyan Thunder
> > motherboards.
>
> The U320 chips likely work a lot better now if you use driver version 2.0.12.
> The AMD chipsets seem to screw up split completions, and this version of
> the driver avoids the issue for the most common case of triggering the
> bug (transaction completion DMAs) by never crossing an ADB boundary with
> a single DMA.
Hi Justin,
I've seen several reports of what seem to be aic7xxx driver bugs. And
some of them you have stated that are fixed by your new driver.
I feel we should merge it in v2.4 mainline.
Do you have any idea of how widely use your newer driver is?
For what reason the changes you made havent been merged in the past
in mainline?
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: aic79xx trouble
2004-05-18 15:48 ` Marcelo Tosatti
@ 2004-05-20 15:22 ` Justin T. Gibbs
2004-05-20 16:15 ` Justin T. Gibbs
0 siblings, 1 reply; 11+ messages in thread
From: Justin T. Gibbs @ 2004-05-20 15:22 UTC (permalink / raw)
To: Marcelo Tosatti; +Cc: Etienne Vogt, linux-kernel, James.Bottomley, Luben Tuikov
> On Sun, May 16, 2004 at 12:10:12PM -0600, Justin T. Gibbs wrote:
>> > The Adaptec Ultra320 cards (aic79xx) do not work reliably on Tyan Thunder
>> > motherboards.
>>
>> The U320 chips likely work a lot better now if you use driver version 2.0.12.
>> The AMD chipsets seem to screw up split completions, and this version of
>> the driver avoids the issue for the most common case of triggering the
>> bug (transaction completion DMAs) by never crossing an ADB boundary with
>> a single DMA.
>
> Hi Justin,
>
> I've seen several reports of what seem to be aic7xxx driver bugs. And
> some of them you have stated that are fixed by your new driver.
>
> I feel we should merge it in v2.4 mainline.
>
> Do you have any idea of how widely use your newer driver is?
Every user that sends a problem report to me is encouraged to use
the new drivers, and Adaptec only supports the newer drivers. I
can't, however, give you a definitive number.
> For what reason the changes you made havent been merged in the past
> in mainline?
The latest drivers (6.3.X for aic7xxx and 2.0.X for aic79xx) perform
their own watchdog error recovery. I made this change in order to overcome
deficiencies that exist in the SCSI mid-layer. While there have been
discussions around fixing these problems in 2.6.X (and some have been
corrected there), I do not believe that the 2.4.X SCSI layer will ever
be fixed to allow the removal of this code. So, as far as I can tell,
the complaints that have been raised about the latest drivers performing
private error recovery do not apply to 2.4.X and the latest drivers can be
merged there without controversy.
I have merged the latest drivers against linux-2.4 as of May 13th.
You can find bksend output containing all of the revisions to these
files since the last merge into 2.4.X, here:
http://people.freebsd.org/~gibbs/linux/SRC/aic79xx-linux-2.4-20040513.bksend.gz
Should you decide to merge in the new drivers, please retain the
revision history.
Thanks,
Justin
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: aic79xx trouble
2004-05-20 15:22 ` Justin T. Gibbs
@ 2004-05-20 16:15 ` Justin T. Gibbs
2004-05-21 21:52 ` Erik Andersen
0 siblings, 1 reply; 11+ messages in thread
From: Justin T. Gibbs @ 2004-05-20 16:15 UTC (permalink / raw)
To: Marcelo Tosatti; +Cc: Etienne Vogt, linux-kernel, James.Bottomley, Luben Tuikov
> I have merged the latest drivers against linux-2.4 as of May 13th.
> You can find bksend output containing all of the revisions to these
> files since the last merge into 2.4.X, here:
>
> http://people.freebsd.org/~gibbs/linux/SRC/aic79xx-linux-2.4-20040513.bksend.gz
I just pulled again and merged with the latest code as of today:
http://people.freebsd.org/~gibbs/linux/SRC/aic79xx-linux-2.4-20040520.bksend.gz
--
Justin
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: aic79xx trouble
2004-05-16 18:10 ` Justin T. Gibbs
2004-05-18 15:48 ` Marcelo Tosatti
@ 2004-05-20 20:30 ` Marcelo Tosatti
2004-05-20 20:49 ` Justin T. Gibbs
1 sibling, 1 reply; 11+ messages in thread
From: Marcelo Tosatti @ 2004-05-20 20:30 UTC (permalink / raw)
To: Justin T. Gibbs; +Cc: Etienne Vogt, linux-kernel
On Sun, May 16, 2004 at 12:10:12PM -0600, Justin T. Gibbs wrote:
> > The Adaptec Ultra320 cards (aic79xx) do not work reliably on Tyan Thunder
> > motherboards.
>
> The U320 chips likely work a lot better now if you use driver version 2.0.12.
> The AMD chipsets seem to screw up split completions, and this version of
> the driver avoids the issue for the most common case of triggering the
> bug (transaction completion DMAs) by never crossing an ADB boundary with
> a single DMA.
Justin,
Just out of curiosity, would you care to submit small fixes in separate
patches instead of a huge patch over a full -pre series
(2.4.28-pre for example) ?
Are distros using your updates ?
Thanks.
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: aic79xx trouble
2004-05-20 20:30 ` Marcelo Tosatti
@ 2004-05-20 20:49 ` Justin T. Gibbs
2004-05-21 11:24 ` Jens Axboe
0 siblings, 1 reply; 11+ messages in thread
From: Justin T. Gibbs @ 2004-05-20 20:49 UTC (permalink / raw)
To: Marcelo Tosatti; +Cc: Etienne Vogt, linux-kernel
> On Sun, May 16, 2004 at 12:10:12PM -0600, Justin T. Gibbs wrote:
>> > The Adaptec Ultra320 cards (aic79xx) do not work reliably on Tyan Thunder
>> > motherboards.
>>
>> The U320 chips likely work a lot better now if you use driver version 2.0.12.
>> The AMD chipsets seem to screw up split completions, and this version of
>> the driver avoids the issue for the most common case of triggering the
>> bug (transaction completion DMAs) by never crossing an ADB boundary with
>> a single DMA.
>
> Justin,
>
> Just out of curiosity, would you care to submit small fixes in separate
> patches instead of a huge patch over a full -pre series
> (2.4.28-pre for example) ?
If you look at the bksend output from my site, it is broken up into
lots of smaller changes. These changes are not tied to a particular
2.4.X revision - they were made and released in response to driver
bug reports and coded so the driver will operate in just about any
2.4.X kernel - customers can't wait for the next kernel to be
released or the community to enter a "pre" phase of development.
As for submitting "small fixes in separate patches", the fixes are
whatever size they come out to be after they are implemented. I
always choose the fix based on correctness and maintainability, not
on whether or not I can break it up into small patches for submission.
In other-words, the size of the changes have nothing to do with their
merit.
What it seems you are asking for is more frequent submissions.
While that is possible, I don't know that it is in the best interest
of the community. I submit a new update to kernel.org after changes
have had a time to settle and have been validated by Adaptec and
several of its customers. Unless the failure is going to be seen
in wide-spread use and the fix is so obvious that it does not need
rigorous validation, I prefer to wait and submit a known quantity.
The latest drivers on my website have had a sufficient level of
testing for me to now feel comfortable with pushing the changes to
kernel.org.
> Are distros using your updates ?
I believe that SuSE is merging these drivers into their next 2.6.X
based release.
--
Justin
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: aic79xx trouble
2004-05-20 20:49 ` Justin T. Gibbs
@ 2004-05-21 11:24 ` Jens Axboe
0 siblings, 0 replies; 11+ messages in thread
From: Jens Axboe @ 2004-05-21 11:24 UTC (permalink / raw)
To: Justin T. Gibbs; +Cc: Marcelo Tosatti, Etienne Vogt, linux-kernel
On Thu, May 20 2004, Justin T. Gibbs wrote:
> > Are distros using your updates ?
>
> I believe that SuSE is merging these drivers into their next 2.6.X
> based release.
I don't think so. There will be an update disk for support for newly
added hardware.
--
Jens Axboe
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: aic79xx trouble
2004-05-20 16:15 ` Justin T. Gibbs
@ 2004-05-21 21:52 ` Erik Andersen
0 siblings, 0 replies; 11+ messages in thread
From: Erik Andersen @ 2004-05-21 21:52 UTC (permalink / raw)
To: Justin T. Gibbs
Cc: Marcelo Tosatti, Etienne Vogt, linux-kernel, James.Bottomley,
Luben Tuikov
On Thu May 20, 2004 at 10:15:25AM -0600, Justin T. Gibbs wrote:
> > I have merged the latest drivers against linux-2.4 as of May 13th.
> > You can find bksend output containing all of the revisions to these
> > files since the last merge into 2.4.X, here:
> >
> > http://people.freebsd.org/~gibbs/linux/SRC/aic79xx-linux-2.4-20040513.bksend.gz
>
> I just pulled again and merged with the latest code as of today:
>
> http://people.freebsd.org/~gibbs/linux/SRC/aic79xx-linux-2.4-20040520.bksend.gz
I do not think this code should (yet) be merged into 2.4.x. I grabbed a
copy of aic79xx-linux-2.4-20040520 and patched it into 2.4.27-pre3. It
appears to work as expected for my 29160 card, and thing work as
expected for that. However...
I then inserted my Adaptec SlimSCSI 1480B Cardbus card into my laptop
and started up pcmcia. This immediately resulted in an Oops and lsmod
shows aic7xxx is stuck initializing. This does not happen with stock
2.4.27-pre3, where the Adaptec 1480B card works as expected.
May 21 14:18:57 sage kernel: Linux Kernel Card Services 3.1.22
May 21 14:18:57 sage kernel: options: [pci] [cardbus] [pm]
May 21 14:18:57 sage kernel: Intel ISA PCIC probe: not found.
May 21 14:18:57 sage kernel: PCI: Found IRQ 10 for device 02:0f.0
May 21 14:18:57 sage kernel: PCI: Sharing IRQ 10 with 00:1f.2
May 21 14:18:57 sage kernel: PCI: Sharing IRQ 10 with 02:06.0
May 21 14:18:57 sage kernel: PCI: Sharing IRQ 10 with 02:06.1
May 21 14:18:57 sage kernel: PCI: Sharing IRQ 10 with 02:0f.1
May 21 14:18:57 sage kernel: PCI: Sharing IRQ 10 with 02:0f.2
May 21 14:18:57 sage kernel: PCI: Found IRQ 10 for device 02:0f.1
May 21 14:18:57 sage kernel: PCI: Sharing IRQ 10 with 00:1f.2
May 21 14:18:57 sage kernel: PCI: Sharing IRQ 10 with 02:06.0
May 21 14:18:57 sage kernel: PCI: Sharing IRQ 10 with 02:06.1
May 21 14:18:57 sage kernel: PCI: Sharing IRQ 10 with 02:0f.0
May 21 14:18:57 sage kernel: PCI: Sharing IRQ 10 with 02:0f.2
May 21 14:18:57 sage kernel: Yenta ISA IRQ mask 0x0298, PCI irq 10
May 21 14:18:57 sage kernel: Socket status: 30000006
May 21 14:18:57 sage kernel: Yenta ISA IRQ mask 0x0298, PCI irq 10
May 21 14:18:57 sage kernel: Socket status: 30000006
May 21 14:18:57 sage kernel: cs: IO port probe 0x0c00-0x0cff: clean.
May 21 14:18:57 sage kernel: cs: IO port probe 0x0100-0x04ff: excluding 0x378-0x37f 0x4d0-0x4d7
May 21 14:18:57 sage kernel: cs: IO port probe 0x0a00-0x0aff: clean.
May 21 14:19:05 sage kernel: cs: cb_alloc(bus 3): vendor 0x9004, device 0x6075
May 21 14:19:05 sage kernel: PCI: Enabling device 03:00.0 (0000 -> 0003)
May 21 14:19:05 sage kernel: SCSI subsystem driver Revision: 1.00
May 21 14:19:05 sage kernel: PCI: Setting latency timer of device 03:00.0 to 64
May 21 14:19:05 sage kernel: aic7xxx: PCI Device 3:0:0 failed memory mapped test. Using PIO.
May 21 14:19:05 sage kernel: Unable to handle kernel NULL pointer dereference at virtual address 00000087
May 21 14:19:05 sage kernel: printing eip:
May 21 14:19:05 sage kernel: e08a966a
May 21 14:19:05 sage kernel: *pde = 00000000
May 21 14:19:05 sage kernel: Oops: 0000
May 21 14:19:05 sage kernel: CPU: 0
May 21 14:19:05 sage kernel: EIP: 0010:[<e08a966a>] Not tainted
May 21 14:19:05 sage kernel: EFLAGS: 00010246
May 21 14:19:05 sage kernel: eax: 00000000 ebx: df29ae00 ecx: 80030004 edx: 00000cfc
May 21 14:19:05 sage kernel: esi: 02900006 edi: 00000005 ebp: dc2c5d8c esp: dc2c5d70
May 21 14:19:05 sage kernel: ds: 0018 es: 0018 ss: 0018
May 21 14:19:05 sage kernel: Process modprobe (pid: 646, stackpage=dc2c5000)
May 21 14:19:05 sage kernel: Stack: 00000006 e08aba83 c02ca6e0 00064000 df29ae00 02900006 02900004 dc2c5db8
May 21 14:19:05 sage kernel: e08abc7b df29ae00 00000000 00004000 00000000 02900007 df290300 e08b45b0
May 21 14:19:05 sage kernel: df29ae00 df29ae00 dc2c5e08 e08a8937 df29ae00 dc848000 00000000 dc848000
May 21 14:19:05 sage kernel: Call Trace: [<e08aba83>] [<e08abc7b>] [<e08b45b0>] [<e08a8937>] [pcibios_set_master+102/108]
May 21 14:19:05 sage kernel: [<e08ab9a0>] [<e08b45b0>] [<e08b45b0>] [get_new_inode+45/230] [iget4_locked+182/190] [<e08b4b60>]
May 21 14:19:05 sage kernel: [<e08b4bc0>] [pci_announce_device+54/88] [<e08b4b60>] [<e08b4bc0>] [pci_register_driver+66/90] [<e08b4bc0>]
May 21 14:19:05 sage kernel: [<e08b2da0>] [<e08ab9ee>] [<e08b4bc0>] [<e0894ff0>] [<e087f47f>] [<e08b2da0>]
May 21 14:19:05 sage kernel: [<e08947e9>] [<e08947f1>] [<e089a47a>] [<e08b2da0>] [<e08b2da0>] [<e08b2c28>]
May 21 14:19:05 sage kernel: [sys_init_module+1334/1494] [<e08b1768>] [<e0894060>] [system_call+51/56]
May 21 14:19:05 sage kernel:
May 21 14:19:05 sage kernel: Code: 0f b6 80 87 00 00 00 eb 0f 90 0f b7 53 04 81 c2 87 00 00 00
-Erik
--
Erik B. Andersen http://codepoet-consulting.com/
--This message was written using 73% post-consumer electrons--
^ permalink raw reply [flat|nested] 11+ messages in thread
end of thread, other threads:[~2004-05-21 21:52 UTC | newest]
Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2004-05-13 19:25 aic79xx trouble Bernd Schubert
2004-05-13 19:36 ` Bernd Schubert
2004-05-16 17:42 ` Etienne Vogt
2004-05-16 18:10 ` Justin T. Gibbs
2004-05-18 15:48 ` Marcelo Tosatti
2004-05-20 15:22 ` Justin T. Gibbs
2004-05-20 16:15 ` Justin T. Gibbs
2004-05-21 21:52 ` Erik Andersen
2004-05-20 20:30 ` Marcelo Tosatti
2004-05-20 20:49 ` Justin T. Gibbs
2004-05-21 11:24 ` Jens Axboe
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).