LKML Archive on lore.kernel.org
help / color / mirror / Atom feed
* 2.6.6 breaks kmail (nfs related?)
@ 2004-05-13 12:11 Andreas Amann
2004-05-16 4:46 ` Linus Torvalds
2004-05-17 6:35 ` Norberto Bensa
0 siblings, 2 replies; 20+ messages in thread
From: Andreas Amann @ 2004-05-13 12:11 UTC (permalink / raw)
To: linux-kernel
[-- Attachment #1: Type: text/plain, Size: 1214 bytes --]
Hi,
I upgraded from vanilla 2.6.4 to vanilla 2.6.6, using the same compiler
(gcc-3.3.1) and .config file (shortened version in attachment) for both. Now
I cannot send messages with kmail and I get the following error messages:
...
kmail: Error: Could not add message to folder (No space left on device?)
kmail: WARNING: KMail encountered a fatal error and will terminate now.
The error was:
KMFolderMaildir::addMsg: abnormally terminating to prevent data loss.
...
Apparently kmail thinks that my /home device is full, but it is not (still
~37GB free). Other programs have no problem writing into my home.
Maybe kmail uses some kind of lock before writing to a folder? (I have fam
enabled.) Is it possible that this broke recently?
My home directory is mounted via udp-nfs from a server running vanilla 2.4.25
with a reiserfs on a hardware raid. The mount options on the client are
hservnlds:/home /net/hservnlds/home nfs
rw,nosuid,nodev,v3,rsize=8192,wsize=8192,hard,intr,udp,lock,addr=sservnlds 0
0
Any hints? This looks like a reproducible regression between 2.6.4 and 2.6.6
to me. I can do more tests on request.
Andreas
--
Andreas Amann
Institut für Theoretische Physik, TU Berlin
[-- Attachment #2: config-2.6.6 --]
[-- Type: text/plain, Size: 5356 bytes --]
CONFIG_X86=y
CONFIG_MMU=y
CONFIG_UID16=y
CONFIG_GENERIC_ISA_DMA=y
CONFIG_EXPERIMENTAL=y
CONFIG_CLEAN_COMPILE=y
CONFIG_STANDALONE=y
CONFIG_SWAP=y
CONFIG_SYSVIPC=y
CONFIG_SYSCTL=y
CONFIG_LOG_BUF_SHIFT=15
CONFIG_HOTPLUG=y
CONFIG_IKCONFIG=y
CONFIG_IKCONFIG_PROC=y
CONFIG_KALLSYMS=y
CONFIG_FUTEX=y
CONFIG_EPOLL=y
CONFIG_IOSCHED_NOOP=y
CONFIG_IOSCHED_AS=y
CONFIG_IOSCHED_DEADLINE=y
CONFIG_IOSCHED_CFQ=y
CONFIG_MODULES=y
CONFIG_MODULE_UNLOAD=y
CONFIG_MODULE_FORCE_UNLOAD=y
CONFIG_OBSOLETE_MODPARM=y
CONFIG_KMOD=y
CONFIG_STOP_MACHINE=y
CONFIG_X86_PC=y
CONFIG_M586=y
CONFIG_X86_GENERIC=y
CONFIG_X86_CMPXCHG=y
CONFIG_X86_XADD=y
CONFIG_X86_L1_CACHE_SHIFT=7
CONFIG_RWSEM_XCHGADD_ALGORITHM=y
CONFIG_X86_PPRO_FENCE=y
CONFIG_X86_F00F_BUG=y
CONFIG_X86_WP_WORKS_OK=y
CONFIG_X86_INVLPG=y
CONFIG_X86_BSWAP=y
CONFIG_X86_POPAD_OK=y
CONFIG_X86_ALIGNMENT_16=y
CONFIG_X86_INTEL_USERCOPY=y
CONFIG_SMP=y
CONFIG_NR_CPUS=32
CONFIG_PREEMPT=y
CONFIG_X86_LOCAL_APIC=y
CONFIG_X86_IO_APIC=y
CONFIG_HIGHMEM4G=y
CONFIG_HIGHMEM=y
CONFIG_MTRR=y
CONFIG_IRQBALANCE=y
CONFIG_HAVE_DEC_LOCK=y
CONFIG_PM=y
CONFIG_ACPI_BOOT=y
CONFIG_APM=m
CONFIG_APM_RTC_IS_GMT=y
CONFIG_PCI=y
CONFIG_PCI_GOANY=y
CONFIG_PCI_BIOS=y
CONFIG_PCI_DIRECT=y
CONFIG_PCI_MMCONFIG=y
CONFIG_PCI_LEGACY_PROC=y
CONFIG_PCI_NAMES=y
CONFIG_ISA=y
CONFIG_PCMCIA_PROBE=y
CONFIG_BINFMT_ELF=y
CONFIG_BINFMT_AOUT=y
CONFIG_PARPORT=m
CONFIG_PARPORT_PC=m
CONFIG_PARPORT_PC_CML1=m
CONFIG_PARPORT_1284=y
CONFIG_BLK_DEV_FD=m
CONFIG_BLK_DEV_LOOP=m
CONFIG_BLK_DEV_CRYPTOLOOP=m
CONFIG_BLK_DEV_RAM=m
CONFIG_BLK_DEV_RAM_SIZE=4096
CONFIG_IDE=y
CONFIG_BLK_DEV_IDE=y
CONFIG_BLK_DEV_IDEDISK=y
CONFIG_BLK_DEV_IDECD=m
CONFIG_BLK_DEV_IDESCSI=m
CONFIG_IDE_GENERIC=y
CONFIG_BLK_DEV_IDEPCI=y
CONFIG_IDEPCI_SHARE_IRQ=y
CONFIG_BLK_DEV_IDEDMA_PCI=y
CONFIG_IDEDMA_PCI_AUTO=y
CONFIG_BLK_DEV_ADMA=y
CONFIG_BLK_DEV_AMD74XX=y
CONFIG_BLK_DEV_PIIX=y
CONFIG_BLK_DEV_VIA82CXXX=y
CONFIG_BLK_DEV_IDEDMA=y
CONFIG_IDEDMA_AUTO=y
CONFIG_SCSI=m
CONFIG_SCSI_PROC_FS=y
CONFIG_BLK_DEV_SD=m
CONFIG_BLK_DEV_SR=m
CONFIG_CHR_DEV_SG=m
CONFIG_SCSI_MULTI_LUN=y
CONFIG_SCSI_CONSTANTS=y
CONFIG_SCSI_AIC7XXX=m
CONFIG_AIC7XXX_CMDS_PER_DEVICE=253
CONFIG_AIC7XXX_RESET_DELAY_MS=15000
CONFIG_AIC7XXX_DEBUG_MASK=0
CONFIG_SCSI_QLA2XXX=m
CONFIG_NET=y
CONFIG_PACKET=y
CONFIG_UNIX=y
CONFIG_NET_KEY=y
CONFIG_INET=y
CONFIG_INET_AH=y
CONFIG_INET_ESP=y
CONFIG_INET_IPCOMP=y
CONFIG_IPV6=m
CONFIG_NETFILTER=y
CONFIG_IP_NF_IPTABLES=m
CONFIG_IP_NF_FILTER=m
CONFIG_XFRM=y
CONFIG_NETDEVICES=y
CONFIG_DUMMY=m
CONFIG_NET_ETHERNET=y
CONFIG_MII=y
CONFIG_NET_VENDOR_3COM=y
CONFIG_VORTEX=y
CONFIG_NET_TULIP=y
CONFIG_TULIP=y
CONFIG_NET_PCI=y
CONFIG_E100=y
CONFIG_INPUT=y
CONFIG_INPUT_MOUSEDEV=y
CONFIG_INPUT_MOUSEDEV_PSAUX=y
CONFIG_INPUT_MOUSEDEV_SCREEN_X=1024
CONFIG_INPUT_MOUSEDEV_SCREEN_Y=768
CONFIG_SOUND_GAMEPORT=y
CONFIG_SERIO=y
CONFIG_SERIO_I8042=y
CONFIG_SERIO_SERPORT=y
CONFIG_INPUT_KEYBOARD=y
CONFIG_KEYBOARD_ATKBD=y
CONFIG_INPUT_MOUSE=y
CONFIG_MOUSE_PS2=y
CONFIG_VT=y
CONFIG_VT_CONSOLE=y
CONFIG_HW_CONSOLE=y
CONFIG_SERIAL_8250=y
CONFIG_SERIAL_8250_CONSOLE=y
CONFIG_SERIAL_8250_NR_UARTS=4
CONFIG_SERIAL_CORE=y
CONFIG_SERIAL_CORE_CONSOLE=y
CONFIG_UNIX98_PTYS=y
CONFIG_LEGACY_PTYS=y
CONFIG_LEGACY_PTY_COUNT=256
CONFIG_PRINTER=m
CONFIG_AGP=y
CONFIG_AGP_INTEL=y
CONFIG_AGP_VIA=y
CONFIG_I2C=m
CONFIG_I2C_CHARDEV=m
CONFIG_I2C_ALGOBIT=m
CONFIG_I2C_ALGOPCF=m
CONFIG_I2C_AMD756=m
CONFIG_I2C_ISA=m
CONFIG_I2C_PIIX4=m
CONFIG_I2C_VIA=m
CONFIG_I2C_VIAPRO=m
CONFIG_I2C_SENSOR=m
CONFIG_SENSORS_VIA686A=m
CONFIG_SENSORS_W83781D=m
CONFIG_SENSORS_W83L785TS=m
CONFIG_VGA_CONSOLE=y
CONFIG_DUMMY_CONSOLE=y
CONFIG_SOUND=m
CONFIG_SND=m
CONFIG_SND_TIMER=m
CONFIG_SND_PCM=m
CONFIG_SND_HWDEP=m
CONFIG_SND_RAWMIDI=m
CONFIG_SND_OSSEMUL=y
CONFIG_SND_MIXER_OSS=m
CONFIG_SND_PCM_OSS=m
CONFIG_SND_MPU401_UART=m
CONFIG_SND_OPL3_LIB=m
CONFIG_SND_DUMMY=m
CONFIG_SND_MPU401=m
CONFIG_SND_AC97_CODEC=m
CONFIG_SND_CMIPCI=m
CONFIG_SND_ENS1371=m
CONFIG_USB=m
CONFIG_USB_DEVICEFS=y
CONFIG_USB_EHCI_HCD=m
CONFIG_USB_OHCI_HCD=m
CONFIG_USB_UHCI_HCD=m
CONFIG_USB_AUDIO=m
CONFIG_USB_PRINTER=m
CONFIG_USB_STORAGE=m
CONFIG_USB_HID=m
CONFIG_USB_HIDINPUT=y
CONFIG_USB_HIDDEV=y
CONFIG_USB_USBNET=m
CONFIG_USB_ALI_M5632=y
CONFIG_USB_ARMLINUX=y
CONFIG_EXT2_FS=y
CONFIG_EXT3_FS=y
CONFIG_JBD=y
CONFIG_REISERFS_FS=m
CONFIG_MINIX_FS=m
CONFIG_AUTOFS4_FS=y
CONFIG_ISO9660_FS=m
CONFIG_JOLIET=y
CONFIG_ZISOFS=y
CONFIG_ZISOFS_FS=m
CONFIG_UDF_FS=m
CONFIG_FAT_FS=m
CONFIG_MSDOS_FS=m
CONFIG_VFAT_FS=m
CONFIG_NTFS_FS=m
CONFIG_NTFS_RW=y
CONFIG_PROC_FS=y
CONFIG_PROC_KCORE=y
CONFIG_SYSFS=y
CONFIG_TMPFS=y
CONFIG_RAMFS=y
CONFIG_CRAMFS=m
CONFIG_NFS_FS=y
CONFIG_NFS_V3=y
CONFIG_NFSD=y
CONFIG_NFSD_V3=y
CONFIG_LOCKD=y
CONFIG_LOCKD_V4=y
CONFIG_EXPORTFS=y
CONFIG_SUNRPC=y
CONFIG_SMB_FS=m
CONFIG_MSDOS_PARTITION=y
CONFIG_NLS=y
CONFIG_NLS_DEFAULT="iso8859-1"
CONFIG_NLS_CODEPAGE_437=m
CONFIG_NLS_ISO8859_1=m
CONFIG_NLS_ISO8859_15=m
CONFIG_EARLY_PRINTK=y
CONFIG_X86_FIND_SMP_CONFIG=y
CONFIG_X86_MPPARSE=y
CONFIG_CRYPTO=y
CONFIG_CRYPTO_HMAC=y
CONFIG_CRYPTO_NULL=m
CONFIG_CRYPTO_MD4=m
CONFIG_CRYPTO_MD5=y
CONFIG_CRYPTO_SHA1=y
CONFIG_CRYPTO_SHA256=m
CONFIG_CRYPTO_SHA512=m
CONFIG_CRYPTO_DES=y
CONFIG_CRYPTO_BLOWFISH=m
CONFIG_CRYPTO_TWOFISH=m
CONFIG_CRYPTO_SERPENT=m
CONFIG_CRYPTO_AES=m
CONFIG_CRYPTO_DEFLATE=y
CONFIG_CRC32=y
CONFIG_ZLIB_INFLATE=y
CONFIG_ZLIB_DEFLATE=y
CONFIG_X86_SMP=y
CONFIG_X86_HT=y
CONFIG_X86_BIOS_REBOOT=y
CONFIG_X86_TRAMPOLINE=y
CONFIG_X86_STD_RESOURCES=y
CONFIG_PC=y
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: 2.6.6 breaks kmail (nfs related?)
2004-05-13 12:11 2.6.6 breaks kmail (nfs related?) Andreas Amann
@ 2004-05-16 4:46 ` Linus Torvalds
2004-05-16 17:59 ` Trond Myklebust
2004-05-17 6:35 ` Norberto Bensa
1 sibling, 1 reply; 20+ messages in thread
From: Linus Torvalds @ 2004-05-16 4:46 UTC (permalink / raw)
To: Andreas Amann, Trond Myklebust; +Cc: Kernel Mailing List
On Thu, 13 May 2004, Andreas Amann wrote:
>
> I upgraded from vanilla 2.6.4 to vanilla 2.6.6, using the same compiler
> (gcc-3.3.1) and .config file (shortened version in attachment) for both. Now
> I cannot send messages with kmail and I get the following error messages:
>
> ...
> kmail: Error: Could not add message to folder (No space left on device?)
> kmail: WARNING: KMail encountered a fatal error and will terminate now.
> The error was:
> KMFolderMaildir::addMsg: abnormally terminating to prevent data loss.
Can you strace it to see what the failing system call was? Especially if
you can compare the traces between 2.6.4 and 2.6.6 some way..
Trond, any idea?
Linus
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: 2.6.6 breaks kmail (nfs related?)
2004-05-16 4:46 ` Linus Torvalds
@ 2004-05-16 17:59 ` Trond Myklebust
2004-05-16 18:10 ` Trond Myklebust
0 siblings, 1 reply; 20+ messages in thread
From: Trond Myklebust @ 2004-05-16 17:59 UTC (permalink / raw)
To: Linus Torvalds; +Cc: Andreas Amann, Kernel Mailing List
På su , 16/05/2004 klokka 00:46, skreiv Linus Torvalds:
> Can you strace it to see what the failing system call was? Especially if
> you can compare the traces between 2.6.4 and 2.6.6 some way..
>
> Trond, any idea?
Not really: there isn't anything in the NFS filesystem code that can
generate an ENOSPC. I agree that the "strace" output will help.
Andreas are both the server and the client running 2.6.6? If so, which
do you have to downgrade to 2.6.4 in order to get rid of the error?
Cheers,
Trond
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: 2.6.6 breaks kmail (nfs related?)
2004-05-16 17:59 ` Trond Myklebust
@ 2004-05-16 18:10 ` Trond Myklebust
2004-05-16 18:19 ` Linus Torvalds
0 siblings, 1 reply; 20+ messages in thread
From: Trond Myklebust @ 2004-05-16 18:10 UTC (permalink / raw)
To: Linus Torvalds; +Cc: Andreas Amann, Kernel Mailing List
På su , 16/05/2004 klokka 13:59, skreiv Trond Myklebust:
> Andreas are both the server and the client running 2.6.6? If so, which
> do you have to downgrade to 2.6.4 in order to get rid of the error?
Oh... Another thing that would be useful: mount options please...
Cheers,
Trond
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: 2.6.6 breaks kmail (nfs related?)
2004-05-16 18:10 ` Trond Myklebust
@ 2004-05-16 18:19 ` Linus Torvalds
2004-05-16 18:47 ` Trond Myklebust
0 siblings, 1 reply; 20+ messages in thread
From: Linus Torvalds @ 2004-05-16 18:19 UTC (permalink / raw)
To: Trond Myklebust; +Cc: Andreas Amann, Kernel Mailing List
On Sun, 16 May 2004, Trond Myklebust wrote:
>
> Oh... Another thing that would be useful: mount options please...
They were in the original email on the kernel mailing list:
hservnlds:/home /net/hservnlds/home nfs rw,nosuid,nodev,v3,rsize=8192,wsize=8192,hard,intr,udp,lock,addr=sservnlds 0
The only thing there is that "intr". Maybe something has broken so that
non-lethal signals also trigger errors? That could explain it (partial
reads or writes when a timer goes off, or something).
Linus
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: 2.6.6 breaks kmail (nfs related?)
2004-05-16 18:19 ` Linus Torvalds
@ 2004-05-16 18:47 ` Trond Myklebust
2004-05-16 18:50 ` Linus Torvalds
0 siblings, 1 reply; 20+ messages in thread
From: Trond Myklebust @ 2004-05-16 18:47 UTC (permalink / raw)
To: Linus Torvalds; +Cc: Andreas Amann, Kernel Mailing List
På su , 16/05/2004 klokka 14:19, skreiv Linus Torvalds:
> They were in the original email on the kernel mailing list:
Sorry. I was in Malaysia last week so that email probably drowned in the
1600 other mails I found in my backlog when I returned on Friday. I've
found it now in the archives...
> hservnlds:/home /net/hservnlds/home nfs rw,nosuid,nodev,v3,rsize=8192,wsize=8192,hard,intr,udp,lock,addr=sservnlds 0
>
> The only thing there is that "intr". Maybe something has broken so that
> non-lethal signals also trigger errors? That could explain it (partial
> reads or writes when a timer goes off, or something).
I haven't touched rpc_clnt_sigmask() in many years, so that would have
to be some change to the generic signal handling code.
If kmail really is reporting an ENOSPC, though, then it's hard to see
how a signal could produce that particular error.
Cheers,
Trond
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: 2.6.6 breaks kmail (nfs related?)
2004-05-16 18:47 ` Trond Myklebust
@ 2004-05-16 18:50 ` Linus Torvalds
2004-05-16 19:10 ` Trond Myklebust
0 siblings, 1 reply; 20+ messages in thread
From: Linus Torvalds @ 2004-05-16 18:50 UTC (permalink / raw)
To: Trond Myklebust; +Cc: Andreas Amann, Kernel Mailing List
On Sun, 16 May 2004, Trond Myklebust wrote:
>
> If kmail really is reporting an ENOSPC, though, then it's hard to see
> how a signal could produce that particular error.
Agreed. But the kmail message is apparently "(No space left on device?)",
which may be just kmail itself reacting to a truncated write rather than
any actual ENOSPC error. A "strace" would help clarify exactly what goes
wrong..
Linus
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: 2.6.6 breaks kmail (nfs related?)
2004-05-16 18:50 ` Linus Torvalds
@ 2004-05-16 19:10 ` Trond Myklebust
2004-05-17 11:31 ` Andreas Amann
0 siblings, 1 reply; 20+ messages in thread
From: Trond Myklebust @ 2004-05-16 19:10 UTC (permalink / raw)
To: Linus Torvalds; +Cc: Andreas Amann, Kernel Mailing List
På su , 16/05/2004 klokka 14:50, skreiv Linus Torvalds:
> Agreed. But the kmail message is apparently "(No space left on device?)",
> which may be just kmail itself reacting to a truncated write rather than
> any actual ENOSPC error. A "strace" would help clarify exactly what goes
> wrong..
Right...
One possible suspect might be open(O_EXCL) since, AFAICS, Andreas is
using maildir-style mailboxes. Perhaps that SETATTR call in
nfs3_proc_create() is failing? We recently fixed so that it always sets
MTIME/ATIME...
Andreas: when you do the "strace" could you first run
echo "16" >/proc/sys/sunrpc/nfs_debug
and then record the output from "dmesg" immediately after the kmail
crash?
Cheers,
Trond
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: 2.6.6 breaks kmail (nfs related?)
2004-05-13 12:11 2.6.6 breaks kmail (nfs related?) Andreas Amann
2004-05-16 4:46 ` Linus Torvalds
@ 2004-05-17 6:35 ` Norberto Bensa
2004-05-17 7:14 ` Andrew Morton
2004-05-17 16:17 ` Frank van Maarseveen
1 sibling, 2 replies; 20+ messages in thread
From: Norberto Bensa @ 2004-05-17 6:35 UTC (permalink / raw)
To: linux-kernel
Andreas Amann wrote:
> kmail: Error: Could not add message to folder (No space left on device?)
> kmail: WARNING: KMail encountered a fatal error and will terminate now.
> The error was:
> KMFolderMaildir::addMsg: abnormally terminating to prevent data loss.
> ...
Well, I'm getting this with kcalc after upgrading to 2.6.6-mm3:
$ kcalc
KCrash: Application 'kcalc' crashing...
strace shows lots of
...
close(1002) = -1 EBADF (Bad file descriptor)
close(1003) = -1 EBADF (Bad file descriptor)
close(1004) = -1 EBADF (Bad file descriptor)
close(1005) = -1 EBADF (Bad file descriptor)
...
Now it's late. More tests and info tomorrow (unless there's a new -mm kernel
which fixes this :-) )
Regards,
Norberto
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: 2.6.6 breaks kmail (nfs related?)
2004-05-17 6:35 ` Norberto Bensa
@ 2004-05-17 7:14 ` Andrew Morton
2004-05-17 17:35 ` Andrew Morton
2004-05-17 16:17 ` Frank van Maarseveen
1 sibling, 1 reply; 20+ messages in thread
From: Andrew Morton @ 2004-05-17 7:14 UTC (permalink / raw)
To: Norberto Bensa; +Cc: linux-kernel
Norberto Bensa <norberto+linux-kernel@bensa.ath.cx> wrote:
>
> Well, I'm getting this with kcalc after upgrading to 2.6.6-mm3:
>
> $ kcalc
> KCrash: Application 'kcalc' crashing...
>
> strace shows lots of
> ...
> close(1002) = -1 EBADF (Bad file descriptor)
> close(1003) = -1 EBADF (Bad file descriptor)
> close(1004) = -1 EBADF (Bad file descriptor)
> close(1005) = -1 EBADF (Bad file descriptor)
> ...
Send the whole thing, please: `strace -f -o log kcalc', and send `log'. If
it's too big to post please mail it to me direct and I'll stick it on a
public server.
Thanks.
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: 2.6.6 breaks kmail (nfs related?)
2004-05-16 19:10 ` Trond Myklebust
@ 2004-05-17 11:31 ` Andreas Amann
2004-05-17 15:55 ` Trond Myklebust
2004-05-17 21:35 ` Matthias Urlichs
0 siblings, 2 replies; 20+ messages in thread
From: Andreas Amann @ 2004-05-17 11:31 UTC (permalink / raw)
To: Trond Myklebust; +Cc: Linus Torvalds, Kernel Mailing List
On Sunday 16 May 2004 21:10, Trond Myklebust wrote:
> På su , 16/05/2004 klokka 14:50, skreiv Linus Torvalds:
> > Agreed. But the kmail message is apparently "(No space left on device?)",
> > which may be just kmail itself reacting to a truncated write rather than
> > any actual ENOSPC error. A "strace" would help clarify exactly what goes
> > wrong..
>
> Right...
>
> One possible suspect might be open(O_EXCL) since, AFAICS, Andreas is
> using maildir-style mailboxes. Perhaps that SETATTR call in
> nfs3_proc_create() is failing? We recently fixed so that it always sets
> MTIME/ATIME...
>
> Andreas: when you do the "strace" could you first run
>
> echo "16" >/proc/sys/sunrpc/nfs_debug
>
> and then record the output from "dmesg" immediately after the kmail
> crash?
Ok, I produced the "strace"s and "dmesg"s for the kernels 2.6.4, 2.6.5 and
2.6.6 and made them available at
http://wwwnlds.physik.tu-berlin.de/~amann/kmail_bug/dmesg_kmail_2.6.4
http://wwwnlds.physik.tu-berlin.de/~amann/kmail_bug/dmesg_kmail_2.6.5
http://wwwnlds.physik.tu-berlin.de/~amann/kmail_bug/dmesg_kmail_2.6.6
http://wwwnlds.physik.tu-berlin.de/~amann/kmail_bug/kmail_trace_2.6.4
http://wwwnlds.physik.tu-berlin.de/~amann/kmail_bug/kmail_trace_2.6.5
http://wwwnlds.physik.tu-berlin.de/~amann/kmail_bug/kmail_trace_2.6.6
Some further information:
My problem occurs already with kernel 2.6.5, and it is indeed NFS related (It
does not appear on a local home partition).
I reproduced the crash with a server exporting an ext2 partition and one
which exports a reiserfs partition. So far I only tested servers running on
vanilla 2.4.25. Should I check others?
The mount options according to /proc/mount are
viola:/tmp /net/viola/tmp nfs
rw,nosuid,nodev,v3,rsize=8192,wsize=8192,hard,intr,udp,lock,addr=viola 0 0
The traces were produced by the command lines
strace 2>/tmp/kmail_trace_2.6.x /usr/linux-local/kde/bin/kmail --nofork -s
test --msg test_mail amann@physik.tu-berlin.de
dmesg > /tmp/dmesg_kmail_2.6.x
(I also tried "strace -f", but apparently exim does not like to be traced?)
>From my (limited) point of view, the problem is the ESTALE of an fstat64 call
in the 2.6.5 trace:
>
access("/net/viola/tmp/amann/home_tmp/Mail/outbox/cur/1084784925.736.utEix:2,S",
F_OK) = -1 ENOENT (No such file or directory)
rename("/net/viola/tmp/amann/home_tmp/Mail/outbox/tmp/1084784925.736.utEix:2,S",
"/net/viola/tmp/amann/home_tmp/Mail/outbox/cur/1084784925
.736.utEix:2,S") = 0
fstat64(8, 0xbfffe650) = -1 ESTALE (Stale NFS file handle)
_llseek(8, 0, [373], SEEK_END) = 0
write(8, "X\1\0\0", 4) = -1 ESTALE (Stale NFS file handle)
write(8, "\5\0\0\0,\0\0a\0z\0/\0w\0t\0z\0t\0+\0N\0S\0+\0001\0s"..., 344) = -1
ESTALE (Stale NFS file handle)
<
This succeeds in the 2.6.4 trace:
>
access("/net/viola/tmp/amann/home_tmp/Mail/outbox/cur/1084785768.460.mTWwO:2,S",
F_OK) = -1 ENOENT (No such file or directory)
rename("/net/viola/tmp/amann/home_tmp/Mail/outbox/tmp/1084785768.460.mTWwO:2,S",
"/net/viola/tmp/amann/home_tmp/Mail/outbox/cur/1084785768
.460.mTWwO:2,S") = 0
fstat64(8, {st_mode=S_IFREG|0600, st_size=373, ...}) = 0
_llseek(8, 0, [0], SEEK_SET) = 0
read(8, "# KMail-Index V1506\n\0\10\0\0\0xV4\22\4\0\0"..., 373) = 373
write(8, "X\1\0\0", 4) = 4
write(8, "\5\0\0\0,\0\0j\0w\0N\0O\0S\0g\0p\0x\0j\0003\0o\0l\0S"..., 344) = 344
<
In both cases filehandle 8 was generated before by
>
open("/net/viola/tmp/amann/home_tmp/Mail/.outbox.index", O_RDWR|O_LARGEFILE) =
8
<
No idea what causes the difference.
I hope this is the information you expected. Please let me know what further
checks I can do.
Andreas
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: 2.6.6 breaks kmail (nfs related?)
2004-05-17 11:31 ` Andreas Amann
@ 2004-05-17 15:55 ` Trond Myklebust
2004-05-21 15:27 ` Andreas Amann
2004-05-17 21:35 ` Matthias Urlichs
1 sibling, 1 reply; 20+ messages in thread
From: Trond Myklebust @ 2004-05-17 15:55 UTC (permalink / raw)
To: Andreas Amann; +Cc: Linus Torvalds, Kernel Mailing List
På må , 17/05/2004 klokka 07:31, skreiv Andreas Amann:
> fstat64(8, 0xbfffe650) = -1 ESTALE (Stale NFS file handle)
> _llseek(8, 0, [373], SEEK_END) = 0
> write(8, "X\1\0\0", 4) = -1 ESTALE (Stale NFS file handle)
> write(8, "\5\0\0\0,\0\0a\0z\0/\0w\0t\0z\0t\0+\0N\0S\0+\0001\0s"..., 344) = -1
> ESTALE (Stale NFS file handle)
That's wierd... Where could that be coming from? The client is *never*
supposed to generate that on its own. If an ESTALE turns up, it should
always be generated from the server.
Does that same ESTALE show up on a tcpdump/ethereal dump? If so, could
you please check that the filehandle that is contained from the reply to
LOOKUP(".outbox.index") is the same as that which is sent on the
offending GETATTR call?
Cheers,
Trond
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: 2.6.6 breaks kmail (nfs related?)
2004-05-17 6:35 ` Norberto Bensa
2004-05-17 7:14 ` Andrew Morton
@ 2004-05-17 16:17 ` Frank van Maarseveen
1 sibling, 0 replies; 20+ messages in thread
From: Frank van Maarseveen @ 2004-05-17 16:17 UTC (permalink / raw)
To: linux-kernel
On Mon, May 17, 2004 at 03:35:42AM -0300, Norberto Bensa wrote:
>
> Well, I'm getting this with kcalc after upgrading to 2.6.6-mm3:
>
> $ kcalc
> KCrash: Application 'kcalc' crashing...
>
> strace shows lots of
> ...
> close(1002) = -1 EBADF (Bad file descriptor)
> close(1003) = -1 EBADF (Bad file descriptor)
> close(1004) = -1 EBADF (Bad file descriptor)
> close(1005) = -1 EBADF (Bad file descriptor)
Looks like daemonizing code to me, getting rid of open fds.
--
Frank
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: 2.6.6 breaks kmail (nfs related?)
2004-05-17 7:14 ` Andrew Morton
@ 2004-05-17 17:35 ` Andrew Morton
2004-05-17 18:01 ` Trond Myklebust
0 siblings, 1 reply; 20+ messages in thread
From: Andrew Morton @ 2004-05-17 17:35 UTC (permalink / raw)
To: norberto+linux-kernel, linux-kernel; +Cc: Trond Myklebust
Andrew Morton <akpm@osdl.org> wrote:
>
> Norberto Bensa <norberto+linux-kernel@bensa.ath.cx> wrote:
> >
> > Well, I'm getting this with kcalc after upgrading to 2.6.6-mm3:
> >
> > $ kcalc
> > KCrash: Application 'kcalc' crashing...
> >
> > strace shows lots of
> > ...
> > close(1002) = -1 EBADF (Bad file descriptor)
> > close(1003) = -1 EBADF (Bad file descriptor)
> > close(1004) = -1 EBADF (Bad file descriptor)
> > close(1005) = -1 EBADF (Bad file descriptor)
> > ...
>
> Send the whole thing, please: `strace -f -o log kcalc', and send `log'. If
> it's too big to post please mail it to me direct and I'll stick it on a
> public server.
>
Norberto's strace log is at
http://www.zip.com.au/~akpm/linux/patches/stuff/log.txt
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: 2.6.6 breaks kmail (nfs related?)
2004-05-17 17:35 ` Andrew Morton
@ 2004-05-17 18:01 ` Trond Myklebust
0 siblings, 0 replies; 20+ messages in thread
From: Trond Myklebust @ 2004-05-17 18:01 UTC (permalink / raw)
To: Andrew Morton; +Cc: norberto+linux-kernel, linux-kernel
På må , 17/05/2004 klokka 13:35, skreiv Andrew Morton:
> Norberto's strace log is at
> http://www.zip.com.au/~akpm/linux/patches/stuff/log.txt
A priori, it looks very different from Andreas' problem. This beast is
crashing due to a SIGSEGV.
The EBADF here appear to be correct: the application or glibc or
whatever appears to be trying to close files more than once. Duh...
Cheers,
Trond
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: 2.6.6 breaks kmail (nfs related?)
2004-05-17 11:31 ` Andreas Amann
2004-05-17 15:55 ` Trond Myklebust
@ 2004-05-17 21:35 ` Matthias Urlichs
1 sibling, 0 replies; 20+ messages in thread
From: Matthias Urlichs @ 2004-05-17 21:35 UTC (permalink / raw)
To: linux-kernel
Hi, Andreas Amann wrote:
> (I also tried "strace -f", but apparently exim does not like to be traced?)
Exim's setuid. Tracing setuid programs generally is fraught with peril,
especially if that program changes uids, drops privileges and then
re-execs itself (as exim does, IIRC). :-/
--
Matthias Urlichs
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: 2.6.6 breaks kmail (nfs related?)
2004-05-17 15:55 ` Trond Myklebust
@ 2004-05-21 15:27 ` Andreas Amann
2004-05-21 16:40 ` Trond Myklebust
0 siblings, 1 reply; 20+ messages in thread
From: Andreas Amann @ 2004-05-21 15:27 UTC (permalink / raw)
To: Trond Myklebust; +Cc: Linus Torvalds, Kernel Mailing List
On Monday 17 May 2004 17:55, Trond Myklebust wrote:
> På må , 17/05/2004 klokka 07:31, skreiv Andreas Amann:
> > fstat64(8, 0xbfffe650) = -1 ESTALE (Stale NFS file
> > handle) _llseek(8, 0, [373], SEEK_END) = 0
> > write(8, "X\1\0\0", 4) = -1 ESTALE (Stale NFS file
> > handle) write(8,
> > "\5\0\0\0,\0\0a\0z\0/\0w\0t\0z\0t\0+\0N\0S\0+\0001\0s"..., 344) = -1
> > ESTALE (Stale NFS file handle)
>
> That's wierd... Where could that be coming from? The client is *never*
> supposed to generate that on its own. If an ESTALE turns up, it should
> always be generated from the server.
>
> Does that same ESTALE show up on a tcpdump/ethereal dump? If so, could
> you please check that the filehandle that is contained from the reply to
> LOOKUP(".outbox.index") is the same as that which is sent on the
> offending GETATTR call?
I now produced the "etheral" dumps and put them on:
http://wwwnlds.physik.tu-berlin.de/~amann/kmail_bug/kmail_etheral_cut_2.6.4
http://wwwnlds.physik.tu-berlin.de/~amann/kmail_bug/kmail_etheral_cut_2.6.6
together with the pertinent "strace"s at:
http://wwwnlds.physik.tu-berlin.de/~amann/kmail_bug/kmail_trace_2.6.4_new
http://wwwnlds.physik.tu-berlin.de/~amann/kmail_bug/kmail_trace_2.6.6_new
I interpret the dump in the failing case with the 2.6.6 client as follows:
First the Filehandle for ".outbox.index" (0xdc36f60a) is delivered by a
READDIRPLUS Reply (Frame 4 in kmail_etheral_cut_2.6.6). Then the client does
GETATTR, ACCESS, SETATTR, READ (Frames 100-114) without any problems.
The client subsequnetly issues a WRITE and a COMMIT (Frame 741 + 751) command,
which are still successful. But the immediadetly following GETATTR (Frame
743) fails with ERR_STALE.
In the case where the client is 2.6.4, the dumps look very similar, except
that now a lot of the GETATTR Calls are missing. In particular the GETATTR
which failed in the 2.6.6 case is not present, and therefore can not fail.
In both cases the server was 2.4.25. Who is now wrong in this case, the client
or the server? To me it looks now, as if the server needs to be fixed, but I
am no expert.
Andreas
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: 2.6.6 breaks kmail (nfs related?)
2004-05-21 15:27 ` Andreas Amann
@ 2004-05-21 16:40 ` Trond Myklebust
2004-05-21 23:05 ` Andreas Amann
0 siblings, 1 reply; 20+ messages in thread
From: Trond Myklebust @ 2004-05-21 16:40 UTC (permalink / raw)
To: Andreas Amann; +Cc: Linus Torvalds, Kernel Mailing List
På fr , 21/05/2004 klokka 11:27, skreiv Andreas Amann:
> In both cases the server was 2.4.25. Who is now wrong in this case, the client
> or the server? To me it looks now, as if the server needs to be fixed, but I
> am no expert.
Yep. This is a server side bug. I just checked the dump, and the client
is indeed sending the correct filehandle (exactly the same one as in the
COMMIT just before it).
Hmm... It looks to me as if you are exporting that filesystem with the
"subtree_check" option enabled. Could you try to set "no_subtree_check"?
The subtree checking stuff breaks NFS in various subtle ways (including
renames etc), and is one of the more common sources of ESTALE errors.
Cheers,
Trond
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: 2.6.6 breaks kmail (nfs related?)
2004-05-21 16:40 ` Trond Myklebust
@ 2004-05-21 23:05 ` Andreas Amann
2004-05-22 3:40 ` J. Bruce Fields
0 siblings, 1 reply; 20+ messages in thread
From: Andreas Amann @ 2004-05-21 23:05 UTC (permalink / raw)
To: Trond Myklebust; +Cc: Linus Torvalds, Kernel Mailing List
On Fri, May 21, 2004 at 12:40:02PM -0400, Trond Myklebust wrote:
>
> Hmm... It looks to me as if you are exporting that filesystem with the
> "subtree_check" option enabled. Could you try to set "no_subtree_check"?
Thanks for that one, with "no_subtree_check" the problem disappears!
What is the disadvantage of this option?
Andreas
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: 2.6.6 breaks kmail (nfs related?)
2004-05-21 23:05 ` Andreas Amann
@ 2004-05-22 3:40 ` J. Bruce Fields
0 siblings, 0 replies; 20+ messages in thread
From: J. Bruce Fields @ 2004-05-22 3:40 UTC (permalink / raw)
To: Andreas Amann; +Cc: Trond Myklebust, Linus Torvalds, Kernel Mailing List
On Sat, May 22, 2004 at 01:05:45AM +0200, Andreas Amann wrote:
> On Fri, May 21, 2004 at 12:40:02PM -0400, Trond Myklebust wrote:
> >
> > Hmm... It looks to me as if you are exporting that filesystem with the
> > "subtree_check" option enabled. Could you try to set "no_subtree_check"?
>
> Thanks for that one, with "no_subtree_check" the problem disappears!
> What is the disadvantage of this option?
With "no_subtree_check" the server will not attempt to verify that a
given filehandle points to a file that is beneath an exported directory;
thus an attacker can guess filehandles of files not beneath any exported
directory and use those guessed filehandles to acces files you didn't
mean to export.
Even with "no_subtree_check", the server can still recognize which
filesystem a filehandle belongs to; so you're only in trouble if you
have files you don't want exported on the same partition as files you do
want exported.
See "man exports" for more.
--Bruce Fields
^ permalink raw reply [flat|nested] 20+ messages in thread
end of thread, other threads:[~2004-05-22 3:40 UTC | newest]
Thread overview: 20+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2004-05-13 12:11 2.6.6 breaks kmail (nfs related?) Andreas Amann
2004-05-16 4:46 ` Linus Torvalds
2004-05-16 17:59 ` Trond Myklebust
2004-05-16 18:10 ` Trond Myklebust
2004-05-16 18:19 ` Linus Torvalds
2004-05-16 18:47 ` Trond Myklebust
2004-05-16 18:50 ` Linus Torvalds
2004-05-16 19:10 ` Trond Myklebust
2004-05-17 11:31 ` Andreas Amann
2004-05-17 15:55 ` Trond Myklebust
2004-05-21 15:27 ` Andreas Amann
2004-05-21 16:40 ` Trond Myklebust
2004-05-21 23:05 ` Andreas Amann
2004-05-22 3:40 ` J. Bruce Fields
2004-05-17 21:35 ` Matthias Urlichs
2004-05-17 6:35 ` Norberto Bensa
2004-05-17 7:14 ` Andrew Morton
2004-05-17 17:35 ` Andrew Morton
2004-05-17 18:01 ` Trond Myklebust
2004-05-17 16:17 ` Frank van Maarseveen
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).