LKML Archive on lore.kernel.org
help / color / mirror / Atom feed
* Re: Bug#502583: scary messages in dmesg
       [not found] <2bfaaf400810172145g4aba4827l87a19e67b9742815@mail.gmail.com>
@ 2008-10-18  6:04 ` Rogério Brito
  2008-10-20 23:54   ` Alexandre Lymberopoulos
  0 siblings, 1 reply; 9+ messages in thread
From: Rogério Brito @ 2008-10-18  6:04 UTC (permalink / raw)
  To: Alexandre Lymberopoulos, 502583; +Cc: linux-kernel

Hi, Alexandre.

On Oct 18 2008, Alexandre Lymberopoulos wrote:
> Package: usbmount
> Severity: normal
(...)
> [32282.607205] wmnetload[6372]: segfault at 1 ip b7db75a9 sp bfdfc288 error 4 in libc-2.7.so[b7d41000+155000]
> [39070.466297] usb 5-2: USB disconnect, address 3
> [39071.159613] Buffer I/O error on device sda1, logical block 1545
> [39071.159625] lost page write due to I/O error on sda1
> [39071.159642] ------------[ cut here ]------------
> [39071.159646] WARNING: at fs/buffer.c:1186 mark_buffer_dirty+0x20/0x6a()
> [39071.159649] Modules linked in: nls_utf8 nls_cp437 vfat fat nls_base sd_mod usb_storage i915 drm ipv6 loop joydev pcmcia snd_intel8x0 snd_intel8x0m snd_ac97_codec ac97_bus ieee80211 ieee80211_crypt snd_seq_dummy firmware_class yenta_socket rsrc_nonstatic pcmcia_core serio_raw snd_pcm_oss snd_mixer_oss i2c_i801 pcspkr psmouse i2c_core snd_pcm snd_seq_midi snd_seq_oss iTCO_wdt snd_rawmidi snd_seq_midi_event snd_seq snd_timer snd_seq_device snd rng_core soundcore snd_page_alloc video output button battery ac intel_agp agpgart evdev dcdbas ext3 jbd mbcache ide_cd_mod cdrom ide_disk ide_pci_generic piix ide_core ata_generic libata scsi_mod dock e100 mii ehci_hcd uhci_hcd usbcore thermal processor fan thermal_sys [last unloaded: ipw2200]
> [39071.159742] Pid: 8890, comm: umount Not tainted 2.6.26-1-686 #1
> [39071.159751]  [<c012242b>] warn_on_slowpath+0x40/0x66
> [39071.159770]  [<c01d1c0c>] generic_make_request+0x34d/0x37b
> [39071.159787]  [<f892d304>] ext3_getblk+0x9f/0x17d [ext3]
> [39071.159811]  [<c0158949>] mempool_alloc+0x1c/0xba
> [39071.159822]  [<c01d2ce8>] submit_bio+0xc6/0xcd
> [39071.159831]  [<c0158929>] mempool_free+0x63/0x67
> [39071.159843]  [<c01900a8>] mark_buffer_dirty+0x20/0x6a
> [39071.159849]  [<f887ec2c>] journal_update_superblock+0x59/0x97 [jbd]
> [39071.159865]  [<f887db43>] cleanup_journal_tail+0xac/0xb1 [jbd]
> [39071.159877]  [<f887de27>] log_do_checkpoint+0x2a8/0x2ee [jbd]
> [39071.159890]  [<c01564fe>] find_get_pages_tag+0x2a/0x6e
> [39071.159901]  [<c01564fe>] find_get_pages_tag+0x2a/0x6e
> [39071.159908]  [<c0135eec>] getnstimeofday+0x37/0xbc
> [39071.159919]  [<c01df8c4>] rb_insert_color+0x4c/0xad
> [39071.159929]  [<c0133c2e>] enqueue_hrtimer+0xc9/0xd4
> [39071.159938]  [<c01341f6>] hrtimer_start+0xf7/0x110
> [39071.159948]  [<c011d36d>] hrtick_set+0x8f/0xd8
> [39071.159957]  [<c02b7eb8>] schedule+0x64e/0x66f
> [39071.159976]  [<f887f6fc>] journal_destroy+0xc7/0x163 [jbd]
> [39071.160040]  [<c013177c>] autoremove_wake_function+0x0/0x2d
> [39071.160051]  [<f8933f16>] ext3_put_super+0x1f/0x169 [ext3]
> [39071.160071]  [<c0175a85>] generic_shutdown_super+0x4f/0xc8
> [39071.160078]  [<c019ebe3>] vfs_quota_off+0x0/0x518
> [39071.160084]  [<c0175b0a>] kill_block_super+0xc/0x1b
> [39071.160090]  [<c0175ba9>] deactivate_super+0x4b/0x60
> [39071.160097]  [<c0186ca7>] sys_umount+0x282/0x2c8
> [39071.160105]  [<c010f6a1>] flush_tlb_mm+0x39/0x60
> [39071.160119]  [<c0115afb>] do_page_fault+0x29b/0x5b8
> [39071.160132]  [<c0103853>] sysenter_past_esp+0x78/0xb1
> [39071.160144]  [<c02b0000>] virtcons_probe+0xd6/0xdd
> [39071.160154]  =======================
> [39071.160157] ---[ end trace c45f2bacd26f4247 ]---
> [39071.160167] Buffer I/O error on device sda1, logical block 1545
> [39071.160170] lost page write due to I/O error on sda1
> [39071.160184] Buffer I/O error on device sda1, logical block 1545
> [39071.160187] lost page write due to I/O error on sda1
> [39071.160197] Buffer I/O error on device sda1, logical block 1545
> [39071.160201] lost page write due to I/O error on sda1
> [39071.160210] Buffer I/O error on device sda1, logical block 1545
> [39071.160213] lost page write due to I/O error on sda1
> [39071.160223] Buffer I/O error on device sda1, logical block 1545
> [39071.160226] lost page write due to I/O error on sda1
> [39071.160235] Buffer I/O error on device sda1, logical block 1545
> [39071.160238] lost page write due to I/O error on sda1
> [39071.160247] Buffer I/O error on device sda1, logical block 1545
> [39071.160251] lost page write due to I/O error on sda1
> [39071.160259] Buffer I/O error on device sda1, logical block 1545
> [39071.160262] lost page write due to I/O error on sda1
> [39071.160272] Buffer I/O error on device sda1, logical block 1545
> [39071.160275] lost page write due to I/O error on sda1
> [39140.700073] usb 5-1: new high speed USB device using ehci_hcd and address 4
> [39140.946003] usb 5-1: configuration #1 chosen from 1 choice
> [39140.948029] scsi2 : SCSI emulation for USB Mass Storage devices
> [39140.949899] usb 5-1: New USB device found, idVendor=067b, idProduct=3507
> [39140.949909] usb 5-1: New USB device strings: Mfr=1, Product=2, SerialNumber=3
> [39140.949914] usb 5-1: Product: PL-3507C USB Storage Device
> [39140.949917] usb 5-1: Manufacturer: Prolific
> [39140.949920] usb 5-1: SerialNumber: 01201290
> [39140.950566] usb-storage: device found at 4
> [39140.950573] usb-storage: waiting for device to settle before scanning
> [39145.948337] usb-storage: device scan complete

As these messages indicate something going wild in kernelland (due to the
stack trace), I'm CC'ing the linux-kernel mailing list. It is probably
triggered by something probably not related to usbmount.

Just for extra information, what kind of fs do you have on your memory
stick? Can you provide extra details on the situation?


Regards, Rogério Brito.

P.S.: Alexandre's complete dmesg is available at
http://bugs.debian.org/502583

-- 
Rogério Brito : rbrito@{mackenzie,ime.usp}.br : GPG key 1024D/7C2CAEB8
http://www.ime.usp.br/~rbrito : http://meusite.mackenzie.com.br/rbrito
Projects: algorithms.berlios.de : lame.sf.net : vrms.alioth.debian.org

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Bug#502583: scary messages in dmesg
  2008-10-18  6:04 ` Bug#502583: scary messages in dmesg Rogério Brito
@ 2008-10-20 23:54   ` Alexandre Lymberopoulos
  2008-10-21 10:15     ` Rogério Brito
  0 siblings, 1 reply; 9+ messages in thread
From: Alexandre Lymberopoulos @ 2008-10-20 23:54 UTC (permalink / raw)
  To: Rogério Brito; +Cc: 502583, linux-kernel

On Sat, Oct 18, 2008 at 4:04 AM, Rogério Brito <rbrito@ime.usp.br> wrote:
> Hi, Alexandre.

Hi there, Rogério!

> As these messages indicate something going wild in kernelland (due to the
> stack trace), I'm CC'ing the linux-kernel mailing list. It is probably
> triggered by something probably not related to usbmount.

Ok...

> Just for extra information, what kind of fs do you have on your memory
> stick? Can you provide extra details on the situation?

It's not a memory stick, it's a hard disk with ext3 file system. I
just plugged it it and the disk was automatically mounted in /dev/ext3
with no abnormal messages in dmesg. That weird messages appeared when
I unpplugged the disk (without umounting it, as it should be done when
using usbmount, right?).

When mounted I got a message asking for a fsck to be run on disk
because of many mount/umount processes without performing that
procedure.

By the way I think I've lost no data, but that messages are pretty
scary. I'm sorry I can't help that much, since I'm no more that a user
and big fan of Linux.

> Regards, Rogério Brito.
>
> P.S.: Alexandre's complete dmesg is available at
> http://bugs.debian.org/502583

Thanks for caring about this bug (if it is a bug).

Regards, Alexandre
-- 
===============================================================================
Alexandre Lymberopoulos - lymber@gmail.com
===============================================================================

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Bug#502583: scary messages in dmesg
  2008-10-20 23:54   ` Alexandre Lymberopoulos
@ 2008-10-21 10:15     ` Rogério Brito
  2008-10-21 12:33       ` Theodore Tso
  2008-10-21 20:55       ` Alexandre Lymberopoulos
  0 siblings, 2 replies; 9+ messages in thread
From: Rogério Brito @ 2008-10-21 10:15 UTC (permalink / raw)
  To: Alexandre Lymberopoulos; +Cc: 502583, 502583-submitter, linux-kernel

On Oct 20 2008, Alexandre Lymberopoulos wrote:
> It's not a memory stick, it's a hard disk with ext3 file system. I
> just plugged it it and the disk was automatically mounted in /dev/ext3
> with no abnormal messages in dmesg.

Ok, this far.

> That weird messages appeared when I unpplugged the disk (without
> umounting it, as it should be done when using usbmount, right?).

Did you sync the device? From the message, it seems that some data were to
be written to the device, but the device was already gone by that time, but
I'm not a specialist on the filesystem subsystem and perhaps others could
say more about it.

OTOH, the device might have been mounted with the sync option and I don't
know how it could have happened in this latter case.

> When mounted I got a message asking for a fsck to be run on disk
> because of many mount/umount processes without performing that
> procedure.

In a subsequent mount, I guess.

> By the way I think I've lost no data, but that messages are pretty
> scary. I'm sorry I can't help that much, since I'm no more that a user
> and big fan of Linux.

Ok. I'm putting here the messages from Alexandre's dmesg, so that it would
be helpful for the filesystem developers to tell us more about the
situation:

> > [32282.607205] wmnetload[6372]: segfault at 1 ip b7db75a9 sp bfdfc288 error 4 in libc-2.7.so[b7d41000+155000]
> > [39070.466297] usb 5-2: USB disconnect, address 3
> > [39071.159613] Buffer I/O error on device sda1, logical block 1545
> > [39071.159625] lost page write due to I/O error on sda1
> > [39071.159642] ------------[ cut here ]------------
> > [39071.159646] WARNING: at fs/buffer.c:1186 mark_buffer_dirty+0x20/0x6a()
> > [39071.159649] Modules linked in: nls_utf8 nls_cp437 vfat fat nls_base sd_mod usb_storage i915 drm ipv6 loop joydev pcmcia snd_intel8x0 snd_intel8x0m snd_ac97_codec ac97_bus ieee80211 ieee80211_crypt snd_seq_dummy firmware_class yenta_socket rsrc_nonstatic pcmcia_core serio_raw snd_pcm_oss snd_mixer_oss i2c_i801 pcspkr psmouse i2c_core snd_pcm snd_seq_midi snd_seq_oss iTCO_wdt snd_rawmidi snd_seq_midi_event snd_seq snd_timer snd_seq_device snd rng_core soundcore snd_page_alloc video output button battery ac intel_agp agpgart evdev dcdbas ext3 jbd mbcache ide_cd_mod cdrom ide_disk ide_pci_generic piix ide_core ata_generic libata scsi_mod dock e100 mii ehci_hcd uhci_hcd usbcore thermal processor fan thermal_sys [last unloaded: ipw2200]
> > [39071.159742] Pid: 8890, comm: umount Not tainted 2.6.26-1-686 #1
> > [39071.159751]  [<c012242b>] warn_on_slowpath+0x40/0x66
> > [39071.159770]  [<c01d1c0c>] generic_make_request+0x34d/0x37b
> > [39071.159787]  [<f892d304>] ext3_getblk+0x9f/0x17d [ext3]
> > [39071.159811]  [<c0158949>] mempool_alloc+0x1c/0xba
> > [39071.159822]  [<c01d2ce8>] submit_bio+0xc6/0xcd
> > [39071.159831]  [<c0158929>] mempool_free+0x63/0x67
> > [39071.159843]  [<c01900a8>] mark_buffer_dirty+0x20/0x6a
> > [39071.159849]  [<f887ec2c>] journal_update_superblock+0x59/0x97 [jbd]
> > [39071.159865]  [<f887db43>] cleanup_journal_tail+0xac/0xb1 [jbd]
> > [39071.159877]  [<f887de27>] log_do_checkpoint+0x2a8/0x2ee [jbd]
> > [39071.159890]  [<c01564fe>] find_get_pages_tag+0x2a/0x6e
> > [39071.159901]  [<c01564fe>] find_get_pages_tag+0x2a/0x6e
> > [39071.159908]  [<c0135eec>] getnstimeofday+0x37/0xbc
> > [39071.159919]  [<c01df8c4>] rb_insert_color+0x4c/0xad
> > [39071.159929]  [<c0133c2e>] enqueue_hrtimer+0xc9/0xd4
> > [39071.159938]  [<c01341f6>] hrtimer_start+0xf7/0x110
> > [39071.159948]  [<c011d36d>] hrtick_set+0x8f/0xd8
> > [39071.159957]  [<c02b7eb8>] schedule+0x64e/0x66f
> > [39071.159976]  [<f887f6fc>] journal_destroy+0xc7/0x163 [jbd]
> > [39071.160040]  [<c013177c>] autoremove_wake_function+0x0/0x2d
> > [39071.160051]  [<f8933f16>] ext3_put_super+0x1f/0x169 [ext3]
> > [39071.160071]  [<c0175a85>] generic_shutdown_super+0x4f/0xc8
> > [39071.160078]  [<c019ebe3>] vfs_quota_off+0x0/0x518
> > [39071.160084]  [<c0175b0a>] kill_block_super+0xc/0x1b
> > [39071.160090]  [<c0175ba9>] deactivate_super+0x4b/0x60
> > [39071.160097]  [<c0186ca7>] sys_umount+0x282/0x2c8
> > [39071.160105]  [<c010f6a1>] flush_tlb_mm+0x39/0x60
> > [39071.160119]  [<c0115afb>] do_page_fault+0x29b/0x5b8
> > [39071.160132]  [<c0103853>] sysenter_past_esp+0x78/0xb1
> > [39071.160144]  [<c02b0000>] virtcons_probe+0xd6/0xdd
> > [39071.160154]  =======================
> > [39071.160157] ---[ end trace c45f2bacd26f4247 ]---
> > [39071.160167] Buffer I/O error on device sda1, logical block 1545
> > [39071.160170] lost page write due to I/O error on sda1
> > [39071.160184] Buffer I/O error on device sda1, logical block 1545
> > [39071.160187] lost page write due to I/O error on sda1
> > [39071.160197] Buffer I/O error on device sda1, logical block 1545
> > [39071.160201] lost page write due to I/O error on sda1
> > [39071.160210] Buffer I/O error on device sda1, logical block 1545
> > [39071.160213] lost page write due to I/O error on sda1
> > [39071.160223] Buffer I/O error on device sda1, logical block 1545
> > [39071.160226] lost page write due to I/O error on sda1
> > [39071.160235] Buffer I/O error on device sda1, logical block 1545
> > [39071.160238] lost page write due to I/O error on sda1
> > [39071.160247] Buffer I/O error on device sda1, logical block 1545
> > [39071.160251] lost page write due to I/O error on sda1
> > [39071.160259] Buffer I/O error on device sda1, logical block 1545
> > [39071.160262] lost page write due to I/O error on sda1
> > [39071.160272] Buffer I/O error on device sda1, logical block 1545
> > [39071.160275] lost page write due to I/O error on sda1
> > [39140.700073] usb 5-1: new high speed USB device using ehci_hcd and address 4
> > [39140.946003] usb 5-1: configuration #1 chosen from 1 choice
> > [39140.948029] scsi2 : SCSI emulation for USB Mass Storage devices
> > [39140.949899] usb 5-1: New USB device found, idVendor=067b, idProduct=3507
> > [39140.949909] usb 5-1: New USB device strings: Mfr=1, Product=2, SerialNumber=3
> > [39140.949914] usb 5-1: Product: PL-3507C USB Storage Device
> > [39140.949917] usb 5-1: Manufacturer: Prolific
> > [39140.949920] usb 5-1: SerialNumber: 01201290
> > [39140.950566] usb-storage: device found at 4
> > [39140.950573] usb-storage: waiting for device to settle before scanning
> > [39145.948337] usb-storage: device scan complete
(...)
> Thanks for caring about this bug (if it is a bug).

Well, it is a bug. It just needs more investigation to see where the bug
lies.


Regards, Rogério.

-- 
Rogério Brito : rbrito@{mackenzie,ime.usp}.br : GPG key 1024D/7C2CAEB8
http://www.ime.usp.br/~rbrito : http://meusite.mackenzie.com.br/rbrito
Projects: algorithms.berlios.de : lame.sf.net : vrms.alioth.debian.org

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Bug#502583: scary messages in dmesg
  2008-10-21 10:15     ` Rogério Brito
@ 2008-10-21 12:33       ` Theodore Tso
  2008-10-21 14:35         ` Rogério Brito
  2008-10-21 21:21         ` Alexandre Lymberopoulos
  2008-10-21 20:55       ` Alexandre Lymberopoulos
  1 sibling, 2 replies; 9+ messages in thread
From: Theodore Tso @ 2008-10-21 12:33 UTC (permalink / raw)
  To: Rogério Brito
  Cc: Alexandre Lymberopoulos, 502583, 502583-submitter, linux-kernel

On Tue, Oct 21, 2008 at 08:15:10AM -0200, Rogério Brito wrote:
> On Oct 20 2008, Alexandre Lymberopoulos wrote:
> > It's not a memory stick, it's a hard disk with ext3 file system. I
> > just plugged it it and the disk was automatically mounted in /dev/ext3
> > with no abnormal messages in dmesg.
> 
> Ok, this far.
> 
> > That weird messages appeared when I unpplugged the disk (without
> > umounting it, as it should be done when using usbmount, right?).
> 
> Did you sync the device? From the message, it seems that some data were to
> be written to the device, but the device was already gone by that time, but
> I'm not a specialist on the filesystem subsystem and perhaps others could
> say more about it.

A patch to suppress the WARN information will be in 2.6.28 when the
user does something stupid (i.e., yank out a USB stick without
unmounting the filesystem first).  This was done mainly to suppress
the "scary message" in dmesg, which on distributions that support
uploading such messages to http://www.kerneloops.org for analysis, was
cluttering the reports.

However, the patch does not make it any *safer* to uncerimoniously
yank out a USB stick without unmounting it first.  This can still lead
to data loss, unless you're *sure* that no process is writing to the
stick and you issued the sync command, and you know enough time has
passed so all of the data has been written to the USB stick. 

> > > [39071.160167] Buffer I/O error on device sda1, logical block 1545
> > > [39071.160170] lost page write due to I/O error on sda1
> > > [39071.160184] Buffer I/O error on device sda1, logical block 1545
> > > [39071.160187] lost page write due to I/O error on sda1

These errors you'd still get, since these messages are the sound of
users' data being irretrivably being lost.

> Well, it is a bug. It just needs more investigation to see where the bug
> lies.

I don't know if you would call it a bug or not.  Fundamentally,
yanking out a USB stick without unmounting it first is dangerous, and
can lead to data loss.  Printing the stack trace when this happens
implies it's a problem which can be fixed by a developer, so perhaps
that could be considered a bug, and in any case, that should be
"fixed" in 2.6.28.  (I'm pretty sure akpm has sent the ext3 version of
that patch to Linus by now, but if not, it should make the 2.6.28
merge window.)

If you see "lost page write due to I/O error", then you will have lost
data due to premature removal of the USB stick, and fundamentally
*that* bug exists between the keyboard and the chair.

Regards,

						- Ted

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Bug#502583: scary messages in dmesg
  2008-10-21 12:33       ` Theodore Tso
@ 2008-10-21 14:35         ` Rogério Brito
  2008-10-21 15:31           ` Theodore Tso
  2008-10-21 21:21         ` Alexandre Lymberopoulos
  1 sibling, 1 reply; 9+ messages in thread
From: Rogério Brito @ 2008-10-21 14:35 UTC (permalink / raw)
  To: Theodore Tso, Alexandre Lymberopoulos, 502583, 502583-submitter,
	linux-kernel, rafael

First of all, Thanks Theodore.

On Oct 21 2008, Theodore Tso wrote:
> On Tue, Oct 21, 2008 at 08:15:10AM -0200, Rogério Brito wrote:
> > Did you sync the device? From the message, it seems that some data were
> > to be written to the device, but the device was already gone by that
> > time, but I'm not a specialist on the filesystem subsystem and perhaps
> > others could say more about it.
> 
> A patch to suppress the WARN information will be in 2.6.28 when the
> user does something stupid (i.e., yank out a USB stick without
> unmounting the filesystem first).

Right. I will put a big, fat warning on the installation of usbmount and
tell the users about it. Besides that, depending on the filesystem, the
superblock may be marked as dirty.

> This was done mainly to suppress the "scary message" in dmesg, which on
> distributions that support uploading such messages to
> http://www.kerneloops.org for analysis, was cluttering the reports.

Right...

> > > > [39071.160167] Buffer I/O error on device sda1, logical block 1545
> > > > [39071.160170] lost page write due to I/O error on sda1
> > > > [39071.160184] Buffer I/O error on device sda1, logical block 1545
> > > > [39071.160187] lost page write due to I/O error on sda1
> 
> These errors you'd still get, since these messages are the sound of
> users' data being irretrivably being lost.

Good that not all messages will be going... BTW, this reminded me of a
patch that eliminated a whole lot of messages (BUG()'s) for embedded
devices... I didn't find it anymore...

Also, the same thing with the hash tables that could turn into linked lists
eventually...

But I'm drifting away from the main topic.

> > Well, it is a bug. It just needs more investigation to see where the bug
> > lies.
> 
> I don't know if you would call it a bug or not.  Fundamentally,
> yanking out a USB stick without unmounting it first is dangerous, and
> can lead to data loss.

At least sync'ing...

[snip]

> If you see "lost page write due to I/O error", then you will have lost
> data due to premature removal of the USB stick, and fundamentally
> *that* bug exists between the keyboard and the chair.

Ok, so I'm closing this bug in the next upload.


Thanks, Rogério Brito.

-- 
Rogério Brito : rbrito@{mackenzie,ime.usp}.br : GPG key 1024D/7C2CAEB8
http://www.ime.usp.br/~rbrito : http://meusite.mackenzie.com.br/rbrito
Projects: algorithms.berlios.de : lame.sf.net : vrms.alioth.debian.org

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Bug#502583: scary messages in dmesg
  2008-10-21 14:35         ` Rogério Brito
@ 2008-10-21 15:31           ` Theodore Tso
  2008-10-21 15:43             ` Jon Smirl
  0 siblings, 1 reply; 9+ messages in thread
From: Theodore Tso @ 2008-10-21 15:31 UTC (permalink / raw)
  To: Rogério Brito
  Cc: Alexandre Lymberopoulos, 502583, 502583-submitter, linux-kernel, rafael

On Tue, Oct 21, 2008 at 12:35:00PM -0200, Rogério Brito wrote:
> > I don't know if you would call it a bug or not.  Fundamentally,
> > yanking out a USB stick without unmounting it first is dangerous, and
> > can lead to data loss.
> 
> At least sync'ing...

I'm loath to suggest "just syncing", because if there is some program
still writing to the USB stick (example: you fire up openoffice and
ask it to "export to PDF".  That command returns instantly, and the
PDF export happens in the background --- as you will discover if you
try to exit OpenOffice.  Well, if a user doesn't try to exit
OpenOffice, and just uses the "sync" command, they could still end up
trashing data.

Also note that for older flash drives, pulling the power while it is
writing could potentially lead to corruption of the USB stick's flash
translation layer (FTL), which could cause the device to become
totally non-functional.  It's for that reason that one particular
Digital SLR's stop writing to the compact flash card the instant the
access door to the flash card is opened, throwing away all of the last
7-8 pictures in the digital camera's write buffer.  I'm assuming they
did this because some users ejected the flash card while it was
writing leading to loss of the flash card plus *all* of the pictures
on the flash card, and they decided the risk of having a Very Unhappy
User was worth the tradeoff of throwing away only the recently taken
photographs that hadn't made it onto the card yet.

If you have a USB stick that has a flashing light to indicate write
activity, and you type the "sync" command, and you wait for the
flashing light to stop, and you *know* nothing else might be trying to
write to the device, sure it might be safe to eject.  But if we have a
command-line user who knows enough to type "sync" into a command-line
shell, wouldn't it be better to create a setuid shell command, call it
something like "usbeject" which finds any USB storage device that was
auto-mounted (i.e., not mounted manually by the user or via an
/etc/fstab entry), and if there is only one such devices,
automatically tries to unmount it?  If there is more than one such
device, it should ask the user (or maybe use lsof to see if there is
only one that appears not to be in use), and if it fails, it should
print a huge warning message as well as the output of "lsof" so the
user knows which program(s) to close before unmounting the device.

Solving this problem for desktop users is harder; probably the best
thing you can do is to throw up "shame" dialog box telling them that
they did Something Wrong, and while they may have gotten lucky this
time, that next time they should close all programs using the USB
storage device, and then right-click on the mounted disk icon and
select "eject".  That's what Windows does, and IIRC, what Mac OS X
does; there really isn't much else that can be done.

Regards,

						- Ted

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Bug#502583: scary messages in dmesg
  2008-10-21 15:31           ` Theodore Tso
@ 2008-10-21 15:43             ` Jon Smirl
  0 siblings, 0 replies; 9+ messages in thread
From: Jon Smirl @ 2008-10-21 15:43 UTC (permalink / raw)
  To: Theodore Tso, Rogério Brito, Alexandre Lymberopoulos,
	502583, 502583-submitter, linux-kernel, rafael

On Tue, Oct 21, 2008 at 11:31 AM, Theodore Tso <tytso@mit.edu> wrote:
> Solving this problem for desktop users is harder; probably the best
> thing you can do is to throw up "shame" dialog box telling them that
> they did Something Wrong, and while they may have gotten lucky this
> time, that next time they should close all programs using the USB
> storage device, and then right-click on the mounted disk icon and
> select "eject".  That's what Windows does, and IIRC, what Mac OS X
> does; there really isn't much else that can be done.

Can we do something about atime on removable media? It is
non-intuitive to most users that sticking a drive in and copying a
couple files off from it is going to cause writes to the device. A
normal user would think that this is read-only access and it is ok to
yank the drive.  I've burnt myself several times from this.

Another thing that gets normal users is yanking out a drive that was
definitely idle and then not having the icon for the drive disappear
on the desktop.

Maybe change distro mount defaults for removable media to noatime? And
add an event when the drive is yanked with no pending writes to tell
the desktop the drive is gone?

Of course yanking with writes pending should generate a big error box.
Can it ask the user to reinsert the drive and pick up where it left
off?

-- 
Jon Smirl
jonsmirl@gmail.com

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Bug#502583: scary messages in dmesg
  2008-10-21 10:15     ` Rogério Brito
  2008-10-21 12:33       ` Theodore Tso
@ 2008-10-21 20:55       ` Alexandre Lymberopoulos
  1 sibling, 0 replies; 9+ messages in thread
From: Alexandre Lymberopoulos @ 2008-10-21 20:55 UTC (permalink / raw)
  To: Rogério Brito; +Cc: 502583, 502583-submitter, linux-kernel

On Tue, Oct 21, 2008 at 7:15 AM, Rogério Brito <rbrito@ime.usp.br> wrote:

> Ok, this far.

Just till that point... ;-)

> Did you sync the device? From the message, it seems that some data were to
> be written to the device, but the device was already gone by that time, but
> I'm not a specialist on the filesystem subsystem and perhaps others could
> say more about it.

No, not that time. I'm used to sync devices via command line, but I
tought that if a device is mounted with sync option tha data is
actually written to the disk when any command to do so is given. I'm
almost convinced that the drive was not in use when I unplugged (I was
rsync'ing mp3 files between devices).

By the way I purged usbmount by now, altought I'm pretty aware that it
may be not safe to unplug the device without unmounting it (which
could be done only logging as root when you use usbmount, since the
device is not in /etc/fstab.)

> In a subsequent mount, I guess.

In a previous one and in the later one.

> Well, it is a bug. It just needs more investigation to see where the bug
> lies.

Not everyone agrees that it'a bug on usbmount, maybe be it's in me... ;-)

I'm not that totally newbie user, so it is good that this happened to
me and not to anyone else...

> Regards, Rogério.

Yours, Alexandre
-- 
===============================================================================
Alexandre Lymberopoulos - lymber@gmail.com
===============================================================================

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Bug#502583: scary messages in dmesg
  2008-10-21 12:33       ` Theodore Tso
  2008-10-21 14:35         ` Rogério Brito
@ 2008-10-21 21:21         ` Alexandre Lymberopoulos
  1 sibling, 0 replies; 9+ messages in thread
From: Alexandre Lymberopoulos @ 2008-10-21 21:21 UTC (permalink / raw)
  To: Theodore Tso, Rogério Brito, Alexandre Lymberopoulos,
	502583, 502583-submitter, linux-kernel

On Tue, Oct 21, 2008 at 9:33 AM, Theodore Tso <tytso@mit.edu> wrote:

> A patch to suppress the WARN information will be in 2.6.28 when the
> user does something stupid (i.e., yank out a USB stick without
> unmounting the filesystem first).  This was done mainly to suppress
> the "scary message" in dmesg, which on distributions that support
> uploading such messages to http://www.kerneloops.org for analysis, was
> cluttering the reports.

I don't think removing these messages is a good idea. It's good even
for non-developer users to have information about what is happening on
their systems. At least to know that they (me in that case) did
something stupid.

> However, the patch does not make it any *safer* to uncerimoniously
> yank out a USB stick without unmounting it first.  This can still lead
> to data loss, unless you're *sure* that no process is writing to the
> stick and you issued the sync command, and you know enough time has
> passed so all of the data has been written to the USB stick.

Another reason to keep these messages. To hide information without
provinding a solution (I can see none here, i´'s impossible to prevent
anyone from doing stupid things) is not a good idea, sounds like the
philosophy of other operating systems...

> If you see "lost page write due to I/O error", then you will have lost
> data due to premature removal of the USB stick, and fundamentally
> *that* bug exists between the keyboard and the chair.

;-)

> Regards,
>
>                                                - Ted

Sincerley, Alexandre
-- 
===============================================================================
Alexandre Lymberopoulos - lymber@gmail.com
===============================================================================

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2008-10-21 21:22 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <2bfaaf400810172145g4aba4827l87a19e67b9742815@mail.gmail.com>
2008-10-18  6:04 ` Bug#502583: scary messages in dmesg Rogério Brito
2008-10-20 23:54   ` Alexandre Lymberopoulos
2008-10-21 10:15     ` Rogério Brito
2008-10-21 12:33       ` Theodore Tso
2008-10-21 14:35         ` Rogério Brito
2008-10-21 15:31           ` Theodore Tso
2008-10-21 15:43             ` Jon Smirl
2008-10-21 21:21         ` Alexandre Lymberopoulos
2008-10-21 20:55       ` Alexandre Lymberopoulos

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).