LKML Archive on lore.kernel.org
help / color / mirror / Atom feed
* broken device locking, sg vs. sg_io on block devices
       [not found] <200703261811.21448.gerald@itzgrund.net>
@ 2007-03-30 11:17 ` Eduard Bloch
  2007-03-30 13:43   ` Christoph Hellwig
  0 siblings, 1 reply; 14+ messages in thread
From: Eduard Bloch @ 2007-03-30 11:17 UTC (permalink / raw)
  To: linux-kernel; +Cc: debburn-devel

Hello,

I am talking this issue to LKML now.

Short story: using O_EXCL on /dev/srX alone does not help to prevent
other process from killing your burn process by just reading the
/dev/sgX device associated with yours, and vice versa. We have done the
best we could to make safe operation (in contrary to Schilling's
kill-this-evil-hald-thing bitching) but that is not enough, the locking
has to be established on kernel layer.

Long story:
http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=413960
https://bugzilla.novell.com/show_bug.cgi?id=226019
http://lists.alioth.debian.org/pipermail/debburn-devel/2007-February/000297.html
and other error messages.

There is AFAICS no simple way to establish locking across the driver
borders. If kernel developers have a good idea, any help is appreciated.

Below are the typical symptoms: wodim operates via /dev/sgX because the
user chosen it this way, some other process (most likely hald) comes
along and reads from /dev/sr0 and the drive gets confused. Boom.

Regards,
Eduard.

* Gerald Lutter [Mon, Mar 26 2007, 06:10:44PM]:
> Hello List,
> 
> i've tried to burn the grml_0.9.iso to a cdr/cd-rw medium using wodim from 
> cdrkit 1.1.2 and the burner HL-DT-ST GMA-4082N. The iso can be fetched 
> >from "http://www.grml.org".
> 
> To make a long story short, burning this image to dvd+r and dvd+rw media works 
> without any problems but I need this image on a cdr or cd-rw medium. I've 
> attatched the output of the command I used to this mail:
> 
> /usr/bin/wodim -vvv -VVV gracetime=2 dev=2,0,0 speed=10 
                                       ^^^^^^^^^
> driveropts=burnfree -eject -overburn -data /tmp/grml_0.9.iso
...
> 
Track 01:    0 of  692 MB written.
> write track data: error after 317440 bytes
...
> Sense Key: 0x2 Not Ready, Segment 0
> Sense Code: 0x04 Qual 0x08 (logical unit not ready, long write in progress) Fru 0x0
> Sense flags: Blk 0 (not valid) 


-- 
* Amaya knuddelt Ganneff because of his email to private :*
<Ganneff> ich wusst doch dass die irgendwo was schlechtes nach sich zieht.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: broken device locking, sg vs. sg_io on block devices
  2007-03-30 11:17 ` broken device locking, sg vs. sg_io on block devices Eduard Bloch
@ 2007-03-30 13:43   ` Christoph Hellwig
  2007-03-30 14:21     ` Eduard Bloch
  0 siblings, 1 reply; 14+ messages in thread
From: Christoph Hellwig @ 2007-03-30 13:43 UTC (permalink / raw)
  To: Eduard Bloch; +Cc: linux-kernel, debburn-devel

On Fri, Mar 30, 2007 at 01:17:44PM +0200, Eduard Bloch wrote:
> Hello,
> 
> I am talking this issue to LKML now.
> 
> Short story: using O_EXCL on /dev/srX alone does not help to prevent
> other process from killing your burn process by just reading the
> /dev/sgX device associated with yours, and vice versa. We have done the
> best we could to make safe operation (in contrary to Schilling's
> kill-this-evil-hald-thing bitching) but that is not enough, the locking
> has to be established on kernel layer.
> 
> Long story:
> http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=413960
> https://bugzilla.novell.com/show_bug.cgi?id=226019
> http://lists.alioth.debian.org/pipermail/debburn-devel/2007-February/000297.html
> and other error messages.
> 
> There is AFAICS no simple way to establish locking across the driver
> borders. If kernel developers have a good idea, any help is appreciated.
> 
> Below are the typical symptoms: wodim operates via /dev/sgX because the
> user chosen it this way, some other process (most likely hald) comes
> along and reads from /dev/sr0 and the drive gets confused. Boom.

You have thre problems here, and none of them are in the kernel :)

First the hardsware is broken when it can't deal with concurrent requests,
I'd try to get a refund for it.  Second wodim should never ever use
/dev/sg if the sr node is available.  And third HAL should stop poking
devices all the time.  Then again hald is a totally lost cause and
I can only recommend to uninstall it ASAP.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: broken device locking, sg vs. sg_io on block devices
  2007-03-30 13:43   ` Christoph Hellwig
@ 2007-03-30 14:21     ` Eduard Bloch
  2007-03-30 18:10       ` Alan Cox
  2007-03-30 19:09       ` Jan Engelhardt
  0 siblings, 2 replies; 14+ messages in thread
From: Eduard Bloch @ 2007-03-30 14:21 UTC (permalink / raw)
  To: linux-kernel, debburn-devel

#include <hallo.h>
* Christoph Hellwig [Fri, Mar 30 2007, 02:43:27PM]:

> > Long story:
> > http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=413960
> > https://bugzilla.novell.com/show_bug.cgi?id=226019
> > http://lists.alioth.debian.org/pipermail/debburn-devel/2007-February/000297.html
> > and other error messages.
> > 
> > There is AFAICS no simple way to establish locking across the driver
> > borders. If kernel developers have a good idea, any help is appreciated.
> > 
> > Below are the typical symptoms: wodim operates via /dev/sgX because the
> > user chosen it this way, some other process (most likely hald) comes
> > along and reads from /dev/sr0 and the drive gets confused. Boom.
> 
> You have thre problems here, and none of them are in the kernel :)
> 
> First the hardsware is broken when it can't deal with concurrent requests,
> I'd try to get a refund for it.  Second wodim should never ever use
> /dev/sg if the sr node is available.  And third HAL should stop poking
> devices all the time.  Then again hald is a totally lost cause and
> I can only recommend to uninstall it ASAP.

Then make /dev/sg* unusable when something opens /dev/sr. Please.
Otherwise it is just another assumption of how things might or should
work, not really matching the use cases in the wild. And here are
some real world facts:

 - there is a fixed mapping between the fake-SCSI numbers and devices
   established by Schilling. People are using it, it is hardcoded into
   frontend applications. We cannot change that overnight without
   breaking even more stuff.
   
   If there is a simple way to get the mapping between the sg and sr
   devices that would be great and almost solve the problems, but I
   cannot discover a such thing in the kernel.

 - hald is using O_EXCL already, as wodim does, and this seems to work
   well as long as they are acting on the same virtual devices.

 - hald is installed together with KDE. Do you suggest to remove KDE as
   well? I don't think so.

 - of course the hardware does not handle concurent requests, it is
   designed that way. It is burden of kernel to canalize the access and
   deal with concurrency issues. Obviously the kernel can and shall not
   do all the work but at least the basic safety mechanisms must work
   reliably and currently they don't.

Eduard.
-- 
<LGS> Halloechen, ihr Spinner, so frueh auf?
<nusse> nein, wir schlafen alle im kollektiv
<knorke> mein alkoven ist kaputt
<teq> alkohol kaputt?

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: broken device locking, sg vs. sg_io on block devices
  2007-03-30 14:21     ` Eduard Bloch
@ 2007-03-30 18:10       ` Alan Cox
  2007-03-31 17:07         ` Eduard Bloch
  2007-03-30 19:09       ` Jan Engelhardt
  1 sibling, 1 reply; 14+ messages in thread
From: Alan Cox @ 2007-03-30 18:10 UTC (permalink / raw)
  To: Eduard Bloch; +Cc: linux-kernel, debburn-devel

>    If there is a simple way to get the mapping between the sg and sr
>    devices that would be great and almost solve the problems, but I
>    cannot discover a such thing in the kernel.

You can go trying to match bus values but we have SG_IO on /dev/sr. This
is an old known problem with /dev/sg, and it is one reason Jens and co
fixed it with the SG_IO interface.

>  - hald is using O_EXCL already, as wodim does, and this seems to work
>    well as long as they are acting on the same virtual devices.

You might want to try hal polling the master while burning on the slave
and vice versa that also used to cause some systems problems.

>  - of course the hardware does not handle concurent requests, it is
>    designed that way. It is burden of kernel to canalize the access and
>    deal with concurrency issues. Obviously the kernel can and shall not

It's the job of the kernel to serialize requests coming in and it does
that for you both with SCSI and even old IDE. If you ask it to do
something stupid then that becomes the desktops problem.

>    do all the work but at least the basic safety mechanisms must work
>    reliably and currently they don't.

The kernel does not have sufficient information to handle /dev/sg locking
by itself. That is one reason /dev/sg is a privileged interface. It's
designed to let you do anything however crazy you like, as root. If you
do something stupid it breaks. The whole point of /dev/sg is that you can
do anything with it. You shouldn't be using /dev/sg for normal CD burning
applications on 2.6. 

For sane systems use the SG_IO interface on the proper device file. That
also fixes the need for setuid access and the like except when issuing
"dangerous" commands.

Christoph is wrong about one point, the hardware isn't broken, The IDE bus
is merely badly designed in this area. 

HAL breaks systems, it causes some laptops to burn power and on a few
boxes kills your disk performance. It'll also stop a few boxes reading
video-cd disks. Most of that appears to be HAL problems or indirectly
through problems in the locking scheme chosen (O_EXCL not fcntl locks).

The desktop user space should really know what it is doing with the CD
device if it wants to do things like CD burning. If the serial port
people could get this right in 1977 then there is no excuse fo the CD
using people not getting it right in 2007

Alan

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: broken device locking, sg vs. sg_io on block devices
  2007-03-30 14:21     ` Eduard Bloch
  2007-03-30 18:10       ` Alan Cox
@ 2007-03-30 19:09       ` Jan Engelhardt
  1 sibling, 0 replies; 14+ messages in thread
From: Jan Engelhardt @ 2007-03-30 19:09 UTC (permalink / raw)
  To: Eduard Bloch; +Cc: linux-kernel, debburn-devel


On Mar 30 2007 16:21, Eduard Bloch wrote:
>> 
>> First the hardsware is broken when it can't deal with concurrent requests,
>> I'd try to get a refund for it.  Second wodim should never ever use
>> /dev/sg if the sr node is available.  And third HAL should stop poking
>> devices all the time.  Then again hald is a totally lost cause and
>> I can only recommend to uninstall it ASAP.

As part of opensuse 10.2, I do have hald running. It has not interfered
with cd writing so far. Though, I do not run any automounter like autofs or
ivman.

> - there is a fixed mapping between the fake-SCSI numbers and devices
>   established by Schilling.

It accepts dev=/dev/hdc, even if noone wants to admit it.

> - hald is installed together with KDE. Do you suggest to remove KDE as
>   well? I don't think so.

hald is installed as part of the base system, not kde. (If I uninstall it,
stuff like battery status on textconsole ceases to work here.)


Jan
-- 

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: broken device locking, sg vs. sg_io on block devices
  2007-03-30 18:10       ` Alan Cox
@ 2007-03-31 17:07         ` Eduard Bloch
  2007-03-31 22:20           ` Alan Cox
  0 siblings, 1 reply; 14+ messages in thread
From: Eduard Bloch @ 2007-03-31 17:07 UTC (permalink / raw)
  To: Alan Cox; +Cc: linux-kernel, debburn-devel

#include <hallo.h>
* Alan Cox [Fri, Mar 30 2007, 07:10:38PM]:
> >    If there is a simple way to get the mapping between the sg and sr
> >    devices that would be great and almost solve the problems, but I
> >    cannot discover a such thing in the kernel.
> 
> You can go trying to match bus values but we have SG_IO on /dev/sr. This
> is an old known problem with /dev/sg, and it is one reason Jens and co
> fixed it with the SG_IO interface.

I am trying a different way now, fishing the associated device name from
sysfs' symlinks and then reassigning the device access to /dev/sr. Not
that I like it very much but it seems to be the best workaround, even
independent from potential fixes in kernel. I do not count on them
either, considering the hostile tone here.

> >  - of course the hardware does not handle concurent requests, it is
> >    designed that way. It is burden of kernel to canalize the access and
> >    deal with concurrency issues. Obviously the kernel can and shall not
> 
> It's the job of the kernel to serialize requests coming in and it does
> that for you both with SCSI and even old IDE. If you ask it to do
> something stupid then that becomes the desktops problem.

But the desktop needs some means to deal with that. AFAICS the only
feasible way for applications to communicate about device usage policy
is locking with O_EXCL. Many people do not realize that even read-only
actions do harm when a delicate operation is in progress. This bad
assumption is already hardcoded into crucial components like libblkid
(mount), increasing the risk of creating coasters up to 100 percent for
no good reason, IMHO.

> >    do all the work but at least the basic safety mechanisms must work
> >    reliably and currently they don't.
> 
> The kernel does not have sufficient information to handle /dev/sg locking

But the kernel knows already that there is a block device behind it. It
is displayed in sysfs. It shall "just" reuse the lock mechanism of that
device, not more and not less. Naturally this "just" definition is
bendable and that is why I initially asked here.

> by itself. That is one reason /dev/sg is a privileged interface. It's
> designed to let you do anything however crazy you like, as root. If you
> do something stupid it breaks. The whole point of /dev/sg is that you can
> do anything with it. You shouldn't be using /dev/sg for normal CD burning
> applications on 2.6. 

The sad thing is, this is just another assumption. At least on Debian
/dev/sgX belongs to the cdrom group when it's a cdrom device and the
permissions do just invite to work with it.

> For sane systems use the SG_IO interface on the proper device file. That
> also fixes the need for setuid access and the like except when issuing
> "dangerous" commands.

IIRC there were similar issues with the SCSI command filtering with
both but I am not sure on that.

> The desktop user space should really know what it is doing with the CD
> device if it wants to do things like CD burning. If the serial port
> people could get this right in 1977 then there is no excuse fo the CD

Serial port? Do we have multiple drivers with multiple interfaces
accessing the same hardware simultaneously and independently? I don't
think so.

> using people not getting it right in 2007

CD burning people (at least most of them) do get it right, please
realize that. It is others who come along and disturb the burning
process. And sometimes it is even not their fault, e.g. hald is assumed
to do proper locking with O_EXCL. They may just happen to use different
userspace interfaces, and the kernel lets them clash.

Oh, and please don't kill the messenger, I am not Schilling.

And if you think it is just simple and sufficient to tell people "you
have to use /dev/sr now and everything else is your problem", please
compare that with the time line of devfs deprecation, for example.

The use of /dev/sg* is still common practice, its invention predates
devfs by many years and there is no big campaign telling to switch to
udev and there is no automatic fallback to a safe system (like static
device files) and no obvious way to see what is going on before the
burning process starts. You just get a coaster in the beginning and no
clear way to see why it happens. I guess it will take years for the
you-shall-not-use-sg message to settle down in the heads of users.

Eduard.
-- 
Klug sein hat noch nie einen Menschen an Dummheiten gehindert.
		-- Stefan Zweig

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: broken device locking, sg vs. sg_io on block devices
  2007-03-31 17:07         ` Eduard Bloch
@ 2007-03-31 22:20           ` Alan Cox
  2007-03-31 22:40             ` Eduard Bloch
  2007-04-07 11:21             ` Eduard Bloch
  0 siblings, 2 replies; 14+ messages in thread
From: Alan Cox @ 2007-03-31 22:20 UTC (permalink / raw)
  To: Eduard Bloch; +Cc: linux-kernel, debburn-devel

> But the desktop needs some means to deal with that. AFAICS the only
> feasible way for applications to communicate about device usage policy
> is locking with O_EXCL. Many people do not realize that even read-only

serial ports and mail both use fcntl file locking , which is much more
flexible.

> > The kernel does not have sufficient information to handle /dev/sg locking
> 
> But the kernel knows already that there is a block device behind it. It
> is displayed in sysfs. It shall "just" reuse the lock mechanism of that
> device, not more and not less. Naturally this "just" definition is
> bendable and that is why I initially asked here.

This doesn't help. There are legitimate reasons to use /dev/sg on a
device which is active. For most subsystems this actually makes a lot of
sense when doing things like enclosure control.

> The sad thing is, this is just another assumption. At least on Debian
> /dev/sgX belongs to the cdrom group when it's a cdrom device and the
> permissions do just invite to work with it.

Which means it is privilegded.

> IIRC there were similar issues with the SCSI command filtering with
> both but I am not sure on that.

SG_IO does command filtering, /dev/sg is intended to be assigned
correctly.

> > The desktop user space should really know what it is doing with the CD
> > device if it wants to do things like CD burning. If the serial port
> > people could get this right in 1977 then there is no excuse fo the CD
> 
> Serial port? Do we have multiple drivers with multiple interfaces
> accessing the same hardware simultaneously and independently? I don't
> think so.

getty/modem/uucp/terminal emulator/slip/ppp/.. 

I do think so.

> The use of /dev/sg* is still common practice, its invention predates

The /dev/sg interface cannot do the locking. If you use /dev/sg you are
telling the kernel you what you are doing. If you don't then you'll make
coasters or even bigger messes. 

If you are prepared to fix the apps then I'd suggest fixing them to use
fcntl locks with exclusive lock/shared lock according to their need for
exclusivity. That would fix some of the HAL problems (open has side
effects relocking doesnt), but there are still corner cases with mounted
file systems that need handling and I can see those might need some kernel
helping hands.

Alan

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: broken device locking, sg vs. sg_io on block devices
  2007-03-31 22:20           ` Alan Cox
@ 2007-03-31 22:40             ` Eduard Bloch
  2007-04-01  0:14               ` Alan Cox
  2007-04-07 11:21             ` Eduard Bloch
  1 sibling, 1 reply; 14+ messages in thread
From: Eduard Bloch @ 2007-03-31 22:40 UTC (permalink / raw)
  To: Alan Cox; +Cc: linux-kernel, debburn-devel

#include <hallo.h>
* Alan Cox [Sat, Mar 31 2007, 11:20:02PM]:
> > But the desktop needs some means to deal with that. AFAICS the only
> > feasible way for applications to communicate about device usage policy
> > is locking with O_EXCL. Many people do not realize that even read-only
> 
> serial ports and mail both use fcntl file locking , which is much more
> flexible.

Again, what does that have to do with the problem at hand? Our problem is
not about locking on a single file (no matter which mechanism is used)
but the coordination of locks _behind_ the userspace access. Or
alternatively reassigning all access to one device file.

> > > The kernel does not have sufficient information to handle /dev/sg locking
> > 
> > But the kernel knows already that there is a block device behind it. It
> > is displayed in sysfs. It shall "just" reuse the lock mechanism of that
> > device, not more and not less. Naturally this "just" definition is
> > bendable and that is why I initially asked here.
> 
> This doesn't help. There are legitimate reasons to use /dev/sg on a
> device which is active. For most subsystems this actually makes a lot of
> sense when doing things like enclosure control.

For such uses one can omit the locking. Problem solved.

> > The sad thing is, this is just another assumption. At least on Debian
> > /dev/sgX belongs to the cdrom group when it's a cdrom device and the
> > permissions do just invite to work with it.
> 
> Which means it is privilegded.

So? Then let's make /etc/shadow privilegded too: chmod a+r /etc/shadow

> > > The desktop user space should really know what it is doing with the CD
> > > device if it wants to do things like CD burning. If the serial port
> > > people could get this right in 1977 then there is no excuse fo the CD
> > 
> > Serial port? Do we have multiple drivers with multiple interfaces
> > accessing the same hardware simultaneously and independently? I don't
> > think so.
> 
> getty/modem/uucp/terminal emulator/slip/ppp/.. 
> 
> I do think so.

Nice try, but where are the different conflicting drivers with different
userspace interfaces? Do you have some more flawed comparisons of that
kind?

> > The use of /dev/sg* is still common practice, its invention predates
> 
> The /dev/sg interface cannot do the locking. If you use /dev/sg you are

Again, it doesn't have to. It can pass the locking operations to the
related block device driver.

The alternative is finding a mapping to the correct block device and act
on this one (with O_EXCL or with fcntl, or both). Sysfs looks like a
good method to get information for such mapping but unfortunately you
(kernel developers) are going to cut even this last path soon (see
CONFIG_SYSFS_DEPRECATED and its bold description).

Is there any other way I need to know about? Some Voodoo ioctl?

Regards,
Eduard.

-- 
<alphascorpii> hm, was kann man denn so aus brot machen ...
<maxx> knusprige ente (mit etwas geduld)

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: broken device locking, sg vs. sg_io on block devices
  2007-03-31 22:40             ` Eduard Bloch
@ 2007-04-01  0:14               ` Alan Cox
  2007-04-01  2:34                 ` Oleg Verych
  0 siblings, 1 reply; 14+ messages in thread
From: Alan Cox @ 2007-04-01  0:14 UTC (permalink / raw)
  To: Eduard Bloch; +Cc: linux-kernel, debburn-devel

> > > The use of /dev/sg* is still common practice, its invention predates
> > 
> > The /dev/sg interface cannot do the locking. If you use /dev/sg you are
> 
> Again, it doesn't have to. It can pass the locking operations to the
> related block device driver.

No it can't. The driver has no idea what the locking rules are for
arbitary command blocks send to arbitary devices. /dev/sg is a *raw*
interface. You can send anything to anyone, and the locking rules for
that are far too complex for a giant morass of kernel code to get added.

The mess begins because you use /dev/sg and put it in a cdrom group
instead of using SG_IO on the /dev/sr device. The mess continues because
of the user of O_EXCL locking thus forcing re-open/close by HAL instead
of fcntl based co-operative locking.

The job of the kernel is not and never has been to anticipate and correct
everything stupid someone tries to do in user space. 

As I said before the people wanting to arbitrate serial ports got this
right in the mid 1970's your situation is not much more complicated,
unless you persist in using /dev/sg - which yes does make it hard, but so
does writing it in COBOL, or while standing on your head. And the
solution to all three cases is the same *DONT DO IT*

Alan

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: broken device locking, sg vs. sg_io on block devices
  2007-04-01  0:14               ` Alan Cox
@ 2007-04-01  2:34                 ` Oleg Verych
  0 siblings, 0 replies; 14+ messages in thread
From: Oleg Verych @ 2007-04-01  2:34 UTC (permalink / raw)
  To: Alan Cox; +Cc: Eduard Bloch, linux-kernel, debburn-devel

> From: Alan Cox
> Newsgroups: gmane.linux.kernel
> Subject: Re: broken device locking, sg vs. sg_io on block devices
> Date: Sun, 1 Apr 2007 01:14:52 +0100
>
[]
>> Again, it doesn't have to. It can pass the locking operations to the
>> related block device driver.
>
> No it can't. The driver has no idea what the locking rules are for
> arbitary command blocks send to arbitary devices. /dev/sg is a *raw*
> interface. You can send anything to anyone, and the locking rules for
> that are far too complex for a giant morass of kernel code to get added.
>
> The mess begins because you use /dev/sg and put it in a cdrom group
> instead of using SG_IO on the /dev/sr device.

(offtop: 'cdrom' is as ugly as 'floppy' for anything like usb,
firewire connected storage, why not use 'optics' and 'external' or
something?)

> The mess continues because of the user of O_EXCL locking thus forcing
> re-open/close by HAL

Manpage states something bad about it also...

> instead of fcntl based co-operative locking.

> > > getty/modem/uucp/terminal emulator/slip/ppp/..

Programs you've mentioned may have co-operative locking, but 'dd' or
'cat' have no knowledge of it for sure. Yet nothing prevents allowed user
program to use this tools on /dev/tty*.

AFAIK kernel developers are always ready for very broken userspace, yet
co-operative locking is a job of the userspace programmers of very
different tools.

> The job of the kernel is not and never has been to anticipate and correct
> everything stupid someone tries to do in user space.


> As I said before the people wanting to arbitrate serial ports got this
> right in the mid 1970's your situation is not much more complicated,

Do you mean co-operative locking or carrier detection as a pre-hotplug
thing (:?

Tell me, please, somebody, why non-exclusive co-operative locking (if it
was implemented anyways), racy and already used in userspace applications
O_EXCL are better than _mandatory locking_? I've found this helpful against
any broken userspace, trying hijack my device and read or write bytes
to it.
____

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: broken device locking, sg vs. sg_io on block devices
  2007-03-31 22:20           ` Alan Cox
  2007-03-31 22:40             ` Eduard Bloch
@ 2007-04-07 11:21             ` Eduard Bloch
  2007-04-11 10:12               ` Eduard Bloch
  1 sibling, 1 reply; 14+ messages in thread
From: Eduard Bloch @ 2007-04-07 11:21 UTC (permalink / raw)
  To: Alan Cox; +Cc: linux-kernel, debburn-devel

#include <hallo.h>

First, we (me and Thomas Schmidt) are working on a draft for a mandatory
locking scheme which will take care of the most racy situations even
without having a proper in-kernel solution. But you need to exlain some
things, otherwise we cannot rely on your words.

> (open has side effects relocking doesnt)

What exactly does that mean in our scope?

Can we do following without having side effects:

open("/dev/sr0",O_EXCL|O_RDWR); /* no matter what it returns */
fcntl(..., F_SETLK); /* no matter what it returns */
ioctl(f, SCSI_IOCTL_GET_IDLUN, &x);
ioctl(f, SCSI_IOCTL_GET_BUS_NUMBER, &jo);

Can you guarantee us that bit? 

Or shall we really implement ugly workarounds to avoid every open call?
Note that "just do like UUCP guys" is not as easy or reliable as people
may pretend.

Eduard.

-- 
Naja, Garbage Collector eben. Holt den Müll sogar vom Himmel.
       (Heise Trollforum über Java in der Flugzeugsteuerung)

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: broken device locking, sg vs. sg_io on block devices
  2007-04-07 11:21             ` Eduard Bloch
@ 2007-04-11 10:12               ` Eduard Bloch
  2007-04-11 11:31                 ` Alan Cox
  0 siblings, 1 reply; 14+ messages in thread
From: Eduard Bloch @ 2007-04-11 10:12 UTC (permalink / raw)
  To: Alan Cox; +Cc: linux-kernel, debburn-devel, Jens Axboe

#include <hallo.h>
* Eduard Bloch [Sat, Apr 07 2007, 01:21:31PM]:

> Can we do following without having side effects:
> 
> open("/dev/sr0",O_EXCL|O_RDWR); /* no matter what it returns */
> fcntl(..., F_SETLK); /* no matter what it returns */
> ioctl(f, SCSI_IOCTL_GET_IDLUN, &x);
> ioctl(f, SCSI_IOCTL_GET_BUS_NUMBER, &jo);
> 
> Can you guarantee us that bit? 
> 
> Or shall we really implement ugly workarounds to avoid every open call?
> Note that "just do like UUCP guys" is not as easy or reliable as people
> may pretend.

Excuse me, but is there ANYBODY willing to give a binding statement on
that? First you tell "us CD writing guys" to manage that in user space
[1] and when we get real critical questions then everything we get is
radio silence? Very kindly. NOT.

Background: there are two good ways we can go:

a) carefully collecting device properties, mapping and opening
   additional devices
b) additional lockfiles delegating the locking operations

Unfortunately b leads to major problems in practice, while a looks
more promising but need a guarantee that it will be kept be working and
harmless in the future kernel releases. We need to know that before
moving on to missionary work.

Regards,
Eduard.

[1] which IMO still sucks because it's allowed to have device driver
access from any file, thus dealing locking problems at file level is
like fighting a hydra. And with multiple semi-autonomous drivers for the
same hardware even the previous silver bullet (O_EXCL) does not help.

-- 
Ich bin bereit überall hinzugehen, wenn es nur vorwärts ist.
		-- David Livingstone

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: broken device locking, sg vs. sg_io on block devices
  2007-04-11 10:12               ` Eduard Bloch
@ 2007-04-11 11:31                 ` Alan Cox
  2007-04-11 12:19                   ` Eduard Bloch
  0 siblings, 1 reply; 14+ messages in thread
From: Alan Cox @ 2007-04-11 11:31 UTC (permalink / raw)
  To: Eduard Bloch; +Cc: linux-kernel, debburn-devel, Jens Axboe

> > Can we do following without having side effects:
> > 
> > open("/dev/sr0",O_EXCL|O_RDWR); /* no matter what it returns */
> > fcntl(..., F_SETLK); /* no matter what it returns */
> > ioctl(f, SCSI_IOCTL_GET_IDLUN, &x);
> > ioctl(f, SCSI_IOCTL_GET_BUS_NUMBER, &jo);
> > 
> > Can you guarantee us that bit? 

open() has side effects. The CD layer allows you to open with O_NDELAY if
you want to avoid them.

> [1] and when we get real critical questions then everything we get is
> radio silence? Very kindly. NOT.

Given the attitude that was shown before be glad I'm even bothering to
reply to this.


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: broken device locking, sg vs. sg_io on block devices
  2007-04-11 11:31                 ` Alan Cox
@ 2007-04-11 12:19                   ` Eduard Bloch
  0 siblings, 0 replies; 14+ messages in thread
From: Eduard Bloch @ 2007-04-11 12:19 UTC (permalink / raw)
  To: Alan Cox; +Cc: linux-kernel, debburn-devel

#include <hallo.h>
* Alan Cox [Wed, Apr 11 2007, 12:31:02PM]:
> > > Can we do following without having side effects:
> > > 
> > > open("/dev/sr0",O_EXCL|O_RDWR); /* no matter what it returns */
> > > fcntl(..., F_SETLK); /* no matter what it returns */
> > > ioctl(f, SCSI_IOCTL_GET_IDLUN, &x);
> > > ioctl(f, SCSI_IOCTL_GET_BUS_NUMBER, &jo);
> > > 
> > > Can you guarantee us that bit? 
> 
> open() has side effects. The CD layer allows you to open with O_NDELAY if
> you want to avoid them.

Okay, thanks.

> > [1] and when we get real critical questions then everything we get is
> > radio silence? Very kindly. NOT.
> 
> Given the attitude that was shown before be glad I'm even bothering to
> reply to this.

My attitude? I hope you realize that our hands are tied if we get that
great kind of support here (like SEP roundtrips, not applicable
exemplary solutions and explanations without relevant details like the
one above) while the problem exists. Meaning sure death to real CDR/DVDR
disks every day.

You don't do the end user support, we do.

Eduard.
-- 
<Salz> jjFux: Ted hieß ja früher auch Walther
<Salz> winkiller: hm... es sind 8... die 7 kandidaten und NOTA
<Madkiss> Ist der jetzt eigentlich eine gespaltene Persönlichkeit, bei der aber
  beide Teile bekloppt sind?

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2007-04-11 12:19 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <200703261811.21448.gerald@itzgrund.net>
2007-03-30 11:17 ` broken device locking, sg vs. sg_io on block devices Eduard Bloch
2007-03-30 13:43   ` Christoph Hellwig
2007-03-30 14:21     ` Eduard Bloch
2007-03-30 18:10       ` Alan Cox
2007-03-31 17:07         ` Eduard Bloch
2007-03-31 22:20           ` Alan Cox
2007-03-31 22:40             ` Eduard Bloch
2007-04-01  0:14               ` Alan Cox
2007-04-01  2:34                 ` Oleg Verych
2007-04-07 11:21             ` Eduard Bloch
2007-04-11 10:12               ` Eduard Bloch
2007-04-11 11:31                 ` Alan Cox
2007-04-11 12:19                   ` Eduard Bloch
2007-03-30 19:09       ` Jan Engelhardt

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).