LKML Archive on lore.kernel.org
help / color / mirror / Atom feed
* xfs_file_ioctl / xfs_freeze: BUG: warning at kernel/mutex-debug.c:80/debug_mutex_unlock()
@ 2007-01-04  0:14 Sami Farin
  2007-01-07 21:37 ` David Chinner
  0 siblings, 1 reply; 18+ messages in thread
From: Sami Farin @ 2007-01-04  0:14 UTC (permalink / raw)
  To: linux-kernel Mailing List

just a simple test I did...
xfs_freeze -f /mnt/newtest
cp /etc/fstab /mnt/newtest
xfs_freeze -u /mnt/newtest

2007-01-04 01:44:30.341979500 <4>BUG: warning at kernel/mutex-debug.c:80/debug_mutex_unlock()
2007-01-04 01:44:30.385771500 <4> [<c0103cfb>] dump_trace+0x215/0x21a
2007-01-04 01:44:30.385774500 <4> [<c0103da3>] show_trace_log_lvl+0x1a/0x30
2007-01-04 01:44:30.385775500 <4> [<c0103dcb>] show_trace+0x12/0x14
2007-01-04 01:44:30.385777500 <4> [<c0103ec8>] dump_stack+0x19/0x1b
2007-01-04 01:44:30.385778500 <4> [<c013a3af>] debug_mutex_unlock+0x69/0x120
2007-01-04 01:44:30.385779500 <4> [<c04b4aac>] __mutex_unlock_slowpath+0x44/0xf0
2007-01-04 01:44:30.385780500 <4> [<c04b4887>] mutex_unlock+0x8/0xa
2007-01-04 01:44:30.385782500 <4> [<c018d0ba>] thaw_bdev+0x57/0x6e
2007-01-04 01:44:30.385791500 <4> [<c026a6cf>] xfs_ioctl+0x7ce/0x7d3
2007-01-04 01:44:30.385793500 <4> [<c0269158>] xfs_file_ioctl+0x33/0x54
2007-01-04 01:44:30.385794500 <4> [<c01793f2>] do_ioctl+0x76/0x85
2007-01-04 01:44:30.385795500 <4> [<c0179570>] vfs_ioctl+0x59/0x1aa
2007-01-04 01:44:30.385796500 <4> [<c0179728>] sys_ioctl+0x67/0x77
2007-01-04 01:44:30.385797500 <4> [<c0102e73>] syscall_call+0x7/0xb
2007-01-04 01:44:30.385799500 <4> [<001be410>] 0x1be410
2007-01-04 01:44:30.385800500 <4> =======================

fstab was there just fine after -u.

Linux 2.6.19.1 SMP on Pentium D.

-- 

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: xfs_file_ioctl / xfs_freeze: BUG: warning at kernel/mutex-debug.c:80/debug_mutex_unlock()
  2007-01-04  0:14 xfs_file_ioctl / xfs_freeze: BUG: warning at kernel/mutex-debug.c:80/debug_mutex_unlock() Sami Farin
@ 2007-01-07 21:37 ` David Chinner
  2007-01-08 11:03   ` Sami Farin
  2007-01-08 23:56   ` xfs_file_ioctl / xfs_freeze: BUG: warning at kernel/mutex-debug.c:80/debug_mutex_unlock() Andrew Morton
  0 siblings, 2 replies; 18+ messages in thread
From: David Chinner @ 2007-01-07 21:37 UTC (permalink / raw)
  To: linux-kernel Mailing List; +Cc: xfs

On Thu, Jan 04, 2007 at 02:14:21AM +0200, Sami Farin wrote:
> just a simple test I did...
> xfs_freeze -f /mnt/newtest
> cp /etc/fstab /mnt/newtest
> xfs_freeze -u /mnt/newtest
> 
> 2007-01-04 01:44:30.341979500 <4>BUG: warning at kernel/mutex-debug.c:80/debug_mutex_unlock()
> 2007-01-04 01:44:30.385771500 <4> [<c0103cfb>] dump_trace+0x215/0x21a
> 2007-01-04 01:44:30.385774500 <4> [<c0103da3>] show_trace_log_lvl+0x1a/0x30
> 2007-01-04 01:44:30.385775500 <4> [<c0103dcb>] show_trace+0x12/0x14
> 2007-01-04 01:44:30.385777500 <4> [<c0103ec8>] dump_stack+0x19/0x1b
> 2007-01-04 01:44:30.385778500 <4> [<c013a3af>] debug_mutex_unlock+0x69/0x120
> 2007-01-04 01:44:30.385779500 <4> [<c04b4aac>] __mutex_unlock_slowpath+0x44/0xf0
> 2007-01-04 01:44:30.385780500 <4> [<c04b4887>] mutex_unlock+0x8/0xa
> 2007-01-04 01:44:30.385782500 <4> [<c018d0ba>] thaw_bdev+0x57/0x6e
> 2007-01-04 01:44:30.385791500 <4> [<c026a6cf>] xfs_ioctl+0x7ce/0x7d3
> 2007-01-04 01:44:30.385793500 <4> [<c0269158>] xfs_file_ioctl+0x33/0x54
> 2007-01-04 01:44:30.385794500 <4> [<c01793f2>] do_ioctl+0x76/0x85
> 2007-01-04 01:44:30.385795500 <4> [<c0179570>] vfs_ioctl+0x59/0x1aa
> 2007-01-04 01:44:30.385796500 <4> [<c0179728>] sys_ioctl+0x67/0x77
> 2007-01-04 01:44:30.385797500 <4> [<c0102e73>] syscall_call+0x7/0xb
> 2007-01-04 01:44:30.385799500 <4> [<001be410>] 0x1be410
> 2007-01-04 01:44:30.385800500 <4> =======================
> 
> fstab was there just fine after -u.

Oh, that still hasn't been fixed? Generic bug, not XFS - the global
semaphore->mutex cleanup converted the bd_mount_sem to a mutex, and
mutexes complain loudly when a the process unlocking the mutex is
not the process that locked it.

Basically, the generic code is broken - the bd_mount_mutex needs to
be reverted back to a semaphore because it is locked and unlocked
by different processes. The following patch does this....

BTW, Sami, can you cc xfs@oss.sgi.com on XFS bug reports in future;
you'll get more XFS savvy eyes there.....

Cheers,

Dave.
-- 
Dave Chinner
Principal Engineer
SGI Australian Software Group

---

Revert bd_mount_mutex back to a semaphore so that xfs_freeze -f /mnt/newtest;
xfs_freeze -u /mnt/newtest works safely and doesn't produce lockdep warnings.

Signed-off-by: Dave Chinner <dgc@sgi.com>


---
 fs/block_dev.c       |    2 +-
 fs/buffer.c          |    6 +++---
 fs/gfs2/ops_fstype.c |    4 ++--
 fs/super.c           |    4 ++--
 include/linux/fs.h   |    2 +-
 5 files changed, 9 insertions(+), 9 deletions(-)

Index: 2.6.x-xfs-new/fs/block_dev.c
===================================================================
--- 2.6.x-xfs-new.orig/fs/block_dev.c	2006-12-22 10:53:20.000000000 +1100
+++ 2.6.x-xfs-new/fs/block_dev.c	2007-01-08 08:26:15.843378600 +1100
@@ -263,7 +263,7 @@ static void init_once(void * foo, kmem_c
 	{
 		memset(bdev, 0, sizeof(*bdev));
 		mutex_init(&bdev->bd_mutex);
-		mutex_init(&bdev->bd_mount_mutex);
+		sema_init(&bdev->bd_mount_sem, 1);
 		INIT_LIST_HEAD(&bdev->bd_inodes);
 		INIT_LIST_HEAD(&bdev->bd_list);
 #ifdef CONFIG_SYSFS
Index: 2.6.x-xfs-new/fs/buffer.c
===================================================================
--- 2.6.x-xfs-new.orig/fs/buffer.c	2006-12-12 12:04:51.000000000 +1100
+++ 2.6.x-xfs-new/fs/buffer.c	2007-01-08 08:28:40.832542651 +1100
@@ -179,7 +179,7 @@ int fsync_bdev(struct block_device *bdev
  * freeze_bdev  --  lock a filesystem and force it into a consistent state
  * @bdev:	blockdevice to lock
  *
- * This takes the block device bd_mount_mutex to make sure no new mounts
+ * This takes the block device bd_mount_sem to make sure no new mounts
  * happen on bdev until thaw_bdev() is called.
  * If a superblock is found on this device, we take the s_umount semaphore
  * on it to make sure nobody unmounts until the snapshot creation is done.
@@ -188,7 +188,7 @@ struct super_block *freeze_bdev(struct b
 {
 	struct super_block *sb;
 
-	mutex_lock(&bdev->bd_mount_mutex);
+	down(&bdev->bd_mount_sem);
 	sb = get_super(bdev);
 	if (sb && !(sb->s_flags & MS_RDONLY)) {
 		sb->s_frozen = SB_FREEZE_WRITE;
@@ -230,7 +230,7 @@ void thaw_bdev(struct block_device *bdev
 		drop_super(sb);
 	}
 
-	mutex_unlock(&bdev->bd_mount_mutex);
+	up(&bdev->bd_mount_sem);
 }
 EXPORT_SYMBOL(thaw_bdev);
 
Index: 2.6.x-xfs-new/fs/gfs2/ops_fstype.c
===================================================================
--- 2.6.x-xfs-new.orig/fs/gfs2/ops_fstype.c	2006-12-12 12:04:58.000000000 +1100
+++ 2.6.x-xfs-new/fs/gfs2/ops_fstype.c	2007-01-08 08:27:12.847973663 +1100
@@ -867,9 +867,9 @@ static int gfs2_get_sb_meta(struct file_
 		error = -EBUSY;
 		goto error;
 	}
-	mutex_lock(&sb->s_bdev->bd_mount_mutex);
+	down(&sb->s_bdev->bd_mount_sem);
 	new = sget(fs_type, test_bdev_super, set_bdev_super, sb->s_bdev);
-	mutex_unlock(&sb->s_bdev->bd_mount_mutex);
+	up(&sb->s_bdev->bd_mount_sem);
 	if (IS_ERR(new)) {
 		error = PTR_ERR(new);
 		goto error;
Index: 2.6.x-xfs-new/fs/super.c
===================================================================
--- 2.6.x-xfs-new.orig/fs/super.c	2006-12-22 11:45:59.000000000 +1100
+++ 2.6.x-xfs-new/fs/super.c	2007-01-08 08:24:20.718330640 +1100
@@ -736,9 +736,9 @@ int get_sb_bdev(struct file_system_type 
 	 * will protect the lockfs code from trying to start a snapshot
 	 * while we are mounting
 	 */
-	mutex_lock(&bdev->bd_mount_mutex);
+	down(&bdev->bd_mount_sem);
 	s = sget(fs_type, test_bdev_super, set_bdev_super, bdev);
-	mutex_unlock(&bdev->bd_mount_mutex);
+	up(&bdev->bd_mount_sem);
 	if (IS_ERR(s))
 		goto error_s;
 
Index: 2.6.x-xfs-new/include/linux/fs.h
===================================================================
--- 2.6.x-xfs-new.orig/include/linux/fs.h	2006-12-12 12:06:31.000000000 +1100
+++ 2.6.x-xfs-new/include/linux/fs.h	2007-01-08 08:24:53.602060200 +1100
@@ -456,7 +456,7 @@ struct block_device {
 	struct inode *		bd_inode;	/* will die */
 	int			bd_openers;
 	struct mutex		bd_mutex;	/* open/close mutex */
-	struct mutex		bd_mount_mutex;	/* mount mutex */
+	struct semaphore	bd_mount_sem;
 	struct list_head	bd_inodes;
 	void *			bd_holder;
 	int			bd_holders;

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: xfs_file_ioctl / xfs_freeze: BUG: warning at kernel/mutex-debug.c:80/debug_mutex_unlock()
  2007-01-07 21:37 ` David Chinner
@ 2007-01-08 11:03   ` Sami Farin
  2007-01-08 16:40     ` Eric Sandeen
  2007-01-08 23:56   ` xfs_file_ioctl / xfs_freeze: BUG: warning at kernel/mutex-debug.c:80/debug_mutex_unlock() Andrew Morton
  1 sibling, 1 reply; 18+ messages in thread
From: Sami Farin @ 2007-01-08 11:03 UTC (permalink / raw)
  To: linux-kernel Mailing List, xfs

On Mon, Jan 08, 2007 at 08:37:34 +1100, David Chinner wrote:
...
> > fstab was there just fine after -u.
> 
> Oh, that still hasn't been fixed?

Looked like it =)

> Generic bug, not XFS - the global
> semaphore->mutex cleanup converted the bd_mount_sem to a mutex, and
> mutexes complain loudly when a the process unlocking the mutex is
> not the process that locked it.
> 
> Basically, the generic code is broken - the bd_mount_mutex needs to
> be reverted back to a semaphore because it is locked and unlocked
> by different processes. The following patch does this....
> 
> BTW, Sami, can you cc xfs@oss.sgi.com on XFS bug reports in future;
> you'll get more XFS savvy eyes there.....

Forgot to.

Thanks for patch.  It fixed the issue, no more warnings.

BTW. the fix is not in 2.6.git, either.

-- 
Do what you love because life is too short for anything else.


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: xfs_file_ioctl / xfs_freeze: BUG: warning at kernel/mutex-debug.c:80/debug_mutex_unlock()
  2007-01-08 11:03   ` Sami Farin
@ 2007-01-08 16:40     ` Eric Sandeen
  2007-01-08 23:47       ` bd_mount_mutex -> bd_mount_sem (was Re: xfs_file_ioctl / xfs_freeze: BUG: warning at kernel/mutex-debug.c:80/debug_mutex_unlock()) David Chinner
  0 siblings, 1 reply; 18+ messages in thread
From: Eric Sandeen @ 2007-01-08 16:40 UTC (permalink / raw)
  To: linux-kernel Mailing List, xfs

Sami Farin wrote:
> On Mon, Jan 08, 2007 at 08:37:34 +1100, David Chinner wrote:
> ...
>>> fstab was there just fine after -u.
>> Oh, that still hasn't been fixed?
> 
> Looked like it =)

Hm, it was proposed upstream a while ago:

http://lkml.org/lkml/2006/9/27/137

I guess it got lost?

-Eric

>> Generic bug, not XFS - the global
>> semaphore->mutex cleanup converted the bd_mount_sem to a mutex, and
>> mutexes complain loudly when a the process unlocking the mutex is
>> not the process that locked it.
>>
>> Basically, the generic code is broken - the bd_mount_mutex needs to
>> be reverted back to a semaphore because it is locked and unlocked
>> by different processes. The following patch does this....
>>
>> BTW, Sami, can you cc xfs@oss.sgi.com on XFS bug reports in future;
>> you'll get more XFS savvy eyes there.....
> 
> Forgot to.
> 
> Thanks for patch.  It fixed the issue, no more warnings.
> 
> BTW. the fix is not in 2.6.git, either.
> 


^ permalink raw reply	[flat|nested] 18+ messages in thread

* bd_mount_mutex -> bd_mount_sem (was Re: xfs_file_ioctl / xfs_freeze: BUG: warning at kernel/mutex-debug.c:80/debug_mutex_unlock())
  2007-01-08 16:40     ` Eric Sandeen
@ 2007-01-08 23:47       ` David Chinner
  2007-01-09  0:19         ` Andrew Morton
  0 siblings, 1 reply; 18+ messages in thread
From: David Chinner @ 2007-01-08 23:47 UTC (permalink / raw)
  To: Eric Sandeen; +Cc: linux-kernel Mailing List, xfs, akpm

On Mon, Jan 08, 2007 at 10:40:54AM -0600, Eric Sandeen wrote:
> Sami Farin wrote:
> > On Mon, Jan 08, 2007 at 08:37:34 +1100, David Chinner wrote:
> > ...
> >>> fstab was there just fine after -u.
> >> Oh, that still hasn't been fixed?
> > 
> > Looked like it =)
> 
> Hm, it was proposed upstream a while ago:
> 
> http://lkml.org/lkml/2006/9/27/137
> 
> I guess it got lost?

Seems like it. Andrew, did this ever get queued for merge?

Cheers,

Dave.
-- 
Dave Chinner
Principal Engineer
SGI Australian Software Group

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: xfs_file_ioctl / xfs_freeze: BUG: warning at kernel/mutex-debug.c:80/debug_mutex_unlock()
  2007-01-07 21:37 ` David Chinner
  2007-01-08 11:03   ` Sami Farin
@ 2007-01-08 23:56   ` Andrew Morton
  2007-01-09  6:41     ` Ingo Molnar
  1 sibling, 1 reply; 18+ messages in thread
From: Andrew Morton @ 2007-01-08 23:56 UTC (permalink / raw)
  To: David Chinner; +Cc: linux-kernel Mailing List, xfs, Ingo Molnar

On Mon, 8 Jan 2007 08:37:34 +1100
David Chinner <dgc@sgi.com> wrote:

> On Thu, Jan 04, 2007 at 02:14:21AM +0200, Sami Farin wrote:
> > just a simple test I did...
> > xfs_freeze -f /mnt/newtest
> > cp /etc/fstab /mnt/newtest
> > xfs_freeze -u /mnt/newtest
> > 
> > 2007-01-04 01:44:30.341979500 <4>BUG: warning at kernel/mutex-debug.c:80/debug_mutex_unlock()
> > 2007-01-04 01:44:30.385771500 <4> [<c0103cfb>] dump_trace+0x215/0x21a
> > 2007-01-04 01:44:30.385774500 <4> [<c0103da3>] show_trace_log_lvl+0x1a/0x30
> > 2007-01-04 01:44:30.385775500 <4> [<c0103dcb>] show_trace+0x12/0x14
> > 2007-01-04 01:44:30.385777500 <4> [<c0103ec8>] dump_stack+0x19/0x1b
> > 2007-01-04 01:44:30.385778500 <4> [<c013a3af>] debug_mutex_unlock+0x69/0x120
> > 2007-01-04 01:44:30.385779500 <4> [<c04b4aac>] __mutex_unlock_slowpath+0x44/0xf0
> > 2007-01-04 01:44:30.385780500 <4> [<c04b4887>] mutex_unlock+0x8/0xa
> > 2007-01-04 01:44:30.385782500 <4> [<c018d0ba>] thaw_bdev+0x57/0x6e
> > 2007-01-04 01:44:30.385791500 <4> [<c026a6cf>] xfs_ioctl+0x7ce/0x7d3
> > 2007-01-04 01:44:30.385793500 <4> [<c0269158>] xfs_file_ioctl+0x33/0x54
> > 2007-01-04 01:44:30.385794500 <4> [<c01793f2>] do_ioctl+0x76/0x85
> > 2007-01-04 01:44:30.385795500 <4> [<c0179570>] vfs_ioctl+0x59/0x1aa
> > 2007-01-04 01:44:30.385796500 <4> [<c0179728>] sys_ioctl+0x67/0x77
> > 2007-01-04 01:44:30.385797500 <4> [<c0102e73>] syscall_call+0x7/0xb
> > 2007-01-04 01:44:30.385799500 <4> [<001be410>] 0x1be410
> > 2007-01-04 01:44:30.385800500 <4> =======================
> > 
> > fstab was there just fine after -u.
> 
> Oh, that still hasn't been fixed? Generic bug, not XFS - the global
> semaphore->mutex cleanup converted the bd_mount_sem to a mutex, and
> mutexes complain loudly when a the process unlocking the mutex is
> not the process that locked it.
> 
> Basically, the generic code is broken - the bd_mount_mutex needs to
> be reverted back to a semaphore because it is locked and unlocked
> by different processes. The following patch does this....
> 
> ...
> 
> Revert bd_mount_mutex back to a semaphore so that xfs_freeze -f /mnt/newtest;
> xfs_freeze -u /mnt/newtest works safely and doesn't produce lockdep warnings.

Sad.  The alternative would be to implement
mutex_unlock_dont_warn_if_a_different_task_did_it().  Ingo?  Possible?


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: bd_mount_mutex -> bd_mount_sem (was Re: xfs_file_ioctl / xfs_freeze: BUG: warning at kernel/mutex-debug.c:80/debug_mutex_unlock())
  2007-01-08 23:47       ` bd_mount_mutex -> bd_mount_sem (was Re: xfs_file_ioctl / xfs_freeze: BUG: warning at kernel/mutex-debug.c:80/debug_mutex_unlock()) David Chinner
@ 2007-01-09  0:19         ` Andrew Morton
  2007-01-09  3:12           ` Eric Sandeen
  2007-01-09 10:02           ` Christoph Hellwig
  0 siblings, 2 replies; 18+ messages in thread
From: Andrew Morton @ 2007-01-09  0:19 UTC (permalink / raw)
  To: David Chinner; +Cc: Eric Sandeen, linux-kernel Mailing List, xfs

On Tue, 9 Jan 2007 10:47:28 +1100
David Chinner <dgc@sgi.com> wrote:

> On Mon, Jan 08, 2007 at 10:40:54AM -0600, Eric Sandeen wrote:
> > Sami Farin wrote:
> > > On Mon, Jan 08, 2007 at 08:37:34 +1100, David Chinner wrote:
> > > ...
> > >>> fstab was there just fine after -u.
> > >> Oh, that still hasn't been fixed?
> > > 
> > > Looked like it =)
> > 
> > Hm, it was proposed upstream a while ago:
> > 
> > http://lkml.org/lkml/2006/9/27/137
> > 
> > I guess it got lost?
> 
> Seems like it. Andrew, did this ever get queued for merge?

Seems not.  I think people were hoping that various nasties in there
would go away.  We return to userspace with a kernel lock held??

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: bd_mount_mutex -> bd_mount_sem (was Re: xfs_file_ioctl / xfs_freeze: BUG: warning at kernel/mutex-debug.c:80/debug_mutex_unlock())
  2007-01-09  0:19         ` Andrew Morton
@ 2007-01-09  3:12           ` Eric Sandeen
  2007-01-09  3:18             ` Andrew Morton
  2007-01-09 10:02           ` Christoph Hellwig
  1 sibling, 1 reply; 18+ messages in thread
From: Eric Sandeen @ 2007-01-09  3:12 UTC (permalink / raw)
  To: Andrew Morton; +Cc: David Chinner, linux-kernel Mailing List, xfs

Andrew Morton wrote:
> On Tue, 9 Jan 2007 10:47:28 +1100
> David Chinner <dgc@sgi.com> wrote:
> 
>> On Mon, Jan 08, 2007 at 10:40:54AM -0600, Eric Sandeen wrote:
>>> Sami Farin wrote:
>>>> On Mon, Jan 08, 2007 at 08:37:34 +1100, David Chinner wrote:
>>>> ...
>>>>>> fstab was there just fine after -u.
>>>>> Oh, that still hasn't been fixed?
>>>> Looked like it =)
>>> Hm, it was proposed upstream a while ago:
>>>
>>> http://lkml.org/lkml/2006/9/27/137
>>>
>>> I guess it got lost?
>> Seems like it. Andrew, did this ever get queued for merge?
> 
> Seems not.  I think people were hoping that various nasties in there
> would go away.  We return to userspace with a kernel lock held??

Is a semaphore any worse than the current mutex in this respect?  At 
least unlocking from another thread doesn't violate semaphore rules.  :)

-Eric

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: bd_mount_mutex -> bd_mount_sem (was Re: xfs_file_ioctl / xfs_freeze: BUG: warning at kernel/mutex-debug.c:80/debug_mutex_unlock())
  2007-01-09  3:12           ` Eric Sandeen
@ 2007-01-09  3:18             ` Andrew Morton
  2007-01-09  3:38               ` Eric Sandeen
  0 siblings, 1 reply; 18+ messages in thread
From: Andrew Morton @ 2007-01-09  3:18 UTC (permalink / raw)
  To: Eric Sandeen; +Cc: David Chinner, linux-kernel Mailing List, xfs

On Mon, 08 Jan 2007 21:12:40 -0600
Eric Sandeen <sandeen@sandeen.net> wrote:

> Andrew Morton wrote:
> > On Tue, 9 Jan 2007 10:47:28 +1100
> > David Chinner <dgc@sgi.com> wrote:
> > 
> >> On Mon, Jan 08, 2007 at 10:40:54AM -0600, Eric Sandeen wrote:
> >>> Sami Farin wrote:
> >>>> On Mon, Jan 08, 2007 at 08:37:34 +1100, David Chinner wrote:
> >>>> ...
> >>>>>> fstab was there just fine after -u.
> >>>>> Oh, that still hasn't been fixed?
> >>>> Looked like it =)
> >>> Hm, it was proposed upstream a while ago:
> >>>
> >>> http://lkml.org/lkml/2006/9/27/137
> >>>
> >>> I guess it got lost?
> >> Seems like it. Andrew, did this ever get queued for merge?
> > 
> > Seems not.  I think people were hoping that various nasties in there
> > would go away.  We return to userspace with a kernel lock held??
> 
> Is a semaphore any worse than the current mutex in this respect?  At 
> least unlocking from another thread doesn't violate semaphore rules.  :)

I assume that if we weren't returning to userspace with a lock held, this
mutex problem would simply go away.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: bd_mount_mutex -> bd_mount_sem (was Re: xfs_file_ioctl / xfs_freeze: BUG: warning at kernel/mutex-debug.c:80/debug_mutex_unlock())
  2007-01-09  3:18             ` Andrew Morton
@ 2007-01-09  3:38               ` Eric Sandeen
  2007-01-09  3:51                 ` Andrew Morton
  0 siblings, 1 reply; 18+ messages in thread
From: Eric Sandeen @ 2007-01-09  3:38 UTC (permalink / raw)
  To: Andrew Morton; +Cc: David Chinner, linux-kernel Mailing List, xfs

Andrew Morton wrote:
> On Mon, 08 Jan 2007 21:12:40 -0600
> Eric Sandeen <sandeen@sandeen.net> wrote:
> 
>> Andrew Morton wrote:
>>> On Tue, 9 Jan 2007 10:47:28 +1100
>>> David Chinner <dgc@sgi.com> wrote:
>>>
>>>> On Mon, Jan 08, 2007 at 10:40:54AM -0600, Eric Sandeen wrote:
>>>>> Sami Farin wrote:
>>>>>> On Mon, Jan 08, 2007 at 08:37:34 +1100, David Chinner wrote:
>>>>>> ...
>>>>>>>> fstab was there just fine after -u.
>>>>>>> Oh, that still hasn't been fixed?
>>>>>> Looked like it =)
>>>>> Hm, it was proposed upstream a while ago:
>>>>>
>>>>> http://lkml.org/lkml/2006/9/27/137
>>>>>
>>>>> I guess it got lost?
>>>> Seems like it. Andrew, did this ever get queued for merge?
>>> Seems not.  I think people were hoping that various nasties in there
>>> would go away.  We return to userspace with a kernel lock held??
>> Is a semaphore any worse than the current mutex in this respect?  At 
>> least unlocking from another thread doesn't violate semaphore rules.  :)
> 
> I assume that if we weren't returning to userspace with a lock held, this
> mutex problem would simply go away.
> 

Well nobody's asserting that the filesystem must always be locked & 
unlocked by the same thread, are they?  That'd be a strange rule to 
enforce upon the userspace doing the filesystem management wouldn't it? 
  Or am I thinking about this wrong...

-Eric

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: bd_mount_mutex -> bd_mount_sem (was Re: xfs_file_ioctl / xfs_freeze: BUG: warning at kernel/mutex-debug.c:80/debug_mutex_unlock())
  2007-01-09  3:38               ` Eric Sandeen
@ 2007-01-09  3:51                 ` Andrew Morton
  2007-01-09  4:17                   ` Nathan Scott
  2007-01-09 10:04                   ` Christoph Hellwig
  0 siblings, 2 replies; 18+ messages in thread
From: Andrew Morton @ 2007-01-09  3:51 UTC (permalink / raw)
  To: Eric Sandeen; +Cc: David Chinner, linux-kernel Mailing List, xfs

On Mon, 08 Jan 2007 21:38:05 -0600
Eric Sandeen <sandeen@sandeen.net> wrote:

> Andrew Morton wrote:
> > On Mon, 08 Jan 2007 21:12:40 -0600
> > Eric Sandeen <sandeen@sandeen.net> wrote:
> > 
> >> Andrew Morton wrote:
> >>> On Tue, 9 Jan 2007 10:47:28 +1100
> >>> David Chinner <dgc@sgi.com> wrote:
> >>>
> >>>> On Mon, Jan 08, 2007 at 10:40:54AM -0600, Eric Sandeen wrote:
> >>>>> Sami Farin wrote:
> >>>>>> On Mon, Jan 08, 2007 at 08:37:34 +1100, David Chinner wrote:
> >>>>>> ...
> >>>>>>>> fstab was there just fine after -u.
> >>>>>>> Oh, that still hasn't been fixed?
> >>>>>> Looked like it =)
> >>>>> Hm, it was proposed upstream a while ago:
> >>>>>
> >>>>> http://lkml.org/lkml/2006/9/27/137
> >>>>>
> >>>>> I guess it got lost?
> >>>> Seems like it. Andrew, did this ever get queued for merge?
> >>> Seems not.  I think people were hoping that various nasties in there
> >>> would go away.  We return to userspace with a kernel lock held??
> >> Is a semaphore any worse than the current mutex in this respect?  At 
> >> least unlocking from another thread doesn't violate semaphore rules.  :)
> > 
> > I assume that if we weren't returning to userspace with a lock held, this
> > mutex problem would simply go away.
> > 
> 
> Well nobody's asserting that the filesystem must always be locked & 
> unlocked by the same thread, are they?  That'd be a strange rule to 
> enforce upon the userspace doing the filesystem management wouldn't it? 
>   Or am I thinking about this wrong...

I don't even know what code we're talking about here...

I'm under the impression that XFS will return to userspace with a
filesystem lock held, under the expectation (ie: requirement) that
userspace will later come in and release that lock.

If that's not true, then what _is_ happening in there?

If that _is_ true then, well, that sucks a bit.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: bd_mount_mutex -> bd_mount_sem (was Re: xfs_file_ioctl / xfs_freeze: BUG: warning at kernel/mutex-debug.c:80/debug_mutex_unlock())
  2007-01-09  3:51                 ` Andrew Morton
@ 2007-01-09  4:17                   ` Nathan Scott
  2007-01-09  4:49                     ` David Chinner
  2007-01-09 10:04                   ` Christoph Hellwig
  1 sibling, 1 reply; 18+ messages in thread
From: Nathan Scott @ 2007-01-09  4:17 UTC (permalink / raw)
  To: Andrew Morton; +Cc: Eric Sandeen, David Chinner, linux-kernel Mailing List, xfs

On Mon, 2007-01-08 at 19:51 -0800, Andrew Morton wrote:
> On Mon, 08 Jan 2007 21:38:05 -0600
> Eric Sandeen <sandeen@sandeen.net> wrote:
> 
> > Andrew Morton wrote:
> > > On Mon, 08 Jan 2007 21:12:40 -0600
> > > Eric Sandeen <sandeen@sandeen.net> wrote:
> > > 
> > >> Andrew Morton wrote:
> > >>> On Tue, 9 Jan 2007 10:47:28 +1100
> > >>> David Chinner <dgc@sgi.com> wrote:
> > >>>
> > >>>> On Mon, Jan 08, 2007 at 10:40:54AM -0600, Eric Sandeen wrote:
> > >>>>> Sami Farin wrote:
> > >>>>>> On Mon, Jan 08, 2007 at 08:37:34 +1100, David Chinner wrote:
> > >>>>>> ...
> > >>>>>>>> fstab was there just fine after -u.
> > >>>>>>> Oh, that still hasn't been fixed?
> > >>>>>> Looked like it =)
> > >>>>> Hm, it was proposed upstream a while ago:
> > >>>>>
> > >>>>> http://lkml.org/lkml/2006/9/27/137
> > >>>>>
> > >>>>> I guess it got lost?
> > >>>> Seems like it. Andrew, did this ever get queued for merge?
> > >>> Seems not.  I think people were hoping that various nasties in there
> > >>> would go away.  We return to userspace with a kernel lock held??
> > >> Is a semaphore any worse than the current mutex in this respect?  At 
> > >> least unlocking from another thread doesn't violate semaphore rules.  :)
> > > 
> > > I assume that if we weren't returning to userspace with a lock held, this
> > > mutex problem would simply go away.
> > > 
> > 
> > Well nobody's asserting that the filesystem must always be locked & 
> > unlocked by the same thread, are they?  That'd be a strange rule to 
> > enforce upon the userspace doing the filesystem management wouldn't it? 
> >   Or am I thinking about this wrong...
> 
> I don't even know what code we're talking about here...
> 
> I'm under the impression that XFS will return to userspace with a
> filesystem lock held, under the expectation (ie: requirement) that
> userspace will later come in and release that lock.

Its not really XFS, its more the generic device freezing code
(freeze_bdev) which is called by both XFS and the device mapper
suspend interface (both of which are exposed to userspace via
ioctls).  These interfaces are used when doing block level
snapshots which are "filesystem coherent".

> If that's not true, then what _is_ happening in there?

This particular case was a device mapper stack trace, hence the
confusion, I think.  Both XFS and DM are making the same generic
block layer call here though (freeze_bdev).

> If that _is_ true then, well, that sucks a bit.

Indeed, its a fairly ordinary interface, but thats too late to go
fix now I guess (since its exposed to userspace already).  A remount
flag along the lines of readonly may have been a better way to go...
perhaps.  *shrug*... not clear - I guess the problem the original
authors had there (whoever they were, I dunno), was that the block
layer wants to call up to the filesystem to quiesce itself, and at
some later user-defined point to unquiesce itself... which is a bit
of a layering violation.

>From a quick look, there seems to be a bug in the original patch - it
is passing -EAGAIN back without wrapping it up in ERR_PTR(), which
it needs to since freeze_bdev returns a struct super_block pointer.

cheers.

-- 
Nathan


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: bd_mount_mutex -> bd_mount_sem (was Re: xfs_file_ioctl / xfs_freeze: BUG: warning at kernel/mutex-debug.c:80/debug_mutex_unlock())
  2007-01-09  4:17                   ` Nathan Scott
@ 2007-01-09  4:49                     ` David Chinner
  2007-01-09  6:02                       ` [**BULK SPAM**] " Nathan Scott
  0 siblings, 1 reply; 18+ messages in thread
From: David Chinner @ 2007-01-09  4:49 UTC (permalink / raw)
  To: Nathan Scott
  Cc: Andrew Morton, Eric Sandeen, David Chinner,
	linux-kernel Mailing List, xfs

On Tue, Jan 09, 2007 at 03:17:03PM +1100, Nathan Scott wrote:
> On Mon, 2007-01-08 at 19:51 -0800, Andrew Morton wrote:
> > On Mon, 08 Jan 2007 21:38:05 -0600
> > Eric Sandeen <sandeen@sandeen.net> wrote:
> > 
> > > Andrew Morton wrote:
> > > > On Mon, 08 Jan 2007 21:12:40 -0600
> > > > Eric Sandeen <sandeen@sandeen.net> wrote:
> > > > 
> > > >> Andrew Morton wrote:
> > > >>> On Tue, 9 Jan 2007 10:47:28 +1100
> > > >>> David Chinner <dgc@sgi.com> wrote:
> > > >>>
> > > >>>> On Mon, Jan 08, 2007 at 10:40:54AM -0600, Eric Sandeen wrote:
> > > >>>>> Sami Farin wrote:
> > > >>>>>> On Mon, Jan 08, 2007 at 08:37:34 +1100, David Chinner wrote:
> > > >>>>>> ...
> > > >>>>>>>> fstab was there just fine after -u.
> > > >>>>>>> Oh, that still hasn't been fixed?
> > > >>>>>> Looked like it =)
> > > >>>>> Hm, it was proposed upstream a while ago:
> > > >>>>>
> > > >>>>> http://lkml.org/lkml/2006/9/27/137
> > > >>>>>
> > > >>>>> I guess it got lost?
> > > >>>> Seems like it. Andrew, did this ever get queued for merge?
> > > >>> Seems not.  I think people were hoping that various nasties in there
> > > >>> would go away.  We return to userspace with a kernel lock held??
> > > >> Is a semaphore any worse than the current mutex in this respect?  At 
> > > >> least unlocking from another thread doesn't violate semaphore rules.  :)
> > > > 
> > > > I assume that if we weren't returning to userspace with a lock held, this
> > > > mutex problem would simply go away.
> > > > 
> > > 
> > > Well nobody's asserting that the filesystem must always be locked & 
> > > unlocked by the same thread, are they?  That'd be a strange rule to 
> > > enforce upon the userspace doing the filesystem management wouldn't it? 
> > >   Or am I thinking about this wrong...
> > 
> > I don't even know what code we're talking about here...
> > 
> > I'm under the impression that XFS will return to userspace with a
> > filesystem lock held, under the expectation (ie: requirement) that
> > userspace will later come in and release that lock.
> 
> Its not really XFS, its more the generic device freezing code
> (freeze_bdev) which is called by both XFS and the device mapper
> suspend interface (both of which are exposed to userspace via
> ioctls).  These interfaces are used when doing block level
> snapshots which are "filesystem coherent".
> 
> > If that's not true, then what _is_ happening in there?
> 
> This particular case was a device mapper stack trace, hence the
> confusion, I think.  Both XFS and DM are making the same generic
> block layer call here though (freeze_bdev).

Yup. it's the freeze_bdev/thaw_bdev use of the bd_mount_mutex()
that's the problem. I fail to see _why_ we need to hold a lock
across the freeze/thaw - the only reason i can think of is to
hold out new calls to sget() (via get_sb_bdev()) while the
filesystem is frozen though I'm not sure why you'd need to
do that. Can someone explain why we are holding the lock from
freeze to thaw?

FWIW, the comment in get_sb_bdev() seems to imply s_umount is supposed
to ensure that we don't get unmounted while frozen.

Indeed, in the comment above freeze_bdev:

 * If a superblock is found on this device, we take the s_umount semaphore
 * on it to make sure nobody unmounts until the snapshot creation is done.

implies this as well, but freeze_bdev does not take the s_umount
semaphore, nor does any filesystem that implements ->write_super_lockfs()

So the code is clearly at odds with the comments here.

IMO, you should be able to unmount a frozen filesystem - behaviour
should be the same as crashing while frozen, so i think the comments
about "snapshots" are pretty dodgy as well.

> > If that _is_ true then, well, that sucks a bit.
> 
> Indeed, its a fairly ordinary interface, but thats too late to go
> fix now I guess (since its exposed to userspace already). 

Userspace knows nothing about that lock, so we can change that without
changing the the userspace API.

Cheers,

Dave.
-- 
Dave Chinner
Principal Engineer
SGI Australian Software Group

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [**BULK SPAM**]  Re: bd_mount_mutex -> bd_mount_sem (was Re: xfs_file_ioctl / xfs_freeze: BUG: warning at kernel/mutex-debug.c:80/debug_mutex_unlock())
  2007-01-09  4:49                     ` David Chinner
@ 2007-01-09  6:02                       ` Nathan Scott
  0 siblings, 0 replies; 18+ messages in thread
From: Nathan Scott @ 2007-01-09  6:02 UTC (permalink / raw)
  To: David Chinner; +Cc: Andrew Morton, Eric Sandeen, linux-kernel Mailing List, xfs

On Tue, 2007-01-09 at 15:49 +1100, David Chinner wrote:
> On Tue, Jan 09, 2007 at 03:17:03PM +1100, Nathan Scott wrote:
> > On Mon, 2007-01-08 at 19:51 -0800, Andrew Morton wrote:
> > > If that's not true, then what _is_ happening in there?
> > 
> > This particular case was a device mapper stack trace, hence the
> > confusion, I think.  Both XFS and DM are making the same generic
> > block layer call here though (freeze_bdev).
> 
> Yup. it's the freeze_bdev/thaw_bdev use of the bd_mount_mutex()
> that's the problem. I fail to see _why_ we need to hold a lock
> across the freeze/thaw - the only reason i can think of is to
> hold out new calls to sget() (via get_sb_bdev()) while the
> filesystem is frozen though I'm not sure why you'd need to
> do that. Can someone explain why we are holding the lock from
> freeze to thaw?

Not me.  If it's really not needed, then...

> > > If that _is_ true then, well, that sucks a bit.
> > 
> > Indeed, its a fairly ordinary interface, but thats too late to go
> > fix now I guess (since its exposed to userspace already). 
> 
> Userspace knows nothing about that lock, so we can change that without
> changing the the userspace API.

...that would be true, AFAICS.

cheers.

-- 
Nathan


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: xfs_file_ioctl / xfs_freeze: BUG: warning at kernel/mutex-debug.c:80/debug_mutex_unlock()
  2007-01-08 23:56   ` xfs_file_ioctl / xfs_freeze: BUG: warning at kernel/mutex-debug.c:80/debug_mutex_unlock() Andrew Morton
@ 2007-01-09  6:41     ` Ingo Molnar
  0 siblings, 0 replies; 18+ messages in thread
From: Ingo Molnar @ 2007-01-09  6:41 UTC (permalink / raw)
  To: Andrew Morton
  Cc: David Chinner, linux-kernel Mailing List, xfs, Arjan van de Ven,
	Peter Zijlstra


* Andrew Morton <akpm@osdl.org> wrote:

> > Revert bd_mount_mutex back to a semaphore so that xfs_freeze -f 
> > /mnt/newtest; xfs_freeze -u /mnt/newtest works safely and doesn't 
> > produce lockdep warnings.
> 
> Sad.  The alternative would be to implement 
> mutex_unlock_dont_warn_if_a_different_task_did_it().  Ingo?  Possible?

i'd like to avoid it as much as i'd like to avoid having to add 
spin_unlock_dont_warn_if_a_different_task_did_it(). Unlocking by a 
different task is usually a sign of messy locking and bugs lurking. Is 
it really true that XFS's use of bd_mount_mutex is safe and justified?

	Ingo

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: bd_mount_mutex -> bd_mount_sem (was Re: xfs_file_ioctl / xfs_freeze: BUG: warning at kernel/mutex-debug.c:80/debug_mutex_unlock())
  2007-01-09  0:19         ` Andrew Morton
  2007-01-09  3:12           ` Eric Sandeen
@ 2007-01-09 10:02           ` Christoph Hellwig
  1 sibling, 0 replies; 18+ messages in thread
From: Christoph Hellwig @ 2007-01-09 10:02 UTC (permalink / raw)
  To: Andrew Morton; +Cc: David Chinner, Eric Sandeen, linux-kernel Mailing List, xfs

On Mon, Jan 08, 2007 at 04:19:17PM -0800, Andrew Morton wrote:
> Seems not.  I think people were hoping that various nasties in there
> would go away.  We return to userspace with a kernel lock held??

Well, there might be nicer solutions, but for now we should revert the
broken commit to change the lock type.


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: bd_mount_mutex -> bd_mount_sem (was Re: xfs_file_ioctl / xfs_freeze: BUG: warning at kernel/mutex-debug.c:80/debug_mutex_unlock())
  2007-01-09  3:51                 ` Andrew Morton
  2007-01-09  4:17                   ` Nathan Scott
@ 2007-01-09 10:04                   ` Christoph Hellwig
  2007-01-10  1:34                     ` David Chinner
  1 sibling, 1 reply; 18+ messages in thread
From: Christoph Hellwig @ 2007-01-09 10:04 UTC (permalink / raw)
  To: Andrew Morton; +Cc: Eric Sandeen, David Chinner, linux-kernel Mailing List, xfs

On Mon, Jan 08, 2007 at 07:51:27PM -0800, Andrew Morton wrote:
> I don't even know what code we're talking about here...
> 
> I'm under the impression that XFS will return to userspace with a
> filesystem lock held, under the expectation (ie: requirement) that
> userspace will later come in and release that lock.
> 
> If that's not true, then what _is_ happening in there?
> 
> If that _is_ true then, well, that sucks a bit.

It's not a filesystem lock.  It's a per-blockdevice lock that prevents
multiple people from freezing the filesystem at the same time, aswell
as providing exclusion between a frozen filesystem an mount-related
activity.  It's a traditional text-box example for a semaphore.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: bd_mount_mutex -> bd_mount_sem (was Re: xfs_file_ioctl / xfs_freeze: BUG: warning at kernel/mutex-debug.c:80/debug_mutex_unlock())
  2007-01-09 10:04                   ` Christoph Hellwig
@ 2007-01-10  1:34                     ` David Chinner
  0 siblings, 0 replies; 18+ messages in thread
From: David Chinner @ 2007-01-10  1:34 UTC (permalink / raw)
  To: Christoph Hellwig, Andrew Morton, Eric Sandeen, David Chinner,
	linux-kernel Mailing List, xfs

On Tue, Jan 09, 2007 at 10:04:20AM +0000, Christoph Hellwig wrote:
> On Mon, Jan 08, 2007 at 07:51:27PM -0800, Andrew Morton wrote:
> > I don't even know what code we're talking about here...
> > 
> > I'm under the impression that XFS will return to userspace with a
> > filesystem lock held, under the expectation (ie: requirement) that
> > userspace will later come in and release that lock.
> > 
> > If that's not true, then what _is_ happening in there?
> > 
> > If that _is_ true then, well, that sucks a bit.
> 
> It's not a filesystem lock.  It's a per-blockdevice lock that prevents
> multiple people from freezing the filesystem at the same time, aswell
> as providing exclusion between a frozen filesystem an mount-related
> activity.  It's a traditional text-box example for a semaphore.

This can be done without needing to hold a semaphore across the
freeze/thaw.

In the XFS case, we never try to lock the semaphore a second
time - the freeze code checks if the filesystem is not already
(being) frozen before calling freeze_bdev(). On thaw it also
checks that the filesystem is frozen before calling thaw_bdev().

IOWs, you can safely do:

# xfs_freeze -f /dev/sda1; xfs_freeze -f /dev/sda1; xfs_freeze -f /dev/sda1;
# xfs_freeze -u /dev/sda1; xfs_freeze -u /dev/sda1; xfs_freeze -u /dev/sda1;

And the filesystem will only be frozen once and thawed once. The second
and subsequent incantations of the freeze/thaw are effectively ignored
and don't block.

IMO, if we need to prevent certain operations from occurring when the
filesystem is frozen, those operations need to explicitly check the
frozen state and block i.e. do something like:

	wait_event(sb->s_wait_unfrozen, (sb->s_frozen < SB_FREEZE_WRITE));

If you need to prevent unmounts from occurring while snapshotting a
frozen filesystem, then the snapshot code needs to take the s_umount
semaphore while the snapshot is in progress. We should not be making
frozen filesystems unmountable....

Thoughts?

Cheers,

Dave.
-- 
Dave Chinner
Principal Engineer
SGI Australian Software Group

^ permalink raw reply	[flat|nested] 18+ messages in thread

end of thread, other threads:[~2007-01-10  1:35 UTC | newest]

Thread overview: 18+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2007-01-04  0:14 xfs_file_ioctl / xfs_freeze: BUG: warning at kernel/mutex-debug.c:80/debug_mutex_unlock() Sami Farin
2007-01-07 21:37 ` David Chinner
2007-01-08 11:03   ` Sami Farin
2007-01-08 16:40     ` Eric Sandeen
2007-01-08 23:47       ` bd_mount_mutex -> bd_mount_sem (was Re: xfs_file_ioctl / xfs_freeze: BUG: warning at kernel/mutex-debug.c:80/debug_mutex_unlock()) David Chinner
2007-01-09  0:19         ` Andrew Morton
2007-01-09  3:12           ` Eric Sandeen
2007-01-09  3:18             ` Andrew Morton
2007-01-09  3:38               ` Eric Sandeen
2007-01-09  3:51                 ` Andrew Morton
2007-01-09  4:17                   ` Nathan Scott
2007-01-09  4:49                     ` David Chinner
2007-01-09  6:02                       ` [**BULK SPAM**] " Nathan Scott
2007-01-09 10:04                   ` Christoph Hellwig
2007-01-10  1:34                     ` David Chinner
2007-01-09 10:02           ` Christoph Hellwig
2007-01-08 23:56   ` xfs_file_ioctl / xfs_freeze: BUG: warning at kernel/mutex-debug.c:80/debug_mutex_unlock() Andrew Morton
2007-01-09  6:41     ` Ingo Molnar

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).