LKML Archive on lore.kernel.org
help / color / mirror / Atom feed
* Re: Fw: 2.6.6-rc3 ia64 smp_call_function() called with interrupts disabled
       [not found] <20040502214525.5ad05bed.akpm@osdl.org>
@ 2004-05-03 12:29 ` Matthew Wilcox
  2004-05-03 20:35   ` Matthew Wilcox
  0 siblings, 1 reply; 4+ messages in thread
From: Matthew Wilcox @ 2004-05-03 12:29 UTC (permalink / raw)
  To: Andrew Morton; +Cc: linux-scsi, kaos, linux-kernel, linux-ia64

On Sun, May 02, 2004 at 09:45:25PM -0700, Andrew Morton wrote:
> Begin forwarded message:
> 
> Date: Mon, 03 May 2004 12:39:44 +1000
> From: Keith Owens <kaos@sgi.com>
> To: linux-kernel@vger.kernel.org
> Subject: 2.6.6-rc3 ia64 smp_call_function() called with interrupts disabled
> 
> 
> 2.6.6-rc3, modprobe sg calls vfree() with interrupts disabled.  On
> ia64, vfree calls smp_flush_tlb_all() which calls smp_call_function().
> Calling smp_call_function() with interrupts disabled can deadlock.
> 
> Badness in smp_call_function at arch/ia64/kernel/smp.c:312
> 
> Call Trace:
>  [<a000000100142fb0>] __vunmap+0x50/0x1e0
>                                 sp=e00001307811fe30 bsp=e0000130781190a0
>  [<a0000002002ae4d0>] sg_add+0x2d0/0xbe0 [sg]
>                                 sp=e00001307811fe30 bsp=e000013078119038

How about the following patch?  Noet that vfree() handles a NULL argument,
so it's not necessary to check the pointer.

Index: drivers/scsi/sg.c
===================================================================
RCS file: /var/cvs/linux-2.6/drivers/scsi/sg.c,v
retrieving revision 1.13
diff -u -p -r1.13 sg.c
--- a/drivers/scsi/sg.c	15 Apr 2004 18:04:45 -0000	1.13
+++ b/drivers/scsi/sg.c	3 May 2004 12:28:12 -0000
@@ -1341,6 +1341,7 @@ sg_add(struct class_device *cl_dev)
 	Sg_device *sdp = NULL;
 	unsigned long iflags;
 	struct cdev * cdev = NULL;
+	void *old_sg_dev_arr = NULL;
 	int k, error;
 
 	disk = alloc_disk(1);
@@ -1368,7 +1369,7 @@ sg_add(struct class_device *cl_dev)
 		memset(tmp_da, 0, tmp_dev_max * sizeof (Sg_device *));
 		memcpy(tmp_da, sg_dev_arr,
 		       sg_dev_max * sizeof (Sg_device *));
-		vfree((char *) sg_dev_arr);
+		old_sg_dev_arr = sg_dev_arr;
 		sg_dev_arr = tmp_da;
 		sg_dev_max = tmp_dev_max;
 	}
@@ -1384,8 +1385,7 @@ find_empty_slot:
 		       " type=%d, minor number exceeds %d\n",
 		       scsidp->host->host_no, scsidp->channel, scsidp->id,
 		       scsidp->lun, scsidp->type, SG_MAX_DEVS - 1);
-		if (NULL != sdp)
-			vfree((char *) sdp);
+		vfree(sdp);
 		error = -ENODEV;
 		goto out;
 	}
@@ -1459,6 +1459,7 @@ find_empty_slot:
 	return 0;
 
 out:
+	vfree(old_sg_dev_arr);
 	put_disk(disk);
 	if (cdev)
 		cdev_del(cdev);

-- 
"Next the statesmen will invent cheap lies, putting the blame upon 
the nation that is attacked, and every man will be glad of those
conscience-soothing falsities, and will diligently study them, and refuse
to examine any refutations of them; and thus he will by and by convince 
himself that the war is just, and will thank God for the better sleep 
he enjoys after this process of grotesque self-deception." -- Mark Twain

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Fw: 2.6.6-rc3 ia64 smp_call_function() called with interrupts disabled
  2004-05-03 12:29 ` Fw: 2.6.6-rc3 ia64 smp_call_function() called with interrupts disabled Matthew Wilcox
@ 2004-05-03 20:35   ` Matthew Wilcox
  2004-05-04  9:41     ` Christoph Hellwig
  0 siblings, 1 reply; 4+ messages in thread
From: Matthew Wilcox @ 2004-05-03 20:35 UTC (permalink / raw)
  To: Matthew Wilcox; +Cc: Andrew Morton, linux-scsi, kaos, linux-kernel, linux-ia64

On Mon, May 03, 2004 at 01:29:48PM +0100, Matthew Wilcox wrote:
> How about the following patch?  Noet that vfree() handles a NULL argument,
> so it's not necessary to check the pointer.

That patch is crap -- it only frees the memory on the error path, not
the normal exit.  Since I got confused by this function, it struck me
as not unreasonable that somebody else might also get confused by it
and split it into two parts.

I simplified some of the code.  The old code took the lock, scanned
through looking for a free slot, dropped the lock, allocated an sdp,
grabbed the lock and checked the slot was still free, branching back
if it had raced.  This rewrite assumes that we will find a slot and
allocates an sdp in advance.

Does anybody like this patch?  It survived booting on my test box which
only has one scsi device.  More testing welcomed.

Index: drivers/scsi/sg.c
===================================================================
RCS file: /var/cvs/linux-2.6/drivers/scsi/sg.c,v
retrieving revision 1.13
diff -u -p -r1.13 sg.c
--- a/drivers/scsi/sg.c	15 Apr 2004 18:04:45 -0000	1.13
+++ b/drivers/scsi/sg.c	3 May 2004 20:27:36 -0000
@@ -1333,85 +1333,44 @@ static struct class_simple * sg_sysfs_cl
 
 static int sg_sysfs_valid = 0;
 
-static int
-sg_add(struct class_device *cl_dev)
+static int sg_alloc(struct gendisk *disk, struct scsi_device *scsidp)
 {
-	struct scsi_device *scsidp = to_scsi_device(cl_dev->dev);
-	struct gendisk *disk;
-	Sg_device *sdp = NULL;
+	Sg_device *sdp;
 	unsigned long iflags;
-	struct cdev * cdev = NULL;
+	void *old_sg_dev_arr = NULL;
 	int k, error;
 
-	disk = alloc_disk(1);
-	if (!disk)
+	sdp = vmalloc(sizeof(Sg_device));
+	if (!sdp)
 		return -ENOMEM;
 
-	cdev = cdev_alloc();
-	if (! cdev)
-		return -ENOMEM;
 	write_lock_irqsave(&sg_dev_arr_lock, iflags);
-	if (sg_nr_dev >= sg_dev_max) {	/* try to resize */
+	if (unlikely(sg_nr_dev >= sg_dev_max)) {	/* try to resize */
 		Sg_device **tmp_da;
 		int tmp_dev_max = sg_nr_dev + SG_DEV_ARR_LUMP;
-
 		write_unlock_irqrestore(&sg_dev_arr_lock, iflags);
-		tmp_da = (Sg_device **)vmalloc(
-				tmp_dev_max * sizeof(Sg_device *));
-		if (NULL == tmp_da) {
-			printk(KERN_ERR
-			       "sg_add: device array cannot be resized\n");
-			error = -ENOMEM;
-			goto out;
-		}
+
+		tmp_da = vmalloc(tmp_dev_max * sizeof(Sg_device *));
+		if (unlikely(!tmp_da))
+			goto expand_failed;
+
 		write_lock_irqsave(&sg_dev_arr_lock, iflags);
-		memset(tmp_da, 0, tmp_dev_max * sizeof (Sg_device *));
-		memcpy(tmp_da, sg_dev_arr,
-		       sg_dev_max * sizeof (Sg_device *));
-		vfree((char *) sg_dev_arr);
+		memset(tmp_da, 0, tmp_dev_max * sizeof(Sg_device *));
+		memcpy(tmp_da, sg_dev_arr, sg_dev_max * sizeof(Sg_device *));
+		old_sg_dev_arr = sg_dev_arr;
 		sg_dev_arr = tmp_da;
 		sg_dev_max = tmp_dev_max;
 	}
 
-find_empty_slot:
 	for (k = 0; k < sg_dev_max; k++)
 		if (!sg_dev_arr[k])
 			break;
-	if (k >= SG_MAX_DEVS) {
-		write_unlock_irqrestore(&sg_dev_arr_lock, iflags);
-		printk(KERN_WARNING
-		       "Unable to attach sg device <%d, %d, %d, %d>"
-		       " type=%d, minor number exceeds %d\n",
-		       scsidp->host->host_no, scsidp->channel, scsidp->id,
-		       scsidp->lun, scsidp->type, SG_MAX_DEVS - 1);
-		if (NULL != sdp)
-			vfree((char *) sdp);
-		error = -ENODEV;
-		goto out;
-	}
-	if (k < sg_dev_max) {
-		if (NULL == sdp) {
-			write_unlock_irqrestore(&sg_dev_arr_lock, iflags);
-			sdp = (Sg_device *)vmalloc(sizeof(Sg_device));
-			write_lock_irqsave(&sg_dev_arr_lock, iflags);
-			if (!sg_dev_arr[k])
-				goto find_empty_slot;
-		}
-	} else
-		sdp = NULL;
-	if (NULL == sdp) {
-		write_unlock_irqrestore(&sg_dev_arr_lock, iflags);
-		printk(KERN_ERR "sg_add: Sg_device cannot be allocated\n");
-		error = -ENOMEM;
-		goto out;
-	}
+	if (unlikely(k >= SG_MAX_DEVS))
+		goto overflow;
 
-	SCSI_LOG_TIMEOUT(3, printk("sg_add: dev=%d \n", k));
 	memset(sdp, 0, sizeof(*sdp));
+	SCSI_LOG_TIMEOUT(3, printk("sg_add: dev=%d \n", k));
 	sprintf(disk->disk_name, "sg%d", k);
-	cdev->owner = THIS_MODULE;
-	cdev->ops = &sg_fops;
-	disk->major = SCSI_GENERIC_MAJOR;
 	disk->first_minor = k;
 	sdp->disk = disk;
 	sdp->device = scsidp;
@@ -1421,6 +1380,55 @@ find_empty_slot:
 	sg_nr_dev++;
 	sg_dev_arr[k] = sdp;
 	write_unlock_irqrestore(&sg_dev_arr_lock, iflags);
+	error = k;
+
+ out:
+	if (error < 0)
+		vfree(sdp);
+	vfree(old_sg_dev_arr);
+	return error;
+
+ expand_failed:
+	printk(KERN_ERR "sg_add: device array cannot be resized\n");
+	error = -ENOMEM;
+	goto out;
+
+ overflow:
+	write_unlock_irqrestore(&sg_dev_arr_lock, iflags);
+	printk(KERN_WARNING
+	       "Unable to attach sg device <%d, %d, %d, %d> type=%d, minor "
+	       "number exceeds %d\n", scsidp->host->host_no, scsidp->channel,
+	       scsidp->id, scsidp->lun, scsidp->type, SG_MAX_DEVS - 1);
+	error = -ENODEV;
+	goto out;
+}
+
+static int
+sg_add(struct class_device *cl_dev)
+{
+	struct scsi_device *scsidp = to_scsi_device(cl_dev->dev);
+	struct gendisk *disk;
+	Sg_device *sdp = NULL;
+	struct cdev * cdev = NULL;
+	int error, k;
+
+	disk = alloc_disk(1);
+	if (!disk)
+		return -ENOMEM;
+	disk->major = SCSI_GENERIC_MAJOR;
+
+	error = -ENOMEM;
+	cdev = cdev_alloc();
+	if (!cdev)
+		goto out;
+	cdev->owner = THIS_MODULE;
+	cdev->ops = &sg_fops;
+
+	error = sg_alloc(disk, scsidp);
+	if (error < 0)
+		goto out;
+	k = error;
+	sdp = sg_dev_arr[k];
 
 	devfs_mk_cdev(MKDEV(SCSI_GENERIC_MAJOR, k),
 			S_IFCHR | S_IRUSR | S_IWUSR | S_IRGRP,

-- 
"Next the statesmen will invent cheap lies, putting the blame upon 
the nation that is attacked, and every man will be glad of those
conscience-soothing falsities, and will diligently study them, and refuse
to examine any refutations of them; and thus he will by and by convince 
himself that the war is just, and will thank God for the better sleep 
he enjoys after this process of grotesque self-deception." -- Mark Twain

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Fw: 2.6.6-rc3 ia64 smp_call_function() called with interrupts disabled
  2004-05-03 20:35   ` Matthew Wilcox
@ 2004-05-04  9:41     ` Christoph Hellwig
  2004-05-14 20:00       ` Patrick Mansfield
  0 siblings, 1 reply; 4+ messages in thread
From: Christoph Hellwig @ 2004-05-04  9:41 UTC (permalink / raw)
  To: Matthew Wilcox; +Cc: Andrew Morton, linux-scsi, kaos, linux-kernel, linux-ia64

On Mon, May 03, 2004 at 09:35:13PM +0100, Matthew Wilcox wrote:
> That patch is crap -- it only frees the memory on the error path, not
> the normal exit.  Since I got confused by this function, it struck me
> as not unreasonable that somebody else might also get confused by it
> and split it into two parts.
> 
> I simplified some of the code.  The old code took the lock, scanned
> through looking for a free slot, dropped the lock, allocated an sdp,
> grabbed the lock and checked the slot was still free, branching back
> if it had raced.  This rewrite assumes that we will find a slot and
> allocates an sdp in advance.
> 
> Does anybody like this patch?  It survived booting on my test box which
> only has one scsi device.  More testing welcomed.

Better than what was there, but I still don't like it.  A global array
of devices is just utter crap.  Every entry point from scsi already has
struct scsi_device from which we can derive the sg-specific portion easily,
and for anything else (from a quick look that seems to be only procfs
stuff which should fade out anyway) a linear search on a linked list
is okay.

btw, why are we vmalloc()ing Sg_device?

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Fw: 2.6.6-rc3 ia64 smp_call_function() called with interrupts disabled
  2004-05-04  9:41     ` Christoph Hellwig
@ 2004-05-14 20:00       ` Patrick Mansfield
  0 siblings, 0 replies; 4+ messages in thread
From: Patrick Mansfield @ 2004-05-14 20:00 UTC (permalink / raw)
  To: Christoph Hellwig, Matthew Wilcox, Andrew Morton, linux-scsi,
	kaos, linux-kernel, linux-ia64

Christopoh -

On Tue, May 04, 2004 at 10:41:43AM +0100, Christoph Hellwig wrote:

> Better than what was there, but I still don't like it.  A global array
> of devices is just utter crap.  Every entry point from scsi already has
> struct scsi_device from which we can derive the sg-specific portion easily,
> and for anything else (from a quick look that seems to be only procfs
> stuff which should fade out anyway) a linear search on a linked list
> is okay.
> 
> btw, why are we vmalloc()ing Sg_device?

With Doug's latest version of the patch, and changing the vmalloc of
Sg_device to a kmalloc, I was able to get sg to attach to 16k devices.
(I'm still debugging major/minor issues, hopefully just userspace stuff.)

I was trying to get rid of the sg_dev_arr, but there is no connection from
a scsi_device to a Sg_device, there is only the pointer from Sg_device to
scsi_device. The sg simple class class_data is set but never used
(class_set_devdata is used but not class_get_devdata).

We have a scsi_device class_data that could store the Sg_device, that is a
bit of a hack, since it is supposed hold scsi core data, and in theory we
could have multiple scsi_device class interfaces (st and any other upper
level character drivers would not use this).

There is no cdev private_data, similiar to the block_device gendisk
private data, so we can't use that in sg_open. The other upper level char
devices (st) could also use this, as well as other char drivers. I am told
this is a 2.7 item.

Do you think we should do anything else (besides losing the one vmalloc)
here for sg in 2.6?

Specifically -

Do you think we should try for a cdev private_data? With only this added,
we could have a global list of all Sg_devices, and not have to do a linear
search on open (I don't know how bad that would be for large numbers of
devices). We would still need a linear search of the list on removal (not
that bad).

Should we hack the scsi_device class_data to hold a Sg_device?

-- Patrick Mansfield

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2004-05-14 20:01 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <20040502214525.5ad05bed.akpm@osdl.org>
2004-05-03 12:29 ` Fw: 2.6.6-rc3 ia64 smp_call_function() called with interrupts disabled Matthew Wilcox
2004-05-03 20:35   ` Matthew Wilcox
2004-05-04  9:41     ` Christoph Hellwig
2004-05-14 20:00       ` Patrick Mansfield

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).