LKML Archive on lore.kernel.org help / color / mirror / Atom feed
* Re: remove_proc_entry and read_proc @ 2007-02-01 16:09 Alexey Dobriyan 2007-02-02 7:31 ` Duncan Sands 0 siblings, 1 reply; 8+ messages in thread From: Alexey Dobriyan @ 2007-02-01 16:09 UTC (permalink / raw) To: Duncan Sands; +Cc: linux-kernel, adobriyan Duncan Sands wrote: > On Wednesday 31 January 2007 19:42:51 Alexey Dobriyan wrote: > > On Wed, Jan 31, 2007 at 11:54:35AM +0100, Duncan Sands wrote: > > > Can read_proc still be executing when remove_proc_entry returns? > > > > > > In my driver [*] I allocate some data and create a proc entry using > > > create_proc_entry. My read method reads from my allocated data. When > > > shutting down, I call remove_proc_entry and immediately free the data. > > > If some call to read_proc is still executing at this point then it will > > > be accessing freed memory. Can this happen? I've been rummaging around > > > in fs/proc to see what prevents it, but didn't find anything yet. > > > > This should be fixed by the following patch (in -mm currently): > > http://kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.20-rc6/2.6.20-rc6-mm3/broken-out/fix-rmmod-read-write-races-in-proc-entries.patch > > > > Tell me if you're unsure it will. > > Excellent! But tell me, > > + atomic_inc(&dp->pde_users); > + if (!dp->proc_fops) > > don't you need a memory barrier between these two? Also a corresponding > one where proc_fops is set to NULL. I believe, barriers not needed, not now. This scheme relies on the fact that remove_proc_entry() will be the only place that will clear ->proc_fops and, once cleared, ->proc_fops will never be resurrected. Clearing of ->proc_fops will eventually propagate to CPU doing first check, thus preveting refcount bumps from this CPU. What can be missed is some "rogue" readers or writers¹. Big deal. > + /* > + * Stop accepting new readers/writers. If you're dynamically > + * allocating ->proc_fops, save a pointer somewhere. > + */ > + de->proc_fops = NULL; > + /* Wait until all readers/writers are done. */ > + if (atomic_read(&de->pde_users) > 0) { > + spin_unlock(&proc_subdir_lock); > + msleep(1); > + goto again; > + } > > I don't understand how this is supposed to work. Consider > > CPU1 CPU2 > > atomic_inc(&dp->pde_users); > if (dp->proc_fops) > de->proc_fops = NULL; > use_proc_fops <= BOOM > if (atomic_read(&de->pde_users) > 0) { > > what prevents dereference of a NULL proc_fops value? ¹ Sigh, modules should do removals of proc entries first. And I should check for that. ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: remove_proc_entry and read_proc 2007-02-01 16:09 remove_proc_entry and read_proc Alexey Dobriyan @ 2007-02-02 7:31 ` Duncan Sands 2007-02-05 11:39 ` Alexey Dobriyan 0 siblings, 1 reply; 8+ messages in thread From: Duncan Sands @ 2007-02-02 7:31 UTC (permalink / raw) To: Alexey Dobriyan; +Cc: linux-kernel, adobriyan Hi Alexey, > I believe, barriers not needed, not now. > > This scheme relies on the fact that remove_proc_entry() will be the only > place that will clear ->proc_fops and, once cleared, ->proc_fops will > never be resurrected. Clearing of ->proc_fops will eventually propagate > to CPU doing first check, thus preveting refcount bumps from this CPU. > What can be missed is some "rogue" readers or writers¹. Big deal. I don't understand you. Without memory barriers, remove_proc_entry will most of the time, but not all of the time, wait for all readers and writers to finish before exiting. Since the whole point of your patch was to ensure that all readers and writers finish before remove_proc_entry exits, I don't understand why you don't just put the memory barriers in and make it correct. Also, I do consider it a big deal: > ¹ Sigh, modules should do removals of proc entries first. And I should > check for that. Modules should of course call remove_proc_entry before exiting. However right now, even with your patch, a read or write method can still be running when remove_proc_entry returns [1], so could still be running when the module is removed (if they sleep; I guess this applies mostly to write methods). This is very bad - why not put in memory barriers and fix it? Also, plenty of proc read and write methods access private data that is allocated before calling create_proc_entry and freed after calling remove_proc_entry. If a read or write method is still running after remove_proc_entry returns, then it can access freed memory - very bad. Ciao, Duncan. [1] proc_get_inode does a try_module_get, so it is possible that module unloading is not a problem - not sure. ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: remove_proc_entry and read_proc 2007-02-02 7:31 ` Duncan Sands @ 2007-02-05 11:39 ` Alexey Dobriyan 2007-02-05 12:05 ` Duncan Sands 0 siblings, 1 reply; 8+ messages in thread From: Alexey Dobriyan @ 2007-02-05 11:39 UTC (permalink / raw) To: Duncan Sands; +Cc: linux-kernel, adobriyan On Fri, Feb 02, 2007 at 08:31:57AM +0100, Duncan Sands wrote: > > I believe, barriers not needed, not now. > > This scheme relies on the fact that remove_proc_entry() will be the only > > place that will clear ->proc_fops and, once cleared, ->proc_fops will > > never be resurrected. Clearing of ->proc_fops will eventually propagate > > to CPU doing first check, thus preveting refcount bumps from this CPU. > > What can be missed is some "rogue" readers or writers¹. Big deal. > > I don't understand you. Without memory barriers, remove_proc_entry will > most of the time, but not all of the time, wait for all readers and writers > to finish before exiting. Since the whole point of your patch was to ensure > that all readers and writers finish before remove_proc_entry exits, I don't > understand why you don't just put the memory barriers in and make it correct. Gee, thanks. I sat and wrote code side-by-side, and it looks like, even barriers won't fix anything, because they don't affect other CPUs. There will be new patch soon. ->proc_fops is valid ->proc_fops is valid ->pde_users is 0 ->pde_users is 0 ------------------------------------------------------------ if (!pde->proc_fops) goto out; ->proc_fops = NULL; if (atomic_read(->pde_users) > 0) goto again; | | atomic_inc(->pde_users); | | | V > Also, I do consider it a big deal: > > > ¹ Sigh, modules should do removals of proc entries first. And I should > > check for that. > > Modules should of course call remove_proc_entry before exiting. However > right now, even with your patch, a read or write method can still be > running when remove_proc_entry returns [1], so could still be running when > the module is removed (if they sleep; I guess this applies mostly to > write methods). This is very bad - why not put in memory barriers and > fix it? Also, plenty of proc read and write methods access private data > that is allocated before calling create_proc_entry and freed after calling > remove_proc_entry. If a read or write method is still running after > remove_proc_entry returns, then it can access freed memory - very bad. > [1] proc_get_inode does a try_module_get, so it is possible that module > unloading is not a problem - not sure. Modules forget to set ->owner sometimes. Also, it's still racy, because of the typical pde = create_proc_entry(); /* * * ->owner is NULL here, effectively, PDE without ->owner. * */ if (pde) pde->owner = THIS_MODULE; ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: remove_proc_entry and read_proc 2007-02-05 11:39 ` Alexey Dobriyan @ 2007-02-05 12:05 ` Duncan Sands 0 siblings, 0 replies; 8+ messages in thread From: Duncan Sands @ 2007-02-05 12:05 UTC (permalink / raw) To: Alexey Dobriyan; +Cc: linux-kernel, adobriyan > Gee, thanks. I sat and wrote code side-by-side, and it looks like, even barriers > won't fix anything, because they don't affect other CPUs. ?! The whole point of memory barriers is that they affect other CPUs. Maybe you are thinking of compiler barriers? > ->proc_fops is valid ->proc_fops is valid > ->pde_users is 0 ->pde_users is 0 > ------------------------------------------------------------ > > > if (!pde->proc_fops) > goto out; > > ->proc_fops = NULL; > if (atomic_read(->pde_users) > 0) > goto again; > > | > | atomic_inc(->pde_users); > | > | > | > V The proc_fops check *before* the atomic_inc is indeed pointless (notice how I removed it in the patch I sent?). It's the one after the atomic_inc that prevents this race, but only if there is a memory barrier between the atomic_inc and the check... because otherwise they could be reordered (i.e. seen in reverse order by another CPU) giving the race. > Modules forget to set ->owner sometimes. Also, it's still racy, because > of the typical > > pde = create_proc_entry(); > /* > * > * ->owner is NULL here, effectively, PDE without ->owner. > * > */ > if (pde) > pde->owner = THIS_MODULE; As long as the module calls remove_proc_entry before being unloaded, this should be ok. Best wishes, Duncan. ^ permalink raw reply [flat|nested] 8+ messages in thread
* remove_proc_entry and read_proc @ 2007-01-31 10:54 Duncan Sands 2007-01-31 18:42 ` Alexey Dobriyan 0 siblings, 1 reply; 8+ messages in thread From: Duncan Sands @ 2007-01-31 10:54 UTC (permalink / raw) To: Linux Kernel Mailing List Can read_proc still be executing when remove_proc_entry returns? In my driver [*] I allocate some data and create a proc entry using create_proc_entry. My read method reads from my allocated data. When shutting down, I call remove_proc_entry and immediately free the data. If some call to read_proc is still executing at this point then it will be accessing freed memory. Can this happen? I've been rummaging around in fs/proc to see what prevents it, but didn't find anything yet. Thanks a lot, Duncan. [*] Actually it's the ATM layer that does all this (net/atm); my driver uses that layer. ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: remove_proc_entry and read_proc 2007-01-31 10:54 Duncan Sands @ 2007-01-31 18:42 ` Alexey Dobriyan 2007-01-31 19:26 ` Duncan Sands 0 siblings, 1 reply; 8+ messages in thread From: Alexey Dobriyan @ 2007-01-31 18:42 UTC (permalink / raw) To: Duncan Sands; +Cc: linux-kernel On Wed, Jan 31, 2007 at 11:54:35AM +0100, Duncan Sands wrote: > Can read_proc still be executing when remove_proc_entry returns? > > In my driver [*] I allocate some data and create a proc entry using > create_proc_entry. My read method reads from my allocated data. When > shutting down, I call remove_proc_entry and immediately free the data. > If some call to read_proc is still executing at this point then it will > be accessing freed memory. Can this happen? I've been rummaging around > in fs/proc to see what prevents it, but didn't find anything yet. This should be fixed by the following patch (in -mm currently): http://kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.20-rc6/2.6.20-rc6-mm3/broken-out/fix-rmmod-read-write-races-in-proc-entries.patch Tell me if you're unsure it will. ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: remove_proc_entry and read_proc 2007-01-31 18:42 ` Alexey Dobriyan @ 2007-01-31 19:26 ` Duncan Sands 2007-02-01 10:15 ` Duncan Sands 0 siblings, 1 reply; 8+ messages in thread From: Duncan Sands @ 2007-01-31 19:26 UTC (permalink / raw) To: Alexey Dobriyan; +Cc: linux-kernel On Wednesday 31 January 2007 19:42:51 Alexey Dobriyan wrote: > On Wed, Jan 31, 2007 at 11:54:35AM +0100, Duncan Sands wrote: > > Can read_proc still be executing when remove_proc_entry returns? > > > > In my driver [*] I allocate some data and create a proc entry using > > create_proc_entry. My read method reads from my allocated data. When > > shutting down, I call remove_proc_entry and immediately free the data. > > If some call to read_proc is still executing at this point then it will > > be accessing freed memory. Can this happen? I've been rummaging around > > in fs/proc to see what prevents it, but didn't find anything yet. > > This should be fixed by the following patch (in -mm currently): > http://kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.20-rc6/2.6.20-rc6-mm3/broken-out/fix-rmmod-read-write-races-in-proc-entries.patch > > Tell me if you're unsure it will. Excellent! But tell me, + atomic_inc(&dp->pde_users); + if (!dp->proc_fops) don't you need a memory barrier between these two? Also a corresponding one where proc_fops is set to NULL. + /* + * Stop accepting new readers/writers. If you're dynamically + * allocating ->proc_fops, save a pointer somewhere. + */ + de->proc_fops = NULL; + /* Wait until all readers/writers are done. */ + if (atomic_read(&de->pde_users) > 0) { + spin_unlock(&proc_subdir_lock); + msleep(1); + goto again; + } I don't understand how this is supposed to work. Consider CPU1 CPU2 atomic_inc(&dp->pde_users); if (dp->proc_fops) de->proc_fops = NULL; use_proc_fops <= BOOM if (atomic_read(&de->pde_users) > 0) { what prevents dereference of a NULL proc_fops value? Best wishes, Duncan. ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: remove_proc_entry and read_proc 2007-01-31 19:26 ` Duncan Sands @ 2007-02-01 10:15 ` Duncan Sands 0 siblings, 0 replies; 8+ messages in thread From: Duncan Sands @ 2007-02-01 10:15 UTC (permalink / raw) To: Alexey Dobriyan; +Cc: linux-kernel > I don't understand how this is supposed to work. Consider > > CPU1 CPU2 > > atomic_inc(&dp->pde_users); > if (dp->proc_fops) > de->proc_fops = NULL; > use_proc_fops <= BOOM > if (atomic_read(&de->pde_users) > 0) { > > what prevents dereference of a NULL proc_fops value? I was wrong: what prevents it is that proc_fops isn't used at all in these paths! However it is used in proc_get_inode thusly: if (de->proc_fops) inode->i_fop = de->proc_fops; What if proc_fops is set to NULL between these two? Putting it into a local variable should take care of this one, but since I'm not sure if it's really needed I'll leave that to you! Otherwise, here's a patch that adds memory barriers (maybe these can be SMP memory barriers) since the atomic ops you are using do not imply such barriers according to Documentation/atomic_ops.txt. The memory barriers are needed, since you need to know that if you saw a non-NULL proc_fops, then remove_proc_entry saw your increased pde_users count. If the proc_fops write was re-ordered after the pde_users read, or the proc_fops read was re-ordered before the pde_users write, then this may not be true. Also, I removed the early checks for NULL proc_fops. True, they avoid an atomic operation and a memory barrier, however they only avoid them in the error path, when -EIO is going to be returned, which is hardly a fast path. In the common case, they represent a pointless check. Ciao, Duncan. PS: Compiles, but otherwise untested. Index: Linux/fs/proc/generic.c =================================================================== --- Linux.orig/fs/proc/generic.c 2006-12-09 17:18:32.000000000 +0100 +++ Linux/fs/proc/generic.c 2007-02-01 10:54:36.000000000 +0100 @@ -19,6 +19,7 @@ #include <linux/idr.h> #include <linux/namei.h> #include <linux/bitops.h> +#include <linux/delay.h> #include <linux/spinlock.h> #include <asm/uaccess.h> @@ -76,6 +77,19 @@ if (!(page = (char*) __get_free_page(GFP_KERNEL))) return -ENOMEM; + /* + * We are going to call into module's code via ->get_info or + * ->read_proc. Bump refcount so that remove_proc_entry() will + * wait for read to complete. + */ + atomic_inc(&dp->pde_users); + mb(); + if (!dp->proc_fops) + /* + * remove_proc_entry() marked PDE as "going away". Obey. + */ + goto out_dec; + while ((nbytes > 0) && !eof) { count = min_t(size_t, PROC_BLOCK_SIZE, nbytes); @@ -195,6 +209,8 @@ buf += n; retval += n; } +out_dec: + atomic_dec(&dp->pde_users); free_page((unsigned long) page); return retval; } @@ -205,14 +221,28 @@ { struct inode *inode = file->f_path.dentry->d_inode; struct proc_dir_entry * dp; + ssize_t rv; dp = PDE(inode); if (!dp->write_proc) return -EIO; - /* FIXME: does this routine need ppos? probably... */ - return dp->write_proc(file, buffer, count, dp->data); + rv = -EIO; + /* + * We are going to call into module's code via ->write_proc . + * Bump refcount so that module won't dissapear while ->write_proc + * sleeps in copy_from_user(). remove_proc_entry() will wait for + * write to complete. + */ + atomic_inc(&dp->pde_users); + mb(); + if (dp->proc_fops) + /* PDE is ready, refcount bumped, call into module. */ + /* FIXME: does this routine need ppos? probably... */ + rv = dp->write_proc(file, buffer, count, dp->data); + atomic_dec(&dp->pde_users); + return rv; } @@ -717,12 +747,26 @@ if (!parent && xlate_proc_name(name, &parent, &fn) != 0) goto out; len = strlen(fn); - +again: spin_lock(&proc_subdir_lock); for (p = &parent->subdir; *p; p=&(*p)->next ) { if (!proc_match(len, fn, *p)) continue; de = *p; + + /* + * Stop accepting new readers/writers. If you're dynamically + * allocating ->proc_fops, save a pointer somewhere. + */ + de->proc_fops = NULL; + mb(); + /* Wait until all readers/writers are done. */ + if (atomic_read(&de->pde_users) > 0) { + spin_unlock(&proc_subdir_lock); + msleep(1); + goto again; + } + *p = de->next; de->next = NULL; if (S_ISDIR(de->mode)) Index: Linux/include/linux/proc_fs.h =================================================================== --- Linux.orig/include/linux/proc_fs.h 2006-10-03 15:39:31.000000000 +0200 +++ Linux/include/linux/proc_fs.h 2007-02-01 10:42:07.000000000 +0100 @@ -56,6 +56,19 @@ gid_t gid; loff_t size; struct inode_operations * proc_iops; + /* + * NULL ->proc_fops means "PDE is going away RSN" or + * "PDE is just created". In either case ->get_info, ->read_proc, + * ->write_proc won't be called because it's too late or too early, + * respectively. + * + * Valid ->proc_fops means "use this file_operations" or + * "->data is setup, it's safe to call ->read_proc, ->write_proc which + * dereference it". + * + * If you're allocating ->proc_fops dynamically, save a pointer + * somewhere. + */ const struct file_operations * proc_fops; get_info_t *get_info; struct module *owner; @@ -66,6 +79,8 @@ atomic_t count; /* use count */ int deleted; /* delete flag */ void *set; + atomic_t pde_users; /* number of readers + number of writers via + * ->read_proc, ->write_proc, ->get_info */ }; struct kcore_list { ^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2007-02-05 12:05 UTC | newest] Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2007-02-01 16:09 remove_proc_entry and read_proc Alexey Dobriyan 2007-02-02 7:31 ` Duncan Sands 2007-02-05 11:39 ` Alexey Dobriyan 2007-02-05 12:05 ` Duncan Sands -- strict thread matches above, loose matches on Subject: below -- 2007-01-31 10:54 Duncan Sands 2007-01-31 18:42 ` Alexey Dobriyan 2007-01-31 19:26 ` Duncan Sands 2007-02-01 10:15 ` Duncan Sands
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).