LKML Archive on lore.kernel.org
help / color / mirror / Atom feed
* [BUG] __copy_to_user_inatomic broken on non Pentium machines
@ 2007-03-25 8:07 Thomas Gleixner
2007-03-25 18:14 ` Linus Torvalds
0 siblings, 1 reply; 7+ messages in thread
From: Thomas Gleixner @ 2007-03-25 8:07 UTC (permalink / raw)
To: LKML
Cc: Stable Kernel Team, Linus Torvalds, Ingo Molnar, Andrew Morton,
Adrian Bunk
Environment: Pre Pentium systems, (boot_cpu_data.wp_works_ok == 0)
Last known working kernel: 2.6.18 (did not try 2.6.19 yet)
Enabling CONFIG_PREEMPT on latest mainline as well as 2.6.20 trigger
[ 14.150000] BUG: sleeping function called from invalid context at /home/tglx/work/kernel/vanilla/linux-2.6.20/kernel/rwsem.c:20
[ 14.160000] in_atomic():1, irqs_disabled():0
[ 14.160000] no locks held by init/1.
[ 14.170000] [<c0103346>] show_trace_log_lvl+0x1a/0x2f
[ 14.180000] [<c0103441>] show_trace+0x12/0x14
[ 14.190000] [<c0103cf5>] dump_stack+0x16/0x18
[ 14.190000] [<c010aa62>] __might_sleep+0xc7/0xcd
[ 14.200000] [<c01213a1>] down_read+0x18/0x47
[ 14.210000] [<c01a01e4>] __copy_to_user_ll+0x5e/0x1b6
[ 14.220000] [<c012cf85>] file_read_actor+0x10b/0x149
[ 14.230000] [<c012d7b2>] do_generic_mapping_read+0x187/0x433
[ 14.240000] [<c012f64b>] generic_file_aio_read+0x191/0x1ca
[ 14.240000] [<c0141657>] do_sync_read+0xc2/0xff
[ 14.250000] [<c0141eb6>] vfs_read+0x90/0x145
[ 14.260000] [<c014227e>] sys_read+0x3f/0x63
[ 14.270000] [<c0102fb0>] syscall_call+0x7/0xb
[ 14.270000] =======================
and
[ 22.660000] BUG: scheduling while atomic: e2fsck/0x10000001/272
[ 22.670000] 1 lock held by e2fsck/272:
[ 22.680000] #0: (&mm->mmap_sem){----}, at: [<c01a01e4>] __copy_to_user_ll+0x5e/0x1b6
[ 22.690000] [<c0103346>] show_trace_log_lvl+0x1a/0x2f
[ 22.700000] [<c0103441>] show_trace+0x12/0x14
[ 22.710000] [<c0103cf5>] dump_stack+0x16/0x18
[ 22.720000] [<c024a189>] __sched_text_start+0x71/0x57f
[ 22.720000] [<c010b49f>] __cond_resched+0x21/0x3b
[ 22.730000] [<c024aca7>] cond_resched+0x26/0x31
[ 22.740000] [<c0137ae5>] get_user_pages+0x1e1/0x23c
[ 22.750000] [<c01a021e>] __copy_to_user_ll+0x98/0x1b6
[ 22.760000] [<c012cf85>] file_read_actor+0x10b/0x149
[ 22.770000] [<c012d7b2>] do_generic_mapping_read+0x187/0x433
[ 22.780000] [<c012f64b>] generic_file_aio_read+0x191/0x1ca
[ 22.790000] [<c0141657>] do_sync_read+0xc2/0xff
[ 22.790000] [<c0141eb6>] vfs_read+0x90/0x145
[ 22.800000] [<c014227e>] sys_read+0x3f/0x63
[ 22.810000] [<c0102fb0>] syscall_call+0x7/0xb
[ 22.820000] =======================
which is not surprising.
int file_read_actor(read_descriptor_t *desc, struct page *page,
unsigned long offset, unsigned long size)
{
....
/*
* Faults on the destination of a read are common, so do it before
* taking the kmap.
*/
if (!fault_in_pages_writeable(desc->arg.buf, size)) {
kaddr = kmap_atomic(page, KM_USER0);
----> left = __copy_to_user_inatomic(desc->arg.buf,
kaddr + offset, size);
is called with preempt_count == 1, due to the kmap_atomic() above.
Now __copy_to_user_ll() takes the (boot_cpu_data.wp_works_ok == 0) path,
which in turn calls
down_read(current->mm->mmap_sem) - which might sleep
and
get_user_pages() - which has a cond_resched() inside.
Not sure how to fix that.
tglx
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [BUG] __copy_to_user_inatomic broken on non Pentium machines
2007-03-25 8:07 [BUG] __copy_to_user_inatomic broken on non Pentium machines Thomas Gleixner
@ 2007-03-25 18:14 ` Linus Torvalds
2007-03-25 18:30 ` Ingo Molnar
` (2 more replies)
0 siblings, 3 replies; 7+ messages in thread
From: Linus Torvalds @ 2007-03-25 18:14 UTC (permalink / raw)
To: Thomas Gleixner
Cc: LKML, Stable Kernel Team, Ingo Molnar, Andrew Morton, Adrian Bunk
On Sun, 25 Mar 2007, Thomas Gleixner wrote:
>
> Environment: Pre Pentium systems, (boot_cpu_data.wp_works_ok == 0)
This shouldn't be "pre-pentium", afaik. WP-works-ok on i486 too. I think
only the original i386 had this bug ("feature").
But I agree, it does seem to be broken on such machines (I assume you
don't actually have one, but just tested by forcing it by hand ;)
> Now __copy_to_user_ll() takes the (boot_cpu_data.wp_works_ok == 0) path,
> which in turn calls
>
> down_read(current->mm->mmap_sem) - which might sleep
>
> and
>
> get_user_pages() - which has a cond_resched() inside.
>
> Not sure how to fix that.
I agree. Nasty. But the thing is, it's actually much worse. We use
"__put_user()" earlier to try to fault it in writably, and that one is
totally broken on a CPU where wp_works_ok isn't set.
The whole notion that we should do this at access time is broken.
We should go back to doing it at "access_ok()", or we should just state
that we don't support original-i386 CPU's any more. As it is, we don't do
it right *anyway*, since we only do the tests properly in
__copy_to_user(), and totally miss them in __put_user() and friends.
So it's buggy on i386 however you try to fix it. The only way to fix it
properly is to move the i386 fixup early, into "access_ok()", the way it
used to be.
Linus
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [BUG] __copy_to_user_inatomic broken on non Pentium machines
2007-03-25 18:14 ` Linus Torvalds
@ 2007-03-25 18:30 ` Ingo Molnar
2007-03-25 19:41 ` Thomas Gleixner
2007-04-02 12:25 ` [PATCH] i386: fix file_read_actor() and pipe_read() for original i386 systems Thomas Gleixner
2 siblings, 0 replies; 7+ messages in thread
From: Ingo Molnar @ 2007-03-25 18:30 UTC (permalink / raw)
To: Linus Torvalds
Cc: Thomas Gleixner, LKML, Stable Kernel Team, Andrew Morton, Adrian Bunk
* Linus Torvalds <torvalds@linux-foundation.org> wrote:
> On Sun, 25 Mar 2007, Thomas Gleixner wrote:
> >
> > Environment: Pre Pentium systems, (boot_cpu_data.wp_works_ok == 0)
>
> This shouldn't be "pre-pentium", afaik. WP-works-ok on i486 too. I
> think only the original i386 had this bug ("feature").
>
> But I agree, it does seem to be broken on such machines (I assume you
> don't actually have one, but just tested by forcing it by hand ;)
actually, AFAIK this is a genuine i386 box Thomas has (an embedded
board). Our hardware legacies and the resulting dependencies _really_
stick around for quite long time :-/
Ingo
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [BUG] __copy_to_user_inatomic broken on non Pentium machines
2007-03-25 18:14 ` Linus Torvalds
2007-03-25 18:30 ` Ingo Molnar
@ 2007-03-25 19:41 ` Thomas Gleixner
2007-04-02 12:25 ` [PATCH] i386: fix file_read_actor() and pipe_read() for original i386 systems Thomas Gleixner
2 siblings, 0 replies; 7+ messages in thread
From: Thomas Gleixner @ 2007-03-25 19:41 UTC (permalink / raw)
To: Linus Torvalds
Cc: LKML, Stable Kernel Team, Ingo Molnar, Andrew Morton, Adrian Bunk
On Sun, 2007-03-25 at 11:14 -0700, Linus Torvalds wrote:
> > Environment: Pre Pentium systems, (boot_cpu_data.wp_works_ok == 0)
>
> This shouldn't be "pre-pentium", afaik. WP-works-ok on i486 too. I think
> only the original i386 had this bug ("feature").
>
> But I agree, it does seem to be broken on such machines (I assume you
> don't actually have one, but just tested by forcing it by hand ;)
Yes, it's a genuine i386 embedded system and AFAIK the same feature is
available on 486 clones. i386 and Co are still in used in the embedded
space.
tglx
^ permalink raw reply [flat|nested] 7+ messages in thread
* [PATCH] i386: fix file_read_actor() and pipe_read() for original i386 systems
2007-03-25 18:14 ` Linus Torvalds
2007-03-25 18:30 ` Ingo Molnar
2007-03-25 19:41 ` Thomas Gleixner
@ 2007-04-02 12:25 ` Thomas Gleixner
2007-04-02 12:34 ` Ingo Molnar
2007-04-13 23:06 ` Adrian Bunk
2 siblings, 2 replies; 7+ messages in thread
From: Thomas Gleixner @ 2007-04-02 12:25 UTC (permalink / raw)
To: Linus Torvalds
Cc: LKML, Stable Kernel Team, Ingo Molnar, Andrew Morton,
Adrian Bunk, Manfred Spraul
The __copy_to_user_inatomic() calls in file_read_actor() and pipe_read()
are broken on original i386 machines, where WP-works-ok == false, as
__copy_to_user_inatomic() on such systems calls functions which might
sleep and/or contain cond_resched() calls inside of a kmap_atomic()
region.
The original check for WP-works-ok was in access_ok(), but got moved
during the 2.5 series to fix a race vs. swap.
Return the number of bytes to copy in the case where we are in an atomic
region, so the non atomic code pathes in file_read_actor() and
pipe_read() are taken.
This could be optimized to avoid the kmap_atomicby moving the check for
WP-works-ok into fault_in_pages_writeable(), but this is more intrusive
and can be done later.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
diff --git a/arch/i386/lib/usercopy.c b/arch/i386/lib/usercopy.c
index d22cfc9..1950277 100644
--- a/arch/i386/lib/usercopy.c
+++ b/arch/i386/lib/usercopy.c
@@ -10,6 +10,7 @@
#include <linux/blkdev.h>
#include <linux/module.h>
#include <linux/backing-dev.h>
+#include <linux/interrupt.h>
#include <asm/uaccess.h>
#include <asm/mmx.h>
@@ -719,6 +720,14 @@ unsigned long __copy_to_user_ll(void __user *to, const void *from,
#ifndef CONFIG_X86_WP_WORKS_OK
if (unlikely(boot_cpu_data.wp_works_ok == 0) &&
((unsigned long )to) < TASK_SIZE) {
+ /*
+ * When we are in an atomic section (see
+ * mm/filemap.c:file_read_actor), return the full
+ * length to take the slow path.
+ */
+ if (in_atomic())
+ return n;
+
/*
* CPU does not honor the WP bit when writing
* from supervisory mode, and due to preemption or SMP,
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] i386: fix file_read_actor() and pipe_read() for original i386 systems
2007-04-02 12:25 ` [PATCH] i386: fix file_read_actor() and pipe_read() for original i386 systems Thomas Gleixner
@ 2007-04-02 12:34 ` Ingo Molnar
2007-04-13 23:06 ` Adrian Bunk
1 sibling, 0 replies; 7+ messages in thread
From: Ingo Molnar @ 2007-04-02 12:34 UTC (permalink / raw)
To: Thomas Gleixner
Cc: Linus Torvalds, LKML, Stable Kernel Team, Andrew Morton,
Adrian Bunk, Manfred Spraul
* Thomas Gleixner <tglx@linutronix.de> wrote:
> The __copy_to_user_inatomic() calls in file_read_actor() and
> pipe_read() are broken on original i386 machines, where WP-works-ok ==
> false, as __copy_to_user_inatomic() on such systems calls functions
> which might sleep and/or contain cond_resched() calls inside of a
> kmap_atomic() region.
>
> The original check for WP-works-ok was in access_ok(), but got moved
> during the 2.5 series to fix a race vs. swap.
>
> Return the number of bytes to copy in the case where we are in an
> atomic region, so the non atomic code pathes in file_read_actor() and
> pipe_read() are taken.
neat trick! (v2.6.21 must-have me thinks.)
> This could be optimized to avoid the kmap_atomicby moving the check
^---- add space
> for WP-works-ok into fault_in_pages_writeable(), but this is more
> intrusive and can be done later.
>
> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Ingo Molnar <mingo@elte.hu>
Ingo
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] i386: fix file_read_actor() and pipe_read() for original i386 systems
2007-04-02 12:25 ` [PATCH] i386: fix file_read_actor() and pipe_read() for original i386 systems Thomas Gleixner
2007-04-02 12:34 ` Ingo Molnar
@ 2007-04-13 23:06 ` Adrian Bunk
1 sibling, 0 replies; 7+ messages in thread
From: Adrian Bunk @ 2007-04-13 23:06 UTC (permalink / raw)
To: Thomas Gleixner
Cc: Linus Torvalds, LKML, Stable Kernel Team, Ingo Molnar,
Andrew Morton, Manfred Spraul
Thanks, applied to 2.6.16.
cu
Adrian
--
"Is there not promise of rain?" Ling Tan asked suddenly out
of the darkness. There had been need of rain for many days.
"Only a promise," Lao Er said.
Pearl S. Buck - Dragon Seed
^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2007-04-13 23:06 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2007-03-25 8:07 [BUG] __copy_to_user_inatomic broken on non Pentium machines Thomas Gleixner
2007-03-25 18:14 ` Linus Torvalds
2007-03-25 18:30 ` Ingo Molnar
2007-03-25 19:41 ` Thomas Gleixner
2007-04-02 12:25 ` [PATCH] i386: fix file_read_actor() and pipe_read() for original i386 systems Thomas Gleixner
2007-04-02 12:34 ` Ingo Molnar
2007-04-13 23:06 ` Adrian Bunk
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).