LKML Archive on lore.kernel.org
help / color / mirror / Atom feed
* [PATCH] futex: Fix fault_in_user_writeable()
@ 2021-08-16  6:54 Huacai Chen
  2021-08-16 18:27 ` Davidlohr Bueso
  0 siblings, 1 reply; 10+ messages in thread
From: Huacai Chen @ 2021-08-16  6:54 UTC (permalink / raw)
  To: Thomas Gleixner, Ingo Molnar, Peter Zijlstra, Darren Hart,
	Davidlohr Bueso, Thomas Bogendoerfer
  Cc: linux-mips, linux-kernel, Xuefeng Li, Huacai Chen, Jiaxun Yang,
	Huacai Chen, Hongchen Zhang

fault_in_user_writeable() should verify R/W access but only verify W. In
most archs W implies R, but not true in MIPS and LoongArch, so fix it.

Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
Signed-off-by: Hongchen Zhang <zhanghongchen@loongson.cn>
---
 kernel/futex.c | 7 ++++++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/kernel/futex.c b/kernel/futex.c
index 2ecb07575055..c3b68be31bf3 100644
--- a/kernel/futex.c
+++ b/kernel/futex.c
@@ -672,10 +672,15 @@ static int get_futex_key(u32 __user *uaddr, bool fshared, union futex_key *key,
  */
 static int fault_in_user_writeable(u32 __user *uaddr)
 {
-	struct mm_struct *mm = current->mm;
 	int ret;
+	struct mm_struct *mm = current->mm;
+	struct vm_area_struct *vma;
 
 	mmap_read_lock(mm);
+	vma = find_vma(mm, (unsigned long)uaddr);
+	if (!(vma->vm_flags & VM_READ))
+		ret = -EFAULT;
+
 	ret = fixup_user_fault(mm, (unsigned long)uaddr,
 			       FAULT_FLAG_WRITE, NULL);
 	mmap_read_unlock(mm);
-- 
2.27.0


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] futex: Fix fault_in_user_writeable()
  2021-08-16  6:54 [PATCH] futex: Fix fault_in_user_writeable() Huacai Chen
@ 2021-08-16 18:27 ` Davidlohr Bueso
  2021-08-16 19:03   ` Thomas Gleixner
  0 siblings, 1 reply; 10+ messages in thread
From: Davidlohr Bueso @ 2021-08-16 18:27 UTC (permalink / raw)
  To: Huacai Chen
  Cc: Thomas Gleixner, Ingo Molnar, Peter Zijlstra, Darren Hart,
	Thomas Bogendoerfer, linux-mips, linux-kernel, Xuefeng Li,
	Huacai Chen, Jiaxun Yang, Hongchen Zhang

On Mon, 16 Aug 2021, Huacai Chen wrote:

>fault_in_user_writeable() should verify R/W access but only verify W. In
>most archs W implies R, but not true in MIPS and LoongArch, so fix it.

Yuck for a find_vma() in futex.c. If this is a problem in MIPS, shouldn't
the fix be there? Furthermore it's stated that fault_in_user_writeable():

"Fault in user address and verify RW access"

And you guys seem to have proposed it already:

https://lore.kernel.org/linux-mips/20200630005845.1239974-1-liulichao@loongson.cn/

Thanks,
Davidlohr

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] futex: Fix fault_in_user_writeable()
  2021-08-16 18:27 ` Davidlohr Bueso
@ 2021-08-16 19:03   ` Thomas Gleixner
  2021-08-17  1:53     ` Huacai Chen
  0 siblings, 1 reply; 10+ messages in thread
From: Thomas Gleixner @ 2021-08-16 19:03 UTC (permalink / raw)
  To: Davidlohr Bueso, Huacai Chen
  Cc: Ingo Molnar, Peter Zijlstra, Darren Hart, Thomas Bogendoerfer,
	linux-mips, linux-kernel, Xuefeng Li, Huacai Chen, Jiaxun Yang,
	Hongchen Zhang

On Mon, Aug 16 2021 at 11:27, Davidlohr Bueso wrote:
> On Mon, 16 Aug 2021, Huacai Chen wrote:
>
>>fault_in_user_writeable() should verify R/W access but only verify W. In
>>most archs W implies R, but not true in MIPS and LoongArch, so fix it.
>
> Yuck for a find_vma() in futex.c. If this is a problem in MIPS, shouldn't
> the fix be there? Furthermore it's stated that fault_in_user_writeable():
>
> "Fault in user address and verify RW access"

That seems to be wishful thinking given the fact that some architectures
do not imply R for FLAG_FAULT_WRITE.

> And you guys seem to have proposed it already:
>
> https://lore.kernel.org/linux-mips/20200630005845.1239974-1-liulichao@loongson.cn/

That's surely one way to fix that. If that does not work for whatever
reason, then we really don't want this find_vma() hack there, but rather
something like:

    if (IS_ENABLED(CONFIG_ARCH_USER_FAULT_VOODOO) && get_user(&tmp, uaddr))
	return -EFAULT;

Thanks,

        tglx

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] futex: Fix fault_in_user_writeable()
  2021-08-16 19:03   ` Thomas Gleixner
@ 2021-08-17  1:53     ` Huacai Chen
  2021-08-17  7:07       ` Thomas Gleixner
  2021-08-17  9:22       ` Peter Zijlstra
  0 siblings, 2 replies; 10+ messages in thread
From: Huacai Chen @ 2021-08-17  1:53 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Davidlohr Bueso, Huacai Chen, Ingo Molnar, Peter Zijlstra,
	Darren Hart, Thomas Bogendoerfer, open list:MIPS, LKML,
	Xuefeng Li, Jiaxun Yang, Hongchen Zhang

Hi, Davidlohr and Thomas,

On Tue, Aug 17, 2021 at 3:03 AM Thomas Gleixner <tglx@linutronix.de> wrote:
>
> On Mon, Aug 16 2021 at 11:27, Davidlohr Bueso wrote:
> > On Mon, 16 Aug 2021, Huacai Chen wrote:
> >
> >>fault_in_user_writeable() should verify R/W access but only verify W. In
> >>most archs W implies R, but not true in MIPS and LoongArch, so fix it.
> >
> > Yuck for a find_vma() in futex.c. If this is a problem in MIPS, shouldn't
> > the fix be there? Furthermore it's stated that fault_in_user_writeable():
> >
> > "Fault in user address and verify RW access"
>
> That seems to be wishful thinking given the fact that some architectures
> do not imply R for FLAG_FAULT_WRITE.
>
> > And you guys seem to have proposed it already:
> >
> > https://lore.kernel.org/linux-mips/20200630005845.1239974-1-liulichao@loongson.cn/
This works, but I don't think this is a MIPS problem, so does Thomas
Bogendoerfer. Because write-only page is valid in MIPS (so Thomas
rejected this patch).

>
> That's surely one way to fix that. If that does not work for whatever
> reason, then we really don't want this find_vma() hack there, but rather
> something like:
I don't know why find_vma() is unacceptable here, there is also
find_vma() in fixup_user_fault().

>
>     if (IS_ENABLED(CONFIG_ARCH_USER_FAULT_VOODOO) && get_user(&tmp, uaddr))
>         return -EFAULT;
get_user() may be better than find_vma(), but can we drop
CONFIG_ARCH_USER_FAULT_VOODOO here? On those "W implies R" archs,
get_user() always success, this can simplify the logic.

Huacai

>
> Thanks,
>
>         tglx

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] futex: Fix fault_in_user_writeable()
  2021-08-17  1:53     ` Huacai Chen
@ 2021-08-17  7:07       ` Thomas Gleixner
  2021-08-17  7:38         ` Huacai Chen
  2021-08-17  9:22       ` Peter Zijlstra
  1 sibling, 1 reply; 10+ messages in thread
From: Thomas Gleixner @ 2021-08-17  7:07 UTC (permalink / raw)
  To: Huacai Chen
  Cc: Davidlohr Bueso, Huacai Chen, Ingo Molnar, Peter Zijlstra,
	Darren Hart, Thomas Bogendoerfer, open list:MIPS, LKML,
	Xuefeng Li, Jiaxun Yang, Hongchen Zhang

On Tue, Aug 17 2021 at 09:53, Huacai Chen wrote:
> On Tue, Aug 17, 2021 at 3:03 AM Thomas Gleixner <tglx@linutronix.de> wrote:
>> That's surely one way to fix that. If that does not work for whatever
>> reason, then we really don't want this find_vma() hack there, but rather
>> something like:
> I don't know why find_vma() is unacceptable here, there is also
> find_vma() in fixup_user_fault().

Wrong. find_extend_vma() != find_vma(). Aside of that fixup_user_fault()
does way more than that.

>>     if (IS_ENABLED(CONFIG_ARCH_USER_FAULT_VOODOO) && get_user(&tmp, uaddr))
>>         return -EFAULT;
>
> get_user() may be better than find_vma(), but can we drop
> CONFIG_ARCH_USER_FAULT_VOODOO here? On those "W implies R" archs,
> get_user() always success, this can simplify the logic.

For architectures which imply R fixup_user_fault() is way more
effinicient than taking the fault on get_user() and then invoking
fixup_user_fault() to ensure that the mapping is writeable.

No, we are not making stuff less efficient just because of MIPS.

Thanks,

        tglx

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] futex: Fix fault_in_user_writeable()
  2021-08-17  7:07       ` Thomas Gleixner
@ 2021-08-17  7:38         ` Huacai Chen
  2021-08-17  9:05           ` Thomas Gleixner
  0 siblings, 1 reply; 10+ messages in thread
From: Huacai Chen @ 2021-08-17  7:38 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Davidlohr Bueso, Huacai Chen, Ingo Molnar, Peter Zijlstra,
	Darren Hart, Thomas Bogendoerfer, open list:MIPS, LKML,
	Xuefeng Li, Jiaxun Yang, Hongchen Zhang

Hi, Thomas,

On Tue, Aug 17, 2021 at 3:07 PM Thomas Gleixner <tglx@linutronix.de> wrote:
>
> On Tue, Aug 17 2021 at 09:53, Huacai Chen wrote:
> > On Tue, Aug 17, 2021 at 3:03 AM Thomas Gleixner <tglx@linutronix.de> wrote:
> >> That's surely one way to fix that. If that does not work for whatever
> >> reason, then we really don't want this find_vma() hack there, but rather
> >> something like:
> > I don't know why find_vma() is unacceptable here, there is also
> > find_vma() in fixup_user_fault().
>
> Wrong. find_extend_vma() != find_vma(). Aside of that fixup_user_fault()
> does way more than that.
>
> >>     if (IS_ENABLED(CONFIG_ARCH_USER_FAULT_VOODOO) && get_user(&tmp, uaddr))
> >>         return -EFAULT;
> >
> > get_user() may be better than find_vma(), but can we drop
> > CONFIG_ARCH_USER_FAULT_VOODOO here? On those "W implies R" archs,
> > get_user() always success, this can simplify the logic.
>
> For architectures which imply R fixup_user_fault() is way more
> effinicient than taking the fault on get_user() and then invoking
> fixup_user_fault() to ensure that the mapping is writeable.
>
> No, we are not making stuff less efficient just because of MIPS.
>

We use this program to test MIPS and X86:

int main(int argc, char** argv)
{
        int fd;
        void *ptr;
        int ret;
        int syscall_nr = 98;

        fd = open("/dev/zero", O_RDWR);
        if (fd == -1)
                exit(EXIT_FAILURE);

        ptr = mmap(NULL, 16384, PROT_WRITE, MAP_ANONYMOUS | MAP_SHARED, fd, 0);
        close(fd);
        /*
         * futex syscall nr:
         * x86_64: 202
         * MIPS84: 5194
         */
#ifdef __mips__
        syscall_nr = 5194;
#elif __x86_64__
        syscall_nr = 202;
#endif

        ret = syscall(syscall_nr,ptr,FUTEX_LOCK_PI,0, NULL, NULL, 0,0);
        printf("syscall %d ret = %d\n",syscall_nr,ret);

        return 0;
}

On X86, it returns 0; on MIPS64 without patch, it hangs in kernel; on
MIPS64 with this patch, it returns -1.

Then, I want to know, on "W implies R" archs (such as X86), should it
return 0? Maybe return -1 is more reasonable? (because the VMA is
marked as write-only). If this program should return -1, then I don't
think this is a MIPS-specific problem.

Huacai

> Thanks,
>
>         tglx

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] futex: Fix fault_in_user_writeable()
  2021-08-17  7:38         ` Huacai Chen
@ 2021-08-17  9:05           ` Thomas Gleixner
  2021-08-17  9:45             ` Peter Zijlstra
  0 siblings, 1 reply; 10+ messages in thread
From: Thomas Gleixner @ 2021-08-17  9:05 UTC (permalink / raw)
  To: Huacai Chen
  Cc: Davidlohr Bueso, Huacai Chen, Ingo Molnar, Peter Zijlstra,
	Darren Hart, Thomas Bogendoerfer, open list:MIPS, LKML,
	Xuefeng Li, Jiaxun Yang, Hongchen Zhang

Huacai,

On Tue, Aug 17 2021 at 15:38, Huacai Chen wrote:
> On Tue, Aug 17, 2021 at 3:07 PM Thomas Gleixner <tglx@linutronix.de> wrote:
> On X86, it returns 0; on MIPS64 without patch, it hangs in kernel; on
> MIPS64 with this patch, it returns -1.

As expected.

> Then, I want to know, on "W implies R" archs (such as X86), should it
> return 0? Maybe return -1 is more reasonable? (because the VMA is
> marked as write-only). If this program should return -1, then I don't
> think this is a MIPS-specific problem.

No. mmap(.., PROT_WRITE...) is simply impossible on x86 and implies
PROT_READ as documented in mmap(2).

So why should this fail and only fail in the fault case, but succeed
when the PTE is already established?

Thanks,

        tglx

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] futex: Fix fault_in_user_writeable()
  2021-08-17  1:53     ` Huacai Chen
  2021-08-17  7:07       ` Thomas Gleixner
@ 2021-08-17  9:22       ` Peter Zijlstra
  1 sibling, 0 replies; 10+ messages in thread
From: Peter Zijlstra @ 2021-08-17  9:22 UTC (permalink / raw)
  To: Huacai Chen
  Cc: Thomas Gleixner, Davidlohr Bueso, Huacai Chen, Ingo Molnar,
	Darren Hart, Thomas Bogendoerfer, open list:MIPS, LKML,
	Xuefeng Li, Jiaxun Yang, Hongchen Zhang, vbabka, Mel Gorman,
	david, willy, Dave Hansen

On Tue, Aug 17, 2021 at 09:53:14AM +0800, Huacai Chen wrote:
> Hi, Davidlohr and Thomas,
> 
> On Tue, Aug 17, 2021 at 3:03 AM Thomas Gleixner <tglx@linutronix.de> wrote:
> >
> > On Mon, Aug 16 2021 at 11:27, Davidlohr Bueso wrote:
> > > On Mon, 16 Aug 2021, Huacai Chen wrote:
> > >
> > >>fault_in_user_writeable() should verify R/W access but only verify W. In
> > >>most archs W implies R, but not true in MIPS and LoongArch, so fix it.
> > >
> > > Yuck for a find_vma() in futex.c. If this is a problem in MIPS, shouldn't
> > > the fix be there? Furthermore it's stated that fault_in_user_writeable():
> > >
> > > "Fault in user address and verify RW access"
> >
> > That seems to be wishful thinking given the fact that some architectures
> > do not imply R for FLAG_FAULT_WRITE.
> >
> > > And you guys seem to have proposed it already:
> > >
> > > https://lore.kernel.org/linux-mips/20200630005845.1239974-1-liulichao@loongson.cn/
> This works, but I don't think this is a MIPS problem, so does Thomas
> Bogendoerfer. Because write-only page is valid in MIPS (so Thomas
> rejected this patch).
> 
> >
> > That's surely one way to fix that. If that does not work for whatever
> > reason, then we really don't want this find_vma() hack there, but rather
> > something like:
> I don't know why find_vma() is unacceptable here, there is also
> find_vma() in fixup_user_fault().
> 
> >
> >     if (IS_ENABLED(CONFIG_ARCH_USER_FAULT_VOODOO) && get_user(&tmp, uaddr))
> >         return -EFAULT;
> get_user() may be better than find_vma(), but can we drop
> CONFIG_ARCH_USER_FAULT_VOODOO here? On those "W implies R" archs,
> get_user() always success, this can simplify the logic.

AFAICT that whole W implies R thing goes much deeper,
mm/gup.c:vma_permits_fault() has:

	vm_flags_t vm_flags = write ? VM_WRITE : VM_READ;

So unless someone wants to go fix the core MM and eradicate all such
assumptions, I'd suggest going with the 'easy' route and fix the arch.

This patch is probably broken and will likely break lots of things...

---
diff --git a/include/linux/mm.h b/include/linux/mm.h
index 7ca22e6e694a..fc587dbb90b4 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -478,6 +478,7 @@ enum fault_flag {
 	FAULT_FLAG_REMOTE =		1 << 7,
 	FAULT_FLAG_INSTRUCTION =	1 << 8,
 	FAULT_FLAG_INTERRUPTIBLE =	1 << 9,
+	FAULT_FLAG_READ =		1 << 10,
 };
 
 /*
diff --git a/kernel/futex.c b/kernel/futex.c
index fcc0570868b7..2c0970759919 100644
--- a/kernel/futex.c
+++ b/kernel/futex.c
@@ -735,7 +735,7 @@ static int fault_in_user_writeable(u32 __user *uaddr)
 
 	mmap_read_lock(mm);
 	ret = fixup_user_fault(mm, (unsigned long)uaddr,
-			       FAULT_FLAG_WRITE, NULL);
+			       FAULT_FLAG_READ|FAULT_FLAG_WRITE, NULL);
 	mmap_read_unlock(mm);
 
 	return ret < 0 ? ret : 0;
diff --git a/mm/gup.c b/mm/gup.c
index e805c1748bbf..37c8bfbe5196 100644
--- a/mm/gup.c
+++ b/mm/gup.c
@@ -1222,11 +1222,17 @@ static long __get_user_pages(struct mm_struct *mm,
 static bool vma_permits_fault(struct vm_area_struct *vma,
 			      unsigned int fault_flags)
 {
+	bool read    = !!(fault_flags & FAULT_FLAG_READ);
 	bool write   = !!(fault_flags & FAULT_FLAG_WRITE);
 	bool foreign = !!(fault_flags & FAULT_FLAG_REMOTE);
-	vm_flags_t vm_flags = write ? VM_WRITE : VM_READ;
+	vm_flags_t vm_flags = 0;
 
-	if (!(vm_flags & vma->vm_flags))
+	if (read)
+		vm_flags |= VM_READ;
+	if (write)
+		vm_flags |= VM_WRITE;
+
+	if ((vma->vm_flags & vm_flags) != vm_flags)
 		return false;
 
 	/*

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] futex: Fix fault_in_user_writeable()
  2021-08-17  9:05           ` Thomas Gleixner
@ 2021-08-17  9:45             ` Peter Zijlstra
  2021-08-17 12:27               ` Huacai Chen
  0 siblings, 1 reply; 10+ messages in thread
From: Peter Zijlstra @ 2021-08-17  9:45 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Huacai Chen, Davidlohr Bueso, Huacai Chen, Ingo Molnar,
	Darren Hart, Thomas Bogendoerfer, open list:MIPS, LKML,
	Xuefeng Li, Jiaxun Yang, Hongchen Zhang

On Tue, Aug 17, 2021 at 11:05:15AM +0200, Thomas Gleixner wrote:
> Huacai,
> 
> On Tue, Aug 17 2021 at 15:38, Huacai Chen wrote:
> > On Tue, Aug 17, 2021 at 3:07 PM Thomas Gleixner <tglx@linutronix.de> wrote:
> > On X86, it returns 0; on MIPS64 without patch, it hangs in kernel; on
> > MIPS64 with this patch, it returns -1.
> 
> As expected.
> 
> > Then, I want to know, on "W implies R" archs (such as X86), should it
> > return 0? Maybe return -1 is more reasonable? (because the VMA is
> > marked as write-only). If this program should return -1, then I don't
> > think this is a MIPS-specific problem.
> 
> No. mmap(.., PROT_WRITE...) is simply impossible on x86 and implies
> PROT_READ as documented in mmap(2).
> 
> So why should this fail and only fail in the fault case, but succeed
> when the PTE is already established?

I wouldn't actually mind if it failed on fault -- it's the 'best' we can
do on x86. Doing a RmW op on PROT_WRITE is silly and deserves all the
wreckage it can get.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] futex: Fix fault_in_user_writeable()
  2021-08-17  9:45             ` Peter Zijlstra
@ 2021-08-17 12:27               ` Huacai Chen
  0 siblings, 0 replies; 10+ messages in thread
From: Huacai Chen @ 2021-08-17 12:27 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Thomas Gleixner, Davidlohr Bueso, Huacai Chen, Ingo Molnar,
	Darren Hart, Thomas Bogendoerfer, open list:MIPS, LKML,
	Xuefeng Li, Jiaxun Yang, Hongchen Zhang

Hi, all,

On Tue, Aug 17, 2021 at 5:45 PM Peter Zijlstra <peterz@infradead.org> wrote:
>
> On Tue, Aug 17, 2021 at 11:05:15AM +0200, Thomas Gleixner wrote:
> > Huacai,
> >
> > On Tue, Aug 17 2021 at 15:38, Huacai Chen wrote:
> > > On Tue, Aug 17, 2021 at 3:07 PM Thomas Gleixner <tglx@linutronix.de> wrote:
> > > On X86, it returns 0; on MIPS64 without patch, it hangs in kernel; on
> > > MIPS64 with this patch, it returns -1.
> >
> > As expected.
> >
> > > Then, I want to know, on "W implies R" archs (such as X86), should it
> > > return 0? Maybe return -1 is more reasonable? (because the VMA is
> > > marked as write-only). If this program should return -1, then I don't
> > > think this is a MIPS-specific problem.
> >
> > No. mmap(.., PROT_WRITE...) is simply impossible on x86 and implies
> > PROT_READ as documented in mmap(2).
> >
> > So why should this fail and only fail in the fault case, but succeed
> > when the PTE is already established?
>
> I wouldn't actually mind if it failed on fault -- it's the 'best' we can
> do on x86. Doing a RmW op on PROT_WRITE is silly and deserves all the
> wreckage it can get.
If we must fix it in arch code, there are two methods: 1, don't use
write-only map (modify protection_map as Liu Lichao did); 2, override
arch_vma_access_permitted() to do extra checks. Thomas, which is
better?

Huacai

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2021-08-17 12:27 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-08-16  6:54 [PATCH] futex: Fix fault_in_user_writeable() Huacai Chen
2021-08-16 18:27 ` Davidlohr Bueso
2021-08-16 19:03   ` Thomas Gleixner
2021-08-17  1:53     ` Huacai Chen
2021-08-17  7:07       ` Thomas Gleixner
2021-08-17  7:38         ` Huacai Chen
2021-08-17  9:05           ` Thomas Gleixner
2021-08-17  9:45             ` Peter Zijlstra
2021-08-17 12:27               ` Huacai Chen
2021-08-17  9:22       ` Peter Zijlstra

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).