LKML Archive on lore.kernel.org
help / color / mirror / Atom feed
* __vm_enough_memory(), OVERCOMMIT_NEVER, current->mm, kernel thread
@ 2008-10-21 15:14 hooanon05
  2008-10-21 15:46 ` Alan Cox
  0 siblings, 1 reply; 9+ messages in thread
From: hooanon05 @ 2008-10-21 15:14 UTC (permalink / raw)
  To: linux-kernel


Hello all,

When sysctl_overcommit_memory is set OVERCOMMIT_NEVER,
__vm_enough_memory() refers current->mm.

For example,
# exportfs -i -o ... localhost:/tmpfs
# mkdir /tmp/w
# mount -o ... localhost:/tmpfs /tmp/w
# yes > /tmp/w/fileA

In this case, NFSD (kernel thread) calls shmem_file_write() or
shmem_write_begin() and __vm_enough_memory() is called. But current->mm
is NULL and the kernel crashes.
If a user have to set OVERCOMMIT_NEVER, where should we fix?


Junjiro R. Okajima

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: __vm_enough_memory(), OVERCOMMIT_NEVER, current->mm, kernel thread
  2008-10-21 15:14 __vm_enough_memory(), OVERCOMMIT_NEVER, current->mm, kernel thread hooanon05
@ 2008-10-21 15:46 ` Alan Cox
  2008-10-21 23:10   ` hooanon05
  0 siblings, 1 reply; 9+ messages in thread
From: Alan Cox @ 2008-10-21 15:46 UTC (permalink / raw)
  To: hooanon05; +Cc: linux-kernel

> In this case, NFSD (kernel thread) calls shmem_file_write() or
> shmem_write_begin() and __vm_enough_memory() is called. But current->mm
> is NULL and the kernel crashes.
> If a user have to set OVERCOMMIT_NEVER, where should we fix?

Calling into the file system code assuming that current->mm is
NULL isn't safe and hasn't been for a very long time since someone added
the 3% hack.

The shmem case is actually a bit special so my thoughts are:

Make security_vm_enough_memory() WARN() if current->mm = NULL
Make security_vm_enough_memory_mm() WARN() if the passed mm = NULL
Add security_vm_enough_memory_fs() which does not do the warning test

All would still call security->ops->vm_enough_memory and then
__vm_enough_memory() would skip the 3% adjustment when the passed mm was
NULL

Does that sound sensible ?

Alan

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: __vm_enough_memory(), OVERCOMMIT_NEVER, current->mm, kernel thread 
  2008-10-21 15:46 ` Alan Cox
@ 2008-10-21 23:10   ` hooanon05
  2008-10-22  5:35     ` hooanon05
  2008-10-22  8:27     ` Alan Cox
  0 siblings, 2 replies; 9+ messages in thread
From: hooanon05 @ 2008-10-21 23:10 UTC (permalink / raw)
  To: Alan Cox; +Cc: linux-kernel


Alan Cox:
> Calling into the file system code assuming that current->mm is
> NULL isn't safe and hasn't been for a very long time since someone added
> the 3% hack.

I guess
- people don't care overcommit and leave it as default, so they don't
  meet the problem 
- people who cares overcommit has rich memory, and they don't meet the
  problem too.


> The shmem case is actually a bit special so my thoughts are:
> 
> Make security_vm_enough_memory() WARN() if current->mm = NULL
> Make security_vm_enough_memory_mm() WARN() if the passed mm = NULL
> Add security_vm_enough_memory_fs() which does not do the warning test
> 
> All would still call security->ops->vm_enough_memory and then
> __vm_enough_memory() would skip the 3% adjustment when the passed mm was
> NULL
> 
> Does that sound sensible ?

In your first option, write() to the exported tmpfs will produce the
warning on nfs server even if much memory is left. I don't think it is a
good idea.
I'd suggest to make __vm_enough_memory() would skip the 3% adjustment
only.

--- /src/linux-2.6/linux-2.6.27/mm/mmap.c	2008-10-10 07:13:53.000000000 +0900
+++ /tmp/mmap.c	2008-10-22 08:07:09.000000000 +0900
@@ -173,9 +173,10 @@
 		allowed -= allowed / 32;
 	allowed += total_swap_pages;
 
-	/* Don't let a single process grow too big:
+	/* Don't let a single user process grow too big:
 	   leave 3% of the size of this process for other processes */
-	allowed -= mm->total_vm / 32;
+	if (mm)
+		allowed -= mm->total_vm / 32;
 
 	/*
 	 * cast `allowed' as a signed long because vm_committed_space


Junjiro R. Okajima

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: __vm_enough_memory(), OVERCOMMIT_NEVER, current->mm, kernel thread 
  2008-10-21 23:10   ` hooanon05
@ 2008-10-22  5:35     ` hooanon05
  2008-10-22  8:27     ` Alan Cox
  1 sibling, 0 replies; 9+ messages in thread
From: hooanon05 @ 2008-10-22  5:35 UTC (permalink / raw)
  To: Alan Cox, linux-kernel


> I guess
> - people don't care overcommit and leave it as default, so they don't
>   meet the problem 
> - people who cares overcommit has rich memory, and they don't meet the
>   problem too.

Correction: the problem is unrelated to the memory size.
I was sleepy when I wrote that.
Regardless the memory size, when nfsd writes to tmpfs the system surely
crashes.


Junjiro R. Okajima

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: __vm_enough_memory(), OVERCOMMIT_NEVER, current->mm, kernel thread
  2008-10-21 23:10   ` hooanon05
  2008-10-22  5:35     ` hooanon05
@ 2008-10-22  8:27     ` Alan Cox
  2008-10-22 11:26       ` hooanon05
  1 sibling, 1 reply; 9+ messages in thread
From: Alan Cox @ 2008-10-22  8:27 UTC (permalink / raw)
  To: hooanon05; +Cc: linux-kernel

> In your first option, write() to the exported tmpfs will produce the
> warning on nfs server even if much memory is left. I don't think it is a

No - the shmem routine would use vm_enough_memory_fs which wouldn't care
if current->mm is NULL, but the others would check.

> good idea.
> I'd suggest to make __vm_enough_memory() would skip the 3% adjustment
> only.
> 
> --- /src/linux-2.6/linux-2.6.27/mm/mmap.c	2008-10-10 07:13:53.000000000 +0900
> +++ /tmp/mmap.c	2008-10-22 08:07:09.000000000 +0900
> @@ -173,9 +173,10 @@
>  		allowed -= allowed / 32;
>  	allowed += total_swap_pages;
>  
> -	/* Don't let a single process grow too big:
> +	/* Don't let a single user process grow too big:
>  	   leave 3% of the size of this process for other processes */
> -	allowed -= mm->total_vm / 32;
> +	if (mm)
> +		allowed -= mm->total_vm / 32;

Doing this means we lose the ability to trap incorrect calls to
vm_enough_memory.

Alan

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: __vm_enough_memory(), OVERCOMMIT_NEVER, current->mm, kernel thread 
  2008-10-22  8:27     ` Alan Cox
@ 2008-10-22 11:26       ` hooanon05
  2008-10-22 12:09         ` Alan Cox
  0 siblings, 1 reply; 9+ messages in thread
From: hooanon05 @ 2008-10-22 11:26 UTC (permalink / raw)
  To: Alan Cox; +Cc: linux-kernel


Alan Cox:
> No - the shmem routine would use vm_enough_memory_fs which wouldn't care
> if current->mm is NULL, but the others would check.

Unfortunately I cannot imagine what new vm_enough_memory_fs() will be.
Would you show me its draft or pseudo code?


Junjiro R. Okajima

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: __vm_enough_memory(), OVERCOMMIT_NEVER, current->mm, kernel thread
  2008-10-22 11:26       ` hooanon05
@ 2008-10-22 12:09         ` Alan Cox
  2008-10-22 12:46           ` hooanon05
  2008-10-22 13:37           ` James Morris
  0 siblings, 2 replies; 9+ messages in thread
From: Alan Cox @ 2008-10-22 12:09 UTC (permalink / raw)
  To: hooanon05; +Cc: linux-kernel

On Wed, 22 Oct 2008 20:26:19 +0900
hooanon05@yahoo.co.jp wrote:

> 
> Alan Cox:
> > No - the shmem routine would use vm_enough_memory_fs which wouldn't care
> > if current->mm is NULL, but the others would check.
> 
> Unfortunately I cannot imagine what new vm_enough_memory_fs() will be.
> Would you show me its draft or pseudo code?

This is the patch I propose:

nfsd: Fix vm overcommit crash

From: Alan Cox <alan@redhat.com>

Junjiro R. Okajima reported a problem where knfsd crashes if you are using
it to export shmemfs objects and run strict overcommit. In this situation
the current->mm based modifier to the overcommit goes through a NULL
pointer.

We could simply check for NULL and skip the modifier but we've caught other
real bugs in the past from mm being NULL here - cases where we did need a
valid mm set up (eg the exec bug about a year ago).

To preserve the checks and get the logic we want shuffle the checking
around and add a new helper to the vm_ security wrappers

Also fix a current->mm reference in nommu that should use the passed mm
---

 include/linux/security.h |    1 +
 mm/mmap.c                |    3 ++-
 mm/nommu.c               |    3 ++-
 mm/shmem.c               |    4 ++--
 security/security.c      |    9 +++++++++
 5 files changed, 16 insertions(+), 4 deletions(-)


diff --git a/include/linux/security.h b/include/linux/security.h
index f5c4a51..a2b8430 100644
--- a/include/linux/security.h
+++ b/include/linux/security.h
@@ -1585,6 +1585,7 @@ int security_syslog(int type);
 int security_settime(struct timespec *ts, struct timezone *tz);
 int security_vm_enough_memory(long pages);
 int security_vm_enough_memory_mm(struct mm_struct *mm, long pages);
+int security_vm_enough_memory_kern(long pages);
 int security_bprm_alloc(struct linux_binprm *bprm);
 void security_bprm_free(struct linux_binprm *bprm);
 void security_bprm_apply_creds(struct linux_binprm *bprm, int unsafe);
diff --git a/mm/mmap.c b/mm/mmap.c
index 74f4d15..de14ac2 100644
--- a/mm/mmap.c
+++ b/mm/mmap.c
@@ -175,7 +175,8 @@ int __vm_enough_memory(struct mm_struct *mm, long pages, int cap_sys_admin)
 
 	/* Don't let a single process grow too big:
 	   leave 3% of the size of this process for other processes */
-	allowed -= mm->total_vm / 32;
+	if (mm)
+		allowed -= mm->total_vm / 32;
 
 	/*
 	 * cast `allowed' as a signed long because vm_committed_space
diff --git a/mm/nommu.c b/mm/nommu.c
index 2696b24..7695dc8 100644
--- a/mm/nommu.c
+++ b/mm/nommu.c
@@ -1454,7 +1454,8 @@ int __vm_enough_memory(struct mm_struct *mm, long pages, int cap_sys_admin)
 
 	/* Don't let a single process grow too big:
 	   leave 3% of the size of this process for other processes */
-	allowed -= current->mm->total_vm / 32;
+	if (mm)
+		allowed -= mm->total_vm / 32;
 
 	/*
 	 * cast `allowed' as a signed long because vm_committed_space
diff --git a/mm/shmem.c b/mm/shmem.c
index d38d7e6..1677b3e 100644
--- a/mm/shmem.c
+++ b/mm/shmem.c
@@ -162,7 +162,7 @@ static inline struct shmem_sb_info *SHMEM_SB(struct super_block *sb)
 static inline int shmem_acct_size(unsigned long flags, loff_t size)
 {
 	return (flags & VM_ACCOUNT)?
-		security_vm_enough_memory(VM_ACCT(size)): 0;
+		security_vm_enough_memory_kern(VM_ACCT(size)): 0;
 }
 
 static inline void shmem_unacct_size(unsigned long flags, loff_t size)
@@ -180,7 +180,7 @@ static inline void shmem_unacct_size(unsigned long flags, loff_t size)
 static inline int shmem_acct_block(unsigned long flags)
 {
 	return (flags & VM_ACCOUNT)?
-		0: security_vm_enough_memory(VM_ACCT(PAGE_CACHE_SIZE));
+		0: security_vm_enough_memory_kern(VM_ACCT(PAGE_CACHE_SIZE));
 }
 
 static inline void shmem_unacct_blocks(unsigned long flags, long pages)
diff --git a/security/security.c b/security/security.c
index 255b085..c0acfa7 100644
--- a/security/security.c
+++ b/security/security.c
@@ -198,14 +198,23 @@ int security_settime(struct timespec *ts, struct timezone *tz)
 
 int security_vm_enough_memory(long pages)
 {
+	WARN_ON(current->mm == NULL);
 	return security_ops->vm_enough_memory(current->mm, pages);
 }
 
 int security_vm_enough_memory_mm(struct mm_struct *mm, long pages)
 {
+	WARN_ON(mm == NULL);
 	return security_ops->vm_enough_memory(mm, pages);
 }
 
+int security_vm_enough_memory_kern(long pages)
+{
+	/* If current->mm is a kernel thread then we will pass NULL,
+	   for this specific case that is fine */
+	return security_ops->vm_enough_memory(current->mm, pages);
+}
+
 int security_bprm_alloc(struct linux_binprm *bprm)
 {
 	return security_ops->bprm_alloc_security(bprm);

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: __vm_enough_memory(), OVERCOMMIT_NEVER, current->mm, kernel thread 
  2008-10-22 12:09         ` Alan Cox
@ 2008-10-22 12:46           ` hooanon05
  2008-10-22 13:37           ` James Morris
  1 sibling, 0 replies; 9+ messages in thread
From: hooanon05 @ 2008-10-22 12:46 UTC (permalink / raw)
  To: Alan Cox; +Cc: linux-kernel


Alan Cox:
> This is the patch I propose:
	:::
> We could simply check for NULL and skip the modifier but we've caught other
> real bugs in the past from mm being NULL here - cases where we did need a
> valid mm set up (eg the exec bug about a year ago).

OK.
I didn't know there was an actual bug but now I understand why you want
to insert WARN_ON().

I have nothing to say about your patch.


Thank you
Junjiro R. Okajima

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: __vm_enough_memory(), OVERCOMMIT_NEVER, current->mm, kernel thread
  2008-10-22 12:09         ` Alan Cox
  2008-10-22 12:46           ` hooanon05
@ 2008-10-22 13:37           ` James Morris
  1 sibling, 0 replies; 9+ messages in thread
From: James Morris @ 2008-10-22 13:37 UTC (permalink / raw)
  To: Alan Cox; +Cc: hooanon05, linux-kernel, linux-security-module

On Wed, 22 Oct 2008, Alan Cox wrote:

[Adding lsm list to the cc]

> On Wed, 22 Oct 2008 20:26:19 +0900
> hooanon05@yahoo.co.jp wrote:
> 
> > 
> > Alan Cox:
> > > No - the shmem routine would use vm_enough_memory_fs which wouldn't care
> > > if current->mm is NULL, but the others would check.
> > 
> > Unfortunately I cannot imagine what new vm_enough_memory_fs() will be.
> > Would you show me its draft or pseudo code?
> 
> This is the patch I propose:
> 
> nfsd: Fix vm overcommit crash
> 
> From: Alan Cox <alan@redhat.com>
> 
> Junjiro R. Okajima reported a problem where knfsd crashes if you are using
> it to export shmemfs objects and run strict overcommit. In this situation
> the current->mm based modifier to the overcommit goes through a NULL
> pointer.
> 
> We could simply check for NULL and skip the modifier but we've caught other
> real bugs in the past from mm being NULL here - cases where we did need a
> valid mm set up (eg the exec bug about a year ago).
> 
> To preserve the checks and get the logic we want shuffle the checking
> around and add a new helper to the vm_ security wrappers
> 
> Also fix a current->mm reference in nommu that should use the passed mm

Looks ok to me.

Acked-by: James Morris <jmorris@namei.org>


> ---
> 
>  include/linux/security.h |    1 +
>  mm/mmap.c                |    3 ++-
>  mm/nommu.c               |    3 ++-
>  mm/shmem.c               |    4 ++--
>  security/security.c      |    9 +++++++++
>  5 files changed, 16 insertions(+), 4 deletions(-)
> 
> 
> diff --git a/include/linux/security.h b/include/linux/security.h
> index f5c4a51..a2b8430 100644
> --- a/include/linux/security.h
> +++ b/include/linux/security.h
> @@ -1585,6 +1585,7 @@ int security_syslog(int type);
>  int security_settime(struct timespec *ts, struct timezone *tz);
>  int security_vm_enough_memory(long pages);
>  int security_vm_enough_memory_mm(struct mm_struct *mm, long pages);
> +int security_vm_enough_memory_kern(long pages);
>  int security_bprm_alloc(struct linux_binprm *bprm);
>  void security_bprm_free(struct linux_binprm *bprm);
>  void security_bprm_apply_creds(struct linux_binprm *bprm, int unsafe);
> diff --git a/mm/mmap.c b/mm/mmap.c
> index 74f4d15..de14ac2 100644
> --- a/mm/mmap.c
> +++ b/mm/mmap.c
> @@ -175,7 +175,8 @@ int __vm_enough_memory(struct mm_struct *mm, long pages, int cap_sys_admin)
>  
>  	/* Don't let a single process grow too big:
>  	   leave 3% of the size of this process for other processes */
> -	allowed -= mm->total_vm / 32;
> +	if (mm)
> +		allowed -= mm->total_vm / 32;
>  
>  	/*
>  	 * cast `allowed' as a signed long because vm_committed_space
> diff --git a/mm/nommu.c b/mm/nommu.c
> index 2696b24..7695dc8 100644
> --- a/mm/nommu.c
> +++ b/mm/nommu.c
> @@ -1454,7 +1454,8 @@ int __vm_enough_memory(struct mm_struct *mm, long pages, int cap_sys_admin)
>  
>  	/* Don't let a single process grow too big:
>  	   leave 3% of the size of this process for other processes */
> -	allowed -= current->mm->total_vm / 32;
> +	if (mm)
> +		allowed -= mm->total_vm / 32;
>  
>  	/*
>  	 * cast `allowed' as a signed long because vm_committed_space
> diff --git a/mm/shmem.c b/mm/shmem.c
> index d38d7e6..1677b3e 100644
> --- a/mm/shmem.c
> +++ b/mm/shmem.c
> @@ -162,7 +162,7 @@ static inline struct shmem_sb_info *SHMEM_SB(struct super_block *sb)
>  static inline int shmem_acct_size(unsigned long flags, loff_t size)
>  {
>  	return (flags & VM_ACCOUNT)?
> -		security_vm_enough_memory(VM_ACCT(size)): 0;
> +		security_vm_enough_memory_kern(VM_ACCT(size)): 0;
>  }
>  
>  static inline void shmem_unacct_size(unsigned long flags, loff_t size)
> @@ -180,7 +180,7 @@ static inline void shmem_unacct_size(unsigned long flags, loff_t size)
>  static inline int shmem_acct_block(unsigned long flags)
>  {
>  	return (flags & VM_ACCOUNT)?
> -		0: security_vm_enough_memory(VM_ACCT(PAGE_CACHE_SIZE));
> +		0: security_vm_enough_memory_kern(VM_ACCT(PAGE_CACHE_SIZE));
>  }
>  
>  static inline void shmem_unacct_blocks(unsigned long flags, long pages)
> diff --git a/security/security.c b/security/security.c
> index 255b085..c0acfa7 100644
> --- a/security/security.c
> +++ b/security/security.c
> @@ -198,14 +198,23 @@ int security_settime(struct timespec *ts, struct timezone *tz)
>  
>  int security_vm_enough_memory(long pages)
>  {
> +	WARN_ON(current->mm == NULL);
>  	return security_ops->vm_enough_memory(current->mm, pages);
>  }
>  
>  int security_vm_enough_memory_mm(struct mm_struct *mm, long pages)
>  {
> +	WARN_ON(mm == NULL);
>  	return security_ops->vm_enough_memory(mm, pages);
>  }
>  
> +int security_vm_enough_memory_kern(long pages)
> +{
> +	/* If current->mm is a kernel thread then we will pass NULL,
> +	   for this specific case that is fine */
> +	return security_ops->vm_enough_memory(current->mm, pages);
> +}
> +
>  int security_bprm_alloc(struct linux_binprm *bprm)
>  {
>  	return security_ops->bprm_alloc_security(bprm);
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
> 

-- 
James Morris
<jmorris@namei.org>

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2008-10-22 13:37 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2008-10-21 15:14 __vm_enough_memory(), OVERCOMMIT_NEVER, current->mm, kernel thread hooanon05
2008-10-21 15:46 ` Alan Cox
2008-10-21 23:10   ` hooanon05
2008-10-22  5:35     ` hooanon05
2008-10-22  8:27     ` Alan Cox
2008-10-22 11:26       ` hooanon05
2008-10-22 12:09         ` Alan Cox
2008-10-22 12:46           ` hooanon05
2008-10-22 13:37           ` James Morris

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).