LKML Archive on lore.kernel.org
help / color / mirror / Atom feed
From: Joe Korty <joe.korty@ccur.com>
To: akpm@digeo.com
Cc: linux-kernel@vger.kernel.org, riel@redhat.com, andrea@suse.de
Subject: mlockall and mmap of IO devices don't mix
Date: Fri, 3 Oct 2003 17:44:12 -0400	[thread overview]
Message-ID: <20031003214411.GA25802@rudolph.ccur.com> (raw)

2.6.0-test6: the use of mlockall(2) in a process that has mmap(2)ed
the registers of an IO device will hang that process uninterruptibly.
The task runs in an infinite loop in get_user_pages(), invoking
follow_page() forever.

Using binary search I discovered that the problem was introduced
in 2.5.14, specifically in ChangeSetKey

    zippel@linux-m68k.org|ChangeSet|20020503210330|37095

The following patch backs out those lines of the above changeset
which are causing the problem:

--- a/mm/memory.c	2003-10-02 15:32:51.000000000 -0400
+++ b/mm/memory.c	2003-10-02 17:02:48.000000000 -0400
@@ -485,9 +485,7 @@
 	if (pte_present(pte)) {
 		if (!write ||
 		    (pte_write(pte) && pte_dirty(pte))) {
-			pfn = pte_pfn(pte);
-			if (pfn_valid(pfn))
-				return pfn_to_page(pfn);
+			return pte_page(pte);
 		}
 	}
 
The equivalent backout patch for 2.6.0-test6 is:

--- linux-2.6.0-test6/mm/memory.c.orig	2003-09-27 20:50:19.000000000 -0400
+++ linux-2.6.0-test6/mm/memory.c	2003-10-03 16:17:25.000000000 -0400
@@ -618,7 +618,6 @@
 	pgd_t *pgd;
 	pmd_t *pmd;
 	pte_t *ptep, pte;
-	unsigned long pfn;
 	struct vm_area_struct *vma;
 
 	vma = hugepage_vma(mm, address);
@@ -645,13 +644,11 @@
 	pte_unmap(ptep);
 	if (pte_present(pte)) {
 		if (!write || (pte_write(pte) && pte_dirty(pte))) {
-			pfn = pte_pfn(pte);
-			if (pfn_valid(pfn)) {
-				struct page *page = pfn_to_page(pfn);
-
+			struct page *page = pte_page(pte);
+			if (pfn_valid(pte_pfn(pte))) {
 				mark_page_accessed(page);
-				return page;
 			}
+			return page;
 		}
 	}
 
I do not believe that the above constitutes a correct fix.  The
problem is that follow_pages() is fundamentally not able to handle a
mapping which does not have a 'struct page' backing it up, and a
mapping to IO memory by definition has no 'struct page' structure to
back it up.

IMO the original 2.5.14 patch which 'broke' the kernel is correct.
It made follow_page() return zero for a 'struct page' address, which
is entirely reasonable when a mapping has no 'struct page'.  Prior to
2.5.14 follow_page() would return some meaningless nonzero value
which, nevertheless, allowed its caller to continue and establish an
apparently correct mapping.

Joe

PS: my test program.  Nice and short.

/* test mlockall() in conjunction with mmap(2) of an IO device.
 * success -- program exits.
 * failure -- program hangs up, unkillable.
 */
#include <stdio.h>
#include <stdlib.h>
#include <fcntl.h>
#include <sys/mman.h>

int main(int argc, char **argv)
{
	int stat, fd;
	unsigned char *p;
	unsigned long off;

	if (argc > 1) {
		off = strtoul(argv[1], NULL, 16);
	} else {
		fprintf(stderr, "arg is video card addr in /dev/mem");
		exit(1);
	}

	stat = mlockall(MCL_CURRENT | MCL_FUTURE);
	if (stat == -1) {
		perror("mlockall");
		exit(1);
	}

	fd = open ("/dev/mem", O_RDWR, 0666);
	if (fd == -1) {
		perror("open");
		exit(1);
	}

	p = (unsigned char *)mmap(NULL, 0x4096,
		PROT_READ | PROT_WRITE, MAP_SHARED, fd, (off_t)off);
	if (!p) {
		perror("mmap");
		exit(1);
	}
	printf("[%08lx] %02x%02x%02x%02x\n",
		off, p[0], p[1], p[2], p[3]);
	return 0;
}

             reply	other threads:[~2003-10-03 21:45 UTC|newest]

Thread overview: 28+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2003-10-03 21:44 Joe Korty [this message]
2003-10-03 22:23 ` Andrew Morton
2003-10-03 22:55   ` Joe Korty
2003-10-03 23:06     ` Andrew Morton
2003-10-03 23:28       ` Joe Korty
2003-10-03 23:15     ` Andrew Morton
2003-10-03 23:54       ` Joe Korty
2003-10-04  0:27         ` Andrew Morton
2003-10-04  5:47           ` David S. Miller
2003-10-04  9:29             ` Ingo Oeser
2004-05-21 11:34 ` Mark Hounschell
2004-05-22  2:13   ` Andrew Morton
2004-05-22 10:47     ` Mark Hounschell
2004-05-23 12:58       ` Mark Hounschell
2004-05-25 14:27     ` Joe Korty
2004-05-25 19:47       ` Andrew Morton
2004-05-25 21:31         ` Joe Korty
2004-07-16 21:01         ` Mark Hounschell
     [not found] <CFYv.787.23@gated-at.bofh.it>
2003-10-04  7:02 ` Andi Kleen
2003-10-04  7:42   ` Andrew Morton
2003-10-04  8:29     ` Andi Kleen
2003-10-04  8:47   ` Ingo Oeser
2003-10-04  9:17     ` Andi Kleen
2003-10-04  9:22       ` Russell King
2003-10-04 10:02         ` Ingo Oeser
2003-10-04 10:13           ` Russell King
2003-10-04 14:19             ` Ingo Oeser
2003-10-04 14:32           ` Martin J. Bligh

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20031003214411.GA25802@rudolph.ccur.com \
    --to=joe.korty@ccur.com \
    --cc=akpm@digeo.com \
    --cc=andrea@suse.de \
    --cc=linux-kernel@vger.kernel.org \
    --cc=riel@redhat.com \
    --subject='Re: mlockall and mmap of IO devices don'\''t mix' \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).