LKML Archive on lore.kernel.org
help / color / mirror / Atom feed
From: "Martin J. Bligh" <mbligh@aracnet.com>
To: William Lee Irwin III <wli@holomorphy.com>
Cc: Rik van Riel <riel@redhat.com>, Jack Steiner <steiner@sgi.com>,
	Anton Blanchard <anton@samba.org>,
	Jes Sorensen <jes@trained-monkey.org>,
	Alexander Viro <viro@math.psu.edu>, Andrew Morton <akpm@osdl.org>,
	linux-kernel@vger.kernel.org, jbarnes@sgi.com
Subject: Re: hash table sizes
Date: Wed, 26 Nov 2003 08:17:41 -0800	[thread overview]
Message-ID: <9060000.1069863460@[10.10.2.4]> (raw)
In-Reply-To: <20031126095114.GH8039@holomorphy.com>

>> However, I'm curious as to why this crashes X, as I don't see how this
>> code change makes a difference in practice. I didn't think we had any i386
>> NUMA with memory holes between nodes at the moment, though perhaps the x440
>> does.
>> M.
>> PS. No, I haven't tested my rephrasing of your patch either.
> 
> mmap() of framebuffers. It takes the box out, not just X. There are
> holes just below 4GB regardless. This has actually been reported by
> rml and some others.
>
> False positives on pfn_valid() result in manipulations of purported page
> structures beyond the bounds of actual allocated pgdat->node_mem_map[]'s,
> potentially either corrupting memory or accessing areas outside memory's
> limits (the case causing oopsen).

OK. But the hole from 3.75 - 4GB you're referring to doesn't seem to
fall under that definition. 

1) It still has a valid pfn, though the backing memory itself isn't there.
2) It's covered by node_start_pfn ... node_start_pfn + node_spanned_pages,
	which is what your patch tests for. Maybe not if you have = 4GB
	per node? Will be if you have more than that.

I agree that pfn_valid absolutely has to be correct. The current definition
was, I thought, correct unless we have holes *between* nodes. I was under
the impression that no box we had uses that setup, but I guess we might
do - were you seeing this on x440 or NUMA-Q?

NUMA-Q looks like this:

BIOS-e820: 0000000000000000 - 000000000009fc00 (usable)
BIOS-e820: 0000000000100000 - 00000000e0000000 (usable)
BIOS-e820: 00000000fec00000 - 00000000fec09000 (reserved)
BIOS-e820: 00000000ffe80000 - 0000000100000000 (reserved)
BIOS-e820: 0000000100000000 - 0000000400000000 (usable)

which should create mem_map from 0 - 4GB contigious for node 0, AFAICS
(I believe struct pages are created for reserved areas still). Maybe
the srat stuff for x440 does something else.

Your patch is correct, I just don't see that it'll fix the X problem.

> diff -prauN linux-2.6.0-test10/include/asm-i386/mmzone.h pfn_valid-2.6.0-test10/include/asm-i386/mmzone.h
> --- linux-2.6.0-test10/include/asm-i386/mmzone.h	2003-11-23 17:31:56.000000000 -0800
> +++ pfn_valid-2.6.0-test10/include/asm-i386/mmzone.h	2003-11-26 01:40:36.000000000 -0800
> @@ -84,14 +84,30 @@ extern struct pglist_data *node_data[];
>  		+ __zone->zone_start_pfn;				\
>  })
>  #define pmd_page(pmd)		(pfn_to_page(pmd_val(pmd) >> PAGE_SHIFT))
> +
> +static inline int pfn_to_nid(unsigned long);
>  /*
> - * pfn_valid should be made as fast as possible, and the current definition 
> - * is valid for machines that are NUMA, but still contiguous, which is what
> - * is currently supported. A more generalised, but slower definition would
> - * be something like this - mbligh:
> - * ( pfn_to_pgdat(pfn) && ((pfn) < node_end_pfn(pfn_to_nid(pfn))) ) 
> + * pfn_valid must absolutely be correct, regardless of speed concerns.
>   */ 
> -#define pfn_valid(pfn)          ((pfn) < num_physpages)
> +static inline int pfn_valid(unsigned long pfn)
> +{
> +	u8 nid = pfn_to_nid(pfn);
> +	pg_data_t *pgdat;
> +
> +	if (nid < MAX_NUMNODES)
> +		pgdat = NODE_DATA(nid);
> +	else
> +		return 0;
> +
> +	if (!pgdat)
> +		return 0;
> +	else if (pfn < pgdat->node_start_pfn)
> +		return 0;
> +	else if (pfn - pgdat->node_start_pfn >= pgdat->node_spanned_pages)
> +		return 0;
> +	else
> +		return 1;
> +}
>  
>  /*
>   * generic node memory support, the following assumptions apply:


Cool, thanks. I'll try runtesting it.

M.

  reply	other threads:[~2003-11-26 16:18 UTC|newest]

Thread overview: 32+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2003-11-25 13:35 Jes Sorensen
2003-11-25 13:42 ` William Lee Irwin III
2003-11-25 13:54   ` Jes Sorensen
2003-11-25 16:25     ` Thomas Schlichter
2003-11-25 17:52       ` Antonio Vargas
2003-11-25 17:54         ` William Lee Irwin III
2003-11-25 20:48 ` Jack Steiner
2003-11-25 21:07   ` Andrew Morton
2003-11-25 21:14     ` Jesse Barnes
2003-11-25 21:24       ` Andrew Morton
2003-11-26  2:14         ` David S. Miller
2003-11-26  5:27         ` Matt Mackall
2003-11-28 14:15         ` Jes Sorensen
2003-11-28 14:52           ` Jack Steiner
2003-11-28 16:22             ` Jes Sorensen
2003-11-28 19:35               ` Jack Steiner
2003-11-28 21:18                 ` Jörn Engel
2003-12-01  9:46                   ` Jes Sorensen
2003-12-01 21:06     ` Anton Blanchard
2003-12-01 22:57       ` Martin J. Bligh
2003-11-25 21:16   ` Anton Blanchard
2003-11-25 23:11     ` Jack Steiner
2003-11-26  3:39       ` Rik van Riel
2003-11-26  3:59         ` William Lee Irwin III
2003-11-26  4:25           ` Andrew Morton
2003-11-26  4:23             ` William Lee Irwin III
2003-11-26  5:14           ` Martin J. Bligh
2003-11-26  9:51             ` William Lee Irwin III
2003-11-26 16:17               ` Martin J. Bligh [this message]
2003-11-26  7:25       ` Anton Blanchard
2003-11-26  5:53 Zhang, Yanmin
2003-11-29 10:39 Manfred Spraul

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='9060000.1069863460@[10.10.2.4]' \
    --to=mbligh@aracnet.com \
    --cc=akpm@osdl.org \
    --cc=anton@samba.org \
    --cc=jbarnes@sgi.com \
    --cc=jes@trained-monkey.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=riel@redhat.com \
    --cc=steiner@sgi.com \
    --cc=viro@math.psu.edu \
    --cc=wli@holomorphy.com \
    --subject='Re: hash table sizes' \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).