LKML Archive on lore.kernel.org
help / color / mirror / Atom feed
* [PATCH 0/4]x86: allocate up to 32 tlb invalidate vectors -resend
@ 2011-01-17  2:51 Shaohua Li
  2011-02-14  2:38 ` Shaohua Li
  0 siblings, 1 reply; 2+ messages in thread
From: Shaohua Li @ 2011-01-17  2:51 UTC (permalink / raw)
  To: lkml; +Cc: Ingo Molnar, Andi Kleen, hpa, Andrew Morton, Eric Dumazet

last post is lost, resent.

Hi,
In workload with heavy page reclaim, flush_tlb_page() is frequently
used. We currently have 8 vectors for tlb flush, which is fine for small
machines. But for big machines with a lot of CPUs, the 8 vectors are
shared by all CPUs and we need lock to protect them. This will cause a
lot of lock contentions. please see the patch 3 for detailed number of
the lock contention.
Andi Kleen suggests we can use 32 vectors for tlb flush, which should be
fine for even 8 socket machines. Test shows this reduces lock contention
dramatically (see patch 3 for number).
One might argue if this will waste too many vectors and leave less
vectors for devices. This could be a problem. But even we use 32
vectors, we still leave 78 vectors for devices. And we now have per-cpu
vector, vector isn't scarce any more, but I'm open if anybody has
objections.

Thanks,
Shaohua


^ permalink raw reply	[flat|nested] 2+ messages in thread

* Re: [PATCH 0/4]x86: allocate up to 32 tlb invalidate vectors -resend
  2011-01-17  2:51 [PATCH 0/4]x86: allocate up to 32 tlb invalidate vectors -resend Shaohua Li
@ 2011-02-14  2:38 ` Shaohua Li
  0 siblings, 0 replies; 2+ messages in thread
From: Shaohua Li @ 2011-02-14  2:38 UTC (permalink / raw)
  To: lkml; +Cc: Ingo Molnar, Andi Kleen, hpa, Andrew Morton, Eric Dumazet

On Mon, 2011-01-17 at 10:51 +0800, Shaohua Li wrote:
> last post is lost, resent.
> 
> Hi,
> In workload with heavy page reclaim, flush_tlb_page() is frequently
> used. We currently have 8 vectors for tlb flush, which is fine for small
> machines. But for big machines with a lot of CPUs, the 8 vectors are
> shared by all CPUs and we need lock to protect them. This will cause a
> lot of lock contentions. please see the patch 3 for detailed number of
> the lock contention.
> Andi Kleen suggests we can use 32 vectors for tlb flush, which should be
> fine for even 8 socket machines. Test shows this reduces lock contention
> dramatically (see patch 3 for number).
> One might argue if this will waste too many vectors and leave less
> vectors for devices. This could be a problem. But even we use 32
> vectors, we still leave 78 vectors for devices. And we now have per-cpu
> vector, vector isn't scarce any more, but I'm open if anybody has
> objections.
any comments on this patch set?

Thanks,
Shaohua



^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2011-02-14  2:38 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-01-17  2:51 [PATCH 0/4]x86: allocate up to 32 tlb invalidate vectors -resend Shaohua Li
2011-02-14  2:38 ` Shaohua Li

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).