LKML Archive on lore.kernel.org
help / color / mirror / Atom feed
From: Avi Kivity <avi@redhat.com>
To: Rik van Riel <riel@redhat.com>
Cc: kvm@vger.kernel.org, linux-kernel@vger.kernel.org,
	Srivatsa Vaddagiri <vatsa@linux.vnet.ibm.com>,
	Peter Zijlstra <a.p.zijlstra@chello.nl>,
	Mike Galbraith <efault@gmx.de>,
	Chris Wright <chrisw@sous-sol.org>,
	ttracy@redhat.com, "Nakajima, Jun" <jun.nakajima@intel.com>
Subject: Re: [RFC -v7 PATCH 0/7] directed yield for Pause Loop Exiting
Date: Thu, 27 Jan 2011 15:20:53 +0200	[thread overview]
Message-ID: <4D417135.7050903@redhat.com> (raw)
In-Reply-To: <20110126165657.2ddd2ac9@annuminas.surriel.com>

On 01/26/2011 11:56 PM, Rik van Riel wrote:
> When running SMP virtual machines, it is possible for one VCPU to be
> spinning on a spinlock, while the VCPU that holds the spinlock is not
> currently running, because the host scheduler preempted it to run
> something else.
>
> Both Intel and AMD CPUs have a feature that detects when a virtual
> CPU is spinning on a lock and will trap to the host.
>
> The current KVM code sleeps for a bit whenever that happens, which
> results in eg. a 64 VCPU Windows guest taking forever and a bit to
> boot up.  This is because the VCPU holding the lock is actually
> running and not sleeping, so the pause is counter-productive.
>
> In other workloads a pause can also be counter-productive, with
> spinlock detection resulting in one guest giving up its CPU time
> to the others.  Instead of spinning, it ends up simply not running
> much at all.
>
> This patch series aims to fix that, by having a VCPU that spins
> give the remainder of its timeslice to another VCPU in the same
> guest before yielding the CPU - one that is runnable but got
> preempted, hopefully the lock holder.
>
> v7:
> - move the vcpu to pid mapping to inside the vcpu->mutex
> - rename ->yield to ->skip
> - merge patch 5 into patch 4
> v6:
> - implement yield_task_fair in a way that works with task groups,
>    this allows me to actually get a performance improvement!
> - fix another race Avi pointed out, the code should be good now
> v5:
> - fix the race condition Avi pointed out, by tracking vcpu->pid
> - also allows us to yield to vcpu tasks that got preempted while in qemu
>    userspace
> v4:
> - change to newer version of Mike Galbraith's yield_to implementation
> - chainsaw out some code from Mike that looked like a great idea, but
>    turned out to give weird interactions in practice
> v3:
> - more cleanups
> - change to Mike Galbraith's yield_to implementation
> - yield to spinning VCPUs, this seems to work better in some
>    situations and has little downside potential
> v2:
> - make lots of cleanups and improvements suggested
> - do not implement timeslice scheduling or fairness stuff
>    yet, since it is not entirely clear how to do that right
>    (suggestions welcome)
>
>
> Benchmark results:
>
> Two 4-CPU KVM guests are pinned to the same 4 physical CPUs.
>
> One guest runs the AMQP performance test, the other guest runs
> 0, 2 or 4 infinite loops, for CPU overcommit factors of 0, 1.5
> and 4.
>
> The AMQP perftest is run 30 times, with message payloads of 8 and 16 bytes.
>
> size8	no overcommit	1.5x overcommit		2x overcommit
>
> no PLE	223801		135137			104951
> PLE	224135		141105			118744
>
> size16	no overcommit	1.5x overcommit		2x overcommit
>
> no PLE	222424		126175			105299
> PLE	222534		138082			132945
>
> Note: this is with the KVM guests NOT running inside cgroups.  There
> seems to be a CPU load balancing issue with cgroup fair group scheduling,
> which often results in one guest getting only 80% CPU time and the other
> guest 320%.  That will have to be fixed to get meaningful results with
> cgroups.
>
> CPU time division between the AMQP guest and the infinite loop guest
> were not exactly fair, but the guests got close to the same amount
> of CPU time in each test run.
>
> There is a substantial amount of randomness in CPU time division between
> guests, but the performance improvement is consistent between multiple
> runs.

The kvm bits look fine; I'll be happy to apply them after the scheduler 
changes are accepted into tip.git.

-- 
error compiling committee.c: too many arguments to function


      parent reply	other threads:[~2011-01-27 13:21 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-01-26 21:56 Rik van Riel
2011-01-26 22:19 ` [RFC -v7 PATCH 2/7] sched: limit the scope of clear_buddies Rik van Riel
2011-01-26 22:19 ` [RFC -v7 PATCH 1/7] sched: check the right ->nr_running in yield_task_fair Rik van Riel
2011-01-26 22:21 ` [RFC -v7 PATCH 3/7] sched: use a buddy to implement yield_task_fair Rik van Riel
2011-01-31 11:47   ` Peter Zijlstra
2011-01-31 15:02     ` Rik van Riel
2011-01-26 22:21 ` [RFC -v7 PATCH 4/7] Add yield_to(task, preempt) functionality Rik van Riel
2011-01-31 11:49   ` Peter Zijlstra
2011-01-31 18:11     ` Rik van Riel
2011-01-26 22:23 ` [RFC -v7 PATCH 5/7] export pid symbols needed for kvm_vcpu_on_spin Rik van Riel
2011-01-31 11:51   ` Peter Zijlstra
2011-01-31 13:26     ` Avi Kivity
2011-01-31 13:43       ` Peter Zijlstra
2011-01-31 13:45         ` Avi Kivity
2011-01-26 22:24 ` [RFC -v7 PATCH 6/7] kvm: keep track of which task is running a KVM vcpu Rik van Riel
2011-01-26 22:25 ` [RFC -v7 PATCH 7/7] kvm: use yield_to instead of sleep in kvm_vcpu_on_spin Rik van Riel
2011-01-27 13:20 ` Avi Kivity [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4D417135.7050903@redhat.com \
    --to=avi@redhat.com \
    --cc=a.p.zijlstra@chello.nl \
    --cc=chrisw@sous-sol.org \
    --cc=efault@gmx.de \
    --cc=jun.nakajima@intel.com \
    --cc=kvm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=riel@redhat.com \
    --cc=ttracy@redhat.com \
    --cc=vatsa@linux.vnet.ibm.com \
    --subject='Re: [RFC -v7 PATCH 0/7] directed yield for Pause Loop Exiting' \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).