LKML Archive on lore.kernel.org
help / color / mirror / Atom feed
From: "Luis R. Rodriguez" <mcgrof@do-not-panic.com>
To: david.vrabel@citrix.com, konrad.wilk@oracle.com,
	boris.ostrovsky@oracle.com, xen-devel@lists.xenproject.org
Cc: linux-kernel@vger.kernel.org, x86@kernel.org,
	kvm@vger.kernel.org, paulmck@linux.vnet.ibm.com,
	rostedt@goodmis.org, "Luis R. Rodriguez" <mcgrof@suse.com>,
	Andy Lutomirski <luto@amacapital.net>,
	Borislav Petkov <bp@suse.de>,
	Thomas Gleixner <tglx@linutronix.de>,
	Ingo Molnar <mingo@redhat.com>, "H. Peter Anvin" <hpa@zytor.com>,
	Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>,
	Jan Beulich <JBeulich@suse.com>
Subject: [PATCH v5 2/2] x86/xen: allow privcmd hypercalls to be preempted on 64-bit
Date: Mon, 26 Jan 2015 17:51:07 -0800	[thread overview]
Message-ID: <1422323467-16713-3-git-send-email-mcgrof@do-not-panic.com> (raw)
In-Reply-To: <1422323467-16713-1-git-send-email-mcgrof@do-not-panic.com>

From: "Luis R. Rodriguez" <mcgrof@suse.com>

Xen has support for splitting heavy work work into a series
of hypercalls, called multicalls, and preempting them through
what Xen calls continuation [0]. Despite this though without
CONFIG_PREEMPT preemption won't happen, without preemption
a system can become pretty useless on heavy handed hypercalls.
Such is the case for example when creating a > 50 GiB HVM guest,
we can get softlockups [1] with:.

kernel: [  802.084335] BUG: soft lockup - CPU#1 stuck for 22s! [xend:31351]

The softlock up triggers on the TASK_UNINTERRUPTIBLE hanger check
(default 120 seconds), on the Xen side in this particular case
this happens when the following Xen hypervisor code is used:

xc_domain_set_pod_target() -->
  do_memory_op() -->
    arch_memory_op() -->
      p2m_pod_set_mem_target()
	-- long delay (real or emulated) --

This happens on arch_memory_op() on the XENMEM_set_pod_target memory
op even though arch_memory_op() can handle continuation via
hypercall_create_continuation() for example.

Machines over 50 GiB of memory are on high demand and hard to come
by so to help replicate this sort of issue long delays on select
hypercalls have been emulated in order to be able to test this on
smaller machines [2].

On one hand this issue can be considered as expected given that
CONFIG_PREEMPT=n is used however we have forced voluntary preemption
precedent practices in the kernel even for CONFIG_PREEMPT=n through
the usage of cond_resched() sprinkled in many places. To address
this issue with Xen hypercalls though we need to find a way to aid
to the schedular in the middle of hypercalls. We are motivated to
address this issue on CONFIG_PREEMPT=n as otherwise the system becomes
rather unresponsive for long periods of time; in the worst case, at least
only currently by emulating long delays on select io disk bound
hypercalls, this can lead to filesystem corruption if the delay happens
for example on SCHEDOP_remote_shutdown (when we call 'xl <domain> shutdown').

We can address this problem by trying to check if we should schedule
on the xen timer in the middle of a hypercall on the return from the
timer interrupt. We want to be careful to not always force voluntary
preemption though so to do this we only selectively enable preemption
on very specific xen hypercalls.

This enables hypercall preemption by selectively forcing checks for
voluntary preempting only on ioctl initiated private hypercalls
where we know some folks have run into reported issues [1].

[0] http://xenbits.xen.org/gitweb/?p=xen.git;a=commitdiff;h=42217cbc5b3e84b8c145d8cfb62dd5de0134b9e8;hp=3a0b9c57d5c9e82c55dd967c84dd06cb43c49ee9
[1] https://bugzilla.novell.com/show_bug.cgi?id=861093
[2] http://ftp.suse.com/pub/people/mcgrof/xen/emulate-long-xen-hypercalls.patch

Based on original work by: David Vrabel <david.vrabel@citrix.com>
Suggested-by: Andy Lutomirski <luto@amacapital.net>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@suse.de>
Cc: David Vrabel <david.vrabel@citrix.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: x86@kernel.org
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Jan Beulich <JBeulich@suse.com>
Cc: linux-kernel@vger.kernel.org
Reviewed-by: Andy Lutomirski <luto@amacapital.net>
Signed-off-by: Luis R. Rodriguez <mcgrof@suse.com>
---
 arch/x86/kernel/entry_64.S       |  2 ++
 drivers/xen/events/events_base.c | 14 ++++++++++++++
 include/xen/events.h             |  1 +
 3 files changed, 17 insertions(+)

diff --git a/arch/x86/kernel/entry_64.S b/arch/x86/kernel/entry_64.S
index 9ebaf63..ee28733 100644
--- a/arch/x86/kernel/entry_64.S
+++ b/arch/x86/kernel/entry_64.S
@@ -1198,6 +1198,8 @@ ENTRY(xen_do_hypervisor_callback)   # do_hypervisor_callback(struct *pt_regs)
 	popq %rsp
 	CFI_DEF_CFA_REGISTER rsp
 	decl PER_CPU_VAR(irq_count)
+	movq %rsp, %rdi  /* pass pt_regs as first argument */
+	call xen_end_upcall
 	jmp  error_exit
 	CFI_ENDPROC
 END(xen_do_hypervisor_callback)
diff --git a/drivers/xen/events/events_base.c b/drivers/xen/events/events_base.c
index b4bca2d..bf207f2 100644
--- a/drivers/xen/events/events_base.c
+++ b/drivers/xen/events/events_base.c
@@ -32,6 +32,8 @@
 #include <linux/slab.h>
 #include <linux/irqnr.h>
 #include <linux/pci.h>
+#include <linux/sched.h>
+#include <linux/kprobes.h>
 
 #ifdef CONFIG_X86
 #include <asm/desc.h>
@@ -1243,6 +1245,18 @@ void xen_evtchn_do_upcall(struct pt_regs *regs)
 	set_irq_regs(old_regs);
 }
 
+/*
+ * Some hypercalls issued by the toolstack can take many 10s of
+ * seconds. Allow tasks running hypercalls via the privcmd driver to be
+ * voluntarily preempted even if full kernel preemption is disabled.
+ */
+void xen_end_upcall(struct pt_regs *regs)
+{
+	if (xen_is_preemptible_hypercall(regs))
+		_cond_resched();
+}
+NOKPROBE_SYMBOL(xen_end_upcall);
+
 void xen_hvm_evtchn_do_upcall(void)
 {
 	__xen_evtchn_do_upcall();
diff --git a/include/xen/events.h b/include/xen/events.h
index 5321cd9..f08df87 100644
--- a/include/xen/events.h
+++ b/include/xen/events.h
@@ -95,6 +95,7 @@ void xen_hvm_callback_vector(void);
 extern int xen_have_vector_callback;
 int xen_set_callback_via(uint64_t via);
 void xen_evtchn_do_upcall(struct pt_regs *regs);
+void xen_end_upcall(struct pt_regs *regs);
 void xen_hvm_evtchn_do_upcall(void);
 
 /* Bind a pirq for a physical interrupt to an irq. */
-- 
2.1.1


  parent reply	other threads:[~2015-01-27  1:51 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-01-27  1:51 [PATCH v5 0/2] x86/xen: add xen hypercall preemption Luis R. Rodriguez
2015-01-27  1:51 ` [PATCH v5 1/2] xen: add xen_is_preemptible_hypercall() Luis R. Rodriguez
2015-02-03 11:05   ` [Xen-devel] " David Vrabel
2015-02-03 18:03     ` Luis R. Rodriguez
2015-01-27  1:51 ` Luis R. Rodriguez [this message]
2015-01-27  8:35   ` [PATCH v5 2/2] x86/xen: allow privcmd hypercalls to be preempted on 64-bit Jan Beulich
2015-01-27 10:01     ` [Xen-devel] " Andrew Cooper
2015-01-27 10:06     ` David Vrabel
2015-01-27 20:12       ` Luis R. Rodriguez
2015-01-29 20:35       ` Luis R. Rodriguez
2015-02-03  0:24         ` Luis R. Rodriguez
2015-02-03  9:58           ` David Vrabel
2015-02-05 12:47 ` [Xen-devel] [PATCH v5 0/2] x86/xen: add xen hypercall preemption David Vrabel
2015-02-05 18:15   ` Luis R. Rodriguez

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1422323467-16713-3-git-send-email-mcgrof@do-not-panic.com \
    --to=mcgrof@do-not-panic.com \
    --cc=JBeulich@suse.com \
    --cc=boris.ostrovsky@oracle.com \
    --cc=bp@suse.de \
    --cc=david.vrabel@citrix.com \
    --cc=hpa@zytor.com \
    --cc=konrad.wilk@oracle.com \
    --cc=kvm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=luto@amacapital.net \
    --cc=masami.hiramatsu.pt@hitachi.com \
    --cc=mcgrof@suse.com \
    --cc=mingo@redhat.com \
    --cc=paulmck@linux.vnet.ibm.com \
    --cc=rostedt@goodmis.org \
    --cc=tglx@linutronix.de \
    --cc=x86@kernel.org \
    --cc=xen-devel@lists.xenproject.org \
    --subject='Re: [PATCH v5 2/2] x86/xen: allow privcmd hypercalls to be preempted on 64-bit' \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).