LKML Archive on lore.kernel.org
help / color / mirror / Atom feed
From: Ian Campbell <ijc@hellion.org.uk>
To: "Rafael J. Wysocki" <rjw@sisk.pl>
Cc: "SUZUKI, Kazuhiro" <kaz@jp.fujitsu.com>,
	linux-pm@lists.linux-foundation.org,
	xen-devel@lists.xensource.com, Greg KH <greg@kroah.com>,
	LKML <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH 0/2] Fix hangup after creating checkpoint on Xen.
Date: Tue, 08 Feb 2011 11:22:58 +0000	[thread overview]
Message-ID: <1297164178.9388.65.camel@zakaz.uk.xensource.com> (raw)
In-Reply-To: <201102071035.29922.rjw@sisk.pl>

On Mon, 2011-02-07 at 10:35 +0100, Rafael J. Wysocki wrote:
> On Monday, February 07, 2011, SUZUKI, Kazuhiro wrote:
> > Hi,
> 
> Hi,
> 
> > The following patch series fixes hangup after creating checkpoint on
> > Xen. The Linux Xen guest can be saved the state to restore later, and
> > also created snapshot like checkpoint via the hypervisor.
> > But, when the snapshot is created for the PV guest, it will hangup.
> > 
> > We added 'PMSG_CANCEL' message and 'cancel' handler in dev_pm_ops
> > struct in the pm-linux part.
> 
> Please don't do that, unless you can convince me there's no other way to fix
> the problem you're trying to address.

Sorry, it was my advise to Kazuhiro that solving the underlying issue by
extending the core would be preferable to making Xen specific hacks.

The problem is that currently we have:

        dpm_suspend_start(PMSG_SUSPEND);
        
                dpm_suspend_noirq(PMSG_SUSPEND);
                        
                        sysdev_suspend(PMSG_SUSPEND);
                        /* suspend hypercall */
                        sysdev_resume();
                
                dpm_resume_noirq(PMSG_RESUME);
        
        dpm_resume_end(PMSG_RESUME);

However the suspend hypercall can return a value indicating that the
suspend didn't actually happen (e.g. was cancelled). This is used e.g.
when checkpointing guests, because in that case you want the original
guest to continue. When the suspend didn't happen the drivers need to
recover differently from if it did.

The originally proposed solution was to only call dpm_resume_end if the
suspend was not cancelled. My concern with this was that unbalancing the
dpm_suspend_* and dpm_resume_* did not seem like a correct interaction
with the core. For example dpm_suspend_* adds stuff to
dpm_suspended_list and if dpm_resume_* is not called presumably this all
gets out of sync for next time. Hence my suggestion to add a cancel
message type.

> In my opinion it's highly unrealistic to assume that device drivers
> (or even subsystems) will implement the ->cancel() callback just for the
> benefit of Xen.

I did vague wonder if a similar message might be of interest to e.g.
cancelling hibernations or similar.

> And if the only subsystem that needs to implement ->cancel() is Xen, then the
> issue should be addressed without modifying the device core code, in a different
> way.

I thought it would be preferable to make use of/extend core
functionality where possible but if that's not the case we can find
another way.

Do you have any suggestions for how to correctly interact with the core
functions?

Is adding a suspend_cancel operation to just at the struct xenbus_driver
level and introducing a xen specific function to walk to the bus the
sort of thing you were thinking of? (it seems reasonable). Should we be
doing anything with dpm_*_list in that case?

(FWIW original thread is on xen-devel at
http://thread.gmane.org/gmane.comp.emulators.xen.devel/95265/)

Ian.

-- 
Ian Campbell
Current Noise: Crowbar - Dead Sun

Why on earth do people buy old bottles of wine when they can get a
fresh one for a quarter of the price?


  reply	other threads:[~2011-02-08 11:23 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <20110207.180718.885783824287845425.kaz@jp.fujitsu.com>
2011-02-07  9:35 ` Rafael J. Wysocki
2011-02-08 11:22   ` Ian Campbell [this message]
2011-02-08 16:46     ` [linux-pm] " Alan Stern
2011-02-08 17:35       ` Ian Campbell
2011-02-09 23:16         ` Brendan Cully
2011-02-09 23:42           ` Alan Stern
2011-02-10 11:40             ` [Xen-devel] " Ian Campbell
2011-02-10 16:00               ` Alan Stern
2011-02-10 16:26                 ` Rafael J. Wysocki
2011-02-10 16:34                 ` Ian Campbell
2011-02-10 17:01                   ` Rafael J. Wysocki
2011-02-10 18:56                   ` Alan Stern
2011-02-10 11:31           ` Ian Campbell
2011-02-10 17:53             ` Brendan Cully

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1297164178.9388.65.camel@zakaz.uk.xensource.com \
    --to=ijc@hellion.org.uk \
    --cc=greg@kroah.com \
    --cc=kaz@jp.fujitsu.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-pm@lists.linux-foundation.org \
    --cc=rjw@sisk.pl \
    --cc=xen-devel@lists.xensource.com \
    --subject='Re: [PATCH 0/2] Fix hangup after creating checkpoint on Xen.' \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).