LKML Archive on lore.kernel.org
help / color / mirror / Atom feed
From: Christian Borntraeger <borntraeger@de.ibm.com>
To: Claudio Imbrenda <imbrenda@linux.ibm.com>, kvm@vger.kernel.org
Cc: cohuck@redhat.com, frankja@linux.ibm.com, thuth@redhat.com,
	pasic@linux.ibm.com, david@redhat.com,
	linux-s390@vger.kernel.org, linux-kernel@vger.kernel.org,
	Ulrich.Weigand@de.ibm.com
Subject: Re: [PATCH v4 05/14] KVM: s390: pv: leak the ASCE page when destroy fails
Date: Mon, 6 Sep 2021 17:32:36 +0200	[thread overview]
Message-ID: <36ce2f10-a65d-ff2a-3a11-8f2cd853f3e9@de.ibm.com> (raw)
In-Reply-To: <20210818132620.46770-6-imbrenda@linux.ibm.com>

The subject should say

KVM: s390: pv: leak the topmost page table when destroy fails


On 18.08.21 15:26, Claudio Imbrenda wrote:
> When a protected VM is created, the topmost level of page tables of its
> ASCE is marked by the Ultravisor; any attempt to use that memory for
> protected virtualization will result in failure.


maybe rephrase that to
Each secure guest must have a unique address space control element and we
must avoid that new guests will use the same ASCE to avoid an error. As
the ASCE mostly consists of the top most page table address (and flags)
we must not return that memory to the pool unless the ASCE is no longer
used.

Only a a successful Destroy Configuration UVC will make the ASCE no longer
collide.
When the Destroy Configuration UVC fails, the ASCE cannot be reused for a
secure guest ASCE. To avoid a collision, it must not be used again.

  
> Only a successful Destroy Configuration UVC will remove the marking.
> 
> When the Destroy Configuration UVC fails, the topmost level of page
> tables of the VM does not get its marking cleared; to avoid issues it
> must not be used again.
> 
> This is a permanent error and the page becomes in practice unusable, so
> we set it aside and leak it.

Maybe add: on failure we already leak other memory that has ultravisor marking (the
variable and base storage for a guest) and not setting the ASCE aside (by
leaking the topmost page table) was an oversight.

Or something like that

maybe also add that we usually do not expect to see such error under normal
circumstances.

> 
> Signed-off-by: Claudio Imbrenda <imbrenda@linux.ibm.com>
> ---
>   arch/s390/include/asm/gmap.h |  2 ++
>   arch/s390/kvm/pv.c           |  4 ++-
>   arch/s390/mm/gmap.c          | 55 ++++++++++++++++++++++++++++++++++++
>   3 files changed, 60 insertions(+), 1 deletion(-)
> 
> diff --git a/arch/s390/include/asm/gmap.h b/arch/s390/include/asm/gmap.h
> index 40264f60b0da..746e18bf8984 100644
> --- a/arch/s390/include/asm/gmap.h
> +++ b/arch/s390/include/asm/gmap.h
> @@ -148,4 +148,6 @@ void gmap_sync_dirty_log_pmd(struct gmap *gmap, unsigned long dirty_bitmap[4],
>   			     unsigned long gaddr, unsigned long vmaddr);
>   int gmap_mark_unmergeable(void);
>   void s390_reset_acc(struct mm_struct *mm);
> +void s390_remove_old_asce(struct gmap *gmap);
> +int s390_replace_asce(struct gmap *gmap);
>   #endif /* _ASM_S390_GMAP_H */
> diff --git a/arch/s390/kvm/pv.c b/arch/s390/kvm/pv.c
> index 00d272d134c2..76b0d64ce8fa 100644
> --- a/arch/s390/kvm/pv.c
> +++ b/arch/s390/kvm/pv.c
> @@ -168,9 +168,11 @@ int kvm_s390_pv_deinit_vm(struct kvm *kvm, u16 *rc, u16 *rrc)
>   	atomic_set(&kvm->mm->context.is_protected, 0);
>   	KVM_UV_EVENT(kvm, 3, "PROTVIRT DESTROY VM: rc %x rrc %x", *rc, *rrc);
>   	WARN_ONCE(cc, "protvirt destroy vm failed rc %x rrc %x", *rc, *rrc);
> -	/* Inteded memory leak on "impossible" error */
> +	/* Intended memory leak on "impossible" error */
>   	if (!cc)
>   		kvm_s390_pv_dealloc_vm(kvm);
> +	else
> +		s390_replace_asce(kvm->arch.gmap);
>   	return cc ? -EIO : 0;
>   }
>   
> diff --git a/arch/s390/mm/gmap.c b/arch/s390/mm/gmap.c
> index 9bb2c7512cd5..5a138f6220c4 100644
> --- a/arch/s390/mm/gmap.c
> +++ b/arch/s390/mm/gmap.c
> @@ -2706,3 +2706,58 @@ void s390_reset_acc(struct mm_struct *mm)
>   	mmput(mm);
>   }
>   EXPORT_SYMBOL_GPL(s390_reset_acc);
> +
> +/*
> + * Remove the topmost level of page tables from the list of page tables of
> + * the gmap.
> + * This means that it will not be freed when the VM is torn down, and needs
> + * to be handled separately by the caller, unless an intentional leak is
> + * intended.
> + */
> +void s390_remove_old_asce(struct gmap *gmap)
> +{
> +	struct page *old;
> +
> +	old = virt_to_page(gmap->table);
> +	spin_lock(&gmap->guest_table_lock);
> +	list_del(&old->lru);
> +	spin_unlock(&gmap->guest_table_lock);
> +	/* in case the ASCE needs to be "removed" multiple times */
> +	INIT_LIST_HEAD(&old->lru);
shouldn't that also be under the spin_lock?

> +}
> +EXPORT_SYMBOL_GPL(s390_remove_old_asce);
> +
> +/*
> + * Try to replace the current ASCE with another equivalent one.
> + * If the allocation of the new top level page table fails, the ASCE is not
> + * replaced.
> + * In any case, the old ASCE is removed from the list, therefore the caller
> + * has to make sure to save a pointer to it beforehands, unless an
> + * intentional leak is intended.
> + */
> +int s390_replace_asce(struct gmap *gmap)
> +{
> +	unsigned long asce;
> +	struct page *page;
> +	void *table;
> +
> +	s390_remove_old_asce(gmap);
> +
> +	page = alloc_pages(GFP_KERNEL_ACCOUNT, CRST_ALLOC_ORDER);
> +	if (!page)
> +		return -ENOMEM;

It seems that we do not handle errors in our caller?

> +	table = page_to_virt(page);
> +	memcpy(table, gmap->table, 1UL << (CRST_ALLOC_ORDER + PAGE_SHIFT));
> +
> +	spin_lock(&gmap->guest_table_lock);
> +	list_add(&page->lru, &gmap->crst_list);
> +	spin_unlock(&gmap->guest_table_lock);
> +
> +	asce = (gmap->asce & ~PAGE_MASK) | __pa(table);

Instead of PAGE_MASK better use _ASCE_ORIGIN ?
> +	WRITE_ONCE(gmap->asce, asce);
> +	WRITE_ONCE(gmap->mm->context.gmap_asce, asce);
> +	WRITE_ONCE(gmap->table, table);
> +
> +	return 0;
> +}
> +EXPORT_SYMBOL_GPL(s390_replace_asce);
> 

  reply	other threads:[~2021-09-06 15:32 UTC|newest]

Thread overview: 31+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-08-18 13:26 [PATCH v4 00/14] KVM: s390: pv: implement lazy destroy for reboot Claudio Imbrenda
2021-08-18 13:26 ` [PATCH v4 01/14] KVM: s390: pv: add macros for UVC CC values Claudio Imbrenda
2021-08-31 13:43   ` Christian Borntraeger
2021-08-18 13:26 ` [PATCH v4 02/14] KVM: s390: pv: avoid double free of sida page Claudio Imbrenda
2021-08-31 13:55   ` Christian Borntraeger
2021-09-08 18:50     ` Claudio Imbrenda
2021-08-31 13:59   ` Janosch Frank
2021-08-18 13:26 ` [PATCH v4 03/14] KVM: s390: pv: avoid stalls for kvm_s390_pv_init_vm Claudio Imbrenda
2021-08-31 14:10   ` Christian Borntraeger
2021-08-18 13:26 ` [PATCH v4 04/14] KVM: s390: pv: avoid stalls when making pages secure Claudio Imbrenda
2021-08-31 14:32   ` Christian Borntraeger
2021-08-31 15:00     ` Claudio Imbrenda
2021-08-31 15:11       ` Christian Borntraeger
2021-08-18 13:26 ` [PATCH v4 05/14] KVM: s390: pv: leak the ASCE page when destroy fails Claudio Imbrenda
2021-09-06 15:32   ` Christian Borntraeger [this message]
2021-09-06 15:54     ` Claudio Imbrenda
2021-08-18 13:26 ` [PATCH v4 06/14] KVM: s390: pv: properly handle page flags for protected guests Claudio Imbrenda
2021-09-06 15:46   ` Christian Borntraeger
2021-09-06 15:56     ` Claudio Imbrenda
2021-09-06 16:16       ` Christian Borntraeger
2021-09-17 14:57         ` Claudio Imbrenda
2021-08-18 13:26 ` [PATCH v4 07/14] KVM: s390: pv: handle secure storage violations " Claudio Imbrenda
2021-08-18 13:26 ` [PATCH v4 08/14] KVM: s390: pv: handle secure storage exceptions for normal guests Claudio Imbrenda
2021-08-18 13:26 ` [PATCH v4 09/14] KVM: s390: pv: refactor s390_reset_acc Claudio Imbrenda
2021-08-18 13:26 ` [PATCH v4 10/14] KVM: s390: pv: usage counter instead of flag Claudio Imbrenda
2021-08-26  7:58   ` Janis Schoetterl-Glausch
2021-08-18 13:26 ` [PATCH v4 11/14] KVM: s390: pv: add export before import Claudio Imbrenda
2021-08-18 13:26 ` [PATCH v4 12/14] KVM: s390: pv: module parameter to fence lazy destroy Claudio Imbrenda
2021-08-18 13:26 ` [PATCH v4 13/14] KVM: s390: pv: lazy destroy for reboot Claudio Imbrenda
2021-08-26  8:33   ` Janis Schoetterl-Glausch
2021-08-18 13:26 ` [PATCH v4 14/14] KVM: s390: pv: avoid export before import if possible Claudio Imbrenda

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=36ce2f10-a65d-ff2a-3a11-8f2cd853f3e9@de.ibm.com \
    --to=borntraeger@de.ibm.com \
    --cc=Ulrich.Weigand@de.ibm.com \
    --cc=cohuck@redhat.com \
    --cc=david@redhat.com \
    --cc=frankja@linux.ibm.com \
    --cc=imbrenda@linux.ibm.com \
    --cc=kvm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-s390@vger.kernel.org \
    --cc=pasic@linux.ibm.com \
    --cc=thuth@redhat.com \
    --subject='Re: [PATCH v4 05/14] KVM: s390: pv: leak the ASCE page when destroy fails' \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).