LKML Archive on lore.kernel.org
help / color / mirror / Atom feed
* [PATCH v3 0/3] s390x: KVM: CPU Topology
@ 2021-08-03  8:26 Pierre Morel
  2021-08-03  8:26 ` [PATCH v3 1/3] s390x: KVM: accept STSI for CPU topology information Pierre Morel
                   ` (2 more replies)
  0 siblings, 3 replies; 25+ messages in thread
From: Pierre Morel @ 2021-08-03  8:26 UTC (permalink / raw)
  To: kvm
  Cc: linux-s390, linux-kernel, borntraeger, frankja, cohuck, david,
	thuth, imbrenda, hca, gor, pmorel

Hi all,

This new series add the implementation of interpretation for
the PTF instruction.

The series is devided in three parts:
1- handling of the STSI instruction forwarding the CPU topology
2- implementation of the interpretation of the PTF instruction
3- use of the PTF interpretation to optimize topology change callback

1- STSI
To provide Topology information to the guest through the STSI
instruction, we need to forward STSI with Function Code 15 to
QEMU which will take care to provide the right information to
the guest.

To let the guest use both the PTF instruction  to check if a topology
change occured and sthe STSI_15.x.x instruction we add a new KVM
capability to enable the topology facility.

2- PTF
To implement PTF interpretation we make the MTCR pending when the
last CPU backed by the vCPU changed from one socket to another.

The PTF instruction will report a topology change if there is any change
with a previous STSI_15_2 SYSIB.
Changes inside a STSI_15_2 SYSIB occur if CPU bits are set or clear
inside the CPU Topology List Entry CPU mask field, which happens with
changes in CPU polarization, dedication, CPU types and adding or
removing CPUs in a socket.

The reporting to the guest is done using the Multiprocessor
Topology-Change-Report (MTCR) bit of the utility entry of the guest's
SCA which will be cleared during the interpretation of PTF.

To check if the topology has been modified we use a new field of the
arch vCPU to save the previous real CPU ID at the end of a schedule
and verify on next schedule that the CPU used is in the same socket.

We deliberatly ignore:
- polarization: only horizontal polarization is currently used in linux.
- CPU Type: only IFL Type are supported in Linux
- Dedication: we consider that only a complete dedicated CPU stack can
  take benefit of the CPU Topology and let the admin take care of that.


Regards,
Pierre


Pierre Morel (3):
  s390x: KVM: accept STSI for CPU topology information
  s390x: KVM: Implementation of Multiprocessor Topology-Change-Report
  s390x: optimization of the check for CPU topology change

 arch/s390/include/asm/kvm_host.h | 14 +++++++---
 arch/s390/kernel/topology.c      |  3 ++
 arch/s390/kvm/kvm-s390.c         | 48 +++++++++++++++++++++++++++++++-
 arch/s390/kvm/priv.c             |  7 ++++-
 arch/s390/kvm/vsie.c             |  3 ++
 include/uapi/linux/kvm.h         |  1 +
 6 files changed, 70 insertions(+), 6 deletions(-)

-- 
2.25.1

Changelog:

from v2 to v3

- use PTF interpretation
  (Christian)

- optimize arch_update_cpu_topology using PTF
  (Pierre)

from v1 to v2:

- Add a KVM capability to let QEMU know we support PTF and STSI 15
  (David)

- check KVM facility 11 before accepting STSI fc 15
  (David)

- handle all we can in userland
  (David)

- add tracing to STSI fc 15
  (Connie)


^ permalink raw reply	[flat|nested] 25+ messages in thread

* [PATCH v3 1/3] s390x: KVM: accept STSI for CPU topology information
  2021-08-03  8:26 [PATCH v3 0/3] s390x: KVM: CPU Topology Pierre Morel
@ 2021-08-03  8:26 ` Pierre Morel
  2021-08-31 13:59   ` David Hildenbrand
  2021-08-03  8:26 ` [PATCH v3 2/3] s390x: KVM: Implementation of Multiprocessor Topology-Change-Report Pierre Morel
  2021-08-03  8:26 ` [PATCH v3 3/3] s390x: optimization of the check for CPU topology change Pierre Morel
  2 siblings, 1 reply; 25+ messages in thread
From: Pierre Morel @ 2021-08-03  8:26 UTC (permalink / raw)
  To: kvm
  Cc: linux-s390, linux-kernel, borntraeger, frankja, cohuck, david,
	thuth, imbrenda, hca, gor, pmorel

STSI(15.1.x) gives information on the CPU configuration topology.
Let's accept the interception of STSI with the function code 15 and
let the userland part of the hypervisor handle it when userland
support the CPU Topology facility.

Signed-off-by: Pierre Morel <pmorel@linux.ibm.com>
---
 arch/s390/kvm/priv.c | 7 ++++++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/arch/s390/kvm/priv.c b/arch/s390/kvm/priv.c
index 9928f785c677..8581b6881212 100644
--- a/arch/s390/kvm/priv.c
+++ b/arch/s390/kvm/priv.c
@@ -856,7 +856,8 @@ static int handle_stsi(struct kvm_vcpu *vcpu)
 	if (vcpu->arch.sie_block->gpsw.mask & PSW_MASK_PSTATE)
 		return kvm_s390_inject_program_int(vcpu, PGM_PRIVILEGED_OP);
 
-	if (fc > 3) {
+	if ((fc > 3 && fc != 15) ||
+	    (fc == 15 && !test_kvm_facility(vcpu->kvm, 11))) {
 		kvm_s390_set_psw_cc(vcpu, 3);
 		return 0;
 	}
@@ -893,6 +894,10 @@ static int handle_stsi(struct kvm_vcpu *vcpu)
 			goto out_no_data;
 		handle_stsi_3_2_2(vcpu, (void *) mem);
 		break;
+	case 15:
+		trace_kvm_s390_handle_stsi(vcpu, fc, sel1, sel2, operand2);
+		insert_stsi_usr_data(vcpu, operand2, ar, fc, sel1, sel2);
+		return -EREMOTE;
 	}
 	if (kvm_s390_pv_cpu_is_protected(vcpu)) {
 		memcpy((void *)sida_origin(vcpu->arch.sie_block), (void *)mem,
-- 
2.25.1


^ permalink raw reply	[flat|nested] 25+ messages in thread

* [PATCH v3 2/3] s390x: KVM: Implementation of Multiprocessor Topology-Change-Report
  2021-08-03  8:26 [PATCH v3 0/3] s390x: KVM: CPU Topology Pierre Morel
  2021-08-03  8:26 ` [PATCH v3 1/3] s390x: KVM: accept STSI for CPU topology information Pierre Morel
@ 2021-08-03  8:26 ` Pierre Morel
  2021-08-31 14:03   ` David Hildenbrand
  2021-09-06 18:37   ` David Hildenbrand
  2021-08-03  8:26 ` [PATCH v3 3/3] s390x: optimization of the check for CPU topology change Pierre Morel
  2 siblings, 2 replies; 25+ messages in thread
From: Pierre Morel @ 2021-08-03  8:26 UTC (permalink / raw)
  To: kvm
  Cc: linux-s390, linux-kernel, borntraeger, frankja, cohuck, david,
	thuth, imbrenda, hca, gor, pmorel

We let the userland hypervisor know if the machine support the CPU
topology facility using a new KVM capability: KVM_CAP_S390_CPU_TOPOLOGY.

The PTF instruction will report a topology change if there is any change
with a previous STSI_15_2 SYSIB.
Changes inside a STSI_15_2 SYSIB occur if CPU bits are set or clear
inside the CPU Topology List Entry CPU mask field, which happens with
changes in CPU polarization, dedication, CPU types and adding or
removing CPUs in a socket.

The reporting to the guest is done using the Multiprocessor
Topology-Change-Report (MTCR) bit of the utility entry of the guest's
SCA which will be cleared during the interpretation of PTF.

To check if the topology has been modified we use a new field of the
arch vCPU to save the previous real CPU ID at the end of a schedule
and verify on next schedule that the CPU used is in the same socket.

We deliberatly ignore:
- polarization: only horizontal polarization is currently used in linux.
- CPU Type: only IFL Type are supported in Linux
- Dedication: we consider that only a complete dedicated CPU stack can
  take benefit of the CPU Topology.

Signed-off-by: Pierre Morel <pmorel@linux.ibm.com>
---
 arch/s390/include/asm/kvm_host.h | 14 +++++++---
 arch/s390/kvm/kvm-s390.c         | 48 +++++++++++++++++++++++++++++++-
 arch/s390/kvm/vsie.c             |  3 ++
 include/uapi/linux/kvm.h         |  1 +
 4 files changed, 61 insertions(+), 5 deletions(-)

diff --git a/arch/s390/include/asm/kvm_host.h b/arch/s390/include/asm/kvm_host.h
index 9b4473f76e56..b7effdc96a7a 100644
--- a/arch/s390/include/asm/kvm_host.h
+++ b/arch/s390/include/asm/kvm_host.h
@@ -95,15 +95,19 @@ struct bsca_block {
 	union ipte_control ipte_control;
 	__u64	reserved[5];
 	__u64	mcn;
-	__u64	reserved2;
+#define ESCA_UTILITY_MTCR	0x8000
+	__u16	utility;
+	__u8	reserved2[6];
 	struct bsca_entry cpu[KVM_S390_BSCA_CPU_SLOTS];
 };
 
 struct esca_block {
 	union ipte_control ipte_control;
-	__u64   reserved1[7];
+	__u64   reserved1[6];
+	__u16	utility;
+	__u8	reserved2[6];
 	__u64   mcn[4];
-	__u64   reserved2[20];
+	__u64   reserved3[20];
 	struct esca_entry cpu[KVM_S390_ESCA_CPU_SLOTS];
 };
 
@@ -228,7 +232,7 @@ struct kvm_s390_sie_block {
 	__u8	icptcode;		/* 0x0050 */
 	__u8	icptstatus;		/* 0x0051 */
 	__u16	ihcpu;			/* 0x0052 */
-	__u8	reserved54;		/* 0x0054 */
+	__u8	mtcr;			/* 0x0054 */
 #define IICTL_CODE_NONE		 0x00
 #define IICTL_CODE_MCHK		 0x01
 #define IICTL_CODE_EXT		 0x02
@@ -246,6 +250,7 @@ struct kvm_s390_sie_block {
 #define ECB_TE		0x10
 #define ECB_SRSI	0x04
 #define ECB_HOSTPROTINT	0x02
+#define ECB_PTF		0x01
 	__u8	ecb;			/* 0x0061 */
 #define ECB2_CMMA	0x80
 #define ECB2_IEP	0x20
@@ -747,6 +752,7 @@ struct kvm_vcpu_arch {
 	bool skey_enabled;
 	struct kvm_s390_pv_vcpu pv;
 	union diag318_info diag318_info;
+	int prev_cpu;
 };
 
 struct kvm_vm_stat {
diff --git a/arch/s390/kvm/kvm-s390.c b/arch/s390/kvm/kvm-s390.c
index b655a7d82bf0..ff6d8a2b511c 100644
--- a/arch/s390/kvm/kvm-s390.c
+++ b/arch/s390/kvm/kvm-s390.c
@@ -568,6 +568,7 @@ int kvm_vm_ioctl_check_extension(struct kvm *kvm, long ext)
 	case KVM_CAP_S390_VCPU_RESETS:
 	case KVM_CAP_SET_GUEST_DEBUG:
 	case KVM_CAP_S390_DIAG318:
+	case KVM_CAP_S390_CPU_TOPOLOGY:
 		r = 1;
 		break;
 	case KVM_CAP_SET_GUEST_DEBUG2:
@@ -819,6 +820,23 @@ int kvm_vm_ioctl_enable_cap(struct kvm *kvm, struct kvm_enable_cap *cap)
 		icpt_operexc_on_all_vcpus(kvm);
 		r = 0;
 		break;
+	case KVM_CAP_S390_CPU_TOPOLOGY:
+		mutex_lock(&kvm->lock);
+		if (kvm->created_vcpus) {
+			r = -EBUSY;
+		} else {
+			set_kvm_facility(kvm->arch.model.fac_mask, 11);
+			set_kvm_facility(kvm->arch.model.fac_list, 11);
+			r = 0;
+		}
+		mutex_unlock(&kvm->lock);
+		VM_EVENT(kvm, 3, "ENABLE: CPU TOPOLOGY %s",
+			 r ? "(not available)" : "(success)");
+		break;
+
+		r = -EINVAL;
+		break;
+
 	default:
 		r = -EINVAL;
 		break;
@@ -3067,18 +3085,41 @@ __u64 kvm_s390_get_cpu_timer(struct kvm_vcpu *vcpu)
 	return value;
 }
 
-void kvm_arch_vcpu_load(struct kvm_vcpu *vcpu, int cpu)
+static void kvm_s390_set_mtcr(struct kvm_vcpu *vcpu)
 {
+	struct esca_block *esca = vcpu->kvm->arch.sca;
+
+	if (vcpu->arch.sie_block->ecb & ECB_PTF) {
+		ipte_lock(vcpu);
+		WRITE_ONCE(esca->utility, ESCA_UTILITY_MTCR);
+		ipte_unlock(vcpu);
+	}
+}
 
+void kvm_arch_vcpu_load(struct kvm_vcpu *vcpu, int cpu)
+{
 	gmap_enable(vcpu->arch.enabled_gmap);
 	kvm_s390_set_cpuflags(vcpu, CPUSTAT_RUNNING);
 	if (vcpu->arch.cputm_enabled && !is_vcpu_idle(vcpu))
 		__start_cpu_timer_accounting(vcpu);
 	vcpu->cpu = cpu;
+
+	/*
+	 * With PTF interpretation the guest will be aware of topology
+	 * change by the Multiprocessor Topology-Change-Report is pending.
+	 * Check for reasons to make the MTCR pending and make it pending.
+	 */
+	if ((vcpu->arch.sie_block->ecb & ECB_PTF) &&
+	    cpu != vcpu->arch.prev_cpu) {
+		if (cpu_topology[cpu].socket_id !=
+		    cpu_topology[vcpu->arch.prev_cpu].socket_id)
+			kvm_s390_set_mtcr(vcpu);
+	}
 }
 
 void kvm_arch_vcpu_put(struct kvm_vcpu *vcpu)
 {
+	vcpu->arch.prev_cpu = vcpu->cpu;
 	vcpu->cpu = -1;
 	if (vcpu->arch.cputm_enabled && !is_vcpu_idle(vcpu))
 		__stop_cpu_timer_accounting(vcpu);
@@ -3198,6 +3239,11 @@ static int kvm_s390_vcpu_setup(struct kvm_vcpu *vcpu)
 		vcpu->arch.sie_block->ecb |= ECB_HOSTPROTINT;
 	if (test_kvm_facility(vcpu->kvm, 9))
 		vcpu->arch.sie_block->ecb |= ECB_SRSI;
+
+	/* PTF needs both host and guest facilities to enable interpretation */
+	if (test_kvm_facility(vcpu->kvm, 11) && test_facility(11))
+		vcpu->arch.sie_block->ecb |= ECB_PTF;
+
 	if (test_kvm_facility(vcpu->kvm, 73))
 		vcpu->arch.sie_block->ecb |= ECB_TE;
 
diff --git a/arch/s390/kvm/vsie.c b/arch/s390/kvm/vsie.c
index 4002a24bc43a..50d67190bf65 100644
--- a/arch/s390/kvm/vsie.c
+++ b/arch/s390/kvm/vsie.c
@@ -503,6 +503,9 @@ static int shadow_scb(struct kvm_vcpu *vcpu, struct vsie_page *vsie_page)
 	/* Host-protection-interruption introduced with ESOP */
 	if (test_kvm_cpu_feat(vcpu->kvm, KVM_S390_VM_CPU_FEAT_ESOP))
 		scb_s->ecb |= scb_o->ecb & ECB_HOSTPROTINT;
+	/* CPU Topology */
+	if (test_kvm_facility(vcpu->kvm, 11))
+		scb_s->ecb |= scb_o->ecb & ECB_PTF;
 	/* transactional execution */
 	if (test_kvm_facility(vcpu->kvm, 73) && wants_tx) {
 		/* remap the prefix is tx is toggled on */
diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
index d9e4aabcb31a..081ce0cd44b9 100644
--- a/include/uapi/linux/kvm.h
+++ b/include/uapi/linux/kvm.h
@@ -1112,6 +1112,7 @@ struct kvm_ppc_resize_hpt {
 #define KVM_CAP_BINARY_STATS_FD 203
 #define KVM_CAP_EXIT_ON_EMULATION_FAILURE 204
 #define KVM_CAP_ARM_MTE 205
+#define KVM_CAP_S390_CPU_TOPOLOGY 206
 
 #ifdef KVM_CAP_IRQ_ROUTING
 
-- 
2.25.1


^ permalink raw reply	[flat|nested] 25+ messages in thread

* [PATCH v3 3/3] s390x: optimization of the check for CPU topology change
  2021-08-03  8:26 [PATCH v3 0/3] s390x: KVM: CPU Topology Pierre Morel
  2021-08-03  8:26 ` [PATCH v3 1/3] s390x: KVM: accept STSI for CPU topology information Pierre Morel
  2021-08-03  8:26 ` [PATCH v3 2/3] s390x: KVM: Implementation of Multiprocessor Topology-Change-Report Pierre Morel
@ 2021-08-03  8:26 ` Pierre Morel
  2021-08-03  8:42   ` Heiko Carstens
  2 siblings, 1 reply; 25+ messages in thread
From: Pierre Morel @ 2021-08-03  8:26 UTC (permalink / raw)
  To: kvm
  Cc: linux-s390, linux-kernel, borntraeger, frankja, cohuck, david,
	thuth, imbrenda, hca, gor, pmorel

Now that the PTF instruction is interpreted by the SIE we can optimize
the arch_update_cpu_topology callback to check if there is a real need
to update the topology by using the PTF instruction.

Signed-off-by: Pierre Morel <pmorel@linux.ibm.com>
---
 arch/s390/kernel/topology.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/arch/s390/kernel/topology.c b/arch/s390/kernel/topology.c
index 26aa2614ee35..741cb447e78e 100644
--- a/arch/s390/kernel/topology.c
+++ b/arch/s390/kernel/topology.c
@@ -322,6 +322,9 @@ int arch_update_cpu_topology(void)
 	struct device *dev;
 	int cpu, rc;
 
+	if (!ptf(PTF_CHECK))
+		return 0;
+
 	rc = __arch_update_cpu_topology();
 	on_each_cpu(__arch_update_dedicated_flag, NULL, 0);
 	for_each_online_cpu(cpu) {
-- 
2.25.1


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v3 3/3] s390x: optimization of the check for CPU topology change
  2021-08-03  8:26 ` [PATCH v3 3/3] s390x: optimization of the check for CPU topology change Pierre Morel
@ 2021-08-03  8:42   ` Heiko Carstens
  2021-08-03  8:57     ` Pierre Morel
  0 siblings, 1 reply; 25+ messages in thread
From: Heiko Carstens @ 2021-08-03  8:42 UTC (permalink / raw)
  To: Pierre Morel
  Cc: kvm, linux-s390, linux-kernel, borntraeger, frankja, cohuck,
	david, thuth, imbrenda, gor

On Tue, Aug 03, 2021 at 10:26:46AM +0200, Pierre Morel wrote:
> Now that the PTF instruction is interpreted by the SIE we can optimize
> the arch_update_cpu_topology callback to check if there is a real need
> to update the topology by using the PTF instruction.
> 
> Signed-off-by: Pierre Morel <pmorel@linux.ibm.com>
> ---
>  arch/s390/kernel/topology.c | 3 +++
>  1 file changed, 3 insertions(+)
> 
> diff --git a/arch/s390/kernel/topology.c b/arch/s390/kernel/topology.c
> index 26aa2614ee35..741cb447e78e 100644
> --- a/arch/s390/kernel/topology.c
> +++ b/arch/s390/kernel/topology.c
> @@ -322,6 +322,9 @@ int arch_update_cpu_topology(void)
>  	struct device *dev;
>  	int cpu, rc;
>  
> +	if (!ptf(PTF_CHECK))
> +		return 0;
> +

We have a timer which checks if topology changed and then triggers a
call to arch_update_cpu_topology() via rebuild_sched_domains().
With this change topology changes would get lost.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v3 3/3] s390x: optimization of the check for CPU topology change
  2021-08-03  8:42   ` Heiko Carstens
@ 2021-08-03  8:57     ` Pierre Morel
  2021-08-03  9:28       ` Pierre Morel
  0 siblings, 1 reply; 25+ messages in thread
From: Pierre Morel @ 2021-08-03  8:57 UTC (permalink / raw)
  To: Heiko Carstens
  Cc: kvm, linux-s390, linux-kernel, borntraeger, frankja, cohuck,
	david, thuth, imbrenda, gor



On 8/3/21 10:42 AM, Heiko Carstens wrote:
> On Tue, Aug 03, 2021 at 10:26:46AM +0200, Pierre Morel wrote:
>> Now that the PTF instruction is interpreted by the SIE we can optimize
>> the arch_update_cpu_topology callback to check if there is a real need
>> to update the topology by using the PTF instruction.
>>
>> Signed-off-by: Pierre Morel <pmorel@linux.ibm.com>
>> ---
>>   arch/s390/kernel/topology.c | 3 +++
>>   1 file changed, 3 insertions(+)
>>
>> diff --git a/arch/s390/kernel/topology.c b/arch/s390/kernel/topology.c
>> index 26aa2614ee35..741cb447e78e 100644
>> --- a/arch/s390/kernel/topology.c
>> +++ b/arch/s390/kernel/topology.c
>> @@ -322,6 +322,9 @@ int arch_update_cpu_topology(void)
>>   	struct device *dev;
>>   	int cpu, rc;
>>   
>> +	if (!ptf(PTF_CHECK))
>> +		return 0;
>> +
> 
> We have a timer which checks if topology changed and then triggers a
> call to arch_update_cpu_topology() via rebuild_sched_domains().
> With this change topology changes would get lost.

For my understanding, if PTF check return 0 it means that there are no 
topology changes.
So they could not get lost.

What did I miss?


-- 
Pierre Morel
IBM Lab Boeblingen

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v3 3/3] s390x: optimization of the check for CPU topology change
  2021-08-03  8:57     ` Pierre Morel
@ 2021-08-03  9:28       ` Pierre Morel
  0 siblings, 0 replies; 25+ messages in thread
From: Pierre Morel @ 2021-08-03  9:28 UTC (permalink / raw)
  To: Heiko Carstens
  Cc: kvm, linux-s390, linux-kernel, borntraeger, frankja, cohuck,
	david, thuth, imbrenda, gor



On 8/3/21 10:57 AM, Pierre Morel wrote:
> 
> 
> On 8/3/21 10:42 AM, Heiko Carstens wrote:
>> On Tue, Aug 03, 2021 at 10:26:46AM +0200, Pierre Morel wrote:
>>> Now that the PTF instruction is interpreted by the SIE we can optimize
>>> the arch_update_cpu_topology callback to check if there is a real need
>>> to update the topology by using the PTF instruction.
>>>
>>> Signed-off-by: Pierre Morel <pmorel@linux.ibm.com>
>>> ---
>>>   arch/s390/kernel/topology.c | 3 +++
>>>   1 file changed, 3 insertions(+)
>>>
>>> diff --git a/arch/s390/kernel/topology.c b/arch/s390/kernel/topology.c
>>> index 26aa2614ee35..741cb447e78e 100644
>>> --- a/arch/s390/kernel/topology.c
>>> +++ b/arch/s390/kernel/topology.c
>>> @@ -322,6 +322,9 @@ int arch_update_cpu_topology(void)
>>>       struct device *dev;
>>>       int cpu, rc;
>>> +    if (!ptf(PTF_CHECK))
>>> +        return 0;
>>> +
>>
>> We have a timer which checks if topology changed and then triggers a
>> call to arch_update_cpu_topology() via rebuild_sched_domains().
>> With this change topology changes would get lost.
> 
> For my understanding, if PTF check return 0 it means that there are no 
> topology changes.
> So they could not get lost.
> 
> What did I miss?
> 
> 
I missed that PTF clears the MCTR... and only one of the two calls will 
return 1 while we need both to return 1...


-- 
Pierre Morel
IBM Lab Boeblingen

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v3 1/3] s390x: KVM: accept STSI for CPU topology information
  2021-08-03  8:26 ` [PATCH v3 1/3] s390x: KVM: accept STSI for CPU topology information Pierre Morel
@ 2021-08-31 13:59   ` David Hildenbrand
  2021-09-01  9:43     ` Pierre Morel
  0 siblings, 1 reply; 25+ messages in thread
From: David Hildenbrand @ 2021-08-31 13:59 UTC (permalink / raw)
  To: Pierre Morel, kvm
  Cc: linux-s390, linux-kernel, borntraeger, frankja, cohuck, thuth,
	imbrenda, hca, gor

On 03.08.21 10:26, Pierre Morel wrote:
> STSI(15.1.x) gives information on the CPU configuration topology.
> Let's accept the interception of STSI with the function code 15 and
> let the userland part of the hypervisor handle it when userland
> support the CPU Topology facility.
> 
> Signed-off-by: Pierre Morel <pmorel@linux.ibm.com>
> ---
>   arch/s390/kvm/priv.c | 7 ++++++-
>   1 file changed, 6 insertions(+), 1 deletion(-)
> 
> diff --git a/arch/s390/kvm/priv.c b/arch/s390/kvm/priv.c
> index 9928f785c677..8581b6881212 100644
> --- a/arch/s390/kvm/priv.c
> +++ b/arch/s390/kvm/priv.c
> @@ -856,7 +856,8 @@ static int handle_stsi(struct kvm_vcpu *vcpu)
>   	if (vcpu->arch.sie_block->gpsw.mask & PSW_MASK_PSTATE)
>   		return kvm_s390_inject_program_int(vcpu, PGM_PRIVILEGED_OP);
>   
> -	if (fc > 3) {
> +	if ((fc > 3 && fc != 15) ||
> +	    (fc == 15 && !test_kvm_facility(vcpu->kvm, 11))) {
>   		kvm_s390_set_psw_cc(vcpu, 3);
>   		return 0;
>   	}
> @@ -893,6 +894,10 @@ static int handle_stsi(struct kvm_vcpu *vcpu)
>   			goto out_no_data;
>   		handle_stsi_3_2_2(vcpu, (void *) mem);
>   		break;
> +	case 15:
> +		trace_kvm_s390_handle_stsi(vcpu, fc, sel1, sel2, operand2);
> +		insert_stsi_usr_data(vcpu, operand2, ar, fc, sel1, sel2);
> +		return -EREMOTE;
>   	}
>   	if (kvm_s390_pv_cpu_is_protected(vcpu)) {
>   		memcpy((void *)sida_origin(vcpu->arch.sie_block), (void *)mem,
> 

Sorry, I'm a bit rusty on s390x kvm facility handling.


For test_kvm_facility() to succeed, the facility has to be in both:

a) fac_mask: actually available on the HW and supported by KVM 
(kvm_s390_fac_base via FACILITIES_KVM, kvm_s390_fac_ext via 
FACILITIES_KVM_CPUMODEL)

b) fac_list: enabled for a VM

AFAIU, facility 11 is neither in FACILITIES_KVM nor 
FACILITIES_KVM_CPUMODEL, and I remember it's a hypervisor-managed bit.

So unless we unlock facility 11 in FACILITIES_KVM_CPUMODEL, will 
test_kvm_facility(vcpu->kvm, 11) ever successfully trigger here?


I'm pretty sure I am messing something up :)

-- 
Thanks,

David / dhildenb


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v3 2/3] s390x: KVM: Implementation of Multiprocessor Topology-Change-Report
  2021-08-03  8:26 ` [PATCH v3 2/3] s390x: KVM: Implementation of Multiprocessor Topology-Change-Report Pierre Morel
@ 2021-08-31 14:03   ` David Hildenbrand
  2021-09-01  9:46     ` Pierre Morel
  2021-09-06 18:37   ` David Hildenbrand
  1 sibling, 1 reply; 25+ messages in thread
From: David Hildenbrand @ 2021-08-31 14:03 UTC (permalink / raw)
  To: Pierre Morel, kvm
  Cc: linux-s390, linux-kernel, borntraeger, frankja, cohuck, thuth,
	imbrenda, hca, gor

On 03.08.21 10:26, Pierre Morel wrote:
> We let the userland hypervisor know if the machine support the CPU
> topology facility using a new KVM capability: KVM_CAP_S390_CPU_TOPOLOGY.
> 
> The PTF instruction will report a topology change if there is any change
> with a previous STSI_15_2 SYSIB.
> Changes inside a STSI_15_2 SYSIB occur if CPU bits are set or clear
> inside the CPU Topology List Entry CPU mask field, which happens with
> changes in CPU polarization, dedication, CPU types and adding or
> removing CPUs in a socket.
> 
> The reporting to the guest is done using the Multiprocessor
> Topology-Change-Report (MTCR) bit of the utility entry of the guest's
> SCA which will be cleared during the interpretation of PTF.
> 
> To check if the topology has been modified we use a new field of the
> arch vCPU to save the previous real CPU ID at the end of a schedule
> and verify on next schedule that the CPU used is in the same socket.
> 
> We deliberatly ignore:
> - polarization: only horizontal polarization is currently used in linux.
> - CPU Type: only IFL Type are supported in Linux
> - Dedication: we consider that only a complete dedicated CPU stack can
>    take benefit of the CPU Topology.
> 
> Signed-off-by: Pierre Morel <pmorel@linux.ibm.com>
> ---
>   arch/s390/include/asm/kvm_host.h | 14 +++++++---
>   arch/s390/kvm/kvm-s390.c         | 48 +++++++++++++++++++++++++++++++-
>   arch/s390/kvm/vsie.c             |  3 ++
>   include/uapi/linux/kvm.h         |  1 +
>   4 files changed, 61 insertions(+), 5 deletions(-)
> 
> diff --git a/arch/s390/include/asm/kvm_host.h b/arch/s390/include/asm/kvm_host.h
> index 9b4473f76e56..b7effdc96a7a 100644
> --- a/arch/s390/include/asm/kvm_host.h
> +++ b/arch/s390/include/asm/kvm_host.h
> @@ -95,15 +95,19 @@ struct bsca_block {
>   	union ipte_control ipte_control;
>   	__u64	reserved[5];
>   	__u64	mcn;
> -	__u64	reserved2;
> +#define ESCA_UTILITY_MTCR	0x8000
> +	__u16	utility;
> +	__u8	reserved2[6];
>   	struct bsca_entry cpu[KVM_S390_BSCA_CPU_SLOTS];
>   };
>   
>   struct esca_block {
>   	union ipte_control ipte_control;
> -	__u64   reserved1[7];
> +	__u64   reserved1[6];
> +	__u16	utility;
> +	__u8	reserved2[6];
>   	__u64   mcn[4];
> -	__u64   reserved2[20];
> +	__u64   reserved3[20];
>   	struct esca_entry cpu[KVM_S390_ESCA_CPU_SLOTS];
>   };
>   
> @@ -228,7 +232,7 @@ struct kvm_s390_sie_block {
>   	__u8	icptcode;		/* 0x0050 */
>   	__u8	icptstatus;		/* 0x0051 */
>   	__u16	ihcpu;			/* 0x0052 */
> -	__u8	reserved54;		/* 0x0054 */
> +	__u8	mtcr;			/* 0x0054 */
>   #define IICTL_CODE_NONE		 0x00
>   #define IICTL_CODE_MCHK		 0x01
>   #define IICTL_CODE_EXT		 0x02
> @@ -246,6 +250,7 @@ struct kvm_s390_sie_block {
>   #define ECB_TE		0x10
>   #define ECB_SRSI	0x04
>   #define ECB_HOSTPROTINT	0x02
> +#define ECB_PTF		0x01
>   	__u8	ecb;			/* 0x0061 */
>   #define ECB2_CMMA	0x80
>   #define ECB2_IEP	0x20
> @@ -747,6 +752,7 @@ struct kvm_vcpu_arch {
>   	bool skey_enabled;
>   	struct kvm_s390_pv_vcpu pv;
>   	union diag318_info diag318_info;
> +	int prev_cpu;
>   };
>   
>   struct kvm_vm_stat {
> diff --git a/arch/s390/kvm/kvm-s390.c b/arch/s390/kvm/kvm-s390.c
> index b655a7d82bf0..ff6d8a2b511c 100644
> --- a/arch/s390/kvm/kvm-s390.c
> +++ b/arch/s390/kvm/kvm-s390.c
> @@ -568,6 +568,7 @@ int kvm_vm_ioctl_check_extension(struct kvm *kvm, long ext)
>   	case KVM_CAP_S390_VCPU_RESETS:
>   	case KVM_CAP_SET_GUEST_DEBUG:
>   	case KVM_CAP_S390_DIAG318:
> +	case KVM_CAP_S390_CPU_TOPOLOGY:
>   		r = 1;
>   		break;
>   	case KVM_CAP_SET_GUEST_DEBUG2:
> @@ -819,6 +820,23 @@ int kvm_vm_ioctl_enable_cap(struct kvm *kvm, struct kvm_enable_cap *cap)
>   		icpt_operexc_on_all_vcpus(kvm);
>   		r = 0;
>   		break;
> +	case KVM_CAP_S390_CPU_TOPOLOGY:
> +		mutex_lock(&kvm->lock);
> +		if (kvm->created_vcpus) {
> +			r = -EBUSY;
> +		} else {
> +			set_kvm_facility(kvm->arch.model.fac_mask, 11);
> +			set_kvm_facility(kvm->arch.model.fac_list, 11);
> +			r = 0;
> +		}
> +		mutex_unlock(&kvm->lock);
> +		VM_EVENT(kvm, 3, "ENABLE: CPU TOPOLOGY %s",
> +			 r ? "(not available)" : "(success)");
> +		break;
> +
> +		r = -EINVAL;
> +		break;
> +
>   	default:
>   		r = -EINVAL;
>   		break;
> @@ -3067,18 +3085,41 @@ __u64 kvm_s390_get_cpu_timer(struct kvm_vcpu *vcpu)
>   	return value;
>   }
>   
> -void kvm_arch_vcpu_load(struct kvm_vcpu *vcpu, int cpu)
> +static void kvm_s390_set_mtcr(struct kvm_vcpu *vcpu)
>   {
> +	struct esca_block *esca = vcpu->kvm->arch.sca;
> +
> +	if (vcpu->arch.sie_block->ecb & ECB_PTF) {
> +		ipte_lock(vcpu);
> +		WRITE_ONCE(esca->utility, ESCA_UTILITY_MTCR);
> +		ipte_unlock(vcpu);
> +	}
> +}
>   
> +void kvm_arch_vcpu_load(struct kvm_vcpu *vcpu, int cpu)
> +{
>   	gmap_enable(vcpu->arch.enabled_gmap);
>   	kvm_s390_set_cpuflags(vcpu, CPUSTAT_RUNNING);
>   	if (vcpu->arch.cputm_enabled && !is_vcpu_idle(vcpu))
>   		__start_cpu_timer_accounting(vcpu);
>   	vcpu->cpu = cpu;
> +
> +	/*
> +	 * With PTF interpretation the guest will be aware of topology
> +	 * change by the Multiprocessor Topology-Change-Report is pending.
> +	 * Check for reasons to make the MTCR pending and make it pending.
> +	 */
> +	if ((vcpu->arch.sie_block->ecb & ECB_PTF) &&
> +	    cpu != vcpu->arch.prev_cpu) {
> +		if (cpu_topology[cpu].socket_id !=
> +		    cpu_topology[vcpu->arch.prev_cpu].socket_id)
> +			kvm_s390_set_mtcr(vcpu);
> +	}
>   }
>   
>   void kvm_arch_vcpu_put(struct kvm_vcpu *vcpu)
>   {
> +	vcpu->arch.prev_cpu = vcpu->cpu;
>   	vcpu->cpu = -1;
>   	if (vcpu->arch.cputm_enabled && !is_vcpu_idle(vcpu))
>   		__stop_cpu_timer_accounting(vcpu);
> @@ -3198,6 +3239,11 @@ static int kvm_s390_vcpu_setup(struct kvm_vcpu *vcpu)
>   		vcpu->arch.sie_block->ecb |= ECB_HOSTPROTINT;
>   	if (test_kvm_facility(vcpu->kvm, 9))
>   		vcpu->arch.sie_block->ecb |= ECB_SRSI;
> +
> +	/* PTF needs both host and guest facilities to enable interpretation */
> +	if (test_kvm_facility(vcpu->kvm, 11) && test_facility(11))
> +		vcpu->arch.sie_block->ecb |= ECB_PTF;


Again, doesn't test_kvm_facility(vcpu->kvm, 11) imply that we have host 
support by checking fac_mask?

-- 
Thanks,

David / dhildenb


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v3 1/3] s390x: KVM: accept STSI for CPU topology information
  2021-08-31 13:59   ` David Hildenbrand
@ 2021-09-01  9:43     ` Pierre Morel
  2021-09-06 18:14       ` David Hildenbrand
  0 siblings, 1 reply; 25+ messages in thread
From: Pierre Morel @ 2021-09-01  9:43 UTC (permalink / raw)
  To: David Hildenbrand, kvm
  Cc: linux-s390, linux-kernel, borntraeger, frankja, cohuck, thuth,
	imbrenda, hca, gor



On 8/31/21 3:59 PM, David Hildenbrand wrote:
> On 03.08.21 10:26, Pierre Morel wrote:
>> STSI(15.1.x) gives information on the CPU configuration topology.
>> Let's accept the interception of STSI with the function code 15 and
>> let the userland part of the hypervisor handle it when userland
>> support the CPU Topology facility.
>>
>> Signed-off-by: Pierre Morel <pmorel@linux.ibm.com>
>> ---
>>   arch/s390/kvm/priv.c | 7 ++++++-
>>   1 file changed, 6 insertions(+), 1 deletion(-)
>>
>> diff --git a/arch/s390/kvm/priv.c b/arch/s390/kvm/priv.c
>> index 9928f785c677..8581b6881212 100644
>> --- a/arch/s390/kvm/priv.c
>> +++ b/arch/s390/kvm/priv.c
>> @@ -856,7 +856,8 @@ static int handle_stsi(struct kvm_vcpu *vcpu)
>>       if (vcpu->arch.sie_block->gpsw.mask & PSW_MASK_PSTATE)
>>           return kvm_s390_inject_program_int(vcpu, PGM_PRIVILEGED_OP);
>> -    if (fc > 3) {
>> +    if ((fc > 3 && fc != 15) ||
>> +        (fc == 15 && !test_kvm_facility(vcpu->kvm, 11))) {
>>           kvm_s390_set_psw_cc(vcpu, 3);
>>           return 0;
>>       }
>> @@ -893,6 +894,10 @@ static int handle_stsi(struct kvm_vcpu *vcpu)
>>               goto out_no_data;
>>           handle_stsi_3_2_2(vcpu, (void *) mem);
>>           break;
>> +    case 15:
>> +        trace_kvm_s390_handle_stsi(vcpu, fc, sel1, sel2, operand2);
>> +        insert_stsi_usr_data(vcpu, operand2, ar, fc, sel1, sel2);
>> +        return -EREMOTE;
>>       }
>>       if (kvm_s390_pv_cpu_is_protected(vcpu)) {
>>           memcpy((void *)sida_origin(vcpu->arch.sie_block), (void *)mem,
>>
> 
> Sorry, I'm a bit rusty on s390x kvm facility handling.
> 
> 
> For test_kvm_facility() to succeed, the facility has to be in both:
> 
> a) fac_mask: actually available on the HW and supported by KVM 
> (kvm_s390_fac_base via FACILITIES_KVM, kvm_s390_fac_ext via 
> FACILITIES_KVM_CPUMODEL)
> 
> b) fac_list: enabled for a VM
> 
> AFAIU, facility 11 is neither in FACILITIES_KVM nor 
> FACILITIES_KVM_CPUMODEL, and I remember it's a hypervisor-managed bit.
> 
> So unless we unlock facility 11 in FACILITIES_KVM_CPUMODEL, will 
> test_kvm_facility(vcpu->kvm, 11) ever successfully trigger here?
> 
> 
> I'm pretty sure I am messing something up :)
> 

I think it is the same remark that Christian did as wanted me to use the 
arch/s390/tools/gen_facilities.c to activate the facility.

The point is that CONFIGURATION_TOPOLOGY, STFL, 11, is already defined 
inside QEMU since full_GEN10_GA1, so the test_kvm_facility() will 
succeed with the next patch setting the facility 11 in the mask when 
getting the KVM_CAP_S390_CPU_TOPOLOGY from userland.

But if we activate it in KVM via any of the FACILITIES_KVM_xxx in the 
gen_facilities.c we will activate it for the guest what ever userland 
hypervizor we have, including old QEMU which will generate an exception.


In this circumstances we have the choice between:

- use FACILITY_KVM and handle everything in kernel
- use FACILITY_KVM and use an extra CAPABILITY to handle part in kernel 
to avoid guest crash and part in userland
- use only the extra CAPABILITY and handle almost everything in userland

I want to avoid kernel code when not really necessary so I eliminated 
the first option.

The last two are not very different but I found a better integration 
using the last one, allowing to use standard test_[kvm_]facility()

Thanks for reviewing.

Regards,
Pierre

-- 
Pierre Morel
IBM Lab Boeblingen

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v3 2/3] s390x: KVM: Implementation of Multiprocessor Topology-Change-Report
  2021-08-31 14:03   ` David Hildenbrand
@ 2021-09-01  9:46     ` Pierre Morel
  0 siblings, 0 replies; 25+ messages in thread
From: Pierre Morel @ 2021-09-01  9:46 UTC (permalink / raw)
  To: David Hildenbrand, kvm
  Cc: linux-s390, linux-kernel, borntraeger, frankja, cohuck, thuth,
	imbrenda, hca, gor



On 8/31/21 4:03 PM, David Hildenbrand wrote:
> On 03.08.21 10:26, Pierre Morel wrote:

...snip...


>> @@ -819,6 +820,23 @@ int kvm_vm_ioctl_enable_cap(struct kvm *kvm, 
>> struct kvm_enable_cap *cap)
>>           icpt_operexc_on_all_vcpus(kvm);
>>           r = 0;
>>           break;
>> +    case KVM_CAP_S390_CPU_TOPOLOGY:
>> +        mutex_lock(&kvm->lock);
>> +        if (kvm->created_vcpus) {
>> +            r = -EBUSY;
>> +        } else {
>> +            set_kvm_facility(kvm->arch.model.fac_mask, 11);
>> +            set_kvm_facility(kvm->arch.model.fac_list, 11);
>> +            r = 0;
>> +        }
>> +        mutex_unlock(&kvm->lock);
>> +        VM_EVENT(kvm, 3, "ENABLE: CPU TOPOLOGY %s",
>> +             r ? "(not available)" : "(success)");
>> +        break;
>> +
>> +        r = -EINVAL;
>> +        break;
>> +
>>       default:
>>           r = -EINVAL;
>>           break;

This above enables the facility 11.

...snip...

>> @@ -3198,6 +3239,11 @@ static int kvm_s390_vcpu_setup(struct kvm_vcpu 
>> *vcpu)
>>           vcpu->arch.sie_block->ecb |= ECB_HOSTPROTINT;
>>       if (test_kvm_facility(vcpu->kvm, 9))
>>           vcpu->arch.sie_block->ecb |= ECB_SRSI;
>> +
>> +    /* PTF needs both host and guest facilities to enable 
>> interpretation */
>> +    if (test_kvm_facility(vcpu->kvm, 11) && test_facility(11))
>> +        vcpu->arch.sie_block->ecb |= ECB_PTF;
> 
> 
> Again, doesn't test_kvm_facility(vcpu->kvm, 11) imply that we have host 
> support by checking fac_mask?
> 

-- 
Pierre Morel
IBM Lab Boeblingen

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v3 1/3] s390x: KVM: accept STSI for CPU topology information
  2021-09-01  9:43     ` Pierre Morel
@ 2021-09-06 18:14       ` David Hildenbrand
  2021-09-07 10:11         ` Pierre Morel
  0 siblings, 1 reply; 25+ messages in thread
From: David Hildenbrand @ 2021-09-06 18:14 UTC (permalink / raw)
  To: Pierre Morel, kvm
  Cc: linux-s390, linux-kernel, borntraeger, frankja, cohuck, thuth,
	imbrenda, hca, gor

On 01.09.21 11:43, Pierre Morel wrote:
> 
> 
> On 8/31/21 3:59 PM, David Hildenbrand wrote:
>> On 03.08.21 10:26, Pierre Morel wrote:
>>> STSI(15.1.x) gives information on the CPU configuration topology.
>>> Let's accept the interception of STSI with the function code 15 and
>>> let the userland part of the hypervisor handle it when userland
>>> support the CPU Topology facility.
>>>
>>> Signed-off-by: Pierre Morel <pmorel@linux.ibm.com>
>>> ---
>>>    arch/s390/kvm/priv.c | 7 ++++++-
>>>    1 file changed, 6 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/arch/s390/kvm/priv.c b/arch/s390/kvm/priv.c
>>> index 9928f785c677..8581b6881212 100644
>>> --- a/arch/s390/kvm/priv.c
>>> +++ b/arch/s390/kvm/priv.c
>>> @@ -856,7 +856,8 @@ static int handle_stsi(struct kvm_vcpu *vcpu)
>>>        if (vcpu->arch.sie_block->gpsw.mask & PSW_MASK_PSTATE)
>>>            return kvm_s390_inject_program_int(vcpu, PGM_PRIVILEGED_OP);
>>> -    if (fc > 3) {
>>> +    if ((fc > 3 && fc != 15) ||
>>> +        (fc == 15 && !test_kvm_facility(vcpu->kvm, 11))) {
>>>            kvm_s390_set_psw_cc(vcpu, 3);
>>>            return 0;
>>>        }
>>> @@ -893,6 +894,10 @@ static int handle_stsi(struct kvm_vcpu *vcpu)
>>>                goto out_no_data;
>>>            handle_stsi_3_2_2(vcpu, (void *) mem);
>>>            break;
>>> +    case 15:
>>> +        trace_kvm_s390_handle_stsi(vcpu, fc, sel1, sel2, operand2);
>>> +        insert_stsi_usr_data(vcpu, operand2, ar, fc, sel1, sel2);
>>> +        return -EREMOTE;
>>>        }
>>>        if (kvm_s390_pv_cpu_is_protected(vcpu)) {
>>>            memcpy((void *)sida_origin(vcpu->arch.sie_block), (void *)mem,
>>>
>>
>> Sorry, I'm a bit rusty on s390x kvm facility handling.
>>
>>
>> For test_kvm_facility() to succeed, the facility has to be in both:
>>
>> a) fac_mask: actually available on the HW and supported by KVM
>> (kvm_s390_fac_base via FACILITIES_KVM, kvm_s390_fac_ext via
>> FACILITIES_KVM_CPUMODEL)
>>
>> b) fac_list: enabled for a VM
>>
>> AFAIU, facility 11 is neither in FACILITIES_KVM nor
>> FACILITIES_KVM_CPUMODEL, and I remember it's a hypervisor-managed bit.
>>
>> So unless we unlock facility 11 in FACILITIES_KVM_CPUMODEL, will
>> test_kvm_facility(vcpu->kvm, 11) ever successfully trigger here?
>>
>>
>> I'm pretty sure I am messing something up :)
>>
> 
> I think it is the same remark that Christian did as wanted me to use the
> arch/s390/tools/gen_facilities.c to activate the facility.
> 
> The point is that CONFIGURATION_TOPOLOGY, STFL, 11, is already defined
> inside QEMU since full_GEN10_GA1, so the test_kvm_facility() will
> succeed with the next patch setting the facility 11 in the mask when
> getting the KVM_CAP_S390_CPU_TOPOLOGY from userland.

Ok, I see ...

QEMU knows the facility and as soon as we present it to QEMU, QEMU will 
want to automatically enable it in the "host" model.

However, we'd like QEMU to join in and handle some part of it.

So indeed, handling it like KVM_CAP_S390_VECTOR_REGISTERS or 
KVM_CAP_S390_RI looks like a reasonable approach.

> 
> But if we activate it in KVM via any of the FACILITIES_KVM_xxx in the
> gen_facilities.c we will activate it for the guest what ever userland
> hypervizor we have, including old QEMU which will generate an exception.
> 
> 
> In this circumstances we have the choice between:
> 
> - use FACILITY_KVM and handle everything in kernel
> - use FACILITY_KVM and use an extra CAPABILITY to handle part in kernel
> to avoid guest crash and part in userland

This sounds quite nice to me. Implement minimal kernel support and 
indicate the facility via stfl to user space.

In addition, add a new capability that intercepts to user space instead.


... but I can understand that it might not be worth it.


This patch as it stands doesn't make any sense on its own. Either 
document how it's supposed to work and why it is currently dead code, or 
simply squash into the next patch (preferred IMHO).

-- 
Thanks,

David / dhildenb


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v3 2/3] s390x: KVM: Implementation of Multiprocessor Topology-Change-Report
  2021-08-03  8:26 ` [PATCH v3 2/3] s390x: KVM: Implementation of Multiprocessor Topology-Change-Report Pierre Morel
  2021-08-31 14:03   ` David Hildenbrand
@ 2021-09-06 18:37   ` David Hildenbrand
  2021-09-07 10:24     ` Pierre Morel
                       ` (2 more replies)
  1 sibling, 3 replies; 25+ messages in thread
From: David Hildenbrand @ 2021-09-06 18:37 UTC (permalink / raw)
  To: Pierre Morel, kvm
  Cc: linux-s390, linux-kernel, borntraeger, frankja, cohuck, thuth,
	imbrenda, hca, gor

On 03.08.21 10:26, Pierre Morel wrote:
> We let the userland hypervisor know if the machine support the CPU
> topology facility using a new KVM capability: KVM_CAP_S390_CPU_TOPOLOGY.
> 
> The PTF instruction will report a topology change if there is any change
> with a previous STSI_15_2 SYSIB.
> Changes inside a STSI_15_2 SYSIB occur if CPU bits are set or clear
> inside the CPU Topology List Entry CPU mask field, which happens with
> changes in CPU polarization, dedication, CPU types and adding or
> removing CPUs in a socket.
> 
> The reporting to the guest is done using the Multiprocessor
> Topology-Change-Report (MTCR) bit of the utility entry of the guest's
> SCA which will be cleared during the interpretation of PTF.
> 
> To check if the topology has been modified we use a new field of the
> arch vCPU to save the previous real CPU ID at the end of a schedule
> and verify on next schedule that the CPU used is in the same socket.
> 
> We deliberatly ignore:
> - polarization: only horizontal polarization is currently used in linux.
> - CPU Type: only IFL Type are supported in Linux
> - Dedication: we consider that only a complete dedicated CPU stack can
>    take benefit of the CPU Topology.
> 
> Signed-off-by: Pierre Morel <pmorel@linux.ibm.com>


> @@ -228,7 +232,7 @@ struct kvm_s390_sie_block {
>   	__u8	icptcode;		/* 0x0050 */
>   	__u8	icptstatus;		/* 0x0051 */
>   	__u16	ihcpu;			/* 0x0052 */
> -	__u8	reserved54;		/* 0x0054 */
> +	__u8	mtcr;			/* 0x0054 */
>   #define IICTL_CODE_NONE		 0x00
>   #define IICTL_CODE_MCHK		 0x01
>   #define IICTL_CODE_EXT		 0x02
> @@ -246,6 +250,7 @@ struct kvm_s390_sie_block {
>   #define ECB_TE		0x10
>   #define ECB_SRSI	0x04
>   #define ECB_HOSTPROTINT	0x02
> +#define ECB_PTF		0x01

 From below I understand, that ECB_PTF can be used with stfl(11) in the 
hypervisor.

What is to happen if the hypervisor doesn't support stfl(11) and we 
consequently cannot use ECB_PTF? Will QEMU be able to emulate PTF fully?


>   	__u8	ecb;			/* 0x0061 */
>   #define ECB2_CMMA	0x80
>   #define ECB2_IEP	0x20
> @@ -747,6 +752,7 @@ struct kvm_vcpu_arch {
>   	bool skey_enabled;
>   	struct kvm_s390_pv_vcpu pv;
>   	union diag318_info diag318_info;
> +	int prev_cpu;
>   };
>   
>   struct kvm_vm_stat {
> diff --git a/arch/s390/kvm/kvm-s390.c b/arch/s390/kvm/kvm-s390.c
> index b655a7d82bf0..ff6d8a2b511c 100644
> --- a/arch/s390/kvm/kvm-s390.c
> +++ b/arch/s390/kvm/kvm-s390.c
> @@ -568,6 +568,7 @@ int kvm_vm_ioctl_check_extension(struct kvm *kvm, long ext)
>   	case KVM_CAP_S390_VCPU_RESETS:
>   	case KVM_CAP_SET_GUEST_DEBUG:
>   	case KVM_CAP_S390_DIAG318:
> +	case KVM_CAP_S390_CPU_TOPOLOGY:

I would have expected instead

r = test_facility(11);
break

...

>   		r = 1;
>   		break;
>   	case KVM_CAP_SET_GUEST_DEBUG2:
> @@ -819,6 +820,23 @@ int kvm_vm_ioctl_enable_cap(struct kvm *kvm, struct kvm_enable_cap *cap)
>   		icpt_operexc_on_all_vcpus(kvm);
>   		r = 0;
>   		break;
> +	case KVM_CAP_S390_CPU_TOPOLOGY:
> +		mutex_lock(&kvm->lock);
> +		if (kvm->created_vcpus) {
> +			r = -EBUSY;
> +		} else {

...
} else if (test_facility(11)) {
	set_kvm_facility(kvm->arch.model.fac_mask, 11);
	set_kvm_facility(kvm->arch.model.fac_list, 11);
	r = 0;
} else {
	r = -EINVAL;
}

similar to how we handle KVM_CAP_S390_VECTOR_REGISTERS.

But I assume you want to be able to support hosts without ECB_PTF, correct?


> +			set_kvm_facility(kvm->arch.model.fac_mask, 11);
> +			set_kvm_facility(kvm->arch.model.fac_list, 11);
> +			r = 0;
> +		}
> +		mutex_unlock(&kvm->lock);
> +		VM_EVENT(kvm, 3, "ENABLE: CPU TOPOLOGY %s",
> +			 r ? "(not available)" : "(success)");
> +		break;
> +
> +		r = -EINVAL;
> +		break;

^ dead code

[...]

>   }
>   
>   void kvm_arch_vcpu_put(struct kvm_vcpu *vcpu)
>   {
> +	vcpu->arch.prev_cpu = vcpu->cpu;
>   	vcpu->cpu = -1;
>   	if (vcpu->arch.cputm_enabled && !is_vcpu_idle(vcpu))
>   		__stop_cpu_timer_accounting(vcpu);
> @@ -3198,6 +3239,11 @@ static int kvm_s390_vcpu_setup(struct kvm_vcpu *vcpu)
>   		vcpu->arch.sie_block->ecb |= ECB_HOSTPROTINT;
>   	if (test_kvm_facility(vcpu->kvm, 9))
>   		vcpu->arch.sie_block->ecb |= ECB_SRSI;
> +
> +	/* PTF needs both host and guest facilities to enable interpretation */
> +	if (test_kvm_facility(vcpu->kvm, 11) && test_facility(11))
> +		vcpu->arch.sie_block->ecb |= ECB_PTF;

Here you say we need both ...

> +
>   	if (test_kvm_facility(vcpu->kvm, 73))
>   		vcpu->arch.sie_block->ecb |= ECB_TE;
>   
> diff --git a/arch/s390/kvm/vsie.c b/arch/s390/kvm/vsie.c
> index 4002a24bc43a..50d67190bf65 100644
> --- a/arch/s390/kvm/vsie.c
> +++ b/arch/s390/kvm/vsie.c
> @@ -503,6 +503,9 @@ static int shadow_scb(struct kvm_vcpu *vcpu, struct vsie_page *vsie_page)
>   	/* Host-protection-interruption introduced with ESOP */
>   	if (test_kvm_cpu_feat(vcpu->kvm, KVM_S390_VM_CPU_FEAT_ESOP))
>   		scb_s->ecb |= scb_o->ecb & ECB_HOSTPROTINT;
> +	/* CPU Topology */
> +	if (test_kvm_facility(vcpu->kvm, 11))
> +		scb_s->ecb |= scb_o->ecb & ECB_PTF;

but here you don't check?

>   	/* transactional execution */
>   	if (test_kvm_facility(vcpu->kvm, 73) && wants_tx) {
>   		/* remap the prefix is tx is toggled on */
> diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
> index d9e4aabcb31a..081ce0cd44b9 100644
> --- a/include/uapi/linux/kvm.h
> +++ b/include/uapi/linux/kvm.h
> @@ -1112,6 +1112,7 @@ struct kvm_ppc_resize_hpt {
>   #define KVM_CAP_BINARY_STATS_FD 203
>   #define KVM_CAP_EXIT_ON_EMULATION_FAILURE 204
>   #define KVM_CAP_ARM_MTE 205
> +#define KVM_CAP_S390_CPU_TOPOLOGY 206
>   

We'll need a Documentation/virt/kvm/api.rst description.

I'm not completely confident that the way we're handling the 
capability+facility is the right approach. It all feels a bit suboptimal.

Except stfl(74) -- STHYI --, we never enable a facility via 
set_kvm_facility() that's not available in the host. And STHYI is 
special such that it is never implemented in hardware.

I'll think about what might be cleaner once I get some more details 
about the interaction with stfl(11) in the hypervisor.

-- 
Thanks,

David / dhildenb


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v3 1/3] s390x: KVM: accept STSI for CPU topology information
  2021-09-06 18:14       ` David Hildenbrand
@ 2021-09-07 10:11         ` Pierre Morel
  0 siblings, 0 replies; 25+ messages in thread
From: Pierre Morel @ 2021-09-07 10:11 UTC (permalink / raw)
  To: David Hildenbrand, kvm
  Cc: linux-s390, linux-kernel, borntraeger, frankja, cohuck, thuth,
	imbrenda, hca, gor



On 9/6/21 8:14 PM, David Hildenbrand wrote:
> On 01.09.21 11:43, Pierre Morel wrote:
>>
>>
>> On 8/31/21 3:59 PM, David Hildenbrand wrote:
>>> On 03.08.21 10:26, Pierre Morel wrote:
>>>> STSI(15.1.x) gives information on the CPU configuration topology.
>>>> Let's accept the interception of STSI with the function code 15 and
>>>> let the userland part of the hypervisor handle it when userland
>>>> support the CPU Topology facility.
>>>>
>>>> Signed-off-by: Pierre Morel <pmorel@linux.ibm.com>
>>>> ---
>>>>    arch/s390/kvm/priv.c | 7 ++++++-
>>>>    1 file changed, 6 insertions(+), 1 deletion(-)
>>>>
>>>> diff --git a/arch/s390/kvm/priv.c b/arch/s390/kvm/priv.c
>>>> index 9928f785c677..8581b6881212 100644
>>>> --- a/arch/s390/kvm/priv.c
>>>> +++ b/arch/s390/kvm/priv.c
>>>> @@ -856,7 +856,8 @@ static int handle_stsi(struct kvm_vcpu *vcpu)
>>>>        if (vcpu->arch.sie_block->gpsw.mask & PSW_MASK_PSTATE)
>>>>            return kvm_s390_inject_program_int(vcpu, PGM_PRIVILEGED_OP);
>>>> -    if (fc > 3) {
>>>> +    if ((fc > 3 && fc != 15) ||
>>>> +        (fc == 15 && !test_kvm_facility(vcpu->kvm, 11))) {
>>>>            kvm_s390_set_psw_cc(vcpu, 3);
>>>>            return 0;
>>>>        }
>>>> @@ -893,6 +894,10 @@ static int handle_stsi(struct kvm_vcpu *vcpu)
>>>>                goto out_no_data;
>>>>            handle_stsi_3_2_2(vcpu, (void *) mem);
>>>>            break;
>>>> +    case 15:
>>>> +        trace_kvm_s390_handle_stsi(vcpu, fc, sel1, sel2, operand2);
>>>> +        insert_stsi_usr_data(vcpu, operand2, ar, fc, sel1, sel2);
>>>> +        return -EREMOTE;
>>>>        }
>>>>        if (kvm_s390_pv_cpu_is_protected(vcpu)) {
>>>>            memcpy((void *)sida_origin(vcpu->arch.sie_block), (void 
>>>> *)mem,
>>>>
>>>
>>> Sorry, I'm a bit rusty on s390x kvm facility handling.
>>>
>>>
>>> For test_kvm_facility() to succeed, the facility has to be in both:
>>>
>>> a) fac_mask: actually available on the HW and supported by KVM
>>> (kvm_s390_fac_base via FACILITIES_KVM, kvm_s390_fac_ext via
>>> FACILITIES_KVM_CPUMODEL)
>>>
>>> b) fac_list: enabled for a VM
>>>
>>> AFAIU, facility 11 is neither in FACILITIES_KVM nor
>>> FACILITIES_KVM_CPUMODEL, and I remember it's a hypervisor-managed bit.
>>>
>>> So unless we unlock facility 11 in FACILITIES_KVM_CPUMODEL, will
>>> test_kvm_facility(vcpu->kvm, 11) ever successfully trigger here?
>>>
>>>
>>> I'm pretty sure I am messing something up :)
>>>
>>
>> I think it is the same remark that Christian did as wanted me to use the
>> arch/s390/tools/gen_facilities.c to activate the facility.
>>
>> The point is that CONFIGURATION_TOPOLOGY, STFL, 11, is already defined
>> inside QEMU since full_GEN10_GA1, so the test_kvm_facility() will
>> succeed with the next patch setting the facility 11 in the mask when
>> getting the KVM_CAP_S390_CPU_TOPOLOGY from userland.
> 
> Ok, I see ...
> 
> QEMU knows the facility and as soon as we present it to QEMU, QEMU will 
> want to automatically enable it in the "host" model.
> 
> However, we'd like QEMU to join in and handle some part of it.
> 
> So indeed, handling it like KVM_CAP_S390_VECTOR_REGISTERS or 
> KVM_CAP_S390_RI looks like a reasonable approach.
> 
>>
>> But if we activate it in KVM via any of the FACILITIES_KVM_xxx in the
>> gen_facilities.c we will activate it for the guest what ever userland
>> hypervizor we have, including old QEMU which will generate an exception.
>>
>>
>> In this circumstances we have the choice between:
>>
>> - use FACILITY_KVM and handle everything in kernel
>> - use FACILITY_KVM and use an extra CAPABILITY to handle part in kernel
>> to avoid guest crash and part in userland
> 
> This sounds quite nice to me. Implement minimal kernel support and 
> indicate the facility via stfl to user space.
> 
> In addition, add a new capability that intercepts to user space instead.
> 
> 
> ... but I can understand that it might not be worth it.

yes, since we need a CAPABILITY anyway I find it makes things more 
complicated.
> 
> 
> This patch as it stands doesn't make any sense on its own. Either 
> document how it's supposed to work and why it is currently dead code, or 
> simply squash into the next patch (preferred IMHO).
> 

Yes, you are right, I will squash it with the next patch.

Thanks,
Pierre

-- 
Pierre Morel
IBM Lab Boeblingen

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v3 2/3] s390x: KVM: Implementation of Multiprocessor Topology-Change-Report
  2021-09-06 18:37   ` David Hildenbrand
@ 2021-09-07 10:24     ` Pierre Morel
  2021-09-08  7:04       ` Christian Borntraeger
  2021-09-07 12:28     ` Pierre Morel
  2021-09-09  9:03     ` Pierre Morel
  2 siblings, 1 reply; 25+ messages in thread
From: Pierre Morel @ 2021-09-07 10:24 UTC (permalink / raw)
  To: David Hildenbrand, kvm
  Cc: linux-s390, linux-kernel, borntraeger, frankja, cohuck, thuth,
	imbrenda, hca, gor



On 9/6/21 8:37 PM, David Hildenbrand wrote:
> On 03.08.21 10:26, Pierre Morel wrote:
>> We let the userland hypervisor know if the machine support the CPU
>> topology facility using a new KVM capability: KVM_CAP_S390_CPU_TOPOLOGY.
>>
>> The PTF instruction will report a topology change if there is any change
>> with a previous STSI_15_2 SYSIB.
>> Changes inside a STSI_15_2 SYSIB occur if CPU bits are set or clear
>> inside the CPU Topology List Entry CPU mask field, which happens with
>> changes in CPU polarization, dedication, CPU types and adding or
>> removing CPUs in a socket.
>>
>> The reporting to the guest is done using the Multiprocessor
>> Topology-Change-Report (MTCR) bit of the utility entry of the guest's
>> SCA which will be cleared during the interpretation of PTF.
>>
>> To check if the topology has been modified we use a new field of the
>> arch vCPU to save the previous real CPU ID at the end of a schedule
>> and verify on next schedule that the CPU used is in the same socket.
>>
>> We deliberatly ignore:
>> - polarization: only horizontal polarization is currently used in linux.
>> - CPU Type: only IFL Type are supported in Linux
>> - Dedication: we consider that only a complete dedicated CPU stack can
>>    take benefit of the CPU Topology.
>>
>> Signed-off-by: Pierre Morel <pmorel@linux.ibm.com>
> 
> 
>> @@ -228,7 +232,7 @@ struct kvm_s390_sie_block {
>>       __u8    icptcode;        /* 0x0050 */
>>       __u8    icptstatus;        /* 0x0051 */
>>       __u16    ihcpu;            /* 0x0052 */
>> -    __u8    reserved54;        /* 0x0054 */
>> +    __u8    mtcr;            /* 0x0054 */
>>   #define IICTL_CODE_NONE         0x00
>>   #define IICTL_CODE_MCHK         0x01
>>   #define IICTL_CODE_EXT         0x02
>> @@ -246,6 +250,7 @@ struct kvm_s390_sie_block {
>>   #define ECB_TE        0x10
>>   #define ECB_SRSI    0x04
>>   #define ECB_HOSTPROTINT    0x02
>> +#define ECB_PTF        0x01
> 
>  From below I understand, that ECB_PTF can be used with stfl(11) in the 
> hypervisor.
> 
> What is to happen if the hypervisor doesn't support stfl(11) and we 
> consequently cannot use ECB_PTF? Will QEMU be able to emulate PTF fully?

Yes.

> 
> 
>>       __u8    ecb;            /* 0x0061 */
>>   #define ECB2_CMMA    0x80
>>   #define ECB2_IEP    0x20
>> @@ -747,6 +752,7 @@ struct kvm_vcpu_arch {
>>       bool skey_enabled;
>>       struct kvm_s390_pv_vcpu pv;
>>       union diag318_info diag318_info;
>> +    int prev_cpu;
>>   };
>>   struct kvm_vm_stat {
>> diff --git a/arch/s390/kvm/kvm-s390.c b/arch/s390/kvm/kvm-s390.c
>> index b655a7d82bf0..ff6d8a2b511c 100644
>> --- a/arch/s390/kvm/kvm-s390.c
>> +++ b/arch/s390/kvm/kvm-s390.c
>> @@ -568,6 +568,7 @@ int kvm_vm_ioctl_check_extension(struct kvm *kvm, 
>> long ext)
>>       case KVM_CAP_S390_VCPU_RESETS:
>>       case KVM_CAP_SET_GUEST_DEBUG:
>>       case KVM_CAP_S390_DIAG318:
>> +    case KVM_CAP_S390_CPU_TOPOLOGY:
> 
> I would have expected instead
> 
> r = test_facility(11);
> break

The idea is that QEMU will emulate both PTF and SYSIB_15 in this case.

> 
> ...
> 
>>           r = 1;
>>           break;
>>       case KVM_CAP_SET_GUEST_DEBUG2:
>> @@ -819,6 +820,23 @@ int kvm_vm_ioctl_enable_cap(struct kvm *kvm, 
>> struct kvm_enable_cap *cap)
>>           icpt_operexc_on_all_vcpus(kvm);
>>           r = 0;
>>           break;
>> +    case KVM_CAP_S390_CPU_TOPOLOGY:
>> +        mutex_lock(&kvm->lock);
>> +        if (kvm->created_vcpus) {
>> +            r = -EBUSY;
>> +        } else {
> 
> ...
> } else if (test_facility(11)) {
>      set_kvm_facility(kvm->arch.model.fac_mask, 11);
>      set_kvm_facility(kvm->arch.model.fac_list, 11);
>      r = 0;
> } else {
>      r = -EINVAL;
> }
> 
> similar to how we handle KVM_CAP_S390_VECTOR_REGISTERS.
> 
> But I assume you want to be able to support hosts without ECB_PTF, correct?

yes, this was the idea.

> 
> 
>> +            set_kvm_facility(kvm->arch.model.fac_mask, 11);
>> +            set_kvm_facility(kvm->arch.model.fac_list, 11);
>> +            r = 0;
>> +        }
>> +        mutex_unlock(&kvm->lock);
>> +        VM_EVENT(kvm, 3, "ENABLE: CPU TOPOLOGY %s",
>> +             r ? "(not available)" : "(success)");
>> +        break;
>> +
>> +        r = -EINVAL;
>> +        break;
> 
> ^ dead code
> 

:) indeed , sorry.

> [...]
> 
>>   }
>>   void kvm_arch_vcpu_put(struct kvm_vcpu *vcpu)
>>   {
>> +    vcpu->arch.prev_cpu = vcpu->cpu;
>>       vcpu->cpu = -1;
>>       if (vcpu->arch.cputm_enabled && !is_vcpu_idle(vcpu))
>>           __stop_cpu_timer_accounting(vcpu);
>> @@ -3198,6 +3239,11 @@ static int kvm_s390_vcpu_setup(struct kvm_vcpu 
>> *vcpu)
>>           vcpu->arch.sie_block->ecb |= ECB_HOSTPROTINT;
>>       if (test_kvm_facility(vcpu->kvm, 9))
>>           vcpu->arch.sie_block->ecb |= ECB_SRSI;
>> +
>> +    /* PTF needs both host and guest facilities to enable 
>> interpretation */
>> +    if (test_kvm_facility(vcpu->kvm, 11) && test_facility(11))
>> +        vcpu->arch.sie_block->ecb |= ECB_PTF;
> 
> Here you say we need both ...

Yes because for interpretation we need both.
But if PTF is not interpreted we will emulate it in QEMU.

> 
>> +
>>       if (test_kvm_facility(vcpu->kvm, 73))
>>           vcpu->arch.sie_block->ecb |= ECB_TE;
>> diff --git a/arch/s390/kvm/vsie.c b/arch/s390/kvm/vsie.c
>> index 4002a24bc43a..50d67190bf65 100644
>> --- a/arch/s390/kvm/vsie.c
>> +++ b/arch/s390/kvm/vsie.c
>> @@ -503,6 +503,9 @@ static int shadow_scb(struct kvm_vcpu *vcpu, 
>> struct vsie_page *vsie_page)
>>       /* Host-protection-interruption introduced with ESOP */
>>       if (test_kvm_cpu_feat(vcpu->kvm, KVM_S390_VM_CPU_FEAT_ESOP))
>>           scb_s->ecb |= scb_o->ecb & ECB_HOSTPROTINT;
>> +    /* CPU Topology */
>> +    if (test_kvm_facility(vcpu->kvm, 11))
>> +        scb_s->ecb |= scb_o->ecb & ECB_PTF;
> 
> but here you don't check?

Arrrg, yes, this is false, we must check both here too.

> 
>>       /* transactional execution */
>>       if (test_kvm_facility(vcpu->kvm, 73) && wants_tx) {
>>           /* remap the prefix is tx is toggled on */
>> diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
>> index d9e4aabcb31a..081ce0cd44b9 100644
>> --- a/include/uapi/linux/kvm.h
>> +++ b/include/uapi/linux/kvm.h
>> @@ -1112,6 +1112,7 @@ struct kvm_ppc_resize_hpt {
>>   #define KVM_CAP_BINARY_STATS_FD 203
>>   #define KVM_CAP_EXIT_ON_EMULATION_FAILURE 204
>>   #define KVM_CAP_ARM_MTE 205
>> +#define KVM_CAP_S390_CPU_TOPOLOGY 206
> 
> We'll need a Documentation/virt/kvm/api.rst description.
> 
> I'm not completely confident that the way we're handling the 
> capability+facility is the right approach. It all feels a bit suboptimal.
> 
> Except stfl(74) -- STHYI --, we never enable a facility via 
> set_kvm_facility() that's not available in the host. And STHYI is 
> special such that it is never implemented in hardware.

Then we can fall back to KVM_facility + in kernel emulation but if for 
PTF it will be quite simple, for STSI_15 it will be much bigger.

> 
> I'll think about what might be cleaner once I get some more details 
> about the interaction with stfl(11) in the hypervisor.
> 

And I just saw I for an unknown reason forgot two patches in the QEMU 
series:

s390x: kvm: make topology change report pending
s390x: kvm: enable CPU Topology Function

So I will publish a new QEMU series this afternoon with the comments 
from Thomas.

thanks,
Pierre

-- 
Pierre Morel
IBM Lab Boeblingen

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v3 2/3] s390x: KVM: Implementation of Multiprocessor Topology-Change-Report
  2021-09-06 18:37   ` David Hildenbrand
  2021-09-07 10:24     ` Pierre Morel
@ 2021-09-07 12:28     ` Pierre Morel
  2021-09-08  7:07       ` Christian Borntraeger
  2021-09-09  9:03     ` Pierre Morel
  2 siblings, 1 reply; 25+ messages in thread
From: Pierre Morel @ 2021-09-07 12:28 UTC (permalink / raw)
  To: David Hildenbrand, kvm
  Cc: linux-s390, linux-kernel, borntraeger, frankja, cohuck, thuth,
	imbrenda, hca, gor



On 9/6/21 8:37 PM, David Hildenbrand wrote:
> On 03.08.21 10:26, Pierre Morel wrote:
>> We let the userland hypervisor know if the machine support the CPU
>> topology facility using a new KVM capability: KVM_CAP_S390_CPU_TOPOLOGY.
>>
>> The PTF instruction will report a topology change if there is any change
>> with a previous STSI_15_2 SYSIB.
>> Changes inside a STSI_15_2 SYSIB occur if CPU bits are set or clear
>> inside the CPU Topology List Entry CPU mask field, which happens with
>> changes in CPU polarization, dedication, CPU types and adding or
>> removing CPUs in a socket.
>>
>> The reporting to the guest is done using the Multiprocessor
>> Topology-Change-Report (MTCR) bit of the utility entry of the guest's
>> SCA which will be cleared during the interpretation of PTF.
>>
>> To check if the topology has been modified we use a new field of the
>> arch vCPU to save the previous real CPU ID at the end of a schedule
>> and verify on next schedule that the CPU used is in the same socket.
>>
>> We deliberatly ignore:
>> - polarization: only horizontal polarization is currently used in linux.
>> - CPU Type: only IFL Type are supported in Linux
>> - Dedication: we consider that only a complete dedicated CPU stack can
>>    take benefit of the CPU Topology.
>>
>> Signed-off-by: Pierre Morel <pmorel@linux.ibm.com>
> 
> 
>> @@ -228,7 +232,7 @@ struct kvm_s390_sie_block {
>>       __u8    icptcode;        /* 0x0050 */
>>       __u8    icptstatus;        /* 0x0051 */
>>       __u16    ihcpu;            /* 0x0052 */
>> -    __u8    reserved54;        /* 0x0054 */
>> +    __u8    mtcr;            /* 0x0054 */
>>   #define IICTL_CODE_NONE         0x00
>>   #define IICTL_CODE_MCHK         0x01
>>   #define IICTL_CODE_EXT         0x02
>> @@ -246,6 +250,7 @@ struct kvm_s390_sie_block {
>>   #define ECB_TE        0x10
>>   #define ECB_SRSI    0x04
>>   #define ECB_HOSTPROTINT    0x02
>> +#define ECB_PTF        0x01
> 
>  From below I understand, that ECB_PTF can be used with stfl(11) in the 
> hypervisor.
> 
> What is to happen if the hypervisor doesn't support stfl(11) and we 
> consequently cannot use ECB_PTF? Will QEMU be able to emulate PTF fully?
> 
> 
>>       __u8    ecb;            /* 0x0061 */
>>   #define ECB2_CMMA    0x80
>>   #define ECB2_IEP    0x20
>> @@ -747,6 +752,7 @@ struct kvm_vcpu_arch {
>>       bool skey_enabled;
>>       struct kvm_s390_pv_vcpu pv;
>>       union diag318_info diag318_info;
>> +    int prev_cpu;
>>   };
>>   struct kvm_vm_stat {
>> diff --git a/arch/s390/kvm/kvm-s390.c b/arch/s390/kvm/kvm-s390.c
>> index b655a7d82bf0..ff6d8a2b511c 100644
>> --- a/arch/s390/kvm/kvm-s390.c
>> +++ b/arch/s390/kvm/kvm-s390.c
>> @@ -568,6 +568,7 @@ int kvm_vm_ioctl_check_extension(struct kvm *kvm, 
>> long ext)
>>       case KVM_CAP_S390_VCPU_RESETS:
>>       case KVM_CAP_SET_GUEST_DEBUG:
>>       case KVM_CAP_S390_DIAG318:
>> +    case KVM_CAP_S390_CPU_TOPOLOGY:
> 
> I would have expected instead
> 
> r = test_facility(11);
> break
> 
> ...
> 
>>           r = 1;
>>           break;
>>       case KVM_CAP_SET_GUEST_DEBUG2:
>> @@ -819,6 +820,23 @@ int kvm_vm_ioctl_enable_cap(struct kvm *kvm, 
>> struct kvm_enable_cap *cap)
>>           icpt_operexc_on_all_vcpus(kvm);
>>           r = 0;
>>           break;
>> +    case KVM_CAP_S390_CPU_TOPOLOGY:
>> +        mutex_lock(&kvm->lock);
>> +        if (kvm->created_vcpus) {
>> +            r = -EBUSY;
>> +        } else {
> 
> ...
> } else if (test_facility(11)) {
>      set_kvm_facility(kvm->arch.model.fac_mask, 11);
>      set_kvm_facility(kvm->arch.model.fac_list, 11);
>      r = 0;
> } else {
>      r = -EINVAL;
> }
> 
> similar to how we handle KVM_CAP_S390_VECTOR_REGISTERS.
> 
> But I assume you want to be able to support hosts without ECB_PTF, correct?
> 
> 
>> +            set_kvm_facility(kvm->arch.model.fac_mask, 11);
>> +            set_kvm_facility(kvm->arch.model.fac_list, 11);
>> +            r = 0;
>> +        }
>> +        mutex_unlock(&kvm->lock);
>> +        VM_EVENT(kvm, 3, "ENABLE: CPU TOPOLOGY %s",
>> +             r ? "(not available)" : "(success)");
>> +        break;
>> +
>> +        r = -EINVAL;
>> +        break;
> 
> ^ dead code
> 
> [...]
> 
>>   }
>>   void kvm_arch_vcpu_put(struct kvm_vcpu *vcpu)
>>   {
>> +    vcpu->arch.prev_cpu = vcpu->cpu;
>>       vcpu->cpu = -1;
>>       if (vcpu->arch.cputm_enabled && !is_vcpu_idle(vcpu))
>>           __stop_cpu_timer_accounting(vcpu);
>> @@ -3198,6 +3239,11 @@ static int kvm_s390_vcpu_setup(struct kvm_vcpu 
>> *vcpu)
>>           vcpu->arch.sie_block->ecb |= ECB_HOSTPROTINT;
>>       if (test_kvm_facility(vcpu->kvm, 9))
>>           vcpu->arch.sie_block->ecb |= ECB_SRSI;
>> +
>> +    /* PTF needs both host and guest facilities to enable 
>> interpretation */
>> +    if (test_kvm_facility(vcpu->kvm, 11) && test_facility(11))
>> +        vcpu->arch.sie_block->ecb |= ECB_PTF;
> 
> Here you say we need both ...
> 
>> +
>>       if (test_kvm_facility(vcpu->kvm, 73))
>>           vcpu->arch.sie_block->ecb |= ECB_TE;
>> diff --git a/arch/s390/kvm/vsie.c b/arch/s390/kvm/vsie.c
>> index 4002a24bc43a..50d67190bf65 100644
>> --- a/arch/s390/kvm/vsie.c
>> +++ b/arch/s390/kvm/vsie.c
>> @@ -503,6 +503,9 @@ static int shadow_scb(struct kvm_vcpu *vcpu, 
>> struct vsie_page *vsie_page)
>>       /* Host-protection-interruption introduced with ESOP */
>>       if (test_kvm_cpu_feat(vcpu->kvm, KVM_S390_VM_CPU_FEAT_ESOP))
>>           scb_s->ecb |= scb_o->ecb & ECB_HOSTPROTINT;
>> +    /* CPU Topology */
>> +    if (test_kvm_facility(vcpu->kvm, 11))
>> +        scb_s->ecb |= scb_o->ecb & ECB_PTF;
> 
> but here you don't check?
> 
>>       /* transactional execution */
>>       if (test_kvm_facility(vcpu->kvm, 73) && wants_tx) {
>>           /* remap the prefix is tx is toggled on */
>> diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
>> index d9e4aabcb31a..081ce0cd44b9 100644
>> --- a/include/uapi/linux/kvm.h
>> +++ b/include/uapi/linux/kvm.h
>> @@ -1112,6 +1112,7 @@ struct kvm_ppc_resize_hpt {
>>   #define KVM_CAP_BINARY_STATS_FD 203
>>   #define KVM_CAP_EXIT_ON_EMULATION_FAILURE 204
>>   #define KVM_CAP_ARM_MTE 205
>> +#define KVM_CAP_S390_CPU_TOPOLOGY 206
> 
> We'll need a Documentation/virt/kvm/api.rst description.
> 
> I'm not completely confident that the way we're handling the 
> capability+facility is the right approach. It all feels a bit suboptimal.
> 
> Except stfl(74) -- STHYI --, we never enable a facility via 
> set_kvm_facility() that's not available in the host. And STHYI is 
> special such that it is never implemented in hardware.
> 
> I'll think about what might be cleaner once I get some more details 
> about the interaction with stfl(11) in the hypervisor.
> 

OK, may be we do not need to handle the case stfl(11) is not present in 
the host, these are pre GA10...



-- 
Pierre Morel
IBM Lab Boeblingen

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v3 2/3] s390x: KVM: Implementation of Multiprocessor Topology-Change-Report
  2021-09-07 10:24     ` Pierre Morel
@ 2021-09-08  7:04       ` Christian Borntraeger
  2021-09-08 12:00         ` Pierre Morel
  0 siblings, 1 reply; 25+ messages in thread
From: Christian Borntraeger @ 2021-09-08  7:04 UTC (permalink / raw)
  To: Pierre Morel, David Hildenbrand, kvm
  Cc: linux-s390, linux-kernel, frankja, cohuck, thuth, imbrenda, hca, gor



On 07.09.21 12:24, Pierre Morel wrote:
> 
> 
> On 9/6/21 8:37 PM, David Hildenbrand wrote:
>> On 03.08.21 10:26, Pierre Morel wrote:
>>> We let the userland hypervisor know if the machine support the CPU
>>> topology facility using a new KVM capability: KVM_CAP_S390_CPU_TOPOLOGY.
>>>
>>> The PTF instruction will report a topology change if there is any change
>>> with a previous STSI_15_2 SYSIB.
>>> Changes inside a STSI_15_2 SYSIB occur if CPU bits are set or clear
>>> inside the CPU Topology List Entry CPU mask field, which happens with
>>> changes in CPU polarization, dedication, CPU types and adding or
>>> removing CPUs in a socket.
>>>
>>> The reporting to the guest is done using the Multiprocessor
>>> Topology-Change-Report (MTCR) bit of the utility entry of the guest's
>>> SCA which will be cleared during the interpretation of PTF.
>>>
>>> To check if the topology has been modified we use a new field of the
>>> arch vCPU to save the previous real CPU ID at the end of a schedule
>>> and verify on next schedule that the CPU used is in the same socket.
>>>
>>> We deliberatly ignore:
>>> - polarization: only horizontal polarization is currently used in linux.
>>> - CPU Type: only IFL Type are supported in Linux
>>> - Dedication: we consider that only a complete dedicated CPU stack can
>>>    take benefit of the CPU Topology.
>>>
>>> Signed-off-by: Pierre Morel <pmorel@linux.ibm.com>
>>
>>
>>> @@ -228,7 +232,7 @@ struct kvm_s390_sie_block {
>>>       __u8    icptcode;        /* 0x0050 */
>>>       __u8    icptstatus;        /* 0x0051 */
>>>       __u16    ihcpu;            /* 0x0052 */
>>> -    __u8    reserved54;        /* 0x0054 */
>>> +    __u8    mtcr;            /* 0x0054 */
>>>   #define IICTL_CODE_NONE         0x00
>>>   #define IICTL_CODE_MCHK         0x01
>>>   #define IICTL_CODE_EXT         0x02
>>> @@ -246,6 +250,7 @@ struct kvm_s390_sie_block {
>>>   #define ECB_TE        0x10
>>>   #define ECB_SRSI    0x04
>>>   #define ECB_HOSTPROTINT    0x02
>>> +#define ECB_PTF        0x01
>>
>>  From below I understand, that ECB_PTF can be used with stfl(11) in the hypervisor.
>>
>> What is to happen if the hypervisor doesn't support stfl(11) and we consequently cannot use ECB_PTF? Will QEMU be able to emulate PTF fully?
> 
> Yes.

Do we want that? I do not think so. Other OSes (like zOS) do use PTF in there low level interrupt handler, so PTF must be really fast.
I think I would prefer that in that case the guest will simply not see stfle(11).
So the user can still specify the topology but the guest will have no interface to query it.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v3 2/3] s390x: KVM: Implementation of Multiprocessor Topology-Change-Report
  2021-09-07 12:28     ` Pierre Morel
@ 2021-09-08  7:07       ` Christian Borntraeger
  2021-09-08 13:09         ` Pierre Morel
  0 siblings, 1 reply; 25+ messages in thread
From: Christian Borntraeger @ 2021-09-08  7:07 UTC (permalink / raw)
  To: Pierre Morel, David Hildenbrand, kvm
  Cc: linux-s390, linux-kernel, frankja, cohuck, thuth, imbrenda, hca, gor



On 07.09.21 14:28, Pierre Morel wrote:
> 
> 
> On 9/6/21 8:37 PM, David Hildenbrand wrote:
>> On 03.08.21 10:26, Pierre Morel wrote:
>>> We let the userland hypervisor know if the machine support the CPU
>>> topology facility using a new KVM capability: KVM_CAP_S390_CPU_TOPOLOGY.
>>>
>>> The PTF instruction will report a topology change if there is any change
>>> with a previous STSI_15_2 SYSIB.
>>> Changes inside a STSI_15_2 SYSIB occur if CPU bits are set or clear
>>> inside the CPU Topology List Entry CPU mask field, which happens with
>>> changes in CPU polarization, dedication, CPU types and adding or
>>> removing CPUs in a socket.
>>>
>>> The reporting to the guest is done using the Multiprocessor
>>> Topology-Change-Report (MTCR) bit of the utility entry of the guest's
>>> SCA which will be cleared during the interpretation of PTF.
>>>
>>> To check if the topology has been modified we use a new field of the
>>> arch vCPU to save the previous real CPU ID at the end of a schedule
>>> and verify on next schedule that the CPU used is in the same socket.
>>>
>>> We deliberatly ignore:
>>> - polarization: only horizontal polarization is currently used in linux.
>>> - CPU Type: only IFL Type are supported in Linux
>>> - Dedication: we consider that only a complete dedicated CPU stack can
>>>    take benefit of the CPU Topology.
>>>
>>> Signed-off-by: Pierre Morel <pmorel@linux.ibm.com>
>>
>>
>>> @@ -228,7 +232,7 @@ struct kvm_s390_sie_block {
>>>       __u8    icptcode;        /* 0x0050 */
>>>       __u8    icptstatus;        /* 0x0051 */
>>>       __u16    ihcpu;            /* 0x0052 */
>>> -    __u8    reserved54;        /* 0x0054 */
>>> +    __u8    mtcr;            /* 0x0054 */
>>>   #define IICTL_CODE_NONE         0x00
>>>   #define IICTL_CODE_MCHK         0x01
>>>   #define IICTL_CODE_EXT         0x02
>>> @@ -246,6 +250,7 @@ struct kvm_s390_sie_block {
>>>   #define ECB_TE        0x10
>>>   #define ECB_SRSI    0x04
>>>   #define ECB_HOSTPROTINT    0x02
>>> +#define ECB_PTF        0x01
>>
>>  From below I understand, that ECB_PTF can be used with stfl(11) in the hypervisor.
>>
>> What is to happen if the hypervisor doesn't support stfl(11) and we consequently cannot use ECB_PTF? Will QEMU be able to emulate PTF fully?
>>
>>
>>>       __u8    ecb;            /* 0x0061 */
>>>   #define ECB2_CMMA    0x80
>>>   #define ECB2_IEP    0x20
>>> @@ -747,6 +752,7 @@ struct kvm_vcpu_arch {
>>>       bool skey_enabled;
>>>       struct kvm_s390_pv_vcpu pv;
>>>       union diag318_info diag318_info;
>>> +    int prev_cpu;
>>>   };
>>>   struct kvm_vm_stat {
>>> diff --git a/arch/s390/kvm/kvm-s390.c b/arch/s390/kvm/kvm-s390.c
>>> index b655a7d82bf0..ff6d8a2b511c 100644
>>> --- a/arch/s390/kvm/kvm-s390.c
>>> +++ b/arch/s390/kvm/kvm-s390.c
>>> @@ -568,6 +568,7 @@ int kvm_vm_ioctl_check_extension(struct kvm *kvm, long ext)
>>>       case KVM_CAP_S390_VCPU_RESETS:
>>>       case KVM_CAP_SET_GUEST_DEBUG:
>>>       case KVM_CAP_S390_DIAG318:
>>> +    case KVM_CAP_S390_CPU_TOPOLOGY:
>>
>> I would have expected instead
>>
>> r = test_facility(11);
>> break
>>
>> ...
>>
>>>           r = 1;
>>>           break;
>>>       case KVM_CAP_SET_GUEST_DEBUG2:
>>> @@ -819,6 +820,23 @@ int kvm_vm_ioctl_enable_cap(struct kvm *kvm, struct kvm_enable_cap *cap)
>>>           icpt_operexc_on_all_vcpus(kvm);
>>>           r = 0;
>>>           break;
>>> +    case KVM_CAP_S390_CPU_TOPOLOGY:
>>> +        mutex_lock(&kvm->lock);
>>> +        if (kvm->created_vcpus) {
>>> +            r = -EBUSY;
>>> +        } else {
>>
>> ...
>> } else if (test_facility(11)) {
>>      set_kvm_facility(kvm->arch.model.fac_mask, 11);
>>      set_kvm_facility(kvm->arch.model.fac_list, 11);
>>      r = 0;
>> } else {
>>      r = -EINVAL;
>> }
>>
>> similar to how we handle KVM_CAP_S390_VECTOR_REGISTERS.
>>
>> But I assume you want to be able to support hosts without ECB_PTF, correct?
>>
>>
>>> +            set_kvm_facility(kvm->arch.model.fac_mask, 11);
>>> +            set_kvm_facility(kvm->arch.model.fac_list, 11);
>>> +            r = 0;
>>> +        }
>>> +        mutex_unlock(&kvm->lock);
>>> +        VM_EVENT(kvm, 3, "ENABLE: CPU TOPOLOGY %s",
>>> +             r ? "(not available)" : "(success)");
>>> +        break;
>>> +
>>> +        r = -EINVAL;
>>> +        break;
>>
>> ^ dead code
>>
>> [...]
>>
>>>   }
>>>   void kvm_arch_vcpu_put(struct kvm_vcpu *vcpu)
>>>   {
>>> +    vcpu->arch.prev_cpu = vcpu->cpu;
>>>       vcpu->cpu = -1;
>>>       if (vcpu->arch.cputm_enabled && !is_vcpu_idle(vcpu))
>>>           __stop_cpu_timer_accounting(vcpu);
>>> @@ -3198,6 +3239,11 @@ static int kvm_s390_vcpu_setup(struct kvm_vcpu *vcpu)
>>>           vcpu->arch.sie_block->ecb |= ECB_HOSTPROTINT;
>>>       if (test_kvm_facility(vcpu->kvm, 9))
>>>           vcpu->arch.sie_block->ecb |= ECB_SRSI;
>>> +
>>> +    /* PTF needs both host and guest facilities to enable interpretation */
>>> +    if (test_kvm_facility(vcpu->kvm, 11) && test_facility(11))
>>> +        vcpu->arch.sie_block->ecb |= ECB_PTF;
>>
>> Here you say we need both ...
>>
>>> +
>>>       if (test_kvm_facility(vcpu->kvm, 73))
>>>           vcpu->arch.sie_block->ecb |= ECB_TE;
>>> diff --git a/arch/s390/kvm/vsie.c b/arch/s390/kvm/vsie.c
>>> index 4002a24bc43a..50d67190bf65 100644
>>> --- a/arch/s390/kvm/vsie.c
>>> +++ b/arch/s390/kvm/vsie.c
>>> @@ -503,6 +503,9 @@ static int shadow_scb(struct kvm_vcpu *vcpu, struct vsie_page *vsie_page)
>>>       /* Host-protection-interruption introduced with ESOP */
>>>       if (test_kvm_cpu_feat(vcpu->kvm, KVM_S390_VM_CPU_FEAT_ESOP))
>>>           scb_s->ecb |= scb_o->ecb & ECB_HOSTPROTINT;
>>> +    /* CPU Topology */
>>> +    if (test_kvm_facility(vcpu->kvm, 11))
>>> +        scb_s->ecb |= scb_o->ecb & ECB_PTF;
>>
>> but here you don't check?
>>
>>>       /* transactional execution */
>>>       if (test_kvm_facility(vcpu->kvm, 73) && wants_tx) {
>>>           /* remap the prefix is tx is toggled on */
>>> diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
>>> index d9e4aabcb31a..081ce0cd44b9 100644
>>> --- a/include/uapi/linux/kvm.h
>>> +++ b/include/uapi/linux/kvm.h
>>> @@ -1112,6 +1112,7 @@ struct kvm_ppc_resize_hpt {
>>>   #define KVM_CAP_BINARY_STATS_FD 203
>>>   #define KVM_CAP_EXIT_ON_EMULATION_FAILURE 204
>>>   #define KVM_CAP_ARM_MTE 205
>>> +#define KVM_CAP_S390_CPU_TOPOLOGY 206
>>
>> We'll need a Documentation/virt/kvm/api.rst description.
>>
>> I'm not completely confident that the way we're handling the capability+facility is the right approach. It all feels a bit suboptimal.
>>
>> Except stfl(74) -- STHYI --, we never enable a facility via set_kvm_facility() that's not available in the host. And STHYI is special such that it is never implemented in hardware.
>>
>> I'll think about what might be cleaner once I get some more details about the interaction with stfl(11) in the hypervisor.
>>
> 
> OK, may be we do not need to handle the case stfl(11) is not present in the host, these are pre GA10...

What about VSIE? For all existing KVM guests, stfl11 is off.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v3 2/3] s390x: KVM: Implementation of Multiprocessor Topology-Change-Report
  2021-09-08  7:04       ` Christian Borntraeger
@ 2021-09-08 12:00         ` Pierre Morel
  2021-09-08 12:01           ` Christian Borntraeger
  0 siblings, 1 reply; 25+ messages in thread
From: Pierre Morel @ 2021-09-08 12:00 UTC (permalink / raw)
  To: Christian Borntraeger, David Hildenbrand, kvm
  Cc: linux-s390, linux-kernel, frankja, cohuck, thuth, imbrenda, hca, gor



On 9/8/21 9:04 AM, Christian Borntraeger wrote:
> 
> 
> On 07.09.21 12:24, Pierre Morel wrote:
>>
>>
>> On 9/6/21 8:37 PM, David Hildenbrand wrote:
>>> On 03.08.21 10:26, Pierre Morel wrote:
>>>> We let the userland hypervisor know if the machine support the CPU
>>>> topology facility using a new KVM capability: 
>>>> KVM_CAP_S390_CPU_TOPOLOGY.
>>>>
>>>> The PTF instruction will report a topology change if there is any 
>>>> change
>>>> with a previous STSI_15_2 SYSIB.
>>>> Changes inside a STSI_15_2 SYSIB occur if CPU bits are set or clear
>>>> inside the CPU Topology List Entry CPU mask field, which happens with
>>>> changes in CPU polarization, dedication, CPU types and adding or
>>>> removing CPUs in a socket.
>>>>
>>>> The reporting to the guest is done using the Multiprocessor
>>>> Topology-Change-Report (MTCR) bit of the utility entry of the guest's
>>>> SCA which will be cleared during the interpretation of PTF.
>>>>
>>>> To check if the topology has been modified we use a new field of the
>>>> arch vCPU to save the previous real CPU ID at the end of a schedule
>>>> and verify on next schedule that the CPU used is in the same socket.
>>>>
>>>> We deliberatly ignore:
>>>> - polarization: only horizontal polarization is currently used in 
>>>> linux.
>>>> - CPU Type: only IFL Type are supported in Linux
>>>> - Dedication: we consider that only a complete dedicated CPU stack can
>>>>    take benefit of the CPU Topology.
>>>>
>>>> Signed-off-by: Pierre Morel <pmorel@linux.ibm.com>
>>>
>>>
>>>> @@ -228,7 +232,7 @@ struct kvm_s390_sie_block {
>>>>       __u8    icptcode;        /* 0x0050 */
>>>>       __u8    icptstatus;        /* 0x0051 */
>>>>       __u16    ihcpu;            /* 0x0052 */
>>>> -    __u8    reserved54;        /* 0x0054 */
>>>> +    __u8    mtcr;            /* 0x0054 */
>>>>   #define IICTL_CODE_NONE         0x00
>>>>   #define IICTL_CODE_MCHK         0x01
>>>>   #define IICTL_CODE_EXT         0x02
>>>> @@ -246,6 +250,7 @@ struct kvm_s390_sie_block {
>>>>   #define ECB_TE        0x10
>>>>   #define ECB_SRSI    0x04
>>>>   #define ECB_HOSTPROTINT    0x02
>>>> +#define ECB_PTF        0x01
>>>
>>>  From below I understand, that ECB_PTF can be used with stfl(11) in 
>>> the hypervisor.
>>>
>>> What is to happen if the hypervisor doesn't support stfl(11) and we 
>>> consequently cannot use ECB_PTF? Will QEMU be able to emulate PTF fully?
>>
>> Yes.
> 
> Do we want that? I do not think so. Other OSes (like zOS) do use PTF in 
> there low level interrupt handler, so PTF must be really fast.
> I think I would prefer that in that case the guest will simply not see 
> stfle(11).
> So the user can still specify the topology but the guest will have no 
> interface to query it.

I do not understand.
If the host support stfle(11) we interpret PTF.

The proposition was to emulate only in the case it is not supported, 
what you propose is to not advertise stfl(11) if the host does not 
support it, and consequently to never emulate is it right?

In this case, as STSI_15 is linked to stfl(11) too, the guest will not 
be aware of the topology.

OK for me.

-- 
Pierre Morel
IBM Lab Boeblingen

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v3 2/3] s390x: KVM: Implementation of Multiprocessor Topology-Change-Report
  2021-09-08 12:00         ` Pierre Morel
@ 2021-09-08 12:01           ` Christian Borntraeger
  2021-09-08 12:52             ` Pierre Morel
  0 siblings, 1 reply; 25+ messages in thread
From: Christian Borntraeger @ 2021-09-08 12:01 UTC (permalink / raw)
  To: Pierre Morel, David Hildenbrand, kvm
  Cc: linux-s390, linux-kernel, frankja, cohuck, thuth, imbrenda, hca, gor



On 08.09.21 14:00, Pierre Morel wrote:
> 
> 
> On 9/8/21 9:04 AM, Christian Borntraeger wrote:
>>
>>
>> On 07.09.21 12:24, Pierre Morel wrote:
>>>
>>>
>>> On 9/6/21 8:37 PM, David Hildenbrand wrote:
>>>> On 03.08.21 10:26, Pierre Morel wrote:
>>>>> We let the userland hypervisor know if the machine support the CPU
>>>>> topology facility using a new KVM capability: KVM_CAP_S390_CPU_TOPOLOGY.
>>>>>
>>>>> The PTF instruction will report a topology change if there is any change
>>>>> with a previous STSI_15_2 SYSIB.
>>>>> Changes inside a STSI_15_2 SYSIB occur if CPU bits are set or clear
>>>>> inside the CPU Topology List Entry CPU mask field, which happens with
>>>>> changes in CPU polarization, dedication, CPU types and adding or
>>>>> removing CPUs in a socket.
>>>>>
>>>>> The reporting to the guest is done using the Multiprocessor
>>>>> Topology-Change-Report (MTCR) bit of the utility entry of the guest's
>>>>> SCA which will be cleared during the interpretation of PTF.
>>>>>
>>>>> To check if the topology has been modified we use a new field of the
>>>>> arch vCPU to save the previous real CPU ID at the end of a schedule
>>>>> and verify on next schedule that the CPU used is in the same socket.
>>>>>
>>>>> We deliberatly ignore:
>>>>> - polarization: only horizontal polarization is currently used in linux.
>>>>> - CPU Type: only IFL Type are supported in Linux
>>>>> - Dedication: we consider that only a complete dedicated CPU stack can
>>>>>    take benefit of the CPU Topology.
>>>>>
>>>>> Signed-off-by: Pierre Morel <pmorel@linux.ibm.com>
>>>>
>>>>
>>>>> @@ -228,7 +232,7 @@ struct kvm_s390_sie_block {
>>>>>       __u8    icptcode;        /* 0x0050 */
>>>>>       __u8    icptstatus;        /* 0x0051 */
>>>>>       __u16    ihcpu;            /* 0x0052 */
>>>>> -    __u8    reserved54;        /* 0x0054 */
>>>>> +    __u8    mtcr;            /* 0x0054 */
>>>>>   #define IICTL_CODE_NONE         0x00
>>>>>   #define IICTL_CODE_MCHK         0x01
>>>>>   #define IICTL_CODE_EXT         0x02
>>>>> @@ -246,6 +250,7 @@ struct kvm_s390_sie_block {
>>>>>   #define ECB_TE        0x10
>>>>>   #define ECB_SRSI    0x04
>>>>>   #define ECB_HOSTPROTINT    0x02
>>>>> +#define ECB_PTF        0x01
>>>>
>>>>  From below I understand, that ECB_PTF can be used with stfl(11) in the hypervisor.
>>>>
>>>> What is to happen if the hypervisor doesn't support stfl(11) and we consequently cannot use ECB_PTF? Will QEMU be able to emulate PTF fully?
>>>
>>> Yes.
>>
>> Do we want that? I do not think so. Other OSes (like zOS) do use PTF in there low level interrupt handler, so PTF must be really fast.
>> I think I would prefer that in that case the guest will simply not see stfle(11).
>> So the user can still specify the topology but the guest will have no interface to query it.
> 
> I do not understand.
> If the host support stfle(11) we interpret PTF.
> 
> The proposition was to emulate only in the case it is not supported, what you propose is to not advertise stfl(11) if the host does not support it, and consequently to never emulate is it right?

Yes, exactly. My idea is to provide it to guests if we can do it fast, but do not provide it if it would add a performance issue.
> 
> In this case, as STSI_15 is linked to stfl(11) too, the guest will not be aware of the topology.
> 
> OK for me.
> 

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v3 2/3] s390x: KVM: Implementation of Multiprocessor Topology-Change-Report
  2021-09-08 12:01           ` Christian Borntraeger
@ 2021-09-08 12:52             ` Pierre Morel
  0 siblings, 0 replies; 25+ messages in thread
From: Pierre Morel @ 2021-09-08 12:52 UTC (permalink / raw)
  To: Christian Borntraeger, David Hildenbrand, kvm
  Cc: linux-s390, linux-kernel, frankja, cohuck, thuth, imbrenda, hca, gor



On 9/8/21 2:01 PM, Christian Borntraeger wrote:
> 
> 
> On 08.09.21 14:00, Pierre Morel wrote:
>>
>>
>> On 9/8/21 9:04 AM, Christian Borntraeger wrote:
>>>
>>>
>>> On 07.09.21 12:24, Pierre Morel wrote:
>>>>
>>>>
>>>> On 9/6/21 8:37 PM, David Hildenbrand wrote:
>>>>> On 03.08.21 10:26, Pierre Morel wrote:
>>>>>> We let the userland hypervisor know if the machine support the CPU
>>>>>> topology facility using a new KVM capability: 
>>>>>> KVM_CAP_S390_CPU_TOPOLOGY.
>>>>>>
>>>>>> The PTF instruction will report a topology change if there is any 
>>>>>> change
>>>>>> with a previous STSI_15_2 SYSIB.
>>>>>> Changes inside a STSI_15_2 SYSIB occur if CPU bits are set or clear
>>>>>> inside the CPU Topology List Entry CPU mask field, which happens with
>>>>>> changes in CPU polarization, dedication, CPU types and adding or
>>>>>> removing CPUs in a socket.
>>>>>>
>>>>>> The reporting to the guest is done using the Multiprocessor
>>>>>> Topology-Change-Report (MTCR) bit of the utility entry of the guest's
>>>>>> SCA which will be cleared during the interpretation of PTF.
>>>>>>
>>>>>> To check if the topology has been modified we use a new field of the
>>>>>> arch vCPU to save the previous real CPU ID at the end of a schedule
>>>>>> and verify on next schedule that the CPU used is in the same socket.
>>>>>>
>>>>>> We deliberatly ignore:
>>>>>> - polarization: only horizontal polarization is currently used in 
>>>>>> linux.
>>>>>> - CPU Type: only IFL Type are supported in Linux
>>>>>> - Dedication: we consider that only a complete dedicated CPU stack 
>>>>>> can
>>>>>>    take benefit of the CPU Topology.
>>>>>>
>>>>>> Signed-off-by: Pierre Morel <pmorel@linux.ibm.com>
>>>>>
>>>>>
>>>>>> @@ -228,7 +232,7 @@ struct kvm_s390_sie_block {
>>>>>>       __u8    icptcode;        /* 0x0050 */
>>>>>>       __u8    icptstatus;        /* 0x0051 */
>>>>>>       __u16    ihcpu;            /* 0x0052 */
>>>>>> -    __u8    reserved54;        /* 0x0054 */
>>>>>> +    __u8    mtcr;            /* 0x0054 */
>>>>>>   #define IICTL_CODE_NONE         0x00
>>>>>>   #define IICTL_CODE_MCHK         0x01
>>>>>>   #define IICTL_CODE_EXT         0x02
>>>>>> @@ -246,6 +250,7 @@ struct kvm_s390_sie_block {
>>>>>>   #define ECB_TE        0x10
>>>>>>   #define ECB_SRSI    0x04
>>>>>>   #define ECB_HOSTPROTINT    0x02
>>>>>> +#define ECB_PTF        0x01
>>>>>
>>>>>  From below I understand, that ECB_PTF can be used with stfl(11) in 
>>>>> the hypervisor.
>>>>>
>>>>> What is to happen if the hypervisor doesn't support stfl(11) and we 
>>>>> consequently cannot use ECB_PTF? Will QEMU be able to emulate PTF 
>>>>> fully?
>>>>
>>>> Yes.
>>>
>>> Do we want that? I do not think so. Other OSes (like zOS) do use PTF 
>>> in there low level interrupt handler, so PTF must be really fast.
>>> I think I would prefer that in that case the guest will simply not 
>>> see stfle(11).
>>> So the user can still specify the topology but the guest will have no 
>>> interface to query it.
>>
>> I do not understand.
>> If the host support stfle(11) we interpret PTF.
>>
>> The proposition was to emulate only in the case it is not supported, 
>> what you propose is to not advertise stfl(11) if the host does not 
>> support it, and consequently to never emulate is it right?
> 
> Yes, exactly. My idea is to provide it to guests if we can do it fast, 
> but do not provide it if it would add a performance issue.

OK, understood, I will update this and the QEMU part too as we do not 
need emulation there anymore.

Thanks,
Pierre

-- 
Pierre Morel
IBM Lab Boeblingen

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v3 2/3] s390x: KVM: Implementation of Multiprocessor Topology-Change-Report
  2021-09-08  7:07       ` Christian Borntraeger
@ 2021-09-08 13:09         ` Pierre Morel
  2021-09-08 13:16           ` Christian Borntraeger
  0 siblings, 1 reply; 25+ messages in thread
From: Pierre Morel @ 2021-09-08 13:09 UTC (permalink / raw)
  To: Christian Borntraeger, David Hildenbrand, kvm
  Cc: linux-s390, linux-kernel, frankja, cohuck, thuth, imbrenda, hca, gor



On 9/8/21 9:07 AM, Christian Borntraeger wrote:
> 
> 
> On 07.09.21 14:28, Pierre Morel wrote:
>>
>>
>> On 9/6/21 8:37 PM, David Hildenbrand wrote:
>>> On 03.08.21 10:26, Pierre Morel wrote:
>>>> We let the userland hypervisor know if the machine support the CPU
>>>> topology facility using a new KVM capability: 
>>>> KVM_CAP_S390_CPU_TOPOLOGY.
>>>>
>>>> The PTF instruction will report a topology change if there is any 
>>>> change
>>>> with a previous STSI_15_2 SYSIB.
>>>> Changes inside a STSI_15_2 SYSIB occur if CPU bits are set or clear
>>>> inside the CPU Topology List Entry CPU mask field, which happens with
>>>> changes in CPU polarization, dedication, CPU types and adding or
>>>> removing CPUs in a socket.
>>>>
>>>> The reporting to the guest is done using the Multiprocessor
>>>> Topology-Change-Report (MTCR) bit of the utility entry of the guest's
>>>> SCA which will be cleared during the interpretation of PTF.
>>>>
>>>> To check if the topology has been modified we use a new field of the
>>>> arch vCPU to save the previous real CPU ID at the end of a schedule
>>>> and verify on next schedule that the CPU used is in the same socket.
>>>>
>>>> We deliberatly ignore:
>>>> - polarization: only horizontal polarization is currently used in 
>>>> linux.
>>>> - CPU Type: only IFL Type are supported in Linux
>>>> - Dedication: we consider that only a complete dedicated CPU stack can
>>>>    take benefit of the CPU Topology.
>>>>
>>>> Signed-off-by: Pierre Morel <pmorel@linux.ibm.com>
>>>
>>>
>>>> @@ -228,7 +232,7 @@ struct kvm_s390_sie_block {
>>>>       __u8    icptcode;        /* 0x0050 */
>>>>       __u8    icptstatus;        /* 0x0051 */
>>>>       __u16    ihcpu;            /* 0x0052 */
>>>> -    __u8    reserved54;        /* 0x0054 */
>>>> +    __u8    mtcr;            /* 0x0054 */
>>>>   #define IICTL_CODE_NONE         0x00
>>>>   #define IICTL_CODE_MCHK         0x01
>>>>   #define IICTL_CODE_EXT         0x02
>>>> @@ -246,6 +250,7 @@ struct kvm_s390_sie_block {
>>>>   #define ECB_TE        0x10
>>>>   #define ECB_SRSI    0x04
>>>>   #define ECB_HOSTPROTINT    0x02
>>>> +#define ECB_PTF        0x01
>>>
>>>  From below I understand, that ECB_PTF can be used with stfl(11) in 
>>> the hypervisor.
>>>
>>> What is to happen if the hypervisor doesn't support stfl(11) and we 
>>> consequently cannot use ECB_PTF? Will QEMU be able to emulate PTF fully?
>>>
>>>
>>>>       __u8    ecb;            /* 0x0061 */
>>>>   #define ECB2_CMMA    0x80
>>>>   #define ECB2_IEP    0x20
>>>> @@ -747,6 +752,7 @@ struct kvm_vcpu_arch {
>>>>       bool skey_enabled;
>>>>       struct kvm_s390_pv_vcpu pv;
>>>>       union diag318_info diag318_info;
>>>> +    int prev_cpu;
>>>>   };
>>>>   struct kvm_vm_stat {
>>>> diff --git a/arch/s390/kvm/kvm-s390.c b/arch/s390/kvm/kvm-s390.c
>>>> index b655a7d82bf0..ff6d8a2b511c 100644
>>>> --- a/arch/s390/kvm/kvm-s390.c
>>>> +++ b/arch/s390/kvm/kvm-s390.c
>>>> @@ -568,6 +568,7 @@ int kvm_vm_ioctl_check_extension(struct kvm 
>>>> *kvm, long ext)
>>>>       case KVM_CAP_S390_VCPU_RESETS:
>>>>       case KVM_CAP_SET_GUEST_DEBUG:
>>>>       case KVM_CAP_S390_DIAG318:
>>>> +    case KVM_CAP_S390_CPU_TOPOLOGY:
>>>
>>> I would have expected instead
>>>
>>> r = test_facility(11);
>>> break
>>>
>>> ...
>>>
>>>>           r = 1;
>>>>           break;
>>>>       case KVM_CAP_SET_GUEST_DEBUG2:
>>>> @@ -819,6 +820,23 @@ int kvm_vm_ioctl_enable_cap(struct kvm *kvm, 
>>>> struct kvm_enable_cap *cap)
>>>>           icpt_operexc_on_all_vcpus(kvm);
>>>>           r = 0;
>>>>           break;
>>>> +    case KVM_CAP_S390_CPU_TOPOLOGY:
>>>> +        mutex_lock(&kvm->lock);
>>>> +        if (kvm->created_vcpus) {
>>>> +            r = -EBUSY;
>>>> +        } else {
>>>
>>> ...
>>> } else if (test_facility(11)) {
>>>      set_kvm_facility(kvm->arch.model.fac_mask, 11);
>>>      set_kvm_facility(kvm->arch.model.fac_list, 11);
>>>      r = 0;
>>> } else {
>>>      r = -EINVAL;
>>> }
>>>
>>> similar to how we handle KVM_CAP_S390_VECTOR_REGISTERS.
>>>
>>> But I assume you want to be able to support hosts without ECB_PTF, 
>>> correct?
>>>
>>>
>>>> +            set_kvm_facility(kvm->arch.model.fac_mask, 11);
>>>> +            set_kvm_facility(kvm->arch.model.fac_list, 11);
>>>> +            r = 0;
>>>> +        }
>>>> +        mutex_unlock(&kvm->lock);
>>>> +        VM_EVENT(kvm, 3, "ENABLE: CPU TOPOLOGY %s",
>>>> +             r ? "(not available)" : "(success)");
>>>> +        break;
>>>> +
>>>> +        r = -EINVAL;
>>>> +        break;
>>>
>>> ^ dead code
>>>
>>> [...]
>>>
>>>>   }
>>>>   void kvm_arch_vcpu_put(struct kvm_vcpu *vcpu)
>>>>   {
>>>> +    vcpu->arch.prev_cpu = vcpu->cpu;
>>>>       vcpu->cpu = -1;
>>>>       if (vcpu->arch.cputm_enabled && !is_vcpu_idle(vcpu))
>>>>           __stop_cpu_timer_accounting(vcpu);
>>>> @@ -3198,6 +3239,11 @@ static int kvm_s390_vcpu_setup(struct 
>>>> kvm_vcpu *vcpu)
>>>>           vcpu->arch.sie_block->ecb |= ECB_HOSTPROTINT;
>>>>       if (test_kvm_facility(vcpu->kvm, 9))
>>>>           vcpu->arch.sie_block->ecb |= ECB_SRSI;
>>>> +
>>>> +    /* PTF needs both host and guest facilities to enable 
>>>> interpretation */
>>>> +    if (test_kvm_facility(vcpu->kvm, 11) && test_facility(11))
>>>> +        vcpu->arch.sie_block->ecb |= ECB_PTF;
>>>
>>> Here you say we need both ...
>>>
>>>> +
>>>>       if (test_kvm_facility(vcpu->kvm, 73))
>>>>           vcpu->arch.sie_block->ecb |= ECB_TE;
>>>> diff --git a/arch/s390/kvm/vsie.c b/arch/s390/kvm/vsie.c
>>>> index 4002a24bc43a..50d67190bf65 100644
>>>> --- a/arch/s390/kvm/vsie.c
>>>> +++ b/arch/s390/kvm/vsie.c
>>>> @@ -503,6 +503,9 @@ static int shadow_scb(struct kvm_vcpu *vcpu, 
>>>> struct vsie_page *vsie_page)
>>>>       /* Host-protection-interruption introduced with ESOP */
>>>>       if (test_kvm_cpu_feat(vcpu->kvm, KVM_S390_VM_CPU_FEAT_ESOP))
>>>>           scb_s->ecb |= scb_o->ecb & ECB_HOSTPROTINT;
>>>> +    /* CPU Topology */
>>>> +    if (test_kvm_facility(vcpu->kvm, 11))
>>>> +        scb_s->ecb |= scb_o->ecb & ECB_PTF;
>>>
>>> but here you don't check?
>>>
>>>>       /* transactional execution */
>>>>       if (test_kvm_facility(vcpu->kvm, 73) && wants_tx) {
>>>>           /* remap the prefix is tx is toggled on */
>>>> diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
>>>> index d9e4aabcb31a..081ce0cd44b9 100644
>>>> --- a/include/uapi/linux/kvm.h
>>>> +++ b/include/uapi/linux/kvm.h
>>>> @@ -1112,6 +1112,7 @@ struct kvm_ppc_resize_hpt {
>>>>   #define KVM_CAP_BINARY_STATS_FD 203
>>>>   #define KVM_CAP_EXIT_ON_EMULATION_FAILURE 204
>>>>   #define KVM_CAP_ARM_MTE 205
>>>> +#define KVM_CAP_S390_CPU_TOPOLOGY 206
>>>
>>> We'll need a Documentation/virt/kvm/api.rst description.
>>>
>>> I'm not completely confident that the way we're handling the 
>>> capability+facility is the right approach. It all feels a bit 
>>> suboptimal.
>>>
>>> Except stfl(74) -- STHYI --, we never enable a facility via 
>>> set_kvm_facility() that's not available in the host. And STHYI is 
>>> special such that it is never implemented in hardware.
>>>
>>> I'll think about what might be cleaner once I get some more details 
>>> about the interaction with stfl(11) in the hypervisor.
>>>
>>
>> OK, may be we do not need to handle the case stfl(11) is not present 
>> in the host, these are pre GA10...
> 
> What about VSIE? For all existing KVM guests, stfl11 is off.

In VSIE the patch activates stfl(11) only if the host has stfl(11).

I do not see any problem to activate the interpretation in VSIE with 
ECB_PTF (ECB.7) when the host has stfl(11) and QEMU asks to enable it 
for the guest using the CAPABILITY as it is done in this patch.

if any intermediary hypervizor decide to not advertize stfl(11) for the 
guest like an old QEMU not having the CAPABILITY, or a QEMU with 
ctop=off, KVM will not set ECB_PTF and the PTF instruction will trigger 
a program check as before.

Is it OK or did I missed something?




-- 
Pierre Morel
IBM Lab Boeblingen

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v3 2/3] s390x: KVM: Implementation of Multiprocessor Topology-Change-Report
  2021-09-08 13:09         ` Pierre Morel
@ 2021-09-08 13:16           ` Christian Borntraeger
  2021-09-08 14:17             ` Pierre Morel
  0 siblings, 1 reply; 25+ messages in thread
From: Christian Borntraeger @ 2021-09-08 13:16 UTC (permalink / raw)
  To: Pierre Morel, David Hildenbrand, kvm
  Cc: linux-s390, linux-kernel, frankja, cohuck, thuth, imbrenda, hca, gor



On 08.09.21 15:09, Pierre Morel wrote:
> 
> 
> On 9/8/21 9:07 AM, Christian Borntraeger wrote:
>>
>>
>> On 07.09.21 14:28, Pierre Morel wrote:
>>>
>>>
>>> On 9/6/21 8:37 PM, David Hildenbrand wrote:
>>>> On 03.08.21 10:26, Pierre Morel wrote:
>>>>> We let the userland hypervisor know if the machine support the CPU
>>>>> topology facility using a new KVM capability: KVM_CAP_S390_CPU_TOPOLOGY.
>>>>>
>>>>> The PTF instruction will report a topology change if there is any change
>>>>> with a previous STSI_15_2 SYSIB.
>>>>> Changes inside a STSI_15_2 SYSIB occur if CPU bits are set or clear
>>>>> inside the CPU Topology List Entry CPU mask field, which happens with
>>>>> changes in CPU polarization, dedication, CPU types and adding or
>>>>> removing CPUs in a socket.
>>>>>
>>>>> The reporting to the guest is done using the Multiprocessor
>>>>> Topology-Change-Report (MTCR) bit of the utility entry of the guest's
>>>>> SCA which will be cleared during the interpretation of PTF.
>>>>>
>>>>> To check if the topology has been modified we use a new field of the
>>>>> arch vCPU to save the previous real CPU ID at the end of a schedule
>>>>> and verify on next schedule that the CPU used is in the same socket.
>>>>>
>>>>> We deliberatly ignore:
>>>>> - polarization: only horizontal polarization is currently used in linux.
>>>>> - CPU Type: only IFL Type are supported in Linux
>>>>> - Dedication: we consider that only a complete dedicated CPU stack can
>>>>>    take benefit of the CPU Topology.
>>>>>
>>>>> Signed-off-by: Pierre Morel <pmorel@linux.ibm.com>
>>>>
>>>>
>>>>> @@ -228,7 +232,7 @@ struct kvm_s390_sie_block {
>>>>>       __u8    icptcode;        /* 0x0050 */
>>>>>       __u8    icptstatus;        /* 0x0051 */
>>>>>       __u16    ihcpu;            /* 0x0052 */
>>>>> -    __u8    reserved54;        /* 0x0054 */
>>>>> +    __u8    mtcr;            /* 0x0054 */
>>>>>   #define IICTL_CODE_NONE         0x00
>>>>>   #define IICTL_CODE_MCHK         0x01
>>>>>   #define IICTL_CODE_EXT         0x02
>>>>> @@ -246,6 +250,7 @@ struct kvm_s390_sie_block {
>>>>>   #define ECB_TE        0x10
>>>>>   #define ECB_SRSI    0x04
>>>>>   #define ECB_HOSTPROTINT    0x02
>>>>> +#define ECB_PTF        0x01
>>>>
>>>>  From below I understand, that ECB_PTF can be used with stfl(11) in the hypervisor.
>>>>
>>>> What is to happen if the hypervisor doesn't support stfl(11) and we consequently cannot use ECB_PTF? Will QEMU be able to emulate PTF fully?
>>>>
>>>>
>>>>>       __u8    ecb;            /* 0x0061 */
>>>>>   #define ECB2_CMMA    0x80
>>>>>   #define ECB2_IEP    0x20
>>>>> @@ -747,6 +752,7 @@ struct kvm_vcpu_arch {
>>>>>       bool skey_enabled;
>>>>>       struct kvm_s390_pv_vcpu pv;
>>>>>       union diag318_info diag318_info;
>>>>> +    int prev_cpu;
>>>>>   };
>>>>>   struct kvm_vm_stat {
>>>>> diff --git a/arch/s390/kvm/kvm-s390.c b/arch/s390/kvm/kvm-s390.c
>>>>> index b655a7d82bf0..ff6d8a2b511c 100644
>>>>> --- a/arch/s390/kvm/kvm-s390.c
>>>>> +++ b/arch/s390/kvm/kvm-s390.c
>>>>> @@ -568,6 +568,7 @@ int kvm_vm_ioctl_check_extension(struct kvm *kvm, long ext)
>>>>>       case KVM_CAP_S390_VCPU_RESETS:
>>>>>       case KVM_CAP_SET_GUEST_DEBUG:
>>>>>       case KVM_CAP_S390_DIAG318:
>>>>> +    case KVM_CAP_S390_CPU_TOPOLOGY:
>>>>
>>>> I would have expected instead
>>>>
>>>> r = test_facility(11);
>>>> break
>>>>
>>>> ...
>>>>
>>>>>           r = 1;
>>>>>           break;
>>>>>       case KVM_CAP_SET_GUEST_DEBUG2:
>>>>> @@ -819,6 +820,23 @@ int kvm_vm_ioctl_enable_cap(struct kvm *kvm, struct kvm_enable_cap *cap)
>>>>>           icpt_operexc_on_all_vcpus(kvm);
>>>>>           r = 0;
>>>>>           break;
>>>>> +    case KVM_CAP_S390_CPU_TOPOLOGY:
>>>>> +        mutex_lock(&kvm->lock);
>>>>> +        if (kvm->created_vcpus) {
>>>>> +            r = -EBUSY;
>>>>> +        } else {
>>>>
>>>> ...
>>>> } else if (test_facility(11)) {
>>>>      set_kvm_facility(kvm->arch.model.fac_mask, 11);
>>>>      set_kvm_facility(kvm->arch.model.fac_list, 11);
>>>>      r = 0;
>>>> } else {
>>>>      r = -EINVAL;
>>>> }
>>>>
>>>> similar to how we handle KVM_CAP_S390_VECTOR_REGISTERS.
>>>>
>>>> But I assume you want to be able to support hosts without ECB_PTF, correct?
>>>>
>>>>
>>>>> +            set_kvm_facility(kvm->arch.model.fac_mask, 11);
>>>>> +            set_kvm_facility(kvm->arch.model.fac_list, 11);
>>>>> +            r = 0;
>>>>> +        }
>>>>> +        mutex_unlock(&kvm->lock);
>>>>> +        VM_EVENT(kvm, 3, "ENABLE: CPU TOPOLOGY %s",
>>>>> +             r ? "(not available)" : "(success)");
>>>>> +        break;
>>>>> +
>>>>> +        r = -EINVAL;
>>>>> +        break;
>>>>
>>>> ^ dead code
>>>>
>>>> [...]
>>>>
>>>>>   }
>>>>>   void kvm_arch_vcpu_put(struct kvm_vcpu *vcpu)
>>>>>   {
>>>>> +    vcpu->arch.prev_cpu = vcpu->cpu;
>>>>>       vcpu->cpu = -1;
>>>>>       if (vcpu->arch.cputm_enabled && !is_vcpu_idle(vcpu))
>>>>>           __stop_cpu_timer_accounting(vcpu);
>>>>> @@ -3198,6 +3239,11 @@ static int kvm_s390_vcpu_setup(struct kvm_vcpu *vcpu)
>>>>>           vcpu->arch.sie_block->ecb |= ECB_HOSTPROTINT;
>>>>>       if (test_kvm_facility(vcpu->kvm, 9))
>>>>>           vcpu->arch.sie_block->ecb |= ECB_SRSI;
>>>>> +
>>>>> +    /* PTF needs both host and guest facilities to enable interpretation */
>>>>> +    if (test_kvm_facility(vcpu->kvm, 11) && test_facility(11))
>>>>> +        vcpu->arch.sie_block->ecb |= ECB_PTF;
>>>>
>>>> Here you say we need both ...
>>>>
>>>>> +
>>>>>       if (test_kvm_facility(vcpu->kvm, 73))
>>>>>           vcpu->arch.sie_block->ecb |= ECB_TE;
>>>>> diff --git a/arch/s390/kvm/vsie.c b/arch/s390/kvm/vsie.c
>>>>> index 4002a24bc43a..50d67190bf65 100644
>>>>> --- a/arch/s390/kvm/vsie.c
>>>>> +++ b/arch/s390/kvm/vsie.c
>>>>> @@ -503,6 +503,9 @@ static int shadow_scb(struct kvm_vcpu *vcpu, struct vsie_page *vsie_page)
>>>>>       /* Host-protection-interruption introduced with ESOP */
>>>>>       if (test_kvm_cpu_feat(vcpu->kvm, KVM_S390_VM_CPU_FEAT_ESOP))
>>>>>           scb_s->ecb |= scb_o->ecb & ECB_HOSTPROTINT;
>>>>> +    /* CPU Topology */
>>>>> +    if (test_kvm_facility(vcpu->kvm, 11))
>>>>> +        scb_s->ecb |= scb_o->ecb & ECB_PTF;
>>>>
>>>> but here you don't check?
>>>>
>>>>>       /* transactional execution */
>>>>>       if (test_kvm_facility(vcpu->kvm, 73) && wants_tx) {
>>>>>           /* remap the prefix is tx is toggled on */
>>>>> diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
>>>>> index d9e4aabcb31a..081ce0cd44b9 100644
>>>>> --- a/include/uapi/linux/kvm.h
>>>>> +++ b/include/uapi/linux/kvm.h
>>>>> @@ -1112,6 +1112,7 @@ struct kvm_ppc_resize_hpt {
>>>>>   #define KVM_CAP_BINARY_STATS_FD 203
>>>>>   #define KVM_CAP_EXIT_ON_EMULATION_FAILURE 204
>>>>>   #define KVM_CAP_ARM_MTE 205
>>>>> +#define KVM_CAP_S390_CPU_TOPOLOGY 206
>>>>
>>>> We'll need a Documentation/virt/kvm/api.rst description.
>>>>
>>>> I'm not completely confident that the way we're handling the capability+facility is the right approach. It all feels a bit suboptimal.
>>>>
>>>> Except stfl(74) -- STHYI --, we never enable a facility via set_kvm_facility() that's not available in the host. And STHYI is special such that it is never implemented in hardware.
>>>>
>>>> I'll think about what might be cleaner once I get some more details about the interaction with stfl(11) in the hypervisor.
>>>>
>>>
>>> OK, may be we do not need to handle the case stfl(11) is not present in the host, these are pre GA10...
>>
>> What about VSIE? For all existing KVM guests, stfl11 is off.
> 
> In VSIE the patch activates stfl(11) only if the host has stfl(11).
> 
> I do not see any problem to activate the interpretation in VSIE with ECB_PTF (ECB.7) when the host has stfl(11) and QEMU asks to enable it for the guest using the CAPABILITY as it is done in this patch.
> 
> if any intermediary hypervizor decide to not advertize stfl(11) for the guest like an old QEMU not having the CAPABILITY, or a QEMU with ctop=off, KVM will not set ECB_PTF and the PTF instruction will trigger a program check as before.
> 
> Is it OK or did I missed something?

Yes, sure.
My point was regarding the pre z10 statement.  We will see hosts without stfl(e)11 when running nested on z14, z15 and co.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v3 2/3] s390x: KVM: Implementation of Multiprocessor Topology-Change-Report
  2021-09-08 13:16           ` Christian Borntraeger
@ 2021-09-08 14:17             ` Pierre Morel
  0 siblings, 0 replies; 25+ messages in thread
From: Pierre Morel @ 2021-09-08 14:17 UTC (permalink / raw)
  To: Christian Borntraeger, David Hildenbrand, kvm
  Cc: linux-s390, linux-kernel, frankja, cohuck, thuth, imbrenda, hca, gor



On 9/8/21 3:16 PM, Christian Borntraeger wrote:
> 
> 
> On 08.09.21 15:09, Pierre Morel wrote:
>>
>>
>> On 9/8/21 9:07 AM, Christian Borntraeger wrote:
>>>
>>>
>>> On 07.09.21 14:28, Pierre Morel wrote:
>>>>
>>>>
>>>> On 9/6/21 8:37 PM, David Hildenbrand wrote:
>>>>> On 03.08.21 10:26, Pierre Morel wrote:
>>>>>> We let the userland hypervisor know if the machine support the CPU
>>>>>> topology facility using a new KVM capability: 
>>>>>> KVM_CAP_S390_CPU_TOPOLOGY.
>>>>>>
>>>>>> The PTF instruction will report a topology change if there is any 
>>>>>> change
>>>>>> with a previous STSI_15_2 SYSIB.
>>>>>> Changes inside a STSI_15_2 SYSIB occur if CPU bits are set or clear
>>>>>> inside the CPU Topology List Entry CPU mask field, which happens with
>>>>>> changes in CPU polarization, dedication, CPU types and adding or
>>>>>> removing CPUs in a socket.
>>>>>>
>>>>>> The reporting to the guest is done using the Multiprocessor
>>>>>> Topology-Change-Report (MTCR) bit of the utility entry of the guest's
>>>>>> SCA which will be cleared during the interpretation of PTF.
>>>>>>
>>>>>> To check if the topology has been modified we use a new field of the
>>>>>> arch vCPU to save the previous real CPU ID at the end of a schedule
>>>>>> and verify on next schedule that the CPU used is in the same socket.
>>>>>>
>>>>>> We deliberatly ignore:
>>>>>> - polarization: only horizontal polarization is currently used in 
>>>>>> linux.
>>>>>> - CPU Type: only IFL Type are supported in Linux
>>>>>> - Dedication: we consider that only a complete dedicated CPU stack 
>>>>>> can
>>>>>>    take benefit of the CPU Topology.
>>>>>>
>>>>>> Signed-off-by: Pierre Morel <pmorel@linux.ibm.com>
>>>>>
>>>>>
>>>>>> @@ -228,7 +232,7 @@ struct kvm_s390_sie_block {
>>>>>>       __u8    icptcode;        /* 0x0050 */
>>>>>>       __u8    icptstatus;        /* 0x0051 */
>>>>>>       __u16    ihcpu;            /* 0x0052 */
>>>>>> -    __u8    reserved54;        /* 0x0054 */
>>>>>> +    __u8    mtcr;            /* 0x0054 */
>>>>>>   #define IICTL_CODE_NONE         0x00
>>>>>>   #define IICTL_CODE_MCHK         0x01
>>>>>>   #define IICTL_CODE_EXT         0x02
>>>>>> @@ -246,6 +250,7 @@ struct kvm_s390_sie_block {
>>>>>>   #define ECB_TE        0x10
>>>>>>   #define ECB_SRSI    0x04
>>>>>>   #define ECB_HOSTPROTINT    0x02
>>>>>> +#define ECB_PTF        0x01
>>>>>
>>>>>  From below I understand, that ECB_PTF can be used with stfl(11) in 
>>>>> the hypervisor.
>>>>>
>>>>> What is to happen if the hypervisor doesn't support stfl(11) and we 
>>>>> consequently cannot use ECB_PTF? Will QEMU be able to emulate PTF 
>>>>> fully?
>>>>>
>>>>>
>>>>>>       __u8    ecb;            /* 0x0061 */
>>>>>>   #define ECB2_CMMA    0x80
>>>>>>   #define ECB2_IEP    0x20
>>>>>> @@ -747,6 +752,7 @@ struct kvm_vcpu_arch {
>>>>>>       bool skey_enabled;
>>>>>>       struct kvm_s390_pv_vcpu pv;
>>>>>>       union diag318_info diag318_info;
>>>>>> +    int prev_cpu;
>>>>>>   };
>>>>>>   struct kvm_vm_stat {
>>>>>> diff --git a/arch/s390/kvm/kvm-s390.c b/arch/s390/kvm/kvm-s390.c
>>>>>> index b655a7d82bf0..ff6d8a2b511c 100644
>>>>>> --- a/arch/s390/kvm/kvm-s390.c
>>>>>> +++ b/arch/s390/kvm/kvm-s390.c
>>>>>> @@ -568,6 +568,7 @@ int kvm_vm_ioctl_check_extension(struct kvm 
>>>>>> *kvm, long ext)
>>>>>>       case KVM_CAP_S390_VCPU_RESETS:
>>>>>>       case KVM_CAP_SET_GUEST_DEBUG:
>>>>>>       case KVM_CAP_S390_DIAG318:
>>>>>> +    case KVM_CAP_S390_CPU_TOPOLOGY:
>>>>>
>>>>> I would have expected instead
>>>>>
>>>>> r = test_facility(11);
>>>>> break
>>>>>
>>>>> ...
>>>>>
>>>>>>           r = 1;
>>>>>>           break;
>>>>>>       case KVM_CAP_SET_GUEST_DEBUG2:
>>>>>> @@ -819,6 +820,23 @@ int kvm_vm_ioctl_enable_cap(struct kvm *kvm, 
>>>>>> struct kvm_enable_cap *cap)
>>>>>>           icpt_operexc_on_all_vcpus(kvm);
>>>>>>           r = 0;
>>>>>>           break;
>>>>>> +    case KVM_CAP_S390_CPU_TOPOLOGY:
>>>>>> +        mutex_lock(&kvm->lock);
>>>>>> +        if (kvm->created_vcpus) {
>>>>>> +            r = -EBUSY;
>>>>>> +        } else {
>>>>>
>>>>> ...
>>>>> } else if (test_facility(11)) {
>>>>>      set_kvm_facility(kvm->arch.model.fac_mask, 11);
>>>>>      set_kvm_facility(kvm->arch.model.fac_list, 11);
>>>>>      r = 0;
>>>>> } else {
>>>>>      r = -EINVAL;
>>>>> }
>>>>>
>>>>> similar to how we handle KVM_CAP_S390_VECTOR_REGISTERS.
>>>>>
>>>>> But I assume you want to be able to support hosts without ECB_PTF, 
>>>>> correct?
>>>>>
>>>>>
>>>>>> +            set_kvm_facility(kvm->arch.model.fac_mask, 11);
>>>>>> +            set_kvm_facility(kvm->arch.model.fac_list, 11);
>>>>>> +            r = 0;
>>>>>> +        }
>>>>>> +        mutex_unlock(&kvm->lock);
>>>>>> +        VM_EVENT(kvm, 3, "ENABLE: CPU TOPOLOGY %s",
>>>>>> +             r ? "(not available)" : "(success)");
>>>>>> +        break;
>>>>>> +
>>>>>> +        r = -EINVAL;
>>>>>> +        break;
>>>>>
>>>>> ^ dead code
>>>>>
>>>>> [...]
>>>>>
>>>>>>   }
>>>>>>   void kvm_arch_vcpu_put(struct kvm_vcpu *vcpu)
>>>>>>   {
>>>>>> +    vcpu->arch.prev_cpu = vcpu->cpu;
>>>>>>       vcpu->cpu = -1;
>>>>>>       if (vcpu->arch.cputm_enabled && !is_vcpu_idle(vcpu))
>>>>>>           __stop_cpu_timer_accounting(vcpu);
>>>>>> @@ -3198,6 +3239,11 @@ static int kvm_s390_vcpu_setup(struct 
>>>>>> kvm_vcpu *vcpu)
>>>>>>           vcpu->arch.sie_block->ecb |= ECB_HOSTPROTINT;
>>>>>>       if (test_kvm_facility(vcpu->kvm, 9))
>>>>>>           vcpu->arch.sie_block->ecb |= ECB_SRSI;
>>>>>> +
>>>>>> +    /* PTF needs both host and guest facilities to enable 
>>>>>> interpretation */
>>>>>> +    if (test_kvm_facility(vcpu->kvm, 11) && test_facility(11))
>>>>>> +        vcpu->arch.sie_block->ecb |= ECB_PTF;
>>>>>
>>>>> Here you say we need both ...
>>>>>
>>>>>> +
>>>>>>       if (test_kvm_facility(vcpu->kvm, 73))
>>>>>>           vcpu->arch.sie_block->ecb |= ECB_TE;
>>>>>> diff --git a/arch/s390/kvm/vsie.c b/arch/s390/kvm/vsie.c
>>>>>> index 4002a24bc43a..50d67190bf65 100644
>>>>>> --- a/arch/s390/kvm/vsie.c
>>>>>> +++ b/arch/s390/kvm/vsie.c
>>>>>> @@ -503,6 +503,9 @@ static int shadow_scb(struct kvm_vcpu *vcpu, 
>>>>>> struct vsie_page *vsie_page)
>>>>>>       /* Host-protection-interruption introduced with ESOP */
>>>>>>       if (test_kvm_cpu_feat(vcpu->kvm, KVM_S390_VM_CPU_FEAT_ESOP))
>>>>>>           scb_s->ecb |= scb_o->ecb & ECB_HOSTPROTINT;
>>>>>> +    /* CPU Topology */
>>>>>> +    if (test_kvm_facility(vcpu->kvm, 11))
>>>>>> +        scb_s->ecb |= scb_o->ecb & ECB_PTF;
>>>>>
>>>>> but here you don't check?
>>>>>
>>>>>>       /* transactional execution */
>>>>>>       if (test_kvm_facility(vcpu->kvm, 73) && wants_tx) {
>>>>>>           /* remap the prefix is tx is toggled on */
>>>>>> diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
>>>>>> index d9e4aabcb31a..081ce0cd44b9 100644
>>>>>> --- a/include/uapi/linux/kvm.h
>>>>>> +++ b/include/uapi/linux/kvm.h
>>>>>> @@ -1112,6 +1112,7 @@ struct kvm_ppc_resize_hpt {
>>>>>>   #define KVM_CAP_BINARY_STATS_FD 203
>>>>>>   #define KVM_CAP_EXIT_ON_EMULATION_FAILURE 204
>>>>>>   #define KVM_CAP_ARM_MTE 205
>>>>>> +#define KVM_CAP_S390_CPU_TOPOLOGY 206
>>>>>
>>>>> We'll need a Documentation/virt/kvm/api.rst description.
>>>>>
>>>>> I'm not completely confident that the way we're handling the 
>>>>> capability+facility is the right approach. It all feels a bit 
>>>>> suboptimal.
>>>>>
>>>>> Except stfl(74) -- STHYI --, we never enable a facility via 
>>>>> set_kvm_facility() that's not available in the host. And STHYI is 
>>>>> special such that it is never implemented in hardware.
>>>>>
>>>>> I'll think about what might be cleaner once I get some more details 
>>>>> about the interaction with stfl(11) in the hypervisor.
>>>>>
>>>>
>>>> OK, may be we do not need to handle the case stfl(11) is not present 
>>>> in the host, these are pre GA10...
>>>
>>> What about VSIE? For all existing KVM guests, stfl11 is off.
>>
>> In VSIE the patch activates stfl(11) only if the host has stfl(11).
>>
>> I do not see any problem to activate the interpretation in VSIE with 
>> ECB_PTF (ECB.7) when the host has stfl(11) and QEMU asks to enable it 
>> for the guest using the CAPABILITY as it is done in this patch.
>>
>> if any intermediary hypervizor decide to not advertize stfl(11) for 
>> the guest like an old QEMU not having the CAPABILITY, or a QEMU with 
>> ctop=off, KVM will not set ECB_PTF and the PTF instruction will 
>> trigger a program check as before.
>>
>> Is it OK or did I missed something?
> 
> Yes, sure.
> My point was regarding the pre z10 statement.  We will see hosts without 
> stfl(e)11 when running nested on z14, z15 and co.

Ah OK, yes.
understood.

Thanks,
Pierre


-- 
Pierre Morel
IBM Lab Boeblingen

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v3 2/3] s390x: KVM: Implementation of Multiprocessor Topology-Change-Report
  2021-09-06 18:37   ` David Hildenbrand
  2021-09-07 10:24     ` Pierre Morel
  2021-09-07 12:28     ` Pierre Morel
@ 2021-09-09  9:03     ` Pierre Morel
  2 siblings, 0 replies; 25+ messages in thread
From: Pierre Morel @ 2021-09-09  9:03 UTC (permalink / raw)
  To: David Hildenbrand, kvm
  Cc: linux-s390, linux-kernel, borntraeger, frankja, cohuck, thuth,
	imbrenda, hca, gor



On 9/6/21 8:37 PM, David Hildenbrand wrote:
> On 03.08.21 10:26, Pierre Morel wrote:
>> We let the userland hypervisor know if the machine support the CPU
>> topology facility using a new KVM capability: KVM_CAP_S390_CPU_TOPOLOGY.
>>
>> The PTF instruction will report a topology change if there is any change
>> with a previous STSI_15_2 SYSIB.
>> Changes inside a STSI_15_2 SYSIB occur if CPU bits are set or clear
>> inside the CPU Topology List Entry CPU mask field, which happens with
>> changes in CPU polarization, dedication, CPU types and adding or
>> removing CPUs in a socket.
>>
>> The reporting to the guest is done using the Multiprocessor
>> Topology-Change-Report (MTCR) bit of the utility entry of the guest's
>> SCA which will be cleared during the interpretation of PTF.
>>
>> To check if the topology has been modified we use a new field of the
>> arch vCPU to save the previous real CPU ID at the end of a schedule
>> and verify on next schedule that the CPU used is in the same socket.
>>
>> We deliberatly ignore:
>> - polarization: only horizontal polarization is currently used in linux.
>> - CPU Type: only IFL Type are supported in Linux
>> - Dedication: we consider that only a complete dedicated CPU stack can
>>    take benefit of the CPU Topology.
>>
>> Signed-off-by: Pierre Morel <pmorel@linux.ibm.com>
> 
> 
>> @@ -228,7 +232,7 @@ struct kvm_s390_sie_block {
>>       __u8    icptcode;        /* 0x0050 */
>>       __u8    icptstatus;        /* 0x0051 */
>>       __u16    ihcpu;            /* 0x0052 */
>> -    __u8    reserved54;        /* 0x0054 */
>> +    __u8    mtcr;            /* 0x0054 */
>>   #define IICTL_CODE_NONE         0x00
>>   #define IICTL_CODE_MCHK         0x01
>>   #define IICTL_CODE_EXT         0x02
>> @@ -246,6 +250,7 @@ struct kvm_s390_sie_block {
>>   #define ECB_TE        0x10
>>   #define ECB_SRSI    0x04
>>   #define ECB_HOSTPROTINT    0x02
>> +#define ECB_PTF        0x01
> 
>  From below I understand, that ECB_PTF can be used with stfl(11) in the 
> hypervisor.
> 
> What is to happen if the hypervisor doesn't support stfl(11) and we 
> consequently cannot use ECB_PTF? Will QEMU be able to emulate PTF fully?
> 
> 
>>       __u8    ecb;            /* 0x0061 */
>>   #define ECB2_CMMA    0x80
>>   #define ECB2_IEP    0x20
>> @@ -747,6 +752,7 @@ struct kvm_vcpu_arch {
>>       bool skey_enabled;
>>       struct kvm_s390_pv_vcpu pv;
>>       union diag318_info diag318_info;
>> +    int prev_cpu;
>>   };
>>   struct kvm_vm_stat {
>> diff --git a/arch/s390/kvm/kvm-s390.c b/arch/s390/kvm/kvm-s390.c
>> index b655a7d82bf0..ff6d8a2b511c 100644
>> --- a/arch/s390/kvm/kvm-s390.c
>> +++ b/arch/s390/kvm/kvm-s390.c
>> @@ -568,6 +568,7 @@ int kvm_vm_ioctl_check_extension(struct kvm *kvm, 
>> long ext)
>>       case KVM_CAP_S390_VCPU_RESETS:
>>       case KVM_CAP_SET_GUEST_DEBUG:
>>       case KVM_CAP_S390_DIAG318:
>> +    case KVM_CAP_S390_CPU_TOPOLOGY:
> 
> I would have expected instead
> 
> r = test_facility(11);
> break

I will change to this as we decided not to support emulation if the hist 
does not support facility 11.


> 
> ...
> 
>>           r = 1;
>>           break;
>>       case KVM_CAP_SET_GUEST_DEBUG2:
>> @@ -819,6 +820,23 @@ int kvm_vm_ioctl_enable_cap(struct kvm *kvm, 
>> struct kvm_enable_cap *cap)
>>           icpt_operexc_on_all_vcpus(kvm);
>>           r = 0;
>>           break;
>> +    case KVM_CAP_S390_CPU_TOPOLOGY:
>> +        mutex_lock(&kvm->lock);
>> +        if (kvm->created_vcpus) {
>> +            r = -EBUSY;
>> +        } else {
> 
> ...
> } else if (test_facility(11)) {
>      set_kvm_facility(kvm->arch.model.fac_mask, 11);
>      set_kvm_facility(kvm->arch.model.fac_list, 11);
>      r = 0;
> } else {
>      r = -EINVAL;
> }
> 
> similar to how we handle KVM_CAP_S390_VECTOR_REGISTERS.
> 
> But I assume you want to be able to support hosts without ECB_PTF, correct?

No more, after Christian comments we do not want to support emulation at 
all.

> 
> 

...snip...

>> +
>> +    /* PTF needs both host and guest facilities to enable 
>> interpretation */
>> +    if (test_kvm_facility(vcpu->kvm, 11) && test_facility(11))
>> +        vcpu->arch.sie_block->ecb |= ECB_PTF;
> 
> Here you say we need both ...
> 
>> +
>>       if (test_kvm_facility(vcpu->kvm, 73))
>>           vcpu->arch.sie_block->ecb |= ECB_TE;
>> diff --git a/arch/s390/kvm/vsie.c b/arch/s390/kvm/vsie.c
>> index 4002a24bc43a..50d67190bf65 100644
>> --- a/arch/s390/kvm/vsie.c
>> +++ b/arch/s390/kvm/vsie.c
>> @@ -503,6 +503,9 @@ static int shadow_scb(struct kvm_vcpu *vcpu, 
>> struct vsie_page *vsie_page)
>>       /* Host-protection-interruption introduced with ESOP */
>>       if (test_kvm_cpu_feat(vcpu->kvm, KVM_S390_VM_CPU_FEAT_ESOP))
>>           scb_s->ecb |= scb_o->ecb & ECB_HOSTPROTINT;
>> +    /* CPU Topology */
>> +    if (test_kvm_facility(vcpu->kvm, 11))
>> +        scb_s->ecb |= scb_o->ecb & ECB_PTF;
> 
> but here you don't check?

do we really need to check at all, even for test_kvm_facility() ?
as facilities do not change during a guest session and we checked for 
setting it at first time.

Regards,
Pierre


-- 
Pierre Morel
IBM Lab Boeblingen

^ permalink raw reply	[flat|nested] 25+ messages in thread

end of thread, other threads:[~2021-09-09  9:03 UTC | newest]

Thread overview: 25+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-08-03  8:26 [PATCH v3 0/3] s390x: KVM: CPU Topology Pierre Morel
2021-08-03  8:26 ` [PATCH v3 1/3] s390x: KVM: accept STSI for CPU topology information Pierre Morel
2021-08-31 13:59   ` David Hildenbrand
2021-09-01  9:43     ` Pierre Morel
2021-09-06 18:14       ` David Hildenbrand
2021-09-07 10:11         ` Pierre Morel
2021-08-03  8:26 ` [PATCH v3 2/3] s390x: KVM: Implementation of Multiprocessor Topology-Change-Report Pierre Morel
2021-08-31 14:03   ` David Hildenbrand
2021-09-01  9:46     ` Pierre Morel
2021-09-06 18:37   ` David Hildenbrand
2021-09-07 10:24     ` Pierre Morel
2021-09-08  7:04       ` Christian Borntraeger
2021-09-08 12:00         ` Pierre Morel
2021-09-08 12:01           ` Christian Borntraeger
2021-09-08 12:52             ` Pierre Morel
2021-09-07 12:28     ` Pierre Morel
2021-09-08  7:07       ` Christian Borntraeger
2021-09-08 13:09         ` Pierre Morel
2021-09-08 13:16           ` Christian Borntraeger
2021-09-08 14:17             ` Pierre Morel
2021-09-09  9:03     ` Pierre Morel
2021-08-03  8:26 ` [PATCH v3 3/3] s390x: optimization of the check for CPU topology change Pierre Morel
2021-08-03  8:42   ` Heiko Carstens
2021-08-03  8:57     ` Pierre Morel
2021-08-03  9:28       ` Pierre Morel

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).