LKML Archive on lore.kernel.org help / color / mirror / Atom feed
* recent patch "x86/acpi: Prevent X2APIC id 0xffffffff from being accounted" @ 2018-04-27 7:21 Jan Beulich 2018-04-27 8:32 ` Dou Liyang 0 siblings, 1 reply; 6+ messages in thread From: Jan Beulich @ 2018-04-27 7:21 UTC (permalink / raw) To: douly.fnst, tglx; +Cc: linux-kernel Hello, I've just stumbled across this commit, and I'm wondering if that's actually correct (I too have at least one system where such IDs are reported in MADT): For offline/absent CPUs, the firmware may not know the APIC IDs at the point MADT is built, so I think it is quite reasonable to put ~0 in there. The ACPID spec specifically calls out that the IDs must not change across sleep states, which implies to me that they may change across an offline period of a CPU. IOW I think such entries still need to contribute to the count of disabled CPUs. I notice a similar change has been done for the xAPIC case a while ago by you, Thomas. Jan ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: recent patch "x86/acpi: Prevent X2APIC id 0xffffffff from being accounted" 2018-04-27 7:21 recent patch "x86/acpi: Prevent X2APIC id 0xffffffff from being accounted" Jan Beulich @ 2018-04-27 8:32 ` Dou Liyang 2018-04-27 12:09 ` Jan Beulich 0 siblings, 1 reply; 6+ messages in thread From: Dou Liyang @ 2018-04-27 8:32 UTC (permalink / raw) To: Jan Beulich, tglx; +Cc: linux-kernel Hi Jan, At 04/27/2018 03:21 PM, Jan Beulich wrote: > Hello, > > I've just stumbled across this commit, and I'm wondering if that's actually > correct (I too have at least one system where such IDs are reported in > MADT): For offline/absent CPUs, the firmware may not know the APIC IDs The MADT table is not reasonable, the Local APIC entries(xAPIC/x2APIC) in MADT is not always correct as OS want. So, we should do this sanity check. > at the point MADT is built, so I think it is quite reasonable to put ~0 in > there. The ACPID spec specifically calls out that the IDs must not change > across sleep states, which implies to me that they may change across an > offline period of a CPU. IOW I think such entries still need to contribute to > the count of disabled CPUs. > Aha, there are no CPU devices which will map it's physical ID to the illegal number 0xffffffff, So it will never be used. BTW, Limiting the number of the disabled CPUs is also a goal. Thanks, dou > I notice a similar change has been done for the xAPIC case a while ago > by you, Thomas. > > Jan > > > > > ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: recent patch "x86/acpi: Prevent X2APIC id 0xffffffff from being accounted" 2018-04-27 8:32 ` Dou Liyang @ 2018-04-27 12:09 ` Jan Beulich 2018-05-02 1:56 ` Dou Liyang 0 siblings, 1 reply; 6+ messages in thread From: Jan Beulich @ 2018-04-27 12:09 UTC (permalink / raw) To: douly.fnst; +Cc: tglx, linux-kernel >>> On 27.04.18 at 10:32, <douly.fnst@cn.fujitsu.com> wrote: > At 04/27/2018 03:21 PM, Jan Beulich wrote: >> I've just stumbled across this commit, and I'm wondering if that's actually >> correct (I too have at least one system where such IDs are reported in >> MADT): For offline/absent CPUs, the firmware may not know the APIC IDs > > The MADT table is not reasonable, the Local APIC entries(xAPIC/x2APIC) > in MADT is not always correct as OS want. So, we should do this sanity > check. Of course, sanity checking is necessary. >> at the point MADT is built, so I think it is quite reasonable to put ~0 in >> there. The ACPID spec specifically calls out that the IDs must not change >> across sleep states, which implies to me that they may change across an >> offline period of a CPU. IOW I think such entries still need to contribute to >> the count of disabled CPUs. > > Aha, there are no CPU devices which will map it's physical ID to the > illegal number 0xffffffff, So it will never be used. The ID will never be used, yes, but the CPU may be (after firmware has assigned it a valid ID). > BTW, Limiting the number of the disabled CPUs is also a goal. I'm afraid I don't understand: Limiting the number of disabled CPUs is certainly desirable when those can never be used, but why would you want to do this when they might later get hotplugged? I'm not aware of a way to tell, by just looking at the MADT, which of the two cases apply. Jan ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: recent patch "x86/acpi: Prevent X2APIC id 0xffffffff from being accounted" 2018-04-27 12:09 ` Jan Beulich @ 2018-05-02 1:56 ` Dou Liyang 2018-05-02 6:39 ` Jan Beulich 0 siblings, 1 reply; 6+ messages in thread From: Dou Liyang @ 2018-05-02 1:56 UTC (permalink / raw) To: Jan Beulich; +Cc: tglx, linux-kernel At 04/27/2018 08:09 PM, Jan Beulich wrote: >>>> On 27.04.18 at 10:32, <douly.fnst@cn.fujitsu.com> wrote: >> At 04/27/2018 03:21 PM, Jan Beulich wrote: >>> I've just stumbled across this commit, and I'm wondering if that's actually >>> correct (I too have at least one system where such IDs are reported in >>> MADT): For offline/absent CPUs, the firmware may not know the APIC IDs >> >> The MADT table is not reasonable, the Local APIC entries(xAPIC/x2APIC) >> in MADT is not always correct as OS want. So, we should do this sanity >> check. > > Of course, sanity checking is necessary. > >>> at the point MADT is built, so I think it is quite reasonable to put ~0 in >>> there. The ACPID spec specifically calls out that the IDs must not change >>> across sleep states, which implies to me that they may change across an >>> offline period of a CPU. IOW I think such entries still need to contribute to >>> the count of disabled CPUs. >> >> Aha, there are no CPU devices which will map it's physical ID to the >> illegal number 0xffffffff, So it will never be used. > > The ID will never be used, yes, but the CPU may be (after firmware has > assigned it a valid ID). > >> BTW, Limiting the number of the disabled CPUs is also a goal. > > I'm afraid I don't understand: Limiting the number of disabled CPUs is > certainly desirable when those can never be used, but why would you > want to do this when they might later get hotplugged? I'm not aware Let's see the workflow of CPU hotplug: 1) get the CPU device info from ACPI namespace - it contains logical processor id 2) through the logical processor id, get the LACPI entry in MADT. 3) generate the CPU for kernel(will create a CPU id, can see by lscpu) Normally, there are no valid CPU devices in 1) which are mapped to the LACPI entries(0xff or 0xffffffff). The actually number of hotplugged CPUs depends on CPU devices/processors in ACPI namespace. The number calculated from MADT is the maximum situation which can be cut and doesn't affect CPU hotplug. Don't worry about it. Now, in ACPI-based system, Linux gets the number of possible CPUs by MADT, We are going to use the ACPI namespace to make the number accurate. But it is so hard, because it's so late to initialize the ACPI namespace. Thanks, dou > of a way to tell, by just looking at the MADT, which of the two cases > apply. > > Jan > > > > > ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: recent patch "x86/acpi: Prevent X2APIC id 0xffffffff from being accounted" 2018-05-02 1:56 ` Dou Liyang @ 2018-05-02 6:39 ` Jan Beulich 2018-05-02 8:10 ` Dou Liyang 0 siblings, 1 reply; 6+ messages in thread From: Jan Beulich @ 2018-05-02 6:39 UTC (permalink / raw) To: douly.fnst; +Cc: tglx, linux-kernel >>> On 02.05.18 at 03:56, <douly.fnst@cn.fujitsu.com> wrote: > At 04/27/2018 08:09 PM, Jan Beulich wrote: >> I'm afraid I don't understand: Limiting the number of disabled CPUs is >> certainly desirable when those can never be used, but why would you >> want to do this when they might later get hotplugged? I'm not aware > > Let's see the workflow of CPU hotplug: > > 1) get the CPU device info from ACPI namespace > - it contains logical processor id > > 2) through the logical processor id, get the LACPI entry in MADT. > > 3) generate the CPU for kernel(will create a CPU id, can see by lscpu) > > Normally, there are no valid CPU devices in 1) which are mapped to > the LACPI entries(0xff or 0xffffffff). > > The actually number of hotplugged CPUs depends on CPU devices/processors > in ACPI namespace. The number calculated from MADT is the maximum > situation which can be cut and doesn't affect CPU hotplug. Don't worry > about it. > > Now, in ACPI-based system, Linux gets the number of possible CPUs by > MADT, We are going to use the ACPI namespace to make the number > accurate. But it is so hard, because it's so late to initialize the ACPI > namespace. So are you envisioning a model when the number of disabled CPUs can be increased once the ACPI interpreter has been enabled? Otherwise the maximum recorded during early boot may be too low with the changes in question. (And FTR, I agree this number may also be way too large without them, but it being too large is a resource efficiency problem, while it being too low is a functionality one.) Also, for background, besides wanting to clarify the correctness of these two changes, I'm also trying to understand whether we want to mirror them into the Xen hypervisor, which relies on the Dom0 kernel's ACPI interpreter (i.e. it can and does parse MADT, but can't and doesn't parse the ACPI name space). Hence late adjustment of the number of hotpluggable CPUs would be even more problematic in that environment. Jan ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: recent patch "x86/acpi: Prevent X2APIC id 0xffffffff from being accounted" 2018-05-02 6:39 ` Jan Beulich @ 2018-05-02 8:10 ` Dou Liyang 0 siblings, 0 replies; 6+ messages in thread From: Dou Liyang @ 2018-05-02 8:10 UTC (permalink / raw) To: Jan Beulich; +Cc: tglx, linux-kernel Hi Jan, At 05/02/2018 02:39 PM, Jan Beulich wrote: >>>> On 02.05.18 at 03:56, <douly.fnst@cn.fujitsu.com> wrote: >> At 04/27/2018 08:09 PM, Jan Beulich wrote: >>> I'm afraid I don't understand: Limiting the number of disabled CPUs is >>> certainly desirable when those can never be used, but why would you >>> want to do this when they might later get hotplugged? I'm not aware >> >> Let's see the workflow of CPU hotplug: >> >> 1) get the CPU device info from ACPI namespace >> - it contains logical processor id >> >> 2) through the logical processor id, get the LACPI entry in MADT. >> >> 3) generate the CPU for kernel(will create a CPU id, can see by lscpu) >> >> Normally, there are no valid CPU devices in 1) which are mapped to >> the LACPI entries(0xff or 0xffffffff). >> >> The actually number of hotplugged CPUs depends on CPU devices/processors >> in ACPI namespace. The number calculated from MADT is the maximum >> situation which can be cut and doesn't affect CPU hotplug. Don't worry >> about it. >> >> Now, in ACPI-based system, Linux gets the number of possible CPUs by >> MADT, We are going to use the ACPI namespace to make the number >> accurate. But it is so hard, because it's so late to initialize the ACPI >> namespace. > > So are you envisioning a model when the number of disabled CPUs can be > increased once the ACPI interpreter has been enabled? Otherwise the Yes, good idea, but, As Thomas said: It needs to run _before_ setup_percpu() is invoked to scale everything correctly. > maximum recorded during early boot may be too low with the changes in > question. (And FTR, I agree this number may also be way too large without > them, but it being too large is a resource efficiency problem, while it being > too low is a functionality one.) Too large number will waste vectors, and even cause bugs. IMO, the number will just be more accurate by excluding 0xffffffff, it only equal to, even be larger than the real number of possible CPUs. > > Also, for background, besides wanting to clarify the correctness of these > two changes, I'm also trying to understand whether we want to mirror > them into the Xen hypervisor, which relies on the Dom0 kernel's ACPI > interpreter (i.e. it can and does parse MADT, but can't and doesn't parse > the ACPI name space). Hence late adjustment of the number of > hotpluggable CPUs would be even more problematic in that environment. > I am not familiar with Xen, I will consider that. Thanks, dou > Jan > > > > > ^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2018-05-02 8:11 UTC | newest] Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2018-04-27 7:21 recent patch "x86/acpi: Prevent X2APIC id 0xffffffff from being accounted" Jan Beulich 2018-04-27 8:32 ` Dou Liyang 2018-04-27 12:09 ` Jan Beulich 2018-05-02 1:56 ` Dou Liyang 2018-05-02 6:39 ` Jan Beulich 2018-05-02 8:10 ` Dou Liyang
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).