LKML Archive on lore.kernel.org
help / color / mirror / Atom feed
* [PATCH v2] mm/kmemleak: Avoid scanning potential huge holes
@ 2021-11-08 14:00 Lang Yu
2021-11-15 13:51 ` Lang Yu
` (3 more replies)
0 siblings, 4 replies; 8+ messages in thread
From: Lang Yu @ 2021-11-08 14:00 UTC (permalink / raw)
To: linux-mm, David Hildenbrand, Oscar Salvador
Cc: linux-kernel, Catalin Marinas, Andrew Morton, Lang Yu
When using devm_request_free_mem_region() and devm_memremap_pages()
to add ZONE_DEVICE memory, if requested free mem region's end pfn
were huge(e.g., 0x400000000), the node_end_pfn() will be also huge
(see move_pfn_range_to_zone()). Thus it creates a huge hole between
node_start_pfn() and node_end_pfn().
We found on some AMD APUs, amdkfd requested such a free mem region
and created a huge hole. In such a case, following code snippet was
just doing busy test_bit() looping on the huge hole.
for (pfn = start_pfn; pfn < end_pfn; pfn++) {
struct page *page = pfn_to_online_page(pfn);
if (!page)
continue;
...
}
So we got a soft lockup:
watchdog: BUG: soft lockup - CPU#6 stuck for 26s! [bash:1221]
CPU: 6 PID: 1221 Comm: bash Not tainted 5.15.0-custom #1
RIP: 0010:pfn_to_online_page+0x5/0xd0
Call Trace:
? kmemleak_scan+0x16a/0x440
kmemleak_write+0x306/0x3a0
? common_file_perm+0x72/0x170
full_proxy_write+0x5c/0x90
vfs_write+0xb9/0x260
ksys_write+0x67/0xe0
__x64_sys_write+0x1a/0x20
do_syscall_64+0x3b/0xc0
entry_SYSCALL_64_after_hwframe+0x44/0xae
I did some tests with the patch.
(1) amdgpu module unloaded
before the patch:
real 0m0.976s
user 0m0.000s
sys 0m0.968s
after the patch:
real 0m0.981s
user 0m0.000s
sys 0m0.973s
(2) amdgpu module loaded
before the patch:
real 0m35.365s
user 0m0.000s
sys 0m35.354s
after the patch:
real 0m1.049s
user 0m0.000s
sys 0m1.042s
v2:
- Only scan pages belonging to the zone.(David Hildenbrand)
- Use __maybe_unused to make compilers happy.
Signed-off-by: Lang Yu <lang.yu@amd.com>
---
mm/kmemleak.c | 13 +++++++------
1 file changed, 7 insertions(+), 6 deletions(-)
diff --git a/mm/kmemleak.c b/mm/kmemleak.c
index b57383c17cf6..adbe5aa01184 100644
--- a/mm/kmemleak.c
+++ b/mm/kmemleak.c
@@ -1403,7 +1403,8 @@ static void kmemleak_scan(void)
{
unsigned long flags;
struct kmemleak_object *object;
- int i;
+ struct zone *zone;
+ int __maybe_unused i;
int new_leaks = 0;
jiffies_last_scan = jiffies;
@@ -1443,9 +1444,9 @@ static void kmemleak_scan(void)
* Struct page scanning for each node.
*/
get_online_mems();
- for_each_online_node(i) {
- unsigned long start_pfn = node_start_pfn(i);
- unsigned long end_pfn = node_end_pfn(i);
+ for_each_populated_zone(zone) {
+ unsigned long start_pfn = zone->zone_start_pfn;
+ unsigned long end_pfn = zone_end_pfn(zone);
unsigned long pfn;
for (pfn = start_pfn; pfn < end_pfn; pfn++) {
@@ -1454,8 +1455,8 @@ static void kmemleak_scan(void)
if (!page)
continue;
- /* only scan pages belonging to this node */
- if (page_to_nid(page) != i)
+ /* only scan pages belonging to this zone */
+ if (page_zone(page) != zone)
continue;
/* only scan if page is in use */
if (page_count(page) == 0)
--
2.25.1
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH v2] mm/kmemleak: Avoid scanning potential huge holes
2021-11-08 14:00 [PATCH v2] mm/kmemleak: Avoid scanning potential huge holes Lang Yu
@ 2021-11-15 13:51 ` Lang Yu
2021-11-24 2:58 ` Lang Yu
` (2 subsequent siblings)
3 siblings, 0 replies; 8+ messages in thread
From: Lang Yu @ 2021-11-15 13:51 UTC (permalink / raw)
To: linux-mm, Andrew Morton
Cc: linux-kernel, Catalin Marinas, David Hildenbrand, Oscar Salvador
Ping for review. Thanks!
On Mon, Nov 08, 2021 at 10:00:29PM +0800, Lang Yu wrote:
> When using devm_request_free_mem_region() and devm_memremap_pages()
> to add ZONE_DEVICE memory, if requested free mem region's end pfn
> were huge(e.g., 0x400000000), the node_end_pfn() will be also huge
> (see move_pfn_range_to_zone()). Thus it creates a huge hole between
> node_start_pfn() and node_end_pfn().
>
> We found on some AMD APUs, amdkfd requested such a free mem region
> and created a huge hole. In such a case, following code snippet was
> just doing busy test_bit() looping on the huge hole.
>
> for (pfn = start_pfn; pfn < end_pfn; pfn++) {
> struct page *page = pfn_to_online_page(pfn);
> if (!page)
> continue;
> ...
> }
>
> So we got a soft lockup:
>
> watchdog: BUG: soft lockup - CPU#6 stuck for 26s! [bash:1221]
> CPU: 6 PID: 1221 Comm: bash Not tainted 5.15.0-custom #1
> RIP: 0010:pfn_to_online_page+0x5/0xd0
> Call Trace:
> ? kmemleak_scan+0x16a/0x440
> kmemleak_write+0x306/0x3a0
> ? common_file_perm+0x72/0x170
> full_proxy_write+0x5c/0x90
> vfs_write+0xb9/0x260
> ksys_write+0x67/0xe0
> __x64_sys_write+0x1a/0x20
> do_syscall_64+0x3b/0xc0
> entry_SYSCALL_64_after_hwframe+0x44/0xae
>
> I did some tests with the patch.
>
> (1) amdgpu module unloaded
>
> before the patch:
>
> real 0m0.976s
> user 0m0.000s
> sys 0m0.968s
>
> after the patch:
>
> real 0m0.981s
> user 0m0.000s
> sys 0m0.973s
>
> (2) amdgpu module loaded
>
> before the patch:
>
> real 0m35.365s
> user 0m0.000s
> sys 0m35.354s
>
> after the patch:
>
> real 0m1.049s
> user 0m0.000s
> sys 0m1.042s
>
> v2:
> - Only scan pages belonging to the zone.(David Hildenbrand)
> - Use __maybe_unused to make compilers happy.
>
> Signed-off-by: Lang Yu <lang.yu@amd.com>
> ---
> mm/kmemleak.c | 13 +++++++------
> 1 file changed, 7 insertions(+), 6 deletions(-)
>
> diff --git a/mm/kmemleak.c b/mm/kmemleak.c
> index b57383c17cf6..adbe5aa01184 100644
> --- a/mm/kmemleak.c
> +++ b/mm/kmemleak.c
> @@ -1403,7 +1403,8 @@ static void kmemleak_scan(void)
> {
> unsigned long flags;
> struct kmemleak_object *object;
> - int i;
> + struct zone *zone;
> + int __maybe_unused i;
> int new_leaks = 0;
>
> jiffies_last_scan = jiffies;
> @@ -1443,9 +1444,9 @@ static void kmemleak_scan(void)
> * Struct page scanning for each node.
> */
> get_online_mems();
> - for_each_online_node(i) {
> - unsigned long start_pfn = node_start_pfn(i);
> - unsigned long end_pfn = node_end_pfn(i);
> + for_each_populated_zone(zone) {
> + unsigned long start_pfn = zone->zone_start_pfn;
> + unsigned long end_pfn = zone_end_pfn(zone);
> unsigned long pfn;
>
> for (pfn = start_pfn; pfn < end_pfn; pfn++) {
> @@ -1454,8 +1455,8 @@ static void kmemleak_scan(void)
> if (!page)
> continue;
>
> - /* only scan pages belonging to this node */
> - if (page_to_nid(page) != i)
> + /* only scan pages belonging to this zone */
> + if (page_zone(page) != zone)
> continue;
> /* only scan if page is in use */
> if (page_count(page) == 0)
> --
> 2.25.1
>
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH v2] mm/kmemleak: Avoid scanning potential huge holes
2021-11-08 14:00 [PATCH v2] mm/kmemleak: Avoid scanning potential huge holes Lang Yu
2021-11-15 13:51 ` Lang Yu
@ 2021-11-24 2:58 ` Lang Yu
2021-11-24 9:07 ` David Hildenbrand
2022-01-28 19:29 ` Catalin Marinas
3 siblings, 0 replies; 8+ messages in thread
From: Lang Yu @ 2021-11-24 2:58 UTC (permalink / raw)
To: linux-mm, Catalin Marinas, Andrew Morton
Cc: linux-kernel, David Hildenbrand, Oscar Salvador
On Mon, Nov 08, 2021 at 10:00:29PM +0800, Lang Yu wrote:
Ping.
> When using devm_request_free_mem_region() and devm_memremap_pages()
> to add ZONE_DEVICE memory, if requested free mem region's end pfn
> were huge(e.g., 0x400000000), the node_end_pfn() will be also huge
> (see move_pfn_range_to_zone()). Thus it creates a huge hole between
> node_start_pfn() and node_end_pfn().
>
> We found on some AMD APUs, amdkfd requested such a free mem region
> and created a huge hole. In such a case, following code snippet was
> just doing busy test_bit() looping on the huge hole.
>
> for (pfn = start_pfn; pfn < end_pfn; pfn++) {
> struct page *page = pfn_to_online_page(pfn);
> if (!page)
> continue;
> ...
> }
>
> So we got a soft lockup:
>
> watchdog: BUG: soft lockup - CPU#6 stuck for 26s! [bash:1221]
> CPU: 6 PID: 1221 Comm: bash Not tainted 5.15.0-custom #1
> RIP: 0010:pfn_to_online_page+0x5/0xd0
> Call Trace:
> ? kmemleak_scan+0x16a/0x440
> kmemleak_write+0x306/0x3a0
> ? common_file_perm+0x72/0x170
> full_proxy_write+0x5c/0x90
> vfs_write+0xb9/0x260
> ksys_write+0x67/0xe0
> __x64_sys_write+0x1a/0x20
> do_syscall_64+0x3b/0xc0
> entry_SYSCALL_64_after_hwframe+0x44/0xae
>
> I did some tests with the patch.
>
> (1) amdgpu module unloaded
>
> before the patch:
>
> real 0m0.976s
> user 0m0.000s
> sys 0m0.968s
>
> after the patch:
>
> real 0m0.981s
> user 0m0.000s
> sys 0m0.973s
>
> (2) amdgpu module loaded
>
> before the patch:
>
> real 0m35.365s
> user 0m0.000s
> sys 0m35.354s
>
> after the patch:
>
> real 0m1.049s
> user 0m0.000s
> sys 0m1.042s
>
> v2:
> - Only scan pages belonging to the zone.(David Hildenbrand)
> - Use __maybe_unused to make compilers happy.
>
> Signed-off-by: Lang Yu <lang.yu@amd.com>
> ---
> mm/kmemleak.c | 13 +++++++------
> 1 file changed, 7 insertions(+), 6 deletions(-)
>
> diff --git a/mm/kmemleak.c b/mm/kmemleak.c
> index b57383c17cf6..adbe5aa01184 100644
> --- a/mm/kmemleak.c
> +++ b/mm/kmemleak.c
> @@ -1403,7 +1403,8 @@ static void kmemleak_scan(void)
> {
> unsigned long flags;
> struct kmemleak_object *object;
> - int i;
> + struct zone *zone;
> + int __maybe_unused i;
> int new_leaks = 0;
>
> jiffies_last_scan = jiffies;
> @@ -1443,9 +1444,9 @@ static void kmemleak_scan(void)
> * Struct page scanning for each node.
> */
> get_online_mems();
> - for_each_online_node(i) {
> - unsigned long start_pfn = node_start_pfn(i);
> - unsigned long end_pfn = node_end_pfn(i);
> + for_each_populated_zone(zone) {
> + unsigned long start_pfn = zone->zone_start_pfn;
> + unsigned long end_pfn = zone_end_pfn(zone);
> unsigned long pfn;
>
> for (pfn = start_pfn; pfn < end_pfn; pfn++) {
> @@ -1454,8 +1455,8 @@ static void kmemleak_scan(void)
> if (!page)
> continue;
>
> - /* only scan pages belonging to this node */
> - if (page_to_nid(page) != i)
> + /* only scan pages belonging to this zone */
> + if (page_zone(page) != zone)
> continue;
> /* only scan if page is in use */
> if (page_count(page) == 0)
> --
> 2.25.1
>
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH v2] mm/kmemleak: Avoid scanning potential huge holes
2021-11-08 14:00 [PATCH v2] mm/kmemleak: Avoid scanning potential huge holes Lang Yu
2021-11-15 13:51 ` Lang Yu
2021-11-24 2:58 ` Lang Yu
@ 2021-11-24 9:07 ` David Hildenbrand
2021-11-24 10:31 ` Lang Yu
2022-01-28 19:29 ` Catalin Marinas
3 siblings, 1 reply; 8+ messages in thread
From: David Hildenbrand @ 2021-11-24 9:07 UTC (permalink / raw)
To: Lang Yu, linux-mm, Oscar Salvador
Cc: linux-kernel, Catalin Marinas, Andrew Morton
On 08.11.21 15:00, Lang Yu wrote:
> When using devm_request_free_mem_region() and devm_memremap_pages()
> to add ZONE_DEVICE memory, if requested free mem region's end pfn
> were huge(e.g., 0x400000000), the node_end_pfn() will be also huge
> (see move_pfn_range_to_zone()). Thus it creates a huge hole between
> node_start_pfn() and node_end_pfn().
>
> We found on some AMD APUs, amdkfd requested such a free mem region
> and created a huge hole. In such a case, following code snippet was
> just doing busy test_bit() looping on the huge hole.
>
> for (pfn = start_pfn; pfn < end_pfn; pfn++) {
> struct page *page = pfn_to_online_page(pfn);
> if (!page)
> continue;
> ...
> }
>
> So we got a soft lockup:
>
> watchdog: BUG: soft lockup - CPU#6 stuck for 26s! [bash:1221]
> CPU: 6 PID: 1221 Comm: bash Not tainted 5.15.0-custom #1
> RIP: 0010:pfn_to_online_page+0x5/0xd0
> Call Trace:
> ? kmemleak_scan+0x16a/0x440
> kmemleak_write+0x306/0x3a0
> ? common_file_perm+0x72/0x170
> full_proxy_write+0x5c/0x90
> vfs_write+0xb9/0x260
> ksys_write+0x67/0xe0
> __x64_sys_write+0x1a/0x20
> do_syscall_64+0x3b/0xc0
> entry_SYSCALL_64_after_hwframe+0x44/0xae
>
> I did some tests with the patch.
>
> (1) amdgpu module unloaded
>
> before the patch:
>
> real 0m0.976s
> user 0m0.000s
> sys 0m0.968s
>
> after the patch:
>
> real 0m0.981s
> user 0m0.000s
> sys 0m0.973s
>
> (2) amdgpu module loaded
>
> before the patch:
>
> real 0m35.365s
> user 0m0.000s
> sys 0m35.354s
>
> after the patch:
>
> real 0m1.049s
> user 0m0.000s
> sys 0m1.042s
>
> v2:
> - Only scan pages belonging to the zone.(David Hildenbrand)
> - Use __maybe_unused to make compilers happy.
>
> Signed-off-by: Lang Yu <lang.yu@amd.com>
> ---
> mm/kmemleak.c | 13 +++++++------
> 1 file changed, 7 insertions(+), 6 deletions(-)
>
> diff --git a/mm/kmemleak.c b/mm/kmemleak.c
> index b57383c17cf6..adbe5aa01184 100644
> --- a/mm/kmemleak.c
> +++ b/mm/kmemleak.c
> @@ -1403,7 +1403,8 @@ static void kmemleak_scan(void)
> {
> unsigned long flags;
> struct kmemleak_object *object;
> - int i;
> + struct zone *zone;
> + int __maybe_unused i;
> int new_leaks = 0;
>
> jiffies_last_scan = jiffies;
> @@ -1443,9 +1444,9 @@ static void kmemleak_scan(void)
> * Struct page scanning for each node.
> */
> get_online_mems();
> - for_each_online_node(i) {
> - unsigned long start_pfn = node_start_pfn(i);
> - unsigned long end_pfn = node_end_pfn(i);
> + for_each_populated_zone(zone) {
> + unsigned long start_pfn = zone->zone_start_pfn;
> + unsigned long end_pfn = zone_end_pfn(zone);
> unsigned long pfn;
>
> for (pfn = start_pfn; pfn < end_pfn; pfn++) {
> @@ -1454,8 +1455,8 @@ static void kmemleak_scan(void)
> if (!page)
> continue;
>
> - /* only scan pages belonging to this node */
> - if (page_to_nid(page) != i)
> + /* only scan pages belonging to this zone */
> + if (page_zone(page) != zone)
> continue;
> /* only scan if page is in use */
> if (page_count(page) == 0)
>
I think in theory we could optimize further, there really isn't that
much need to skip single pages ... we can usually skip whole
pageblocks. (in some corner cases we might have to back off
one pageblock and continue the search page-wise). But that's a
different story and there might not be need to optimize.
Also, I wonder if we should adjust the cond_resched() logic instead.
While your code makes the "sparse node" case faster, I think we could
still run into the same issue in the "sparse zone" case now.
Acked-by: David Hildenbrand <david@redhat.com>
to this patch.
diff --git a/mm/kmemleak.c b/mm/kmemleak.c
index b57383c17cf6..1cd1df3cb01b 100644
--- a/mm/kmemleak.c
+++ b/mm/kmemleak.c
@@ -1451,6 +1451,9 @@ static void kmemleak_scan(void)
for (pfn = start_pfn; pfn < end_pfn; pfn++) {
struct page *page = pfn_to_online_page(pfn);
+ if (!(pfn & 63))
+ cond_resched();
+
if (!page)
continue;
@@ -1461,8 +1464,6 @@ static void kmemleak_scan(void)
if (page_count(page) == 0)
continue;
scan_block(page, page + 1, NULL);
- if (!(pfn & 63))
- cond_resched();
}
}
put_online_mems();
What do you think?
--
Thanks,
David / dhildenb
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH v2] mm/kmemleak: Avoid scanning potential huge holes
2021-11-24 9:07 ` David Hildenbrand
@ 2021-11-24 10:31 ` Lang Yu
0 siblings, 0 replies; 8+ messages in thread
From: Lang Yu @ 2021-11-24 10:31 UTC (permalink / raw)
To: David Hildenbrand
Cc: linux-mm, Oscar Salvador, linux-kernel, Catalin Marinas, Andrew Morton
On Wed, Nov 24, 2021 at 10:07:57AM +0100, David Hildenbrand wrote:
> On 08.11.21 15:00, Lang Yu wrote:
> > When using devm_request_free_mem_region() and devm_memremap_pages()
> > to add ZONE_DEVICE memory, if requested free mem region's end pfn
> > were huge(e.g., 0x400000000), the node_end_pfn() will be also huge
> > (see move_pfn_range_to_zone()). Thus it creates a huge hole between
> > node_start_pfn() and node_end_pfn().
> >
> > We found on some AMD APUs, amdkfd requested such a free mem region
> > and created a huge hole. In such a case, following code snippet was
> > just doing busy test_bit() looping on the huge hole.
> >
> > for (pfn = start_pfn; pfn < end_pfn; pfn++) {
> > struct page *page = pfn_to_online_page(pfn);
> > if (!page)
> > continue;
> > ...
> > }
> >
> > So we got a soft lockup:
> >
> > watchdog: BUG: soft lockup - CPU#6 stuck for 26s! [bash:1221]
> > CPU: 6 PID: 1221 Comm: bash Not tainted 5.15.0-custom #1
> > RIP: 0010:pfn_to_online_page+0x5/0xd0
> > Call Trace:
> > ? kmemleak_scan+0x16a/0x440
> > kmemleak_write+0x306/0x3a0
> > ? common_file_perm+0x72/0x170
> > full_proxy_write+0x5c/0x90
> > vfs_write+0xb9/0x260
> > ksys_write+0x67/0xe0
> > __x64_sys_write+0x1a/0x20
> > do_syscall_64+0x3b/0xc0
> > entry_SYSCALL_64_after_hwframe+0x44/0xae
> >
> > I did some tests with the patch.
> >
> > (1) amdgpu module unloaded
> >
> > before the patch:
> >
> > real 0m0.976s
> > user 0m0.000s
> > sys 0m0.968s
> >
> > after the patch:
> >
> > real 0m0.981s
> > user 0m0.000s
> > sys 0m0.973s
> >
> > (2) amdgpu module loaded
> >
> > before the patch:
> >
> > real 0m35.365s
> > user 0m0.000s
> > sys 0m35.354s
> >
> > after the patch:
> >
> > real 0m1.049s
> > user 0m0.000s
> > sys 0m1.042s
> >
> > v2:
> > - Only scan pages belonging to the zone.(David Hildenbrand)
> > - Use __maybe_unused to make compilers happy.
> >
> > Signed-off-by: Lang Yu <lang.yu@amd.com>
> > ---
> > mm/kmemleak.c | 13 +++++++------
> > 1 file changed, 7 insertions(+), 6 deletions(-)
> >
> > diff --git a/mm/kmemleak.c b/mm/kmemleak.c
> > index b57383c17cf6..adbe5aa01184 100644
> > --- a/mm/kmemleak.c
> > +++ b/mm/kmemleak.c
> > @@ -1403,7 +1403,8 @@ static void kmemleak_scan(void)
> > {
> > unsigned long flags;
> > struct kmemleak_object *object;
> > - int i;
> > + struct zone *zone;
> > + int __maybe_unused i;
> > int new_leaks = 0;
> >
> > jiffies_last_scan = jiffies;
> > @@ -1443,9 +1444,9 @@ static void kmemleak_scan(void)
> > * Struct page scanning for each node.
> > */
> > get_online_mems();
> > - for_each_online_node(i) {
> > - unsigned long start_pfn = node_start_pfn(i);
> > - unsigned long end_pfn = node_end_pfn(i);
> > + for_each_populated_zone(zone) {
> > + unsigned long start_pfn = zone->zone_start_pfn;
> > + unsigned long end_pfn = zone_end_pfn(zone);
> > unsigned long pfn;
> >
> > for (pfn = start_pfn; pfn < end_pfn; pfn++) {
> > @@ -1454,8 +1455,8 @@ static void kmemleak_scan(void)
> > if (!page)
> > continue;
> >
> > - /* only scan pages belonging to this node */
> > - if (page_to_nid(page) != i)
> > + /* only scan pages belonging to this zone */
> > + if (page_zone(page) != zone)
> > continue;
> > /* only scan if page is in use */
> > if (page_count(page) == 0)
> >
>
> I think in theory we could optimize further, there really isn't that
> much need to skip single pages ... we can usually skip whole
> pageblocks. (in some corner cases we might have to back off
> one pageblock and continue the search page-wise). But that's a
> different story and there might not be need to optimize.
I agree with you.
>
> Also, I wonder if we should adjust the cond_resched() logic instead.
> While your code makes the "sparse node" case faster, I think we could
> still run into the same issue in the "sparse zone" case now.
>
> Acked-by: David Hildenbrand <david@redhat.com>
>
> to this patch.
>
>
> diff --git a/mm/kmemleak.c b/mm/kmemleak.c
> index b57383c17cf6..1cd1df3cb01b 100644
> --- a/mm/kmemleak.c
> +++ b/mm/kmemleak.c
> @@ -1451,6 +1451,9 @@ static void kmemleak_scan(void)
> for (pfn = start_pfn; pfn < end_pfn; pfn++) {
> struct page *page = pfn_to_online_page(pfn);
>
> + if (!(pfn & 63))
> + cond_resched();
> +
> if (!page)
> continue;
>
> @@ -1461,8 +1464,6 @@ static void kmemleak_scan(void)
> if (page_count(page) == 0)
> continue;
> scan_block(page, page + 1, NULL);
> - if (!(pfn & 63))
> - cond_resched();
> }
> }
> put_online_mems();
>
>
> What do you think?
Yes, I think that will avoid any potential soft lockup.
But wheather there are still such huge continuous pages.
And the run time may increase a little.
Regards,
Lang
> --
> Thanks,
>
> David / dhildenb
>
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH v2] mm/kmemleak: Avoid scanning potential huge holes
2021-11-08 14:00 [PATCH v2] mm/kmemleak: Avoid scanning potential huge holes Lang Yu
` (2 preceding siblings ...)
2021-11-24 9:07 ` David Hildenbrand
@ 2022-01-28 19:29 ` Catalin Marinas
2022-02-01 0:51 ` Andrew Morton
3 siblings, 1 reply; 8+ messages in thread
From: Catalin Marinas @ 2022-01-28 19:29 UTC (permalink / raw)
To: Lang Yu
Cc: linux-mm, David Hildenbrand, Oscar Salvador, linux-kernel, Andrew Morton
On Mon, Nov 08, 2021 at 10:00:29PM +0800, Lang Yu wrote:
> When using devm_request_free_mem_region() and devm_memremap_pages()
> to add ZONE_DEVICE memory, if requested free mem region's end pfn
> were huge(e.g., 0x400000000), the node_end_pfn() will be also huge
> (see move_pfn_range_to_zone()). Thus it creates a huge hole between
> node_start_pfn() and node_end_pfn().
>
> We found on some AMD APUs, amdkfd requested such a free mem region
> and created a huge hole. In such a case, following code snippet was
> just doing busy test_bit() looping on the huge hole.
>
> for (pfn = start_pfn; pfn < end_pfn; pfn++) {
> struct page *page = pfn_to_online_page(pfn);
> if (!page)
> continue;
> ...
> }
>
> So we got a soft lockup:
>
> watchdog: BUG: soft lockup - CPU#6 stuck for 26s! [bash:1221]
> CPU: 6 PID: 1221 Comm: bash Not tainted 5.15.0-custom #1
> RIP: 0010:pfn_to_online_page+0x5/0xd0
> Call Trace:
> ? kmemleak_scan+0x16a/0x440
> kmemleak_write+0x306/0x3a0
> ? common_file_perm+0x72/0x170
> full_proxy_write+0x5c/0x90
> vfs_write+0xb9/0x260
> ksys_write+0x67/0xe0
> __x64_sys_write+0x1a/0x20
> do_syscall_64+0x3b/0xc0
> entry_SYSCALL_64_after_hwframe+0x44/0xae
>
> I did some tests with the patch.
>
> (1) amdgpu module unloaded
>
> before the patch:
>
> real 0m0.976s
> user 0m0.000s
> sys 0m0.968s
>
> after the patch:
>
> real 0m0.981s
> user 0m0.000s
> sys 0m0.973s
>
> (2) amdgpu module loaded
>
> before the patch:
>
> real 0m35.365s
> user 0m0.000s
> sys 0m35.354s
>
> after the patch:
>
> real 0m1.049s
> user 0m0.000s
> sys 0m1.042s
>
> v2:
> - Only scan pages belonging to the zone.(David Hildenbrand)
> - Use __maybe_unused to make compilers happy.
>
> Signed-off-by: Lang Yu <lang.yu@amd.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH v2] mm/kmemleak: Avoid scanning potential huge holes
2022-01-28 19:29 ` Catalin Marinas
@ 2022-02-01 0:51 ` Andrew Morton
2022-02-03 14:55 ` Catalin Marinas
0 siblings, 1 reply; 8+ messages in thread
From: Andrew Morton @ 2022-02-01 0:51 UTC (permalink / raw)
To: Catalin Marinas
Cc: Lang Yu, linux-mm, David Hildenbrand, Oscar Salvador, linux-kernel
On Fri, 28 Jan 2022 19:29:26 +0000 Catalin Marinas <catalin.marinas@arm.com> wrote:
> Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Thanks.
I think this deserves a cc:stable? Triggering the soft lockup detector
is bad behavior.
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH v2] mm/kmemleak: Avoid scanning potential huge holes
2022-02-01 0:51 ` Andrew Morton
@ 2022-02-03 14:55 ` Catalin Marinas
0 siblings, 0 replies; 8+ messages in thread
From: Catalin Marinas @ 2022-02-03 14:55 UTC (permalink / raw)
To: Andrew Morton
Cc: Lang Yu, linux-mm, David Hildenbrand, Oscar Salvador, linux-kernel
On Mon, Jan 31, 2022 at 04:51:41PM -0800, Andrew Morton wrote:
> On Fri, 28 Jan 2022 19:29:26 +0000 Catalin Marinas <catalin.marinas@arm.com> wrote:
>
> > Acked-by: Catalin Marinas <catalin.marinas@arm.com>
>
> Thanks.
>
> I think this deserves a cc:stable? Triggering the soft lockup detector
> is bad behavior.
Yes, I think it should. I guess the problem is not widely spread as
no-one reported it until recently.
--
Catalin
^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2022-02-03 14:55 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-11-08 14:00 [PATCH v2] mm/kmemleak: Avoid scanning potential huge holes Lang Yu
2021-11-15 13:51 ` Lang Yu
2021-11-24 2:58 ` Lang Yu
2021-11-24 9:07 ` David Hildenbrand
2021-11-24 10:31 ` Lang Yu
2022-01-28 19:29 ` Catalin Marinas
2022-02-01 0:51 ` Andrew Morton
2022-02-03 14:55 ` Catalin Marinas
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).