LKML Archive on lore.kernel.org
help / color / mirror / Atom feed
* [PATCH v8 0/6] optimize memblock_next_valid_pfn and early_pfn_valid on arm and arm64
@ 2018-04-11  7:21 Jia He
  2018-04-11  7:21 ` [PATCH v8 1/6] arm: arm64: introduce CONFIG_HAVE_MEMBLOCK_PFN_VALID Jia He
                   ` (6 more replies)
  0 siblings, 7 replies; 12+ messages in thread
From: Jia He @ 2018-04-11  7:21 UTC (permalink / raw)
  To: Russell King, Catalin Marinas, Will Deacon, Mark Rutland,
	Ard Biesheuvel, Andrew Morton, Michal Hocko
  Cc: Wei Yang, Kees Cook, Laura Abbott, Vladimir Murzin,
	Philip Derrin, AKASHI Takahiro, James Morse, Steve Capper,
	Pavel Tatashin, Gioh Kim, Vlastimil Babka, Mel Gorman,
	Johannes Weiner, Kemi Wang, Petr Tesarik, YASUAKI ISHIMATSU,
	Andrey Ryabinin, Nikolay Borisov, Daniel Jordan, Daniel Vacek,
	Eugeniu Rosca, linux-arm-kernel, linux-kernel, linux-mm, Jia He

Commit b92df1de5d28 ("mm: page_alloc: skip over regions of invalid pfns
where possible") tried to optimize the loop in memmap_init_zone(). But
there is still some room for improvement.

Patch 1 introduce new config to make codes more generic
Patch 2 remain the memblock_next_valid_pfn on arm and arm64
Patch 3 optimizes the memblock_next_valid_pfn()
Patch 4~6 optimizes the early_pfn_valid()

As for the performance improvement, after this set, I can see the time
overhead of memmap_init() is reduced from 41313 us to 24389 us in my
armv8a server(QDF2400 with 96G memory).

Without this patchset:
[  117.113677] before memmap_init
[  117.118195] after  memmap_init
>>> memmap_init takes 4518 us
[  117.121446] before memmap_init
[  117.154992] after  memmap_init
>>> memmap_init takes 33546 us
[  117.158241] before memmap_init
[  117.161490] after  memmap_init
>>> memmap_init takes 3249 us
>>> totally takes 41313 us

With this patchset:
[  123.222962] before memmap_init
[  123.226819] after  memmap_init
>>> memmap_init takes 3857
[  123.230070] before memmap_init
[  123.247354] after  memmap_init
>>> memmap_init takes 17284
[  123.250604] before memmap_init
[  123.253852] after  memmap_init
>>> memmap_init takes 3248
>>> totally takes 24389 us

Attached the memblock region information in my server.
[   86.956758] Zone ranges:
[   86.959452]   DMA      [mem 0x0000000000200000-0x00000000ffffffff]
[   86.966041]   Normal   [mem 0x0000000100000000-0x00000017ffffffff]
[   86.972631] Movable zone start for each node
[   86.977179] Early memory node ranges
[   86.980985]   node   0: [mem 0x0000000000200000-0x000000000021ffff]
[   86.987666]   node   0: [mem 0x0000000000820000-0x000000000307ffff]
[   86.994348]   node   0: [mem 0x0000000003080000-0x000000000308ffff]
[   87.001029]   node   0: [mem 0x0000000003090000-0x00000000031fffff]
[   87.007710]   node   0: [mem 0x0000000003200000-0x00000000033fffff]
[   87.014392]   node   0: [mem 0x0000000003410000-0x000000000563ffff]
[   87.021073]   node   0: [mem 0x0000000005640000-0x000000000567ffff]
[   87.027754]   node   0: [mem 0x0000000005680000-0x00000000056dffff]
[   87.034435]   node   0: [mem 0x00000000056e0000-0x00000000086fffff]
[   87.041117]   node   0: [mem 0x0000000008700000-0x000000000871ffff]
[   87.047798]   node   0: [mem 0x0000000008720000-0x000000000894ffff]
[   87.054479]   node   0: [mem 0x0000000008950000-0x0000000008baffff]
[   87.061161]   node   0: [mem 0x0000000008bb0000-0x0000000008bcffff]
[   87.067842]   node   0: [mem 0x0000000008bd0000-0x0000000008c4ffff]
[   87.074524]   node   0: [mem 0x0000000008c50000-0x0000000008e2ffff]
[   87.081205]   node   0: [mem 0x0000000008e30000-0x0000000008e4ffff]
[   87.087886]   node   0: [mem 0x0000000008e50000-0x0000000008fcffff]
[   87.094568]   node   0: [mem 0x0000000008fd0000-0x000000000910ffff]
[   87.101249]   node   0: [mem 0x0000000009110000-0x00000000092effff]
[   87.107930]   node   0: [mem 0x00000000092f0000-0x000000000930ffff]
[   87.114612]   node   0: [mem 0x0000000009310000-0x000000000963ffff]
[   87.121293]   node   0: [mem 0x0000000009640000-0x000000000e61ffff]
[   87.127975]   node   0: [mem 0x000000000e620000-0x000000000e64ffff]
[   87.134657]   node   0: [mem 0x000000000e650000-0x000000000fffffff]
[   87.141338]   node   0: [mem 0x0000000010800000-0x0000000017feffff]
[   87.148019]   node   0: [mem 0x000000001c000000-0x000000001c00ffff]
[   87.154701]   node   0: [mem 0x000000001c010000-0x000000001c7fffff]
[   87.161383]   node   0: [mem 0x000000001c810000-0x000000007efbffff]
[   87.168064]   node   0: [mem 0x000000007efc0000-0x000000007efdffff]
[   87.174746]   node   0: [mem 0x000000007efe0000-0x000000007efeffff]
[   87.181427]   node   0: [mem 0x000000007eff0000-0x000000007effffff]
[   87.188108]   node   0: [mem 0x000000007f000000-0x00000017ffffffff]
[   87.194791] Initmem setup node 0 [mem 0x0000000000200000-0x00000017ffffffff]

Changelog:
V8: - introduce new config and move generic code to early_pfn.h
    - optimize memblock_next_valid_pfn as suggested by Matthew Wilcox
V7: - fix i386 compilation error. refine the commit description
V6: - simplify the codes, move arm/arm64 common codes to one file.
    - refine patches as suggested by Danial Vacek and Ard Biesheuvel
V5: - further refining as suggested by Danial Vacek. Make codes
      arm/arm64 more arch specific
V4: - refine patches as suggested by Danial Vacek and Wei Yang
    - optimized on arm besides arm64
V3: - fix 2 issues reported by kbuild test robot
V2: - rebase to mmotm latest
    - remain memblock_next_valid_pfn on arm64
    - refine memblock_search_pfn_regions and pfn_valid_region

Jia He (6):
  arm: arm64: introduce CONFIG_HAVE_MEMBLOCK_PFN_VALID
  mm: page_alloc: remain memblock_next_valid_pfn() on arm/arm64
  arm: arm64: page_alloc: reduce unnecessary binary search in
    memblock_next_valid_pfn()
  mm/memblock: introduce memblock_search_pfn_regions()
  arm: arm64: introduce pfn_valid_region()
  mm: page_alloc: reduce unnecessary binary search in early_pfn_valid()

 arch/arm/Kconfig          |  4 +++
 arch/arm/mm/init.c        |  1 +
 arch/arm64/Kconfig        |  4 +++
 arch/arm64/mm/init.c      |  1 +
 include/linux/early_pfn.h | 79 +++++++++++++++++++++++++++++++++++++++++++++++
 include/linux/memblock.h  |  2 ++
 include/linux/mmzone.h    | 18 ++++++++++-
 mm/Kconfig                |  3 ++
 mm/memblock.c             |  9 ++++++
 mm/page_alloc.c           |  5 ++-
 10 files changed, 124 insertions(+), 2 deletions(-)
 create mode 100644 include/linux/early_pfn.h

-- 
2.7.4

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2018-05-07  1:11 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-04-11  7:21 [PATCH v8 0/6] optimize memblock_next_valid_pfn and early_pfn_valid on arm and arm64 Jia He
2018-04-11  7:21 ` [PATCH v8 1/6] arm: arm64: introduce CONFIG_HAVE_MEMBLOCK_PFN_VALID Jia He
2018-04-11  7:21 ` [PATCH v8 2/6] mm: page_alloc: remain memblock_next_valid_pfn() on arm/arm64 Jia He
2018-04-11  7:21 ` [PATCH v8 3/6] arm: arm64: page_alloc: reduce unnecessary binary search in memblock_next_valid_pfn() Jia He
2018-04-11  7:21 ` [PATCH v8 4/6] mm/memblock: introduce memblock_search_pfn_regions() Jia He
2018-04-11  7:21 ` [PATCH v8 5/6] arm: arm64: introduce pfn_valid_region() Jia He
2018-04-11  7:21 ` [PATCH v8 6/6] mm: page_alloc: reduce unnecessary binary search in early_pfn_valid() Jia He
2018-05-04  2:45 ` [PATCH v8 0/6] optimize memblock_next_valid_pfn and early_pfn_valid on arm and arm64 Jia He
2018-05-04 16:08   ` Daniel Vacek
2018-05-04 16:53     ` Pavel Tatashin
2018-05-04 18:33       ` Daniel Vacek
2018-05-07  1:10       ` Jia He

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).