LKML Archive on lore.kernel.org
help / color / mirror / Atom feed
* [PATCH 4/8][for -mm] mem_notify v6: memory_pressure_notify() caller
@ 2008-02-09 15:24 KOSAKI Motohiro
2008-02-12 22:56 ` Andrew Morton
0 siblings, 1 reply; 5+ messages in thread
From: KOSAKI Motohiro @ 2008-02-09 15:24 UTC (permalink / raw)
To: linux-mm, linux-kernel
Cc: kosaki.motohiro, Marcelo Tosatti, Daniel Spang, Rik van Riel,
Andrew Morton, Alan Cox, linux-fsdevel, Pavel Machek, Al Boldi,
Jon Masters, Zan Lynx
the notification point to happen whenever the VM moves an
anonymous page to the inactive list - this is a pretty good indication
that there are unused anonymous pages present which will be very likely
swapped out soon.
and, It is judged out of trouble at the fllowing situations.
o memory pressure decrease and stop moves an anonymous page to the
inactive list.
o free pages increase than (pages_high+lowmem_reserve)*2.
ChangeLog:
v5: add out of trouble notify to exit of balance_pgdat().
Signed-off-by: Marcelo Tosatti <marcelo@kvack.org>
Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
---
mm/page_alloc.c | 12 ++++++++++++
mm/vmscan.c | 26 ++++++++++++++++++++++++++
2 files changed, 38 insertions(+)
Index: b/mm/vmscan.c
===================================================================
--- a/mm/vmscan.c 2008-01-23 22:06:08.000000000 +0900
+++ b/mm/vmscan.c 2008-01-23 22:07:57.000000000 +0900
@@ -39,6 +39,7 @@
#include <linux/kthread.h>
#include <linux/freezer.h>
#include <linux/memcontrol.h>
+#include <linux/mem_notify.h>
#include <asm/tlbflush.h>
#include <asm/div64.h>
@@ -1089,10 +1090,14 @@ static void shrink_active_list(unsigned
struct page *page;
struct pagevec pvec;
int reclaim_mapped = 0;
+ bool inactivated_anon = 0;
if (sc->may_swap)
reclaim_mapped = calc_reclaim_mapped(sc, zone, priority);
+ if (!reclaim_mapped)
+ memory_pressure_notify(zone, 0);
+
lru_add_drain();
spin_lock_irq(&zone->lru_lock);
pgmoved = sc->isolate_pages(nr_pages, &l_hold, &pgscanned, sc->order,
@@ -1116,6 +1121,13 @@ static void shrink_active_list(unsigned
if (!reclaim_mapped ||
(total_swap_pages == 0 && PageAnon(page)) ||
page_referenced(page, 0, sc->mem_cgroup)) {
+ /* deal with the case where there is no
+ * swap but an anonymous page would be
+ * moved to the inactive list.
+ */
+ if (!total_swap_pages && reclaim_mapped &&
+ PageAnon(page))
+ inactivated_anon = 1;
list_add(&page->lru, &l_active);
continue;
}
@@ -1123,8 +1135,12 @@ static void shrink_active_list(unsigned
list_add(&page->lru, &l_active);
continue;
}
+ if (PageAnon(page))
+ inactivated_anon = 1;
list_add(&page->lru, &l_inactive);
}
+ if (inactivated_anon)
+ memory_pressure_notify(zone, 1);
pagevec_init(&pvec, 1);
pgmoved = 0;
@@ -1158,6 +1174,8 @@ static void shrink_active_list(unsigned
pagevec_strip(&pvec);
spin_lock_irq(&zone->lru_lock);
}
+ if (!reclaim_mapped)
+ memory_pressure_notify(zone, 0);
pgmoved = 0;
while (!list_empty(&l_active)) {
@@ -1659,6 +1677,14 @@ out:
goto loop_again;
}
+ for (i = pgdat->nr_zones - 1; i >= 0; i--) {
+ struct zone *zone = pgdat->node_zones + i;
+
+ if (!populated_zone(zone))
+ continue;
+ memory_pressure_notify(zone, 0);
+ }
+
return nr_reclaimed;
}
Index: b/mm/page_alloc.c
===================================================================
--- a/mm/page_alloc.c 2008-01-23 22:06:08.000000000 +0900
+++ b/mm/page_alloc.c 2008-01-23 23:09:32.000000000 +0900
@@ -44,6 +44,7 @@
#include <linux/fault-inject.h>
#include <linux/page-isolation.h>
#include <linux/memcontrol.h>
+#include <linux/mem_notify.h>
#include <asm/tlbflush.h>
#include <asm/div64.h>
@@ -435,6 +436,8 @@ static inline void __free_one_page(struc
unsigned long page_idx;
int order_size = 1 << order;
int migratetype = get_pageblock_migratetype(page);
+ unsigned long prev_free;
+ unsigned long notify_threshold;
if (unlikely(PageCompound(page)))
destroy_compound_page(page, order);
@@ -444,6 +447,7 @@ static inline void __free_one_page(struc
VM_BUG_ON(page_idx & (order_size - 1));
VM_BUG_ON(bad_range(zone, page));
+ prev_free = zone_page_state(zone, NR_FREE_PAGES);
__mod_zone_page_state(zone, NR_FREE_PAGES, order_size);
while (order < MAX_ORDER-1) {
unsigned long combined_idx;
@@ -465,6 +469,14 @@ static inline void __free_one_page(struc
list_add(&page->lru,
&zone->free_area[order].free_list[migratetype]);
zone->free_area[order].nr_free++;
+
+ notify_threshold = (zone->pages_high +
+ zone->lowmem_reserve[MAX_NR_ZONES-1]) * 2;
+
+ if (unlikely((zone->mem_notify_status == 1) &&
+ (prev_free <= notify_threshold) &&
+ (zone_page_state(zone, NR_FREE_PAGES) > notify_threshold)))
+ memory_pressure_notify(zone, 0);
}
static inline int free_pages_check(struct page *page)
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH 4/8][for -mm] mem_notify v6: memory_pressure_notify() caller
2008-02-09 15:24 [PATCH 4/8][for -mm] mem_notify v6: memory_pressure_notify() caller KOSAKI Motohiro
@ 2008-02-12 22:56 ` Andrew Morton
2008-02-13 6:37 ` KOSAKI Motohiro
0 siblings, 1 reply; 5+ messages in thread
From: Andrew Morton @ 2008-02-12 22:56 UTC (permalink / raw)
To: KOSAKI Motohiro
Cc: linux-mm, linux-kernel, kosaki.motohiro, marcelo, daniel.spang,
riel, alan, linux-fsdevel, pavel, a1426z, jonathan, zlynx
On Sun, 10 Feb 2008 00:24:28 +0900
"KOSAKI Motohiro" <kosaki.motohiro@jp.fujitsu.com> wrote:
> the notification point to happen whenever the VM moves an
> anonymous page to the inactive list - this is a pretty good indication
> that there are unused anonymous pages present which will be very likely
> swapped out soon.
>
> and, It is judged out of trouble at the fllowing situations.
> o memory pressure decrease and stop moves an anonymous page to the
> inactive list.
> o free pages increase than (pages_high+lowmem_reserve)*2.
This seems rather arbitrary. Why choose this stage in the page
reclaimation process rather than some other stage?
If this feature is useful then I'd expect that some applications would want
notification at different times, or at different levels of VM distress. So
this semi-randomly-chosen notification point just won't be strong enough in
real-world use.
Does this change work correctly and appropriately for processes which are
running in a cgroup memory controller?
Given the amount of code which these patches add, and the subsequent
maintenance burden, and the unlikelihood of getting many applications to
actually _use_ the interface, it is not obvious to me that inclusion in the
kernel is justifiable, sorry.
memory_pressure_notify() is far too large to be inlined.
Some of the patches were wordwrapped.
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH 4/8][for -mm] mem_notify v6: memory_pressure_notify() caller
2008-02-12 22:56 ` Andrew Morton
@ 2008-02-13 6:37 ` KOSAKI Motohiro
2008-02-13 15:03 ` Andi Kleen
0 siblings, 1 reply; 5+ messages in thread
From: KOSAKI Motohiro @ 2008-02-13 6:37 UTC (permalink / raw)
To: Andrew Morton
Cc: kosaki.motohiro, linux-mm, linux-kernel, marcelo, daniel.spang,
riel, alan, linux-fsdevel, pavel, a1426z, jonathan, zlynx
Hi Andrew
> > and, It is judged out of trouble at the fllowing situations.
> > o memory pressure decrease and stop moves an anonymous page to the
> > inactive list.
> > o free pages increase than (pages_high+lowmem_reserve)*2.
>
> This seems rather arbitrary. Why choose this stage in the page
> reclaimation process rather than some other stage?
>
> If this feature is useful then I'd expect that some applications would want
> notification at different times, or at different levels of VM distress. So
> this semi-randomly-chosen notification point just won't be strong enough in
> real-world use.
Hmmm
actually, This portion become code broat through some bug reports.
Yes, I think it again and implement it more simplefy.
Thanks!
> Does this change work correctly and appropriately for processes which are
> running in a cgroup memory controller?
nice point out.
to be honest, I don't think at mem-cgroup until now.
I will implement it at next post.
> Given the amount of code which these patches add, and the subsequent
> maintenance burden, and the unlikelihood of getting many applications to
> actually _use_ the interface, it is not obvious to me that inclusion in the
> kernel is justifiable, sorry.
OK.
I'll implement it again more simplefy.
Thanks.
> memory_pressure_notify() is far too large to be inlined.
OK.
I will fix it.
> Some of the patches were wordwrapped.
Agghh..
I will don't use gmail at next post.
sorry.
and,
I hope merge only poll_wait_exclusive() and wake_up_locked_nr()
if you don't mind.
this 2 portion anybody noclaim about 2 month.
and I think it is useful function by many people.
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH 4/8][for -mm] mem_notify v6: memory_pressure_notify() caller
2008-02-13 6:37 ` KOSAKI Motohiro
@ 2008-02-13 15:03 ` Andi Kleen
2008-02-14 0:25 ` KOSAKI Motohiro
0 siblings, 1 reply; 5+ messages in thread
From: Andi Kleen @ 2008-02-13 15:03 UTC (permalink / raw)
To: KOSAKI Motohiro
Cc: Andrew Morton, linux-mm, linux-kernel, marcelo, daniel.spang,
riel, alan, linux-fsdevel, pavel, a1426z, jonathan, zlynx
KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> writes:
>
> to be honest, I don't think at mem-cgroup until now.
There is not only mem-cgroup BTW, but also NUMA node restrictons from
NUMA memory policy. So this means a process might not be able to access
all memory.
-Andi
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH 4/8][for -mm] mem_notify v6: memory_pressure_notify() caller
2008-02-13 15:03 ` Andi Kleen
@ 2008-02-14 0:25 ` KOSAKI Motohiro
0 siblings, 0 replies; 5+ messages in thread
From: KOSAKI Motohiro @ 2008-02-14 0:25 UTC (permalink / raw)
To: Andi Kleen
Cc: kosaki.motohiro, Andrew Morton, linux-mm, linux-kernel, marcelo,
daniel.spang, riel, alan, linux-fsdevel, pavel, a1426z, jonathan,
zlynx
Hi Andi,
> > to be honest, I don't think at mem-cgroup until now.
>
> There is not only mem-cgroup BTW, but also NUMA node restrictons from
> NUMA memory policy. So this means a process might not be able to access
> all memory.
you are right.
good point out.
current implementation may cause wake up the no relate process of
memory shortage zone ;-)
but unfortunately, we can't know per zone rss.
(/proc/[pid]/numa_maps is very slow, we can't use it
at memory shortage emergency)
I think we need develop per zone rss.
it become not only improve mem_notify, but also improve
oom killer of more intelligent process choice.
but it is a bit difficult. (at least for me ;-)
may be, I will implement it a bit later...
Thanks again!
your good opnion may improve my patch.
- kosaki
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2008-02-14 0:32 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2008-02-09 15:24 [PATCH 4/8][for -mm] mem_notify v6: memory_pressure_notify() caller KOSAKI Motohiro
2008-02-12 22:56 ` Andrew Morton
2008-02-13 6:37 ` KOSAKI Motohiro
2008-02-13 15:03 ` Andi Kleen
2008-02-14 0:25 ` KOSAKI Motohiro
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).