LKML Archive on lore.kernel.org
help / color / mirror / Atom feed
* Re: why swap at all?
@ 2004-05-26 12:24 Nick Piggin
  2004-05-26 13:03 ` Buddy Lumpkin
  0 siblings, 1 reply; 146+ messages in thread
From: Nick Piggin @ 2004-05-26 12:24 UTC (permalink / raw)
  To: Buddy Lumpkin
  Cc: 'John Bradford', 'William Lee Irwin III',
	orders, linux-kernel

Buddy Lumpkin wrote:
>>>3) once physical memory is full, file system I/O will only benefit from
>>>reads that incur a minor fault. All other file system operations 
>>>are bound
>>>by the rate you can reclaim pages from physical memory.
>>>
> 
> 
>>No, typically we can reclaim memory very quickly and the operations
>>are bound by the speed of the block device.
> 
> 
> So if all physical memory is full with either pagecache or anonymous memory,
> where are you going to put these operations that are bound by the speed of
> the block device?
> 
> You have to evict pages at the same rate your reading them in or writing to
> the filesystem else you have nowhere to put them. This means that the rate
> you can access the filesystem is governed by the rate you can evict pages
> from memory.
> 

... and the speed of the block device. The minimum of the two actually.
Usually we can reclaim memory *much* faster than the block device can
fill it. Didn't you read what I had said?

> Couple that with the fact that there are many pte's pointing at the same
> physical page (shared page) in many cases where many processes are running
> on the system. Because all of the references to that page must be removed
> before the page can be evicted, there are some absolute limitations in the
> rate that pages can be evicted from memory as the number of processes
> running on the system and the total amount of memory increases.
> 

This is still many orders of magnitude faster than filling the page
from disk, and you typically don't reclaim much of mapped memory anyway.

We are sort of spamming lkml now so let's get this finished up.

If you want to talk about memory management basics, there should
be some more helpful lists.

^ permalink raw reply	[flat|nested] 146+ messages in thread

* RE: why swap at all?
  2004-05-26 12:24 why swap at all? Nick Piggin
@ 2004-05-26 13:03 ` Buddy Lumpkin
  2004-05-26 13:27   ` Helge Hafting
  0 siblings, 1 reply; 146+ messages in thread
From: Buddy Lumpkin @ 2004-05-26 13:03 UTC (permalink / raw)
  To: 'Nick Piggin'
  Cc: 'John Bradford', 'William Lee Irwin III',
	orders, linux-kernel

>> Couple that with the fact that there are many pte's pointing at the same
>> physical page (shared page) in many cases where many processes 
>>
>> are running
>> on the system. Because all of the references to that page must be removed
>> before the page can be evicted, there are some absolute 
>> limitations in the
>> rate that pages can be evicted from memory as the number of processes
>> running on the system and the total amount of memory increases.
>> 

> This is still many orders of magnitude faster than filling the page
> from disk, and you typically don't reclaim much of mapped memory anyway.

This discussion went broke-minded again. Your still picturing that single
IDE hard drive in your workstation and im talking about big iron, large
databases, etc.. where the total amount of aggregate disk I/O is completely
limited by the rate you can evict pages from the pagecache.

Picture 6 to 7 fibre channel cards with over 70% utilization during peak
usage times connected to a large EMC storage array with 64GB of non-volatile
cache.

--Buddy


^ permalink raw reply	[flat|nested] 146+ messages in thread

* Re: why swap at all?
  2004-05-26 13:03 ` Buddy Lumpkin
@ 2004-05-26 13:27   ` Helge Hafting
  0 siblings, 0 replies; 146+ messages in thread
From: Helge Hafting @ 2004-05-26 13:27 UTC (permalink / raw)
  To: Buddy Lumpkin; +Cc: linux-kernel

Buddy Lumpkin wrote:

>>>Couple that with the fact that there are many pte's pointing at the same
>>>physical page (shared page) in many cases where many processes 
>>>
>>>are running
>>>on the system. Because all of the references to that page must be removed
>>>before the page can be evicted, there are some absolute 
>>>limitations in the
>>>rate that pages can be evicted from memory as the number of processes
>>>running on the system and the total amount of memory increases.
>>>
>>>      
>>>
>
>  
>
>>This is still many orders of magnitude faster than filling the page
>>from disk, and you typically don't reclaim much of mapped memory anyway.
>>    
>>
>
>This discussion went broke-minded again. Your still picturing that single
>IDE hard drive in your workstation and im talking about big iron, large
>databases, etc.. where the total amount of aggregate disk I/O is completely
>limited by the rate you can evict pages from the pagecache.
>  
>
The eviction speed should not be a limitation, unless the machine is
ill-configured. Some pages aren't dirty, and can be dropped instantly.
That is way faster than any storage solution.

Other pages have to be written out (to swap, or to some file because
it is a pending write.)  This is not a problem, because io out
is as fast as io in.  If you have big iron with a superfast array - sure,
your io comes in at tremendous speed.  But swap and other writes
go out at the same tremendous speed too.  So no problem.

Now if you have a big-iron machine with filesystems on a fast array
and swap on a single slow disk then you're in trouble.  But that
is a bad setup, not a kernel problem.

Helge Hafting

^ permalink raw reply	[flat|nested] 146+ messages in thread

* Re: why swap at all?
  2004-06-04 17:08                         ` Bill Davidsen
@ 2004-06-15 14:55                           ` Charles Shannon Hendrix
  0 siblings, 0 replies; 146+ messages in thread
From: Charles Shannon Hendrix @ 2004-06-15 14:55 UTC (permalink / raw)
  To: Linux Kernel

Fri, 04 Jun 2004 @ 13:08 -0400, Bill Davidsen said:

> But I fail to make my point... I want to limit how much memory is used 
> for i/o buffers, cache, or anything else which will produce memory 
> pressure of my programs. 

I would love to be able to limit this kind of memory use.

I've always liked how BSD works in this area, never using over a certain
amount.

I find the Linux behavior of using all memory for things like
buffercache is less than optimal.  While there are situations where it
helps, there are a great many where it hurts.

I frequently do work which fills memory with data I'll never use again,
and it makes things slow.

Desktop work tends to do this kind of thing as well.

> That's what would be nice with tuning, the admin can optimize what is 
> important on that system. I am usually happy with what the system does 
> on i/o, but I want my 500MB or so of programs to stay resident in a 2GB 
> machine, and if that adds a ms or two to i/o I can live with it, so that 
> when I change windows it happens now, not eventually. And I bet there 
> are a lot of others who would like better response to focus changes aswell.

Not only that, but I wish certain bits of code could be locked into
memory.  Generally any code and data associated with the user interface
should always be there.

It's annoying when a menu in X takes ten seconds of swapping to appear.

-- 
shannon "AT" widomaker.com -- [javalin: an unwieldy programming weapon used
to stab a software project through the heart until dead]

^ permalink raw reply	[flat|nested] 146+ messages in thread

* Re: why swap at all?
  2004-06-08 15:15   ` Ray Bryant
@ 2004-06-09 19:24     ` Bill Davidsen
  0 siblings, 0 replies; 146+ messages in thread
From: Bill Davidsen @ 2004-06-09 19:24 UTC (permalink / raw)
  To: Ray Bryant
  Cc: Buddy Lumpkin, 'Con Kolivas', 'FabF',
	'Bernd Eckenfels',
	linux-kernel, lse-tech, linux-mm

Ray Bryant wrote:
> 
> Buddy Lumpkin wrote:
> 
>>  <snip> One method would be to keep the
>> pagecache on it's own list, and move pages to the head of the list any 
>> time
>> they are modified or referenced, and reclaim from the tail.
>> All pages on this list can be considered as "free memory", because any 
>> new
>> memory requests would just cause pages to be evicted from the tail of the
>> list.
>>
> 
> We have code running on Altix that does exactly this.  (Please note,
> however, that this is for our version of Linux 2.4.21 -- Yeah, its
> old, but that is what the product runs at the moment -- we are in
> the process of switching over to Linux 2.6 when all of this will
> have to be re-evaluated.)  The changes are in three parts:
> 
> (1)  We added a new page list, the reclaim list.  Pages are put
> onto the reclaim list when they are inserted into the page cache.
> They are removed from the list when they are marked dirty (buffers
> from the page go on to the LRU dirty list) or when the pages are
> mmap'd into an address space, since in either of these situations,
> the pages are not reclaimable.  (This list is per node in our
> NUMA system.)
> 
> (2)  We added code in __alloc_pages() so that if the local node
> allocation is going to fail (remember that Altix is a NUMA machine),
> we call out to a routine to scan the reclaim list on that node and
> to release enough clean buffer cache pages to make the local
> allocation succeed (plus a few pages, for efficiency).  If this
> doesn't work, we most likely end up spilling the allocation over
> to another node.
> 
> (3)  We added code in generic_file_write() to limit the size of
> the page cache on buffered file I/O write operations.  If the
> current size of the page cache is larger than the limit, we
> call the same routine as above to release some page cache pages.
> If we can't free enough pages to get below the limit, we throttle
> the write process by delaying it for a bit.  This was all to
> avoid the problem of a large buffered file I/O request causing
> the page cache to grow to the point where the system would start
> to swap.  (On our large memory systems, dropping into the
> swapping code can cause the system to freeze for 10's of seconds,
> and that is something we would like to avoid).
> 
> (We actually don't enforce the page cache limit unless the amount
> of free memory has dropped below a certain threshold.  This is to
> keep the page cache from being limited if there is lots of free
> memory -- even though we only limit the page cache on writes,
> it turns out that the kernel is constantly writing to the disk,
> so this also effectively causes the page cache to be limited
> for reads as well.)
> 
> This code was also written in response to customer demand.  They
> don't like the fact that the buffer cache grows and grows on our
> Altix systems, and they want old buffer cache pages to be cleared
> out when they are no longer needed.  Since we almost never suffer
> memory pressure on our systems (and if we do, we are likely in
> trouble), kswapd almost never does this.  Buffer cache pages can
> sit around for days with no one removing them.  The above was one
> approach to solve that problem.
> 
> Pleaes note: YMMV.  An Altix is not a desktop system and I make
> no claims that the above approach is appropriate for everyone.
> For us, it turns out to work better to bias storage allocation
> against unbridled growth of the page cache.  Indeed, we have
> spent a lot of time trying to solve problems related to page
> cache on Altix systems.  Assuming we get our OLS paper done
> in time, you can read more about this in our paper at OLS.
> (If not, we intend to post our experiences paper on the
> oss.sgi.com website.)
> 
> Finally, let me reiterate that we are beginning the process of
> evaluating the 2.6 memory manager wrt the same problem as above.
> Before we will propose a change such as above for 2.6, we have
> to convince ourselves that (1) setting vm_swappiness appropriately
> doesn't solve the problem, and (2) that patches such as the ones
> that Nick Piggin has been proposing don't solve the problem
> either, and that (3) there isn't some other mechanism to deal
> with this in 2.6.

I have to admit that the definition of "desktop machine" has changed a 
lot in the last few years, in terms of hardware, but I have been running 
since 486 days with "what can I build/buy for <$2k which best fits my 
overall computing?" With the onset of cheap memory and Opteron, NUMA 
will be a factor in the next few years in all probability, and SMP has 
been since the dual pentium systems were new.

That said, I think that your work will be useful, even if it is used 
piecemeal or as inspiration to Nick, Andrea, and other who have been 
working in the area. I find Nick's work as of 2.6.7-rc1-mm1 so good I 
haven't moved any of my desktop machines beyond it, but it sounds as if 
your work addresses the issue I mentioned about limiting buffer usage, 
and Rik's comment that the code lacks check and balances. You seem to 
have a balance, I'd love to see it.


-- 
    -bill davidsen (davidsen@tmr.com)
"The secret to procrastination is to put things off until the
  last possible moment - but no longer"  -me

^ permalink raw reply	[flat|nested] 146+ messages in thread

* Re: why swap at all?
       [not found] ` <fa.kfm8lru.1l2mdp4@ifi.uio.no>
  2004-06-08 15:12   ` Ray Bryant
@ 2004-06-08 15:15   ` Ray Bryant
  2004-06-09 19:24     ` Bill Davidsen
  1 sibling, 1 reply; 146+ messages in thread
From: Ray Bryant @ 2004-06-08 15:15 UTC (permalink / raw)
  To: Buddy Lumpkin
  Cc: 'Bill Davidsen', 'Con Kolivas', 'FabF',
	'Bernd Eckenfels',
	linux-kernel, lse-tech, linux-mm


Buddy Lumpkin wrote:
>  <snip> One method would be to keep the
> pagecache on it's own list, and move pages to the head of the list any time
> they are modified or referenced, and reclaim from the tail. 
> 
> All pages on this list can be considered as "free memory", because any new
> memory requests would just cause pages to be evicted from the tail of the
> list.
> 

We have code running on Altix that does exactly this.  (Please note,
however, that this is for our version of Linux 2.4.21 -- Yeah, its
old, but that is what the product runs at the moment -- we are in
the process of switching over to Linux 2.6 when all of this will
have to be re-evaluated.)  The changes are in three parts:

(1)  We added a new page list, the reclaim list.  Pages are put
onto the reclaim list when they are inserted into the page cache.
They are removed from the list when they are marked dirty (buffers
from the page go on to the LRU dirty list) or when the pages are
mmap'd into an address space, since in either of these situations,
the pages are not reclaimable.  (This list is per node in our
NUMA system.)

(2)  We added code in __alloc_pages() so that if the local node
allocation is going to fail (remember that Altix is a NUMA machine),
we call out to a routine to scan the reclaim list on that node and
to release enough clean buffer cache pages to make the local
allocation succeed (plus a few pages, for efficiency).  If this
doesn't work, we most likely end up spilling the allocation over
to another node.

(3)  We added code in generic_file_write() to limit the size of
the page cache on buffered file I/O write operations.  If the
current size of the page cache is larger than the limit, we
call the same routine as above to release some page cache pages.
If we can't free enough pages to get below the limit, we throttle
the write process by delaying it for a bit.  This was all to
avoid the problem of a large buffered file I/O request causing
the page cache to grow to the point where the system would start
to swap.  (On our large memory systems, dropping into the
swapping code can cause the system to freeze for 10's of seconds,
and that is something we would like to avoid).

(We actually don't enforce the page cache limit unless the amount
of free memory has dropped below a certain threshold.  This is to
keep the page cache from being limited if there is lots of free
memory -- even though we only limit the page cache on writes,
it turns out that the kernel is constantly writing to the disk,
so this also effectively causes the page cache to be limited
for reads as well.)

This code was also written in response to customer demand.  They
don't like the fact that the buffer cache grows and grows on our
Altix systems, and they want old buffer cache pages to be cleared
out when they are no longer needed.  Since we almost never suffer
memory pressure on our systems (and if we do, we are likely in
trouble), kswapd almost never does this.  Buffer cache pages can
sit around for days with no one removing them.  The above was one
approach to solve that problem.

Pleaes note: YMMV.  An Altix is not a desktop system and I make
no claims that the above approach is appropriate for everyone.
For us, it turns out to work better to bias storage allocation
against unbridled growth of the page cache.  Indeed, we have
spent a lot of time trying to solve problems related to page
cache on Altix systems.  Assuming we get our OLS paper done
in time, you can read more about this in our paper at OLS.
(If not, we intend to post our experiences paper on the
oss.sgi.com website.)

Finally, let me reiterate that we are beginning the process of
evaluating the 2.6 memory manager wrt the same problem as above.
Before we will propose a change such as above for 2.6, we have
to convince ourselves that (1) setting vm_swappiness appropriately
doesn't solve the problem, and (2) that patches such as the ones
that Nick Piggin has been proposing don't solve the problem
either, and that (3) there isn't some other mechanism to deal
with this in 2.6.

Stay tuned for results of same.

 > <snip>
-- 
Best Regards,
Ray
-----------------------------------------------
                   Ray Bryant
512-453-9679 (work)         512-507-7807 (cell)
raybry@sgi.com             raybry@austin.rr.com
The box said: "Requires Windows 98 or better",
            so I installed Linux.
-----------------------------------------------


^ permalink raw reply	[flat|nested] 146+ messages in thread

* Re: why swap at all?
       [not found] ` <fa.kfm8lru.1l2mdp4@ifi.uio.no>
@ 2004-06-08 15:12   ` Ray Bryant
  2004-06-08 15:15   ` Ray Bryant
  1 sibling, 0 replies; 146+ messages in thread
From: Ray Bryant @ 2004-06-08 15:12 UTC (permalink / raw)
  To: Buddy Lumpkin
  Cc: 'Bill Davidsen', 'Con Kolivas', 'FabF',
	'Bernd Eckenfels',
	linux-kernel, lse-tech, linux-mm


Buddy Lumpkin wrote:
>  <snip> One method would be to keep the
> pagecache on it's own list, and move pages to the head of the list any time
> they are modified or referenced, and reclaim from the tail. 
> 
> All pages on this list can be considered as "free memory", because any new
> memory requests would just cause pages to be evicted from the tail of the
> list.
> 

We have code running on Altix that does exactly this.  (Please note,
however, that this is for our version of Linux 2.4.21 -- Yeah, its
old, but that is what the product runs at the moment -- we are in
the process of switching over to Linux 2.6 when all of this will
have to be re-evaluated.)  The changes are in three parts:

(1)  We added a new page list, the reclaim list.  Pages are put
onto the reclaim list when they are inserted into the page cache.
They are removed from the list when they are marked dirty (buffers
from the page go on to the LRU dirty list) or when the pages are
mmap'd into an address space, since in either of these situations,
the pages are not reclaimable.  (This list is per node in our
NUMA system.)

(2)  We added code in __alloc_pages() so that if the local node
allocation is going to fail (remember that Altix is a NUMA machine),
we call out to a routine to scan the reclaim list on that node and
to release enough clean buffer cache pages to make the local
allocation succeed (plus a few pages, for efficiency).  If this
doesn't work, we most likely end up spilling the allocation over
to another node.

(3)  We added code in generic_file_write() to limit the size of
the page cache on buffered file I/O write operations.  If the
current size of the page cache is larger than the limit, we
call the same routine as above to release some page cache pages.
If we can't free enough pages to get below the limit, we throttle
the write process by delaying it for a bit.  This was all to
avoid the problem of a large buffered file I/O request causing
the page cache to grow to the point where the system would start
to swap.  (On our large memory systems, dropping into the
swapping code can cause the system to freeze for 10's of seconds,
and that is something we would like to avoid).

(We actually don't enforce the page cache limit unless the amount
of free memory has dropped below a certain threshold.  This is to
keep the page cache from being limited if there is lots of free
memory -- even though we only limit the page cache on writes,
it turns out that the kernel is constantly writing to the disk,
so this also effectively causes the page cache to be limited
for reads as well.)

This code was also written in response to customer demand.  They
don't like the fact that the buffer cache grows and grows on our
Altix systems, and they want old buffer cache pages to be cleared
out when they are no longer needed.  Since we almost never suffer
memory pressure on our systems (and if we do, we are likely in
trouble), kswapd almost never does this.  Buffer cache pages can
sit around for days with no one removing them.  The above was one
approach to solve that problem.

Pleaes note: YMMV.  An Altix is not a desktop system and I make
no claims that the above approach is appropriate for everyone.
For us, it turns out to work better to bias storage allocation
against unbridled growth of the page cache.  Indeed, we have
spent a lot of time trying to solve problems related to page
cache on Altix systems.  Assuming we get our OLS paper done
in time, you can read more about this in our paper at OLS.
(If not, we intend to post our experiences paper on the
oss.sgi.com website.)

Finally, let me reiterate that we are beginning the process of
evaluating the 2.6 memory manager wrt the same problem as above.
Before we will propose a change such as above for 2.6, we have
to convince ourselves that (1) setting vm_swappiness appropriately
doesn't solve the problem, and (2) that patches such as the ones
that Nick Piggin has been proposing don't solve the problem
either, and that (3) there isn't some other mechanism to deal
with this in 2.6.

Stay tuned for results of same.

 > <snip>
-- 
Best Regards,
Ray
-----------------------------------------------
                   Ray Bryant
512-453-9679 (work)         512-507-7807 (cell)
raybry@sgi.com             raybry@austin.rr.com
The box said: "Requires Windows 98 or better",
            so I installed Linux.
-----------------------------------------------


^ permalink raw reply	[flat|nested] 146+ messages in thread

* Re:  why swap at all?
  2004-06-08  1:18               ` Tim Connors
@ 2004-06-08  5:29                 ` Denis Vlasenko
  0 siblings, 0 replies; 146+ messages in thread
From: Denis Vlasenko @ 2004-06-08  5:29 UTC (permalink / raw)
  To: Tim Connors, John Bradford
  Cc: William Lee Irwin III, Nick Piggin, Michael Brennan, linux-kernel

On Tuesday 08 June 2004 04:18, Tim Connors wrote:
> I just got an interesting problem - possibly (or not?) related to
> this:
>
> I have my laptop with 256MB of RAM, running 2.4.25-pre7, uptime 48
> days. Every morning, I come in and have to wait 2 minutes while
> everything comes back into RAM, after the daily slocate. So I did a
> swapoff -a, and it failed despite all the applications and cache and
> tmpfs adding up to far less than 256MB (more like 128).
>
> I closed mozilla, which let me do a swapoff -a.
>
> All was well for a few days, but then thismorning, my partitions were
> mounted ro, and an oops was in syslog at the same time as all the
> slocate work:
[alloc failures + oops snipped]

prolonged oom condition triggers lots of rarely user error paths
in kernel (and applications). Most probably slocate hit one of bugs
still living in one of them.

> So OOM - but why? The cache was registering 65MB used.
> 24353,23> cat /proc/meminfo
>         total:    used:    free:  shared: buffers:  cached:
> Mem:  262647808 256618496  6029312        0  4820992 67239936
> Swap:        0        0        0
> MemTotal:       256492 kB
> MemFree:          5888 kB
> MemShared:           0 kB
> Buffers:          4708 kB
> Cached:          65664 kB
> SwapCached:          0 kB
> Active:          77944 kB
> Inactive:       142308 kB
> HighTotal:           0 kB
> HighFree:            0 kB
> LowTotal:       256492 kB
> LowFree:          5888 kB
> SwapTotal:           0 kB
> SwapFree:            0 kB

Maybe this is not the state of the meminfo at the time of oom
condition. oops killed the task and had freed its memory,
which is now used by cache.

> Why was it so eager to kill applications, and not reclaim some of that
> swap space? Is this a problem that is known on 2.4, and can't be fixed
> (I can't use 2.6 on my laptop yet, far too many problems to even
> start - eg the suspend to ram on APM thread).
>
> Is there another output of a /proc file you want? I'll try not to get
> the urge to use/reboot the box in the meantime.

vmstat log of this event may be useful.
-- 
vda

^ permalink raw reply	[flat|nested] 146+ messages in thread

* Re:  why swap at all?
  2004-06-01  9:10             ` John Bradford
@ 2004-06-08  1:18               ` Tim Connors
  2004-06-08  5:29                 ` Denis Vlasenko
  0 siblings, 1 reply; 146+ messages in thread
From: Tim Connors @ 2004-06-08  1:18 UTC (permalink / raw)
  To: John Bradford
  Cc: William Lee Irwin III, Nick Piggin, Michael Brennan, linux-kernel

John Bradford <john@grabjohn.com> said on Tue, 1 Jun 2004 10:10:32 +0100:
> Quote from William Lee Irwin III <wli@holomorphy.com>:
> > Quote from William Lee Irwin III <wli@holomorphy.com>:
> > > to reproduce these 'swap increases performance even with untouched RAM'
> > > claims.
> > 
> > Because ZONE_DMA, the lower 16MB is not all of RAM.
> 
> Ah, OK, this isn't really my area of expertise so maybe this is a stupid, (for
> LKML), question, but can we only migrate data from low RAM via swap!?
> 
> Also, surely this is only relevant to X86 architectures?


I just got an interesting problem - possibly (or not?) related to
this:

I have my laptop with 256MB of RAM, running 2.4.25-pre7, uptime 48
days. Every morning, I come in and have to wait 2 minutes while
everything comes back into RAM, after the daily slocate. So I did a
swapoff -a, and it failed despite all the applications and cache and
tmpfs adding up to far less than 256MB (more like 128).

I closed mozilla, which let me do a swapoff -a.

All was well for a few days, but then thismorning, my partitions were
mounted ro, and an oops was in syslog at the same time as all the
slocate work:

Jun  8 07:55:24 scuzzie kernel: VM: killing process xemacs
Jun  8 07:56:17 scuzzie kernel: __alloc_pages: 0-order allocation failed (gfp=0x1d2/0)
Jun  8 07:56:17 scuzzie kernel: VM: killing process mailoops
Jun  8 07:57:30 scuzzie kernel: __alloc_pages: 0-order allocation failed (gfp=0x1d0/0)
Jun  8 07:57:34 scuzzie kernel: __alloc_pages: 0-order allocation failed (gfp=0x1d2/0)
Jun  8 07:57:35 scuzzie kernel: __alloc_pages: 0-order allocation failed (gfp=0x1d2/0)
Jun  8 07:57:35 scuzzie kernel: VM: killing process sh
Jun  8 07:57:35 scuzzie kernel: __alloc_pages: 0-order allocation failed (gfp=0x1d2/0)
Jun  8 07:57:35 scuzzie kernel: VM: killing process sh
Jun  8 07:57:38 scuzzie kernel: __alloc_pages: 0-order allocation failed (gfp=0x1d2/0)
Jun  8 07:57:38 scuzzie kernel: VM: killing process ssh-agent
Jun  8 07:57:38 scuzzie kernel: __alloc_pages: 0-order allocation failed (gfp=0x1d2/0)
Jun  8 07:57:38 scuzzie kernel: VM: killing process gunzip
Jun  8 07:57:40 scuzzie kernel: __alloc_pages: 0-order allocation failed (gfp=0x1d2/0)
Jun  8 07:57:40 scuzzie kernel: VM: killing process gunzip
Jun  8 07:57:40 scuzzie kernel: __alloc_pages: 0-order allocation failed (gfp=0x1d2/0)
Jun  8 07:57:40 scuzzie kernel: VM: killing process gunzip
Jun  8 07:57:41 scuzzie kernel: __alloc_pages: 0-order allocation failed (gfp=0x1d0/0)
Jun  8 07:57:41 scuzzie kernel: ERROR: (device ide0(3,5)): txAbortCommit
Jun  8 07:57:41 scuzzie kernel: BUG at jfs_txnmgr.c:509 assert(tblk->next == 0)
Jun  8 07:57:41 scuzzie kernel: kernel BUG at jfs_txnmgr.c:509!
Jun  8 07:57:41 scuzzie kernel: invalid operand: 0000
Jun  8 07:57:41 scuzzie kernel: CPU:    0
Jun  8 07:57:41 scuzzie kernel: EIP:    0010:[txEnd+235/256]    Not tainted
Jun  8 07:57:41 scuzzie kernel: EFLAGS: 00010282
Jun  8 07:57:41 scuzzie kernel: eax: 00000033   ebx: d0811c98   ecx: c12dc000   edx: cd9f5f7c
Jun  8 07:57:41 scuzzie kernel: esi: 00000102   edi: cfec1d60   ebp: cf4eb000   esp: c12ddf18
Jun  8 07:57:41 scuzzie kernel: ds: 0018   es: 0018   ss: 0018
Jun  8 07:57:41 scuzzie kernel: Process keventd (pid: 2, stackpage=c12dd000)
Jun  8 07:57:41 scuzzie kernel: Stack: c028ed43 c028f190 000001fd c028f180 fffffffb 00000102 cf4eb05c c017db9e 
Jun  8 07:57:41 scuzzie kernel:        00000102 00000001 c12ddf54 00000000 00000000 c35c5220 c017dc0e c35c5220 
Jun  8 07:57:41 scuzzie kernel:        00000000 00000001 c014d3f2 c35c5220 00000000 00000003 c12ddf88 c12ddf88 
Jun  8 07:57:41 scuzzie kernel: Call Trace:    [jfs_commit_inode+222/256] [jfs_write_inode+78/96] [try_to_sync_unused_inodes+466/480] [__run_task_queue+90/112] [cont
ext_thread+442/448]
Jun  8 07:57:41 scuzzie kernel:   [context_thread+0/448] [rest_init+0/64] [arch_kernel_thread+46/64] [context_thread+0/448]
Jun  8 07:57:41 scuzzie kernel: 
Jun  8 07:57:41 scuzzie kernel: Code: 0f 0b fd 01 90 f1 28 c0 e9 71 ff ff ff 90 8d b4 26 00 00 00 
Jun  8 07:57:41 scuzzie kernel:  <5>__alloc_pages: 0-order allocation failed (gfp=0x1d2/0)
Jun  8 07:57:41 scuzzie kernel: VM: killing process gunzip
Jun  8 07:57:49 scuzzie kernel: __alloc_pages: 0-order allocation failed (gfp=0x1d2/0)
Jun  8 07:57:49 scuzzie kernel: VM: killing process mozilla-bin
Jun  8 07:57:49 scuzzie kernel: __alloc_pages: 0-order allocation failed (gfp=0x1f0/0)
Jun  8 07:58:09 scuzzie wwwoffles[25878]: Cannot make a lock file for 'http/www.bom.gov.au/LkdPT+cpeS73Lf-picjpvUw' [Read-only file system].
Jun  8 07:58:58 scuzzie anacron[16529]: Job `cron.weekly' terminated (exit status: 1) (mailing output)
Jun  8 07:58:59 scuzzie modprobe: modprobe: cannot create /var/log/ksymoops/20040608.log Read-only file system
Jun  8 07:58:59 scuzzie modprobe: modprobe: cannot create /var/log/ksymoops/20040608.log Read-only file system
Jun  8 07:58:59 scuzzie anacron[16529]: Tried to mail output of job `cron.weekly', but mailer process (/usr/sbin/sendmail) exited with ststus 1
Jun  8 07:58:59 scuzzie anacron[16529]: Normal exit (2 jobs run)



So OOM - but why? The cache was registering 65MB used.
24353,23> cat /proc/meminfo 
        total:    used:    free:  shared: buffers:  cached:
Mem:  262647808 256618496  6029312        0  4820992 67239936
Swap:        0        0        0
MemTotal:       256492 kB
MemFree:          5888 kB
MemShared:           0 kB
Buffers:          4708 kB
Cached:          65664 kB
SwapCached:          0 kB
Active:          77944 kB
Inactive:       142308 kB
HighTotal:           0 kB
HighFree:            0 kB
LowTotal:       256492 kB
LowFree:          5888 kB
SwapTotal:           0 kB
SwapFree:            0 kB

Why was it so eager to kill applications, and not reclaim some of that
swap space? Is this a problem that is known on 2.4, and can't be fixed
(I can't use 2.6 on my laptop yet, far too many problems to even
start - eg the suspend to ram on APM thread).

Is there another output of a /proc file you want? I'll try not to get
the urge to use/reboot the box in the meantime.

-- 
TimC -- http://astronomy.swin.edu.au/staff/tconnors/
Anyone seeking the "Relativistic Quantum Mechanics" soft option
course, may wish to leave now. -- Intro lecture to RQM

^ permalink raw reply	[flat|nested] 146+ messages in thread

* Re: why swap at all?
  2004-06-03 14:14                     ` Bill Davidsen
  2004-06-04  7:23                       ` Buddy Lumpkin
  2004-06-04  9:11                       ` Catalin BOIE
@ 2004-06-06 14:39                       ` Rik van Riel
  2 siblings, 0 replies; 146+ messages in thread
From: Rik van Riel @ 2004-06-06 14:39 UTC (permalink / raw)
  To: Bill Davidsen; +Cc: Con Kolivas, FabF, Bernd Eckenfels, linux-kernel

On Thu, 3 Jun 2004, Bill Davidsen wrote:

> My perception is that the system is really bad at recognizing 
> diminishing returns to be had by paging programs for the benefit of i/o. 

Currently the kernel has no mechanisms at all to do any
kind of detection of bad pageout decisions it made in
the past, and consequently no way to learn for the future.

Checks and balances ... those are what's missing ;)

-- 
"Debugging is twice as hard as writing the code in the first place.
Therefore, if you write the code as cleverly as possible, you are,
by definition, not smart enough to debug it." - Brian W. Kernighan


^ permalink raw reply	[flat|nested] 146+ messages in thread

* Re: why swap at all?
  2004-06-04  9:11                       ` Catalin BOIE
@ 2004-06-04 17:24                         ` Bill Davidsen
  0 siblings, 0 replies; 146+ messages in thread
From: Bill Davidsen @ 2004-06-04 17:24 UTC (permalink / raw)
  To: Catalin BOIE; +Cc: Con Kolivas, FabF, Bernd Eckenfels, linux-kernel

Catalin BOIE wrote:
> Hello!
> 
>> But swap behaviour kills performance even when memory is more than 
>> adequate. Consider building a DVD image in a 4GB system. The i/o 
>> forces all of the unused programs out, in spite of the fact that an 
>> extra 100MB doesn't make a measurable difference in performance. But 
>> when I click Mozilla paging most of it in from disk make a big 
>> difference in performance to the user.
> 
> 
> I think that kernel cannot know that you need some data once or more.
> This is fadvise for.
> With my wrapper (http://kernel.umbrella.ro) for fadvise you can do this:
> NOCA_SIZE=128 NOCA_READ=1 NOCA_WRITE=1 NOCA_RA=1 \
>     noca mkisofs -R -o /tmp/1.iso /tmp/data
> 
> This means:
> NOCA_SIZE: Call fadvise only after 128KiB was read/wrote.
> NOCA_RA: call fadvise with POSIX_FADV_SEQUENTIAL
> NOCA_READ: use fadvise(POSIX_FADV_DONTNEED) for reads (because you don't 
> need anymore the source files)
> NOCA_WRITE: use fadvise(POSIX_FADV_DONTNEED) for writes (because it's 
> useless to cache the end of the ISO)
> 
> Do this program resolve your problem?

It addresses one of the cases which trigger problems, certainly. Thank you.
> 
>> The problems with small memory are different in kind, when not even 
>> the programs will fit in memory at the same time, or will leave next 
>> to nothing for i/o, swap is required for performance. But on a large 
>> memory system I believe the gain to pain ratio is way too low with the 
>> current VM. The solution at the moment is to turn off swap, which as 
>> you note has other problems (can't move between zones without swap?) 
>> which in theory could really hang a system.


-- 
    -bill davidsen (davidsen@tmr.com)
"The secret to procrastination is to put things off until the
  last possible moment - but no longer"  -me

^ permalink raw reply	[flat|nested] 146+ messages in thread

* Re: why swap at all?
  2004-06-04  7:23                       ` Buddy Lumpkin
@ 2004-06-04 17:08                         ` Bill Davidsen
  2004-06-15 14:55                           ` Charles Shannon Hendrix
  0 siblings, 1 reply; 146+ messages in thread
From: Bill Davidsen @ 2004-06-04 17:08 UTC (permalink / raw)
  To: Buddy Lumpkin
  Cc: 'Con Kolivas', 'FabF', 'Bernd Eckenfels',
	linux-kernel

Buddy Lumpkin wrote:
>>But swap behaviour kills performance even when memory is more than 
>>adequate. Consider building a DVD image in a 4GB system. The i/o forces 
>>all of the unused programs out, in spite of the fact that an extra 100MB 
>>doesn't make a measurable difference in performance. But when I click 
>>Mozilla paging most of it in from disk make a big difference in 
>>performance to the user.
> 
> 
> 
> We really need a server option. Something that ages out file backed pages
> naturally with less overhead than kswapd. One method would be to keep the
> pagecache on it's own list, and move pages to the head of the list any time
> they are modified or referenced, and reclaim from the tail. 
> 
> All pages on this list can be considered as "free memory", because any new
> memory requests would just cause pages to be evicted from the tail of the
> list.
> 
> Anonymous memory would *not* be on this list. This way any time anonymous
> memory is allocated, the pages can be readily stolen from the pagecache
> list.
> 
> Lastly one nifty configuration parameter that could exist as a knob for
> sys-admins is the ability to tell the VM not to add file backed pages with
> the execute bit set to the page cache list but rather, leave them to be
> reclaimed if kswapd wakes up in a true low memory situation (pagecache is
> exhausted and memory is still low). This would require a sys-admin to make
> sure only executables have the execute bit set and "data files", etc... do
> not have the execute bit set.

Or have the exec() call set a "part of a process" flag. That means that 
if I read an executable in as data it doesn't get locked, other than 
what part might be in my i/o buffers. And mmap can produce different 
effects than read/write which may be good, if they are GOOD different 
effects ;-) Before you ask, thing 'strings' as why avg user does this.

But I fail to make my point... I want to limit how much memory is used 
for i/o buffers, cache, or anything else which will produce memory 
pressure of my programs. The quick solution might be just a number from 
the admin, like the 2.2 patch, but some kernel logic to understand that 
while 20MB is much better than 10MB in a tiny system, 2GB is not a lot 
better than 1GB in a large memory system, and having a sync() bog the 
system for tens of seconds is undesirable. Well, maybe some folks don't 
agree, it could be that the admin set limit is really the way to go.

I regard this as a desktop issue, trading some i/o performance to keep 
window changes fast.
> 
> 
> A system that works like this is nice for the following reasons:
> 
> 1) The system administrator can size a system so that all programs
>     Safely run within physical RAM. Extra RAM
>     Could be added and sized based on the need
>     for caching files.
> 
> 2) Anonymous pages (and possibly executable if you read 
>      the last paragraph above) will only be evicted if kswapd is
>      awaken due to a true memory shortage (1/128th pagable memory?).
> 
> 
> I like to view the VM system as always being full, because if enough unique
> file system IO takes place, that is exactly what eventually happens. A
> system that counts page cache as free memory and uses a gentler mechanism to
> evict pages from the page cache would benefit IO bound servers significantly
> IMHO.

That's what would be nice with tuning, the admin can optimize what is 
important on that system. I am usually happy with what the system does 
on i/o, but I want my 500MB or so of programs to stay resident in a 2GB 
machine, and if that adds a ms or two to i/o I can live with it, so that 
when I change windows it happens now, not eventually. And I bet there 
are a lot of others who would like better response to focus changes aswell.

-- 
    -bill davidsen (davidsen@tmr.com)
"The secret to procrastination is to put things off until the
  last possible moment - but no longer"  -me

^ permalink raw reply	[flat|nested] 146+ messages in thread

* Re: why swap at all?
  2004-06-03 14:14                     ` Bill Davidsen
  2004-06-04  7:23                       ` Buddy Lumpkin
@ 2004-06-04  9:11                       ` Catalin BOIE
  2004-06-04 17:24                         ` Bill Davidsen
  2004-06-06 14:39                       ` Rik van Riel
  2 siblings, 1 reply; 146+ messages in thread
From: Catalin BOIE @ 2004-06-04  9:11 UTC (permalink / raw)
  To: Bill Davidsen; +Cc: Con Kolivas, FabF, Bernd Eckenfels, linux-kernel

Hello!

> But swap behaviour kills performance even when memory is more than adequate. 
> Consider building a DVD image in a 4GB system. The i/o forces all of the 
> unused programs out, in spite of the fact that an extra 100MB doesn't make a 
> measurable difference in performance. But when I click Mozilla paging most of 
> it in from disk make a big difference in performance to the user.

I think that kernel cannot know that you need some data once or more.
This is fadvise for.
With my wrapper (http://kernel.umbrella.ro) for fadvise you can do this:
NOCA_SIZE=128 NOCA_READ=1 NOCA_WRITE=1 NOCA_RA=1 \
 	noca mkisofs -R -o /tmp/1.iso /tmp/data

This means:
NOCA_SIZE: Call fadvise only after 128KiB was read/wrote.
NOCA_RA: call fadvise with POSIX_FADV_SEQUENTIAL
NOCA_READ: use fadvise(POSIX_FADV_DONTNEED) for reads (because you don't 
need anymore the source files)
NOCA_WRITE: use fadvise(POSIX_FADV_DONTNEED) for writes (because it's 
useless to cache the end of the ISO)

Do this program resolve your problem?

> The problems with small memory are different in kind, when not even the 
> programs will fit in memory at the same time, or will leave next to nothing 
> for i/o, swap is required for performance. But on a large memory system I 
> believe the gain to pain ratio is way too low with the current VM. The 
> solution at the moment is to turn off swap, which as you note has other 
> problems (can't move between zones without swap?) which in theory could 
> really hang a system.
>
> -- 
>   -bill davidsen (davidsen@tmr.com)
> "The secret to procrastination is to put things off until the
> last possible moment - but no longer"  -me
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
>

---
Catalin(ux aka Dino) BOIE
catab at deuroconsult.ro
http://kernel.umbrella.ro/

^ permalink raw reply	[flat|nested] 146+ messages in thread

* RE: why swap at all?
  2004-06-03 14:14                     ` Bill Davidsen
@ 2004-06-04  7:23                       ` Buddy Lumpkin
  2004-06-04 17:08                         ` Bill Davidsen
  2004-06-04  9:11                       ` Catalin BOIE
  2004-06-06 14:39                       ` Rik van Riel
  2 siblings, 1 reply; 146+ messages in thread
From: Buddy Lumpkin @ 2004-06-04  7:23 UTC (permalink / raw)
  To: 'Bill Davidsen', 'Con Kolivas'
  Cc: 'FabF', 'Bernd Eckenfels', linux-kernel

> But swap behaviour kills performance even when memory is more than 
> adequate. Consider building a DVD image in a 4GB system. The i/o forces 
> all of the unused programs out, in spite of the fact that an extra 100MB 
> doesn't make a measurable difference in performance. But when I click 
> Mozilla paging most of it in from disk make a big difference in 
> performance to the user.


We really need a server option. Something that ages out file backed pages
naturally with less overhead than kswapd. One method would be to keep the
pagecache on it's own list, and move pages to the head of the list any time
they are modified or referenced, and reclaim from the tail. 

All pages on this list can be considered as "free memory", because any new
memory requests would just cause pages to be evicted from the tail of the
list.

Anonymous memory would *not* be on this list. This way any time anonymous
memory is allocated, the pages can be readily stolen from the pagecache
list.

Lastly one nifty configuration parameter that could exist as a knob for
sys-admins is the ability to tell the VM not to add file backed pages with
the execute bit set to the page cache list but rather, leave them to be
reclaimed if kswapd wakes up in a true low memory situation (pagecache is
exhausted and memory is still low). This would require a sys-admin to make
sure only executables have the execute bit set and "data files", etc... do
not have the execute bit set.


A system that works like this is nice for the following reasons:

1) The system administrator can size a system so that all programs
    Safely run within physical RAM. Extra RAM
    Could be added and sized based on the need
    for caching files.

2) Anonymous pages (and possibly executable if you read 
     the last paragraph above) will only be evicted if kswapd is
     awaken due to a true memory shortage (1/128th pagable memory?).


I like to view the VM system as always being full, because if enough unique
file system IO takes place, that is exactly what eventually happens. A
system that counts page cache as free memory and uses a gentler mechanism to
evict pages from the page cache would benefit IO bound servers significantly
IMHO.

--Buddy


^ permalink raw reply	[flat|nested] 146+ messages in thread

* Re: why swap at all?
  2004-06-03 23:56                           ` Con Kolivas
@ 2004-06-04  0:16                             ` Con Kolivas
  0 siblings, 0 replies; 146+ messages in thread
From: Con Kolivas @ 2004-06-04  0:16 UTC (permalink / raw)
  To: FabF; +Cc: Valdis.Kletnieks, Bernd Eckenfels, linux-kernel

On Fri, 4 Jun 2004 09:56, Con Kolivas wrote:
> On Fri, 4 Jun 2004 02:16, FabF wrote:
> > On Thu, 2004-06-03 at 01:54, Con Kolivas wrote:
> > > Try this version instead which biases it downwards.
> >
> > I've been unhappy with this one.sw range : 19->60.
> > So I've been playing slightly with sw curve replacing nerve centre with
>
> Are you unhappy with the numbers for swappiness it gives or the feel of it?
> It gives a range of 0-100 in meaningful ways. Your version gives swappiness
> > 100 at times (oops). If this version does not feel good, the last linear
> one is better and you simply dont have enough ram for it to feel good after
> updatedb.

Oh and I forgot to say, if that's the case then you should try Nick's patches 
which are far more sophisticated than this.

Con

^ permalink raw reply	[flat|nested] 146+ messages in thread

* Re: why swap at all?
  2004-06-03 13:54                     ` Bill Davidsen
@ 2004-06-04  0:01                       ` Nick Piggin
  0 siblings, 0 replies; 146+ messages in thread
From: Nick Piggin @ 2004-06-04  0:01 UTC (permalink / raw)
  To: Bill Davidsen; +Cc: linux-kernel

Bill Davidsen wrote:

> Just a thought, I'm pretty well convinced that Nick's latest patches 
> have reduced the problem, at least for me. I'll try to get some metrics 
> on the measured effect, but the "feel" is better by far.
> 

Well thanks for testing them. I have another version with
minor bug fixes and a sync up to recent -mm changes.

http://www.kerneltrap.org/~npiggin/nickvm-267r2m2.gz

^ permalink raw reply	[flat|nested] 146+ messages in thread

* Re: why swap at all?
  2004-06-03 16:16                         ` FabF
@ 2004-06-03 23:56                           ` Con Kolivas
  2004-06-04  0:16                             ` Con Kolivas
  0 siblings, 1 reply; 146+ messages in thread
From: Con Kolivas @ 2004-06-03 23:56 UTC (permalink / raw)
  To: FabF; +Cc: Valdis.Kletnieks, Bernd Eckenfels, linux-kernel

On Fri, 4 Jun 2004 02:16, FabF wrote:
> On Thu, 2004-06-03 at 01:54, Con Kolivas wrote:
> > Try this version instead which biases it downwards.
> I've been unhappy with this one.sw range : 19->60.
> So I've been playing slightly with sw curve replacing nerve centre with

Are you unhappy with the numbers for swappiness it gives or the feel of it? It 
gives a range of 0-100 in meaningful ways. Your version gives swappiness > 
100 at times (oops). If this version does not feel good, the last linear one 
is better and you simply dont have enough ram for it to feel good after 
updatedb.

Con

^ permalink raw reply	[flat|nested] 146+ messages in thread

* Re: why swap at all?
  2004-06-02 23:54                       ` Con Kolivas
@ 2004-06-03 16:16                         ` FabF
  2004-06-03 23:56                           ` Con Kolivas
  0 siblings, 1 reply; 146+ messages in thread
From: FabF @ 2004-06-03 16:16 UTC (permalink / raw)
  To: Con Kolivas; +Cc: Valdis.Kletnieks, Bernd Eckenfels, linux-kernel

On Thu, 2004-06-03 at 01:54, Con Kolivas wrote:
> On Thu, 3 Jun 2004 04:30, FabF wrote:
> > On Wed, 2004-06-02 at 19:59, Valdis.Kletnieks@vt.edu wrote:
> > > On Wed, 02 Jun 2004 07:38:41 +0200, FabF said:
> > > > > Yes but: your wm is so  often used/activated it will not get swaped 
> > > > > out. But if your mouse passes over mozilla and tries to focus it,
> > > > > then you will feel the pain of a swapped-out x program.
> > > >
> > > > Exactly !
> > > > Does autoregulated VM swap. patch could help here ?
> > >
> > > Con's auto-adjusting swappiness patch did in fact help that quite a bit,
> > > especially for the case of heavy file I/O causing process images to be
> > > swapped out.  I need to do some comparisons of that to Nick's MM work...
> >
> > It helps inactive applications to re-ermerge smoothly, heavy I/O and
> > global tuning.I've got 20 swapping delta from start to high usage.
> > That patch rock'n'roll my box until updatedb makes sw climbs up to 80
> > and freezes my box :(
> 
> Try this version instead which biases it downwards.
> 
> Con
> 
> ______________________________________________________________________
> diff -Naurp --exclude-from=dontdiff linux-2.6.7-rc2-base/include/linux/swap.h linux-2.6.7-rc2-am11/include/linux/swap.h
> --- linux-2.6.7-rc2-base/include/linux/swap.h	2004-05-31 21:29:21.000000000 +1000
> +++ linux-2.6.7-rc2-am11/include/linux/swap.h	2004-05-31 23:39:26.020055153 +1000
> @@ -175,6 +175,7 @@ extern void swap_setup(void);
>  extern int try_to_free_pages(struct zone **, unsigned int, unsigned int);
>  extern int shrink_all_memory(int);
>  extern int vm_swappiness;
> +extern int auto_swappiness;
>  
>  #ifdef CONFIG_MMU
>  /* linux/mm/shmem.c */
> diff -Naurp --exclude-from=dontdiff linux-2.6.7-rc2-base/include/linux/sysctl.h linux-2.6.7-rc2-am11/include/linux/sysctl.h
> --- linux-2.6.7-rc2-base/include/linux/sysctl.h	2004-05-31 21:29:21.000000000 +1000
> +++ linux-2.6.7-rc2-am11/include/linux/sysctl.h	2004-05-31 23:39:26.021054997 +1000
> @@ -164,6 +164,7 @@ enum
>  	VM_LAPTOP_MODE=23,	/* vm laptop mode */
>  	VM_BLOCK_DUMP=24,	/* block dump mode */
>  	VM_HUGETLB_GROUP=25,	/* permitted hugetlb group */
> +	VM_AUTO_SWAPPINESS=26,	/* Make vm_swappiness autoregulated */
>  };
>  
> 
> diff -Naurp --exclude-from=dontdiff linux-2.6.7-rc2-base/kernel/sysctl.c linux-2.6.7-rc2-am11/kernel/sysctl.c
> --- linux-2.6.7-rc2-base/kernel/sysctl.c	2004-05-31 21:29:24.000000000 +1000
> +++ linux-2.6.7-rc2-am11/kernel/sysctl.c	2004-05-31 23:40:57.658756170 +1000
> @@ -727,6 +727,14 @@ static ctl_table vm_table[] = {
>  		.extra1		= &zero,
>  		.extra2		= &one_hundred,
>  	},
> +	{
> +		.ctl_name	= VM_AUTO_SWAPPINESS,
> +		.procname	= "autoswappiness",
> +		.data		= &auto_swappiness,
> +		.maxlen		= sizeof(int),
> +		.mode		= 0644,
> +		.proc_handler	= &proc_dointvec,
> +	},
>  #ifdef CONFIG_HUGETLB_PAGE
>  	 {
>  		.ctl_name	= VM_HUGETLB_PAGES,
> diff -Naurp --exclude-from=dontdiff linux-2.6.7-rc2-base/mm/vmscan.c linux-2.6.7-rc2-am11/mm/vmscan.c
> --- linux-2.6.7-rc2-base/mm/vmscan.c	2004-05-31 21:29:24.000000000 +1000
> +++ linux-2.6.7-rc2-am11/mm/vmscan.c	2004-05-31 23:39:26.051050316 +1000
> @@ -43,6 +43,7 @@
>   * From 0 .. 100.  Higher means more swappy.
>   */
>  int vm_swappiness = 60;
> +int auto_swappiness = 1;
>  static long total_memory;
>  
>  #define lru_to_page(_head) (list_entry((_head)->prev, struct page, lru))
> @@ -634,6 +635,41 @@ refill_inactive_zone(struct zone *zone, 
>  	 */
>  	mapped_ratio = (ps->nr_mapped * 100) / total_memory;
>  
> +	if (auto_swappiness) {
> +#ifdef CONFIG_SWAP
> +		int app_percent;
> +		struct sysinfo i;
> +		
> +		si_swapinfo(&i);
> +			
> +		if (likely(i.totalswap >= 100)) {
> +			int swap_centile;
> +	
> +			/*
> +			 * app_percent is the percentage of physical ram used
> +			 * by application pages.
> +			 */
> +			si_meminfo(&i);
> +			app_percent = 100 - ((i.freeram + get_page_cache_size() -
> +				swapper_space.nrpages) / (i.totalram / 100));
> +	
> +			/*
> +			 * swap_centile is the percentage of the last (sizeof physical
> +			 * ram) of swap free.
> +			 */
> +			swap_centile = i.freeswap / 
> +				(min(i.totalswap, i.totalram) / 100);
> +			/*
> +			 * Autoregulate vm_swappiness to be equal to the lowest of
> +			 * app_percent and swap_centile.  Bias it downwards -ck
> +			 */
> +			vm_swappiness = min(app_percent, swap_centile);
> +			vm_swappiness = vm_swappiness * vm_swappiness / 100;
> +		} else 
> +			vm_swappiness = 0;
> +#endif
> +	}
> +	
>  	/*
>  	 * Now decide how much we really want to unmap some pages.  The mapped
>  	 * ratio is downgraded - just because there's a lot of mapped memory
I've been unhappy with this one.sw range : 19->60.
So I've been playing slightly with sw curve replacing nerve centre with
:
vm_swappiness = vm_swappiness * vm_swappiness / 50;
if (vm_swappiness > 100)
        vm_swappiness = 100;

Results :
Warmup :Smooth, 30
Under pressure, swap grows roughly 60*60/50->72, 70*70/50->98 .... and
... rock'n'roll .... Box seems to keep the road this time.
updatedb gives side effects after 30 sec. though....sw drops to 70 ...
but I've got good global response.I guess we could expose curve
parameter.That one seems determinant :

vm_swappiness = vm_swappiness * vm_swappiness / swapcurve;
if (vm_swappiness > 100)
        vm_swappiness = 100;

and sysctl stuff as well :

	{
		.ctl_name	= VM_AUTO_SWAPPINESS,
		.procname	= "swapcurvature",
		.data		= &swapcurve,
		.maxlen		= sizeof(int),
		.mode		= 0644,
		.proc_handler	= &proc_curvature,
	},

Regards,
FabF


^ permalink raw reply	[flat|nested] 146+ messages in thread

* Re: why swap at all?
  2004-06-03 14:18                     ` Bill Davidsen
@ 2004-06-03 14:27                       ` Con Kolivas
  0 siblings, 0 replies; 146+ messages in thread
From: Con Kolivas @ 2004-06-03 14:27 UTC (permalink / raw)
  To: Bill Davidsen; +Cc: Valdis.Kletnieks, linux-kernel

On Fri, 4 Jun 2004 00:18, Bill Davidsen wrote:
> Valdis.Kletnieks@vt.edu wrote:
> > On Wed, 02 Jun 2004 07:38:41 +0200, FabF said:
> >>>Yes but: your wm is so  often used/activated it will not get swaped 
> >>> out. But if your mouse passes over mozilla and tries to focus it, then
> >>> you will feel the pain of a swapped-out x program.
> >>
> >>Exactly !
> >>Does autoregulated VM swap. patch could help here ?
> >
> > Con's auto-adjusting swappiness patch did in fact help that quite a bit,
> > especially for the case of heavy file I/O causing process images to be
> > swapped out.  I need to do some comparisons of that to Nick's MM work...
>
> I haven't had a chance to try Con's stuff, the Nick patch is working
> VERY well for me, small memory and slow system, lots of memory pressure.
> Hopefully you can report a comparison.

Well note there are two revisions available now. The original linear design is 
here:
http://ck.kolivas.org/patches/2.6/2.6.7-rc2/patch-2.6.7-rc2-am11

and there is an exponential curve bias in this one which will probably 
deprecate the last one:
http://ck.kolivas.org/patches/2.6/2.6.7-rc2/patch-2.6.7-rc2-as

I am keen to get more feedback; apart from what I get off list there has been 
very little in the way of reports.

Con

^ permalink raw reply	[flat|nested] 146+ messages in thread

* Re: why swap at all? 
  2004-06-02 17:59                   ` Valdis.Kletnieks
  2004-06-02 18:30                     ` FabF
@ 2004-06-03 14:18                     ` Bill Davidsen
  2004-06-03 14:27                       ` Con Kolivas
  1 sibling, 1 reply; 146+ messages in thread
From: Bill Davidsen @ 2004-06-03 14:18 UTC (permalink / raw)
  To: Valdis.Kletnieks; +Cc: linux-kernel

Valdis.Kletnieks@vt.edu wrote:
> On Wed, 02 Jun 2004 07:38:41 +0200, FabF said:
> 
> 
>>>Yes but: your wm is so  often used/activated it will not get swaped  out. 
>>>But if your mouse passes over mozilla and tries to focus it, then you will
>>>feel the pain of a swapped-out x program.
>>>
>>
>>Exactly !
>>Does autoregulated VM swap. patch could help here ?
> 
> 
> Con's auto-adjusting swappiness patch did in fact help that quite a bit,
> especially for the case of heavy file I/O causing process images to be swapped
> out.  I need to do some comparisons of that to Nick's MM work...

I haven't had a chance to try Con's stuff, the Nick patch is working 
VERY well for me, small memory and slow system, lots of memory pressure. 
Hopefully you can report a comparison.

-- 
    -bill davidsen (davidsen@tmr.com)
"The secret to procrastination is to put things off until the
  last possible moment - but no longer"  -me

^ permalink raw reply	[flat|nested] 146+ messages in thread

* Re: why swap at all?
  2004-06-02 11:42                   ` Con Kolivas
  2004-06-02 12:22                     ` John Bradford
  2004-06-02 17:06                     ` FabF
@ 2004-06-03 14:14                     ` Bill Davidsen
  2004-06-04  7:23                       ` Buddy Lumpkin
                                         ` (2 more replies)
  2 siblings, 3 replies; 146+ messages in thread
From: Bill Davidsen @ 2004-06-03 14:14 UTC (permalink / raw)
  To: Con Kolivas; +Cc: FabF, Bernd Eckenfels, linux-kernel

Con Kolivas wrote:
> On Wed, 2 Jun 2004 15:38, FabF wrote:
> 
>>On Wed, 2004-06-02 at 01:17, Bernd Eckenfels wrote:
>>
>>>In article <200406012000.i51K0vor019011@turing-police.cc.vt.edu> you 
> 
> wrote:
> 
>>>>out (unlike some, I don't mind if Mozilla or OpenOffice end up out on
>>>>disk after extended inactivity - but if my window manager gets swapped
>>>>out, I get peeved when focus-follows-mouse doesn't and my typing goes
>>>>into the wrong window or some such... ;)
>>>
>>>Yes but: your wm is so  often used/activated it will not get swaped  out.
>>>But if your mouse passes over mozilla and tries to focus it, then you
>>>will feel the pain of a swapped-out x program.
>>
>>Exactly !
>>Does autoregulated VM swap. patch could help here ?
> 
> 
> Unless you are pushing the limits of your available ram by your usage pattern 
> then yes the autoregulated swappiness patch should help.
> 
> available here:
> http://ck.kolivas.org/patches/2.6/2.6.7-rc2/patch-2.6.7-rc2-am11
> 
> Just a brief word that might clarify things for people. It seems this huge 
> swap discussion centres around 2 different arguments. Akpm has said that the 
> correct way for the vm to behave is that of swappiness=100. Desktop users 
> note they have less swap out of the programs they use with swappiness 0 or 
> their swap turned off. When your swappiness is set high, the current vm 
> decisions are the fastest they can be, but when you go back to your 
> applications they will take longer to restart. When your swappiness is set 
> low your applications will restart rapidly, but the current vm will be doing 
> more work and be slower. Most benchmarks will show the latter, but most 
> desktop users will feel the former and not really notice the latter.
> 
> Try the little experiment to see: Boot with mem=128M and try to compile a 2.6 
> kernel with all the debugging symbols option enabled - do this with 
> swappiness set to 0 and then at 100. You'll see it compile much faster at 
> 100. Yet you know that if you set your swappiness to 0 mozilla will load 
> faster next time you use it on your desktop during your normal usage pattern 
> (of course you'd probably be using mozilla on a system with a bit more than 
> 128M ram but this helps demonstrate the point). 
> 
> Does this explain in coarse examples to the desktop users why ideal systems 
> shouldn't be swap disabled or swappiness=0 ?
> 
> The autoregulated swappiness patch tries to get some sort of common ground, 
> where it sacrifices performance slightly currently to improve what happens 
> the next time you use your machine substantially. Because it changes with the 
> amount of application pages in ram, it will not increasingly sacrifice 
> performance when your memory is full with application pages. What it will not 
> do is improve the swap thrash situation when you have grossly overloaded your 
> ram.

But swap behaviour kills performance even when memory is more than 
adequate. Consider building a DVD image in a 4GB system. The i/o forces 
all of the unused programs out, in spite of the fact that an extra 100MB 
doesn't make a measurable difference in performance. But when I click 
Mozilla paging most of it in from disk make a big difference in 
performance to the user.

My perception is that the system is really bad at recognizing 
diminishing returns to be had by paging programs for the benefit of i/o. 
Not to mention what happens if you get 2-3GB of dirty buffers and then 
do a sync(). In practice my little RAID array will take at most 40MB/s, 
so the creation of a DVD runs fast, but the system bogs right after.

The problems with small memory are different in kind, when not even the 
programs will fit in memory at the same time, or will leave next to 
nothing for i/o, swap is required for performance. But on a large memory 
system I believe the gain to pain ratio is way too low with the current 
VM. The solution at the moment is to turn off swap, which as you note 
has other problems (can't move between zones without swap?) which in 
theory could really hang a system.

-- 
    -bill davidsen (davidsen@tmr.com)
"The secret to procrastination is to put things off until the
  last possible moment - but no longer"  -me

^ permalink raw reply	[flat|nested] 146+ messages in thread

* Re: why swap at all?
  2004-06-01 21:15                   ` FabF
  2004-06-01 21:40                     ` Valdis.Kletnieks
@ 2004-06-03 13:54                     ` Bill Davidsen
  2004-06-04  0:01                       ` Nick Piggin
  1 sibling, 1 reply; 146+ messages in thread
From: Bill Davidsen @ 2004-06-03 13:54 UTC (permalink / raw)
  To: linux-kernel

FabF wrote:

> 	As I said, I think this thread is "becoming offtopic" but what can be
> interesting is the swapping problem fragmentation :
> 
> 	1.Global inactivity (what you're talking about)
> 	2.Application isolation (what we're talking about).
> 
> Geek or not, someone backgrounding an application doesn't want it to
> down the box for X seconds some minutes later when it comes back and
> such things arrive many times a day.Maybe you've got an idea about a
> better rule(s) then ? (I mean for the 2 cases)

Maybe what we need is a per-process tuner like nice, for swap candidacy. 
Unfortunately doing it right is probably 2.7 material, you want users to 
be able to set it DOWN for seldom used things, but not UP where they 
could hog the system. And I think 'right' also means having a capability 
for setting it UP again, etc.

Note that there are some hooks which *might* be useful for quick user, 
there is a sticky bit which seems pretty unused in practice, and which 
might cause pages to be marked less likely to swap. You could implement 
in exec() to do the setting, with whatever access control seems useful.

Just a thought, I'm pretty well convinced that Nick's latest patches 
have reduced the problem, at least for me. I'll try to get some metrics 
on the measured effect, but the "feel" is better by far.

-- 
    -bill davidsen (davidsen@tmr.com)
"The secret to procrastination is to put things off until the
  last possible moment - but no longer"  -me

^ permalink raw reply	[flat|nested] 146+ messages in thread

* Re: why swap at all?
  2004-05-27 12:41 ` William Lee Irwin III
  2004-05-27 15:59   ` John Bradford
@ 2004-06-03 13:38   ` Bill Davidsen
  1 sibling, 0 replies; 146+ messages in thread
From: Bill Davidsen @ 2004-06-03 13:38 UTC (permalink / raw)
  To: linux-kernel

William Lee Irwin III wrote:
> On Thu, May 27, 2004 at 08:31:26AM -0400, Piszcz, Justin Michael wrote:
> 
>>If I have 16GB of ram should I use swap?
>>Would swap cause the machine to slow down?
> 
> 
> Yes. You want swap so you can physically relocate anonymous pages in the
> rare case one ends up somewhere it could cause memory pressure against
> allocations that can only be satisfied by a restricted range of memory.

It would seem that the o/s has enough information to separate pages into 
  categories such as 'part of a program,' 'unwritten user write() data,' 
'user read() data sequential," 'user read data random' (read after seek) 
and the like. It would be nice if admins could do tuning on how the o/s 
weights giving these memory. The swappiness tuner is certainly a start, 
in practice it does help with atypical loads.

And Nick's latest stuff against 2.6.7-rc1-mm1 certainly seems to work 
very well on my little 96MB slow box with a few dozen windows open. I 
would call it the best I've run on this box, ever.


-- 
    -bill davidsen (davidsen@tmr.com)
"The secret to procrastination is to put things off until the
  last possible moment - but no longer"  -me

^ permalink raw reply	[flat|nested] 146+ messages in thread

* Re: why swap at all?
  2004-06-02 18:30                     ` FabF
@ 2004-06-02 23:54                       ` Con Kolivas
  2004-06-03 16:16                         ` FabF
  0 siblings, 1 reply; 146+ messages in thread
From: Con Kolivas @ 2004-06-02 23:54 UTC (permalink / raw)
  To: FabF; +Cc: Valdis.Kletnieks, Bernd Eckenfels, linux-kernel

[-- Attachment #1: Type: text/plain, Size: 975 bytes --]

On Thu, 3 Jun 2004 04:30, FabF wrote:
> On Wed, 2004-06-02 at 19:59, Valdis.Kletnieks@vt.edu wrote:
> > On Wed, 02 Jun 2004 07:38:41 +0200, FabF said:
> > > > Yes but: your wm is so  often used/activated it will not get swaped 
> > > > out. But if your mouse passes over mozilla and tries to focus it,
> > > > then you will feel the pain of a swapped-out x program.
> > >
> > > Exactly !
> > > Does autoregulated VM swap. patch could help here ?
> >
> > Con's auto-adjusting swappiness patch did in fact help that quite a bit,
> > especially for the case of heavy file I/O causing process images to be
> > swapped out.  I need to do some comparisons of that to Nick's MM work...
>
> It helps inactive applications to re-ermerge smoothly, heavy I/O and
> global tuning.I've got 20 swapping delta from start to high usage.
> That patch rock'n'roll my box until updatedb makes sw climbs up to 80
> and freezes my box :(

Try this version instead which biases it downwards.

Con

[-- Attachment #2: patch-2.6.7-rc2-as --]
[-- Type: text/x-diff, Size: 3336 bytes --]

diff -Naurp --exclude-from=dontdiff linux-2.6.7-rc2-base/include/linux/swap.h linux-2.6.7-rc2-am11/include/linux/swap.h
--- linux-2.6.7-rc2-base/include/linux/swap.h	2004-05-31 21:29:21.000000000 +1000
+++ linux-2.6.7-rc2-am11/include/linux/swap.h	2004-05-31 23:39:26.020055153 +1000
@@ -175,6 +175,7 @@ extern void swap_setup(void);
 extern int try_to_free_pages(struct zone **, unsigned int, unsigned int);
 extern int shrink_all_memory(int);
 extern int vm_swappiness;
+extern int auto_swappiness;
 
 #ifdef CONFIG_MMU
 /* linux/mm/shmem.c */
diff -Naurp --exclude-from=dontdiff linux-2.6.7-rc2-base/include/linux/sysctl.h linux-2.6.7-rc2-am11/include/linux/sysctl.h
--- linux-2.6.7-rc2-base/include/linux/sysctl.h	2004-05-31 21:29:21.000000000 +1000
+++ linux-2.6.7-rc2-am11/include/linux/sysctl.h	2004-05-31 23:39:26.021054997 +1000
@@ -164,6 +164,7 @@ enum
 	VM_LAPTOP_MODE=23,	/* vm laptop mode */
 	VM_BLOCK_DUMP=24,	/* block dump mode */
 	VM_HUGETLB_GROUP=25,	/* permitted hugetlb group */
+	VM_AUTO_SWAPPINESS=26,	/* Make vm_swappiness autoregulated */
 };
 
 
diff -Naurp --exclude-from=dontdiff linux-2.6.7-rc2-base/kernel/sysctl.c linux-2.6.7-rc2-am11/kernel/sysctl.c
--- linux-2.6.7-rc2-base/kernel/sysctl.c	2004-05-31 21:29:24.000000000 +1000
+++ linux-2.6.7-rc2-am11/kernel/sysctl.c	2004-05-31 23:40:57.658756170 +1000
@@ -727,6 +727,14 @@ static ctl_table vm_table[] = {
 		.extra1		= &zero,
 		.extra2		= &one_hundred,
 	},
+	{
+		.ctl_name	= VM_AUTO_SWAPPINESS,
+		.procname	= "autoswappiness",
+		.data		= &auto_swappiness,
+		.maxlen		= sizeof(int),
+		.mode		= 0644,
+		.proc_handler	= &proc_dointvec,
+	},
 #ifdef CONFIG_HUGETLB_PAGE
 	 {
 		.ctl_name	= VM_HUGETLB_PAGES,
diff -Naurp --exclude-from=dontdiff linux-2.6.7-rc2-base/mm/vmscan.c linux-2.6.7-rc2-am11/mm/vmscan.c
--- linux-2.6.7-rc2-base/mm/vmscan.c	2004-05-31 21:29:24.000000000 +1000
+++ linux-2.6.7-rc2-am11/mm/vmscan.c	2004-05-31 23:39:26.051050316 +1000
@@ -43,6 +43,7 @@
  * From 0 .. 100.  Higher means more swappy.
  */
 int vm_swappiness = 60;
+int auto_swappiness = 1;
 static long total_memory;
 
 #define lru_to_page(_head) (list_entry((_head)->prev, struct page, lru))
@@ -634,6 +635,41 @@ refill_inactive_zone(struct zone *zone, 
 	 */
 	mapped_ratio = (ps->nr_mapped * 100) / total_memory;
 
+	if (auto_swappiness) {
+#ifdef CONFIG_SWAP
+		int app_percent;
+		struct sysinfo i;
+		
+		si_swapinfo(&i);
+			
+		if (likely(i.totalswap >= 100)) {
+			int swap_centile;
+	
+			/*
+			 * app_percent is the percentage of physical ram used
+			 * by application pages.
+			 */
+			si_meminfo(&i);
+			app_percent = 100 - ((i.freeram + get_page_cache_size() -
+				swapper_space.nrpages) / (i.totalram / 100));
+	
+			/*
+			 * swap_centile is the percentage of the last (sizeof physical
+			 * ram) of swap free.
+			 */
+			swap_centile = i.freeswap / 
+				(min(i.totalswap, i.totalram) / 100);
+			/*
+			 * Autoregulate vm_swappiness to be equal to the lowest of
+			 * app_percent and swap_centile.  Bias it downwards -ck
+			 */
+			vm_swappiness = min(app_percent, swap_centile);
+			vm_swappiness = vm_swappiness * vm_swappiness / 100;
+		} else 
+			vm_swappiness = 0;
+#endif
+	}
+	
 	/*
 	 * Now decide how much we really want to unmap some pages.  The mapped
 	 * ratio is downgraded - just because there's a lot of mapped memory

^ permalink raw reply	[flat|nested] 146+ messages in thread

* Re: why swap at all?
  2004-06-01 16:49             ` jlnance
@ 2004-06-02 18:38               ` John Hendrikx
  0 siblings, 0 replies; 146+ messages in thread
From: John Hendrikx @ 2004-06-02 18:38 UTC (permalink / raw)
  To: Linux Kernel Mailinglist

jlnance@unity.ncsu.edu wrote:

>On Tue, Jun 01, 2004 at 02:57:00PM +0300, Lenar L?hmus wrote:
>  
>
>>jlnance@unity.ncsu.edu wrote:
>>
>>    
>>
>>>I'm not sure.  Copying a file is a pretty good indication that you
>>>are about to do something with either the new or the old file.
>>>
>>>      
>>>
>>Like taking the new file with me on USB dongle and deleting old one? 
>>Caching the file really doesn't help in this case.
>>    
>>
>
>No, it does not help in this case.
>
>Not putting things in cache is a solution for the problem of
>having useful stuff pushed out of the cache.  However, fixing
>the problem this way may create other problems if it causes
>us to fail to put useful things into the cache.
>
>The point I was trying (perhaps unsuccessfully) to make, is
>that we should be careful about not caching things.  We are
>likely to break other corner cases by fixing the ones we
>are discussing.
>  
>
I've experienced the problem where applications need to be swapped back 
in.  It's mainly caused by the dual role my machine has (desktop machine 
when I'm using it, server when it is serving files).   Whenever my 
machine has been sitting idly serving files for a while, when I get 
back, the desktop is slow.  However, there is no need for that, as the 
files are served at low speeds -- there's no real point in caching them 
apart from maybe preventing harddisk wear... the harddisk itself can 
serve these files again faster than they will be needed.

So perhaps it is possible to reduce caching of data that is simply not 
putting stress on the system (the harddisk in this case).  If the 
harddisk is not the bottleneck, it is probably not worth caching.  
Typical examples are letting a box play music all day (and then trying 
to read your mail...), having a webserver on a slow connection or 
watching a large movie file.  None of these really require much caching 
beyond a bit of read-ahead. 

I'm not sure how best to distinguish when something is fast I/O that 
would benefit from caching and when something is slow I/O that the 
harddisk can handle well enough on its own.

--John


^ permalink raw reply	[flat|nested] 146+ messages in thread

* Re: why swap at all?
  2004-06-02 17:59                   ` Valdis.Kletnieks
@ 2004-06-02 18:30                     ` FabF
  2004-06-02 23:54                       ` Con Kolivas
  2004-06-03 14:18                     ` Bill Davidsen
  1 sibling, 1 reply; 146+ messages in thread
From: FabF @ 2004-06-02 18:30 UTC (permalink / raw)
  To: Valdis.Kletnieks; +Cc: Bernd Eckenfels, linux-kernel

On Wed, 2004-06-02 at 19:59, Valdis.Kletnieks@vt.edu wrote:
> On Wed, 02 Jun 2004 07:38:41 +0200, FabF said:
> 
> > > Yes but: your wm is so  often used/activated it will not get swaped  out. 
> > > But if your mouse passes over mozilla and tries to focus it, then you will
> > > feel the pain of a swapped-out x program.
> > > 
> > Exactly !
> > Does autoregulated VM swap. patch could help here ?
> 
> Con's auto-adjusting swappiness patch did in fact help that quite a bit,
> especially for the case of heavy file I/O causing process images to be swapped
> out.  I need to do some comparisons of that to Nick's MM work...
It helps inactive applications to re-ermerge smoothly, heavy I/O and
global tuning.I've got 20 swapping delta from start to high usage.
That patch rock'n'roll my box until updatedb makes sw climbs up to 80
and freezes my box :(

FabF


^ permalink raw reply	[flat|nested] 146+ messages in thread

* Re: why swap at all? 
  2004-06-02  5:38                 ` FabF
  2004-06-02 11:42                   ` Con Kolivas
@ 2004-06-02 17:59                   ` Valdis.Kletnieks
  2004-06-02 18:30                     ` FabF
  2004-06-03 14:18                     ` Bill Davidsen
  1 sibling, 2 replies; 146+ messages in thread
From: Valdis.Kletnieks @ 2004-06-02 17:59 UTC (permalink / raw)
  To: FabF; +Cc: Bernd Eckenfels, linux-kernel

[-- Attachment #1: Type: text/plain, Size: 541 bytes --]

On Wed, 02 Jun 2004 07:38:41 +0200, FabF said:

> > Yes but: your wm is so  often used/activated it will not get swaped  out. 
> > But if your mouse passes over mozilla and tries to focus it, then you will
> > feel the pain of a swapped-out x program.
> > 
> Exactly !
> Does autoregulated VM swap. patch could help here ?

Con's auto-adjusting swappiness patch did in fact help that quite a bit,
especially for the case of heavy file I/O causing process images to be swapped
out.  I need to do some comparisons of that to Nick's MM work...

[-- Attachment #2: Type: application/pgp-signature, Size: 226 bytes --]

^ permalink raw reply	[flat|nested] 146+ messages in thread

* Re: why swap at all? 
  2004-06-01 23:17               ` Bernd Eckenfels
  2004-06-02  5:38                 ` FabF
@ 2004-06-02 17:52                 ` Valdis.Kletnieks
  1 sibling, 0 replies; 146+ messages in thread
From: Valdis.Kletnieks @ 2004-06-02 17:52 UTC (permalink / raw)
  To: Bernd Eckenfels; +Cc: linux-kernel

[-- Attachment #1: Type: text/plain, Size: 901 bytes --]

On Wed, 02 Jun 2004 01:17:06 +0200, Bernd Eckenfels <ecki-news2004-05@lina.inka.de>  said:

> Yes but: your wm is so  often used/activated it will not get swaped  out. 
> But if your mouse passes over mozilla and tries to focus it, then you will
> feel the pain of a swapped-out x program.

Yes, I'm quite familiar with what a swapped-out mozilla does to my laptop ;)

The point I was making (apparently poorly) was that if mozilla is swapping in,
*that window* is hosed, but if the WM or the X server is swapping in,
*everything* is hosed.

And I *have* had times when I've left for an extended period while Mozilla
is downloading a Knoppix .iso or similar beastly large thing, and it managed
to keep Mozilla pages hot because it was busy doing a download, and the WM
pages got swapped out because the WM wasn't actually doing anything...

Yes, it's a rare state of affairs, but it *has* happened...

[-- Attachment #2: Type: application/pgp-signature, Size: 226 bytes --]

^ permalink raw reply	[flat|nested] 146+ messages in thread

* Re: why swap at all? 
  2004-06-02  3:50           ` Tim Connors
@ 2004-06-02 17:45             ` Valdis.Kletnieks
  0 siblings, 0 replies; 146+ messages in thread
From: Valdis.Kletnieks @ 2004-06-02 17:45 UTC (permalink / raw)
  To: Tim Connors; +Cc: FabF, Bernd Eckenfels, linux-kernel

[-- Attachment #1: Type: text/plain, Size: 945 bytes --]

On Wed, 02 Jun 2004 13:50:42 +1000, Tim Connors said:

> I do often get frustrated that the DoS card is brought up to kill a
> potentially useful solution. I think there should be a flag in KConfig
> saying "This machine will be a server"/"This machine will be mostly a
> single user desktop machine". In the latter, you can enable all these
> vm/etc heuristics that will help out mozilla/X/your favourite
> bloat-ware, but potentially enable a DoS attack, and in the former,
> you stay conservative.

And with that, you've worried about whether it's a potential DoS or
not.  I didn't bring it up to "kill" it - I brought it up to start a discussion,
because I felt that including that sort of feature without at least thinking
about the DoS issues was a bad idea.  Shipping it with a Kconfig or
sysctl flag, or using the capabilities framework, or any other similar
"allow the sysadmin to control it" feature is a different matter entirely...


[-- Attachment #2: Type: application/pgp-signature, Size: 226 bytes --]

^ permalink raw reply	[flat|nested] 146+ messages in thread

* Re: why swap at all?
  2004-06-02 11:42                   ` Con Kolivas
  2004-06-02 12:22                     ` John Bradford
@ 2004-06-02 17:06                     ` FabF
  2004-06-03 14:14                     ` Bill Davidsen
  2 siblings, 0 replies; 146+ messages in thread
From: FabF @ 2004-06-02 17:06 UTC (permalink / raw)
  To: Con Kolivas; +Cc: Bernd Eckenfels, linux-kernel

On Wed, 2004-06-02 at 13:42, Con Kolivas wrote:
> On Wed, 2 Jun 2004 15:38, FabF wrote:
> > On Wed, 2004-06-02 at 01:17, Bernd Eckenfels wrote:
> > > In article <200406012000.i51K0vor019011@turing-police.cc.vt.edu> you 
> wrote:
> > > > out (unlike some, I don't mind if Mozilla or OpenOffice end up out on
> > > > disk after extended inactivity - but if my window manager gets swapped
> > > > out, I get peeved when focus-follows-mouse doesn't and my typing goes
> > > > into the wrong window or some such... ;)
> > >
> > > Yes but: your wm is so  often used/activated it will not get swaped  out.
> > > But if your mouse passes over mozilla and tries to focus it, then you
> > > will feel the pain of a swapped-out x program.
> >
> > Exactly !
> > Does autoregulated VM swap. patch could help here ?
> 
> Unless you are pushing the limits of your available ram by your usage pattern 
> then yes the autoregulated swappiness patch should help.
> 
> available here:
> http://ck.kolivas.org/patches/2.6/2.6.7-rc2/patch-2.6.7-rc2-am11
> 
> Just a brief word that might clarify things for people. It seems this huge 
> swap discussion centres around 2 different arguments. Akpm has said that the 
> correct way for the vm to behave is that of swappiness=100. Desktop users 
> note they have less swap out of the programs they use with swappiness 0 or 
> their swap turned off. When your swappiness is set high, the current vm 
> decisions are the fastest they can be, but when you go back to your 
> applications they will take longer to restart. When your swappiness is set 
> low your applications will restart rapidly, but the current vm will be doing 
> more work and be slower. Most benchmarks will show the latter, but most 
> desktop users will feel the former and not really notice the latter.
> 
> Try the little experiment to see: Boot with mem=128M and try to compile a 2.6 
> kernel with all the debugging symbols option enabled - do this with 
> swappiness set to 0 and then at 100. You'll see it compile much faster at 
> 100. Yet you know that if you set your swappiness to 0 mozilla will load 
> faster next time you use it on your desktop during your normal usage pattern 
> (of course you'd probably be using mozilla on a system with a bit more than 
> 128M ram but this helps demonstrate the point). 
> 
> Does this explain in coarse examples to the desktop users why ideal systems 
> shouldn't be swap disabled or swappiness=0 ?
> 
> The autoregulated swappiness patch tries to get some sort of common ground, 
> where it sacrifices performance slightly currently to improve what happens 
> the next time you use your machine substantially. Because it changes with the 
> amount of application pages in ram, it will not increasingly sacrifice 
> performance when your memory is full with application pages. What it will not 
> do is improve the swap thrash situation when you have grossly overloaded your 
> ram.
> 
> Con

My box rocks with you patch Con ! Swappiness is floating between 50->65.
I never saw a 2.6 box so quick in rl5.

Thanks !
FabF





^ permalink raw reply	[flat|nested] 146+ messages in thread

* Re: why swap at all?
  2004-06-02 11:42                   ` Con Kolivas
@ 2004-06-02 12:22                     ` John Bradford
  2004-06-02 12:22                       ` Con Kolivas
  2004-06-02 17:06                     ` FabF
  2004-06-03 14:14                     ` Bill Davidsen
  2 siblings, 1 reply; 146+ messages in thread
From: John Bradford @ 2004-06-02 12:22 UTC (permalink / raw)
  To: Con Kolivas, FabF; +Cc: Bernd Eckenfels, linux-kernel

Quote from Con Kolivas <kernel@kolivas.org>:
> Does this explain in coarse examples to the desktop users why ideal systems 
> shouldn't be swap disabled or swappiness=0 ?

Yes, except in the case where you are processing a small, (relative to
physical RAM), dataset, and not even touching all physical RAM.

(I admit, this isn't really typical desktop usage, though).

John.

^ permalink raw reply	[flat|nested] 146+ messages in thread

* Re: why swap at all?
  2004-06-02 12:22                     ` John Bradford
@ 2004-06-02 12:22                       ` Con Kolivas
  0 siblings, 0 replies; 146+ messages in thread
From: Con Kolivas @ 2004-06-02 12:22 UTC (permalink / raw)
  To: John Bradford; +Cc: FabF, Bernd Eckenfels, linux-kernel

On Wed, 2 Jun 2004 22:22, John Bradford wrote:
> Quote from Con Kolivas <kernel@kolivas.org>:
> > Does this explain in coarse examples to the desktop users why ideal
> > systems shouldn't be swap disabled or swappiness=0 ?
>
> Yes, except in the case where you are processing a small, (relative to
> physical RAM), dataset, and not even touching all physical RAM.
>
> (I admit, this isn't really typical desktop usage, though).

Well there is no doubt that there are some unique scenarios where an 
algorithmic setting will not be as good as a single static setting; and 
that's why I put in the option of disabling the auto swappiness. I believe 
our proc settings in the kernel should not need to be adjusted for the 
majority of cases, though.

Con

^ permalink raw reply	[flat|nested] 146+ messages in thread

* Re: why swap at all?
  2004-06-02  5:38                 ` FabF
@ 2004-06-02 11:42                   ` Con Kolivas
  2004-06-02 12:22                     ` John Bradford
                                       ` (2 more replies)
  2004-06-02 17:59                   ` Valdis.Kletnieks
  1 sibling, 3 replies; 146+ messages in thread
From: Con Kolivas @ 2004-06-02 11:42 UTC (permalink / raw)
  To: FabF; +Cc: Bernd Eckenfels, linux-kernel

On Wed, 2 Jun 2004 15:38, FabF wrote:
> On Wed, 2004-06-02 at 01:17, Bernd Eckenfels wrote:
> > In article <200406012000.i51K0vor019011@turing-police.cc.vt.edu> you 
wrote:
> > > out (unlike some, I don't mind if Mozilla or OpenOffice end up out on
> > > disk after extended inactivity - but if my window manager gets swapped
> > > out, I get peeved when focus-follows-mouse doesn't and my typing goes
> > > into the wrong window or some such... ;)
> >
> > Yes but: your wm is so  often used/activated it will not get swaped  out.
> > But if your mouse passes over mozilla and tries to focus it, then you
> > will feel the pain of a swapped-out x program.
>
> Exactly !
> Does autoregulated VM swap. patch could help here ?

Unless you are pushing the limits of your available ram by your usage pattern 
then yes the autoregulated swappiness patch should help.

available here:
http://ck.kolivas.org/patches/2.6/2.6.7-rc2/patch-2.6.7-rc2-am11

Just a brief word that might clarify things for people. It seems this huge 
swap discussion centres around 2 different arguments. Akpm has said that the 
correct way for the vm to behave is that of swappiness=100. Desktop users 
note they have less swap out of the programs they use with swappiness 0 or 
their swap turned off. When your swappiness is set high, the current vm 
decisions are the fastest they can be, but when you go back to your 
applications they will take longer to restart. When your swappiness is set 
low your applications will restart rapidly, but the current vm will be doing 
more work and be slower. Most benchmarks will show the latter, but most 
desktop users will feel the former and not really notice the latter.

Try the little experiment to see: Boot with mem=128M and try to compile a 2.6 
kernel with all the debugging symbols option enabled - do this with 
swappiness set to 0 and then at 100. You'll see it compile much faster at 
100. Yet you know that if you set your swappiness to 0 mozilla will load 
faster next time you use it on your desktop during your normal usage pattern 
(of course you'd probably be using mozilla on a system with a bit more than 
128M ram but this helps demonstrate the point). 

Does this explain in coarse examples to the desktop users why ideal systems 
shouldn't be swap disabled or swappiness=0 ?

The autoregulated swappiness patch tries to get some sort of common ground, 
where it sacrifices performance slightly currently to improve what happens 
the next time you use your machine substantially. Because it changes with the 
amount of application pages in ram, it will not increasingly sacrifice 
performance when your memory is full with application pages. What it will not 
do is improve the swap thrash situation when you have grossly overloaded your 
ram.

Con

^ permalink raw reply	[flat|nested] 146+ messages in thread

* Re: why swap at all?
  2004-06-01 23:17               ` Bernd Eckenfels
@ 2004-06-02  5:38                 ` FabF
  2004-06-02 11:42                   ` Con Kolivas
  2004-06-02 17:59                   ` Valdis.Kletnieks
  2004-06-02 17:52                 ` Valdis.Kletnieks
  1 sibling, 2 replies; 146+ messages in thread
From: FabF @ 2004-06-02  5:38 UTC (permalink / raw)
  To: Bernd Eckenfels; +Cc: linux-kernel

On Wed, 2004-06-02 at 01:17, Bernd Eckenfels wrote:
> In article <200406012000.i51K0vor019011@turing-police.cc.vt.edu> you wrote:
> > out (unlike some, I don't mind if Mozilla or OpenOffice end up out on
> > disk after extended inactivity - but if my window manager gets swapped
> > out, I get peeved when focus-follows-mouse doesn't and my typing goes
> > into the wrong window or some such... ;)
> 
> Yes but: your wm is so  often used/activated it will not get swaped  out. 
> But if your mouse passes over mozilla and tries to focus it, then you will
> feel the pain of a swapped-out x program.
> 
Exactly !
Does autoregulated VM swap. patch could help here ?

FabF

> Greetings
> Bernd


^ permalink raw reply	[flat|nested] 146+ messages in thread

* Re:  why swap at all? 
  2004-06-01 19:02         ` Valdis.Kletnieks
  2004-06-01 19:53           ` FabF
@ 2004-06-02  3:50           ` Tim Connors
  2004-06-02 17:45             ` Valdis.Kletnieks
  1 sibling, 1 reply; 146+ messages in thread
From: Tim Connors @ 2004-06-02  3:50 UTC (permalink / raw)
  To: Valdis.Kletnieks; +Cc: FabF, Bernd Eckenfels, linux-kernel

Valdis.Kletnieks@vt.edu said on Tue, 01 Jun 2004 15:02:48 -0400:
> --==_Exmh_482188856P
> Content-Type: text/plain; charset=us-ascii
> 
> On Tue, 01 Jun 2004 20:36:23 +0200, FabF said:
> 
> > I guess we have a design problem right here.We could add per-process
> > swappiness attribute.That swap thread becomes boring coz we're looking
> > globally what's going wrong locally.
> 
> Hmm.. do we need to worry about the same DoS issues we need to worry about with
> mlock and friends?  I know I can trust myself to not do stupid things to said
> flags on my laptop (well... not twice anyhow ;).  On the other hand, I have
> systems with clueless users, and the even more dangerous half-clued users.  And
> then I have a bunch of machines in our security lab, where Bad Things happen
> all the time... 

I do often get frustrated that the DoS card is brought up to kill a
potentially useful solution. I think there should be a flag in KConfig
saying "This machine will be a server"/"This machine will be mostly a
single user desktop machine". In the latter, you can enable all these
vm/etc heuristics that will help out mozilla/X/your favourite
bloat-ware, but potentially enable a DoS attack, and in the former,
you stay conservative.

I can't rememeber the situation that I was last annoyed by someone
saying "but what about a DoS?"...

-- 
TimC -- http://astronomy.swin.edu.au/staff/tconnors/
Entropy requires no maintenance.
                -- Markoff Chaney

^ permalink raw reply	[flat|nested] 146+ messages in thread

* Re: why swap at all?
  2004-06-01 20:00             ` Valdis.Kletnieks
  2004-06-01 20:14               ` FabF
@ 2004-06-01 23:17               ` Bernd Eckenfels
  2004-06-02  5:38                 ` FabF
  2004-06-02 17:52                 ` Valdis.Kletnieks
  1 sibling, 2 replies; 146+ messages in thread
From: Bernd Eckenfels @ 2004-06-01 23:17 UTC (permalink / raw)
  To: linux-kernel

In article <200406012000.i51K0vor019011@turing-police.cc.vt.edu> you wrote:
> out (unlike some, I don't mind if Mozilla or OpenOffice end up out on
> disk after extended inactivity - but if my window manager gets swapped
> out, I get peeved when focus-follows-mouse doesn't and my typing goes
> into the wrong window or some such... ;)

Yes but: your wm is so  often used/activated it will not get swaped  out. 
But if your mouse passes over mozilla and tries to focus it, then you will
feel the pain of a swapped-out x program.

Greetings
Bernd
-- 
eckes privat - http://www.eckes.org/
Project Freefire - http://www.freefire.org/

^ permalink raw reply	[flat|nested] 146+ messages in thread

* Re: why swap at all? 
  2004-06-01 21:15                   ` FabF
@ 2004-06-01 21:40                     ` Valdis.Kletnieks
  2004-06-03 13:54                     ` Bill Davidsen
  1 sibling, 0 replies; 146+ messages in thread
From: Valdis.Kletnieks @ 2004-06-01 21:40 UTC (permalink / raw)
  To: FabF; +Cc: linux-kernel

[-- Attachment #1: Type: text/plain, Size: 1827 bytes --]

On Tue, 01 Jun 2004 23:15:36 +0200, FabF said:

> 	1.Global inactivity (what you're talking about)
> 	2.Application isolation (what we're talking about).

Again, be careful there - I wasn't the one who said inactive boxes should be in RL3. ;)

And just because I may not be typing on the keyboard doesn't mean that things
are in fact globally inactive - gkrellm is still running, and it has a plugin
monitoring the CPU temperature and adjusting the fan speed as needed, and new
mail is arriving in the background and causing status changes in my MUA.

And yes, said activity tends to keep the gkrellm and MUA pages "hot" and prevent
their swapping out.  The problem is that other processes are also doing stuff
in the background for me - but there's no really good way for the system to
know that I consider the gkrellm lages to be "more important" than those
pages taken up by xclock....

> Geek or not, someone backgrounding an application doesn't want it to
> down the box for X seconds some minutes later when it comes back and
> such things arrive many times a day.

Yes, but a solution to that really *should* take into account that some things
will only down the *app* (if OpenOffice is paging in, I can still interact with
the system if X and my window manager and an xterm aren't paged out), whereas
other things will effectively down the *system* as far as the user is concerned (if
X and/or my window manager are paged out, I'm *stuck* till they come back in).

> Maybe you've got an idea about a
> better rule(s) then ? (I mean for the 2 cases)

I admit I have slacked and haven't tried Nick Piggin's MM patches - others have
commented that those work well.  I am however quite sure that the Really Right
Answer will require much greater subtlety than a rule like "if it uses libX it
shouldn't be swapped out"....


[-- Attachment #2: Type: application/pgp-signature, Size: 226 bytes --]

^ permalink raw reply	[flat|nested] 146+ messages in thread

* Re: why swap at all?
  2004-06-01 20:22                 ` Valdis.Kletnieks
@ 2004-06-01 21:15                   ` FabF
  2004-06-01 21:40                     ` Valdis.Kletnieks
  2004-06-03 13:54                     ` Bill Davidsen
  0 siblings, 2 replies; 146+ messages in thread
From: FabF @ 2004-06-01 21:15 UTC (permalink / raw)
  To: Valdis.Kletnieks; +Cc: linux-kernel

On Tue, 2004-06-01 at 22:22, Valdis.Kletnieks@vt.edu wrote:
> On Tue, 01 Jun 2004 22:14:26 +0200, FabF said:
> 
> > Boring....You can't have X root layer swapped to disk as it's often used
> > ! Some quick lsof | grep "libX" gives all frontal applications 'swapping
> > sensible' .fuser can do 'user resource reverse'.Kernel _can_ 'appl.
> > resource reverse' as well.
> 
> The point you're missing is that if you use a rule such as "everything using
> libX* isn't swappable", then the X *server* is suddenly the prime candidate for
> swapping out (as it's quite likely the biggest user of memory not using libX*).
> (Anybody who ever had the OOM killer whomp their X server to free up space
> fast when the *real* problem was a cluster of 6 or 8 "large but still smaller
> than the X server" processes knows exactly what I mean... ;)
> 
> > PS: I'm not talking about inactive desktop box.Such box has to be rl 3
> > and is not meant to be user (geek) relevant :)
> 
> So you're saying that I should have kicked my laptop down to runlevel 3 just
> because I went across the hall to the machine room to help get a few servers
> into racks?  Or every time I go into a meeting, or get stuck on a longish phone
> call?
> 
> Also, be *very* careful equating "user" with "geek" - at least some of us are
> trying to produce systems that suit the needs of non-geek users....
> 
	As I said, I think this thread is "becoming offtopic" but what can be
interesting is the swapping problem fragmentation :

	1.Global inactivity (what you're talking about)
	2.Application isolation (what we're talking about).

Geek or not, someone backgrounding an application doesn't want it to
down the box for X seconds some minutes later when it comes back and
such things arrive many times a day.Maybe you've got an idea about a
better rule(s) then ? (I mean for the 2 cases)


^ permalink raw reply	[flat|nested] 146+ messages in thread

* Re: why swap at all? 
  2004-06-01 20:14               ` FabF
@ 2004-06-01 20:22                 ` Valdis.Kletnieks
  2004-06-01 21:15                   ` FabF
  0 siblings, 1 reply; 146+ messages in thread
From: Valdis.Kletnieks @ 2004-06-01 20:22 UTC (permalink / raw)
  To: FabF; +Cc: linux-kernel

[-- Attachment #1: Type: text/plain, Size: 1261 bytes --]

On Tue, 01 Jun 2004 22:14:26 +0200, FabF said:

> Boring....You can't have X root layer swapped to disk as it's often used
> ! Some quick lsof | grep "libX" gives all frontal applications 'swapping
> sensible' .fuser can do 'user resource reverse'.Kernel _can_ 'appl.
> resource reverse' as well.

The point you're missing is that if you use a rule such as "everything using
libX* isn't swappable", then the X *server* is suddenly the prime candidate for
swapping out (as it's quite likely the biggest user of memory not using libX*).
(Anybody who ever had the OOM killer whomp their X server to free up space
fast when the *real* problem was a cluster of 6 or 8 "large but still smaller
than the X server" processes knows exactly what I mean... ;)

> PS: I'm not talking about inactive desktop box.Such box has to be rl 3
> and is not meant to be user (geek) relevant :)

So you're saying that I should have kicked my laptop down to runlevel 3 just
because I went across the hall to the machine room to help get a few servers
into racks?  Or every time I go into a meeting, or get stuck on a longish phone
call?

Also, be *very* careful equating "user" with "geek" - at least some of us are
trying to produce systems that suit the needs of non-geek users....


[-- Attachment #2: Type: application/pgp-signature, Size: 226 bytes --]

^ permalink raw reply	[flat|nested] 146+ messages in thread

* Re: why swap at all?
  2004-06-01 20:00             ` Valdis.Kletnieks
@ 2004-06-01 20:14               ` FabF
  2004-06-01 20:22                 ` Valdis.Kletnieks
  2004-06-01 23:17               ` Bernd Eckenfels
  1 sibling, 1 reply; 146+ messages in thread
From: FabF @ 2004-06-01 20:14 UTC (permalink / raw)
  To: Valdis.Kletnieks; +Cc: linux-kernel

On Tue, 2004-06-01 at 22:00, Valdis.Kletnieks@vt.edu wrote:
> On Tue, 01 Jun 2004 21:53:32 +0200, FabF said:
> 
> > I was thinking about some rule e.g. any process using libX* isn't
> > swapped to disk until OOM ...
> 
> Odd.. some of the processes that I'd want kept in memory use libX*,
> but others that also use it are at the top of my list of things to migrate
> out (unlike some, I don't mind if Mozilla or OpenOffice end up out on
> disk after extended inactivity - but if my window manager gets swapped
> out, I get peeved when focus-follows-mouse doesn't and my typing goes
> into the wrong window or some such... ;)
> 
> And that rule doesn't even help much - as it will cause at least some X
> servers themselves to get swapped out.  Here's the list for my X server
> at the moment, as reported by lsof:
> 
> X       13886 root  txt    REG      254,1 1960870       1966 /usr/X11R6/bin/Xorg
> X       13886 root  mem    REG      254,5  105700      12388 /lib/ld-2.3.3.so
> X       13886 root  mem    REG      254,5   50944      12530 /lib/libnss_files-2.3.3.so
> X       13886 root  mem    REG      254,1   64040       1347 /usr/lib/libz.so.1.2.1.1
> X       13886 root  mem    REG      254,5  212972      53335 /lib/tls/libm-2.3.3.so
> X       13886 root  mem    REG      254,5   28008      12513 /lib/libpam.so.0.77
> X       13886 root  mem    REG      254,5   15008      12471 /lib/libdl-2.3.3.so
> X       13886 root  mem    REG      254,5    8332      12515 /lib/libpam_misc.so.0.77
> X       13886 root  mem    REG      254,5   29660      12511 /lib/libgcc_s-3.3.3-20040413.so.1
> X       13886 root  mem    REG      254,5 1451868      53258 /lib/tls/libc-2.3.3.so
> X       13886 root  mem    REG      254,1  647652      32015 /usr/X11R6/lib/modules/extensions/libglx.so.1.0.5341
> X       13886 root  mem    REG      254,1 4954876       8362 /usr/lib/tls/libGLcore.so.1.0.5341
> 
> Nope, no libX* here... ;)
> 
> It's a lot harder than it looks, which explains why we haven't gotten it right
> yet...
> 
Boring....You can't have X root layer swapped to disk as it's often used
! Some quick lsof | grep "libX" gives all frontal applications 'swapping
sensible' .fuser can do 'user resource reverse'.Kernel _can_ 'appl.
resource reverse' as well.

PS: I'm not talking about inactive desktop box.Such box has to be rl 3
and is not meant to be user (geek) relevant :)

FabF



^ permalink raw reply	[flat|nested] 146+ messages in thread

* Re: why swap at all? 
  2004-06-01 19:53           ` FabF
@ 2004-06-01 20:00             ` Valdis.Kletnieks
  2004-06-01 20:14               ` FabF
  2004-06-01 23:17               ` Bernd Eckenfels
  0 siblings, 2 replies; 146+ messages in thread
From: Valdis.Kletnieks @ 2004-06-01 20:00 UTC (permalink / raw)
  To: FabF; +Cc: linux-kernel

[-- Attachment #1: Type: text/plain, Size: 1895 bytes --]

On Tue, 01 Jun 2004 21:53:32 +0200, FabF said:

> I was thinking about some rule e.g. any process using libX* isn't
> swapped to disk until OOM ...

Odd.. some of the processes that I'd want kept in memory use libX*,
but others that also use it are at the top of my list of things to migrate
out (unlike some, I don't mind if Mozilla or OpenOffice end up out on
disk after extended inactivity - but if my window manager gets swapped
out, I get peeved when focus-follows-mouse doesn't and my typing goes
into the wrong window or some such... ;)

And that rule doesn't even help much - as it will cause at least some X
servers themselves to get swapped out.  Here's the list for my X server
at the moment, as reported by lsof:

X       13886 root  txt    REG      254,1 1960870       1966 /usr/X11R6/bin/Xorg
X       13886 root  mem    REG      254,5  105700      12388 /lib/ld-2.3.3.so
X       13886 root  mem    REG      254,5   50944      12530 /lib/libnss_files-2.3.3.so
X       13886 root  mem    REG      254,1   64040       1347 /usr/lib/libz.so.1.2.1.1
X       13886 root  mem    REG      254,5  212972      53335 /lib/tls/libm-2.3.3.so
X       13886 root  mem    REG      254,5   28008      12513 /lib/libpam.so.0.77
X       13886 root  mem    REG      254,5   15008      12471 /lib/libdl-2.3.3.so
X       13886 root  mem    REG      254,5    8332      12515 /lib/libpam_misc.so.0.77
X       13886 root  mem    REG      254,5   29660      12511 /lib/libgcc_s-3.3.3-20040413.so.1
X       13886 root  mem    REG      254,5 1451868      53258 /lib/tls/libc-2.3.3.so
X       13886 root  mem    REG      254,1  647652      32015 /usr/X11R6/lib/modules/extensions/libglx.so.1.0.5341
X       13886 root  mem    REG      254,1 4954876       8362 /usr/lib/tls/libGLcore.so.1.0.5341

Nope, no libX* here... ;)

It's a lot harder than it looks, which explains why we haven't gotten it right
yet...


[-- Attachment #2: Type: application/pgp-signature, Size: 226 bytes --]

^ permalink raw reply	[flat|nested] 146+ messages in thread

* Re: why swap at all?
  2004-06-01 19:02         ` Valdis.Kletnieks
@ 2004-06-01 19:53           ` FabF
  2004-06-01 20:00             ` Valdis.Kletnieks
  2004-06-02  3:50           ` Tim Connors
  1 sibling, 1 reply; 146+ messages in thread
From: FabF @ 2004-06-01 19:53 UTC (permalink / raw)
  To: Valdis.Kletnieks; +Cc: Bernd Eckenfels, linux-kernel

On Tue, 2004-06-01 at 21:02, Valdis.Kletnieks@vt.edu wrote:
> On Tue, 01 Jun 2004 20:36:23 +0200, FabF said:
> 
> > I guess we have a design problem right here.We could add per-process
> > swappiness attribute.That swap thread becomes boring coz we're looking
> > globally what's going wrong locally.
> 
> Hmm.. do we need to worry about the same DoS issues we need to worry about with
> mlock and friends?  I know I can trust myself to not do stupid things to said
> flags on my laptop (well... not twice anyhow ;).  On the other hand, I have
> systems with clueless users, and the even more dangerous half-clued users.  And
> then I have a bunch of machines in our security lab, where Bad Things happen
> all the time... 

I was thinking about some rule e.g. any process using libX* isn't
swapped to disk until OOM ...


^ permalink raw reply	[flat|nested] 146+ messages in thread

* Re: why swap at all? 
  2004-06-01 18:36       ` FabF
@ 2004-06-01 19:02         ` Valdis.Kletnieks
  2004-06-01 19:53           ` FabF
  2004-06-02  3:50           ` Tim Connors
  0 siblings, 2 replies; 146+ messages in thread
From: Valdis.Kletnieks @ 2004-06-01 19:02 UTC (permalink / raw)
  To: FabF; +Cc: Bernd Eckenfels, linux-kernel

[-- Attachment #1: Type: text/plain, Size: 641 bytes --]

On Tue, 01 Jun 2004 20:36:23 +0200, FabF said:

> I guess we have a design problem right here.We could add per-process
> swappiness attribute.That swap thread becomes boring coz we're looking
> globally what's going wrong locally.

Hmm.. do we need to worry about the same DoS issues we need to worry about with
mlock and friends?  I know I can trust myself to not do stupid things to said
flags on my laptop (well... not twice anyhow ;).  On the other hand, I have
systems with clueless users, and the even more dangerous half-clued users.  And
then I have a bunch of machines in our security lab, where Bad Things happen
all the time... 


[-- Attachment #2: Type: application/pgp-signature, Size: 226 bytes --]

^ permalink raw reply	[flat|nested] 146+ messages in thread

* Re: why swap at all?
  2004-05-31 23:30     ` Bernd Eckenfels
@ 2004-06-01 18:36       ` FabF
  2004-06-01 19:02         ` Valdis.Kletnieks
  0 siblings, 1 reply; 146+ messages in thread
From: FabF @ 2004-06-01 18:36 UTC (permalink / raw)
  To: Bernd Eckenfels; +Cc: linux-kernel

On Tue, 2004-06-01 at 01:30, Bernd Eckenfels wrote:
> In article <40BBB5F7.1010407@yahoo.com.au> you wrote:
> > Well, at the "expense" of paging out unused memory. I don't see
> > any swapin.
> 
> On a slow system with small memory you quite often see swapped out
> applications like for example a kopete messenger windows. Once you click on
> it, it takes 10sec or more to get responsive again. Of course its a slow
> system, but gradually paging out and forgetting image pages has that effecct
> on faster systems too, makes the desktop sluggish.
I guess we have a design problem right here.We could add per-process
swappiness attribute.That swap thread becomes boring coz we're looking
globally what's going wrong locally.

FabF



^ permalink raw reply	[flat|nested] 146+ messages in thread

* Re: why swap at all?
  2004-06-01 11:57           ` Lenar Lõhmus
  2004-06-01 12:27             ` Robin Rosenberg
@ 2004-06-01 16:49             ` jlnance
  2004-06-02 18:38               ` John Hendrikx
  1 sibling, 1 reply; 146+ messages in thread
From: jlnance @ 2004-06-01 16:49 UTC (permalink / raw)
  To: Lenar =?unknown-8bit?Q?L=F5hmus?=; +Cc: Linux Kernel Mailinglist

On Tue, Jun 01, 2004 at 02:57:00PM +0300, Lenar L?hmus wrote:
> jlnance@unity.ncsu.edu wrote:
> 
> >I'm not sure.  Copying a file is a pretty good indication that you
> >are about to do something with either the new or the old file.
> >
> Like taking the new file with me on USB dongle and deleting old one? 
> Caching the file really doesn't help in this case.

No, it does not help in this case.

Not putting things in cache is a solution for the problem of
having useful stuff pushed out of the cache.  However, fixing
the problem this way may create other problems if it causes
us to fail to put useful things into the cache.

The point I was trying (perhaps unsuccessfully) to make, is
that we should be careful about not caching things.  We are
likely to break other corner cases by fixing the ones we
are discussing.

Thanks,

Jim

^ permalink raw reply	[flat|nested] 146+ messages in thread

* Re: why swap at all?
  2004-06-01 11:57           ` Lenar Lõhmus
@ 2004-06-01 12:27             ` Robin Rosenberg
  2004-06-01 16:49             ` jlnance
  1 sibling, 0 replies; 146+ messages in thread
From: Robin Rosenberg @ 2004-06-01 12:27 UTC (permalink / raw)
  To: Lenar Lõhmus; +Cc: Linux Kernel Mailinglist

On Tuesday 01 June 2004 13.57, Lenar Lõhmus wrote:
> jlnance@unity.ncsu.edu wrote:
> >I'm not sure.  Copying a file is a pretty good indication that you
> >are about to do something with either the new or the old file.
>
> Like taking the new file with me on USB dongle and deleting old one?
> Caching the file really doesn't help in this case.

No, and most file copies are not to be used in the "near" future. I.e. on
my machine. Caching on the second read (close in time) is ok, or if there
are unused ram, but paging out things in use is bad. It's much more likely
that the page allocated to a program will be used than a newly read or written 
file.

Ofcourse your milega may vary. I'm thinking of my desktop now.

-- robin

^ permalink raw reply	[flat|nested] 146+ messages in thread

* Re: why swap at all?
  2004-05-31 10:49         ` jlnance
  2004-06-01 11:57           ` Lenar Lõhmus
@ 2004-06-01 12:21           ` David B. Stevens
  1 sibling, 0 replies; 146+ messages in thread
From: David B. Stevens @ 2004-06-01 12:21 UTC (permalink / raw)
  To: jlnance; +Cc: linux-kernel

jlnance@unity.ncsu.edu wrote:
>>>cp should use fadvise() and say that it _really_ does not need those pages.
>>
>>Yes, indeed. On the other hand the sequential read could be detected by the kernel, too.
> 
> 
> I'm not sure.  Copying a file is a pretty good indication that you
> are about to do something with either the new or the old file.
> 
> Thanks,
> 
> Jim

It is?

Sorry for butting in folks, but I've been reading this thread hoping to 
see some possible solutions.  Seems that a survey of best practices 
might have been suggested, however, I haven't seen such a suggestion.

So here goes, might it not be of some benefit to see how other operating 
systems (there are rather large number) handle the use  of memory.  For 
just one example you could look at MVS, where the application can 
request through various means how and how much memory it uses.  Most 
defaults can be overridden by the scripting language used to run the 
application.  This also true of other operating systems.

I would be more willing to say that the folks setting up the running of 
systems should have far more control over the use or non use of cache 
backed I/O data.

Now that I've said that you have to consider how and where this control 
should be based.

<soapbox>

SWAP is a solution for the age old whine, I caused the system to run out 
of memory and the big mean operating system terminated my application.

These days it allows the performance of the system to degrade to the 
point that the whine goes, The big mean operating system is taking 
forever to run my 10 TB backup and, by the way, it takes 3 days to wake 
up my openoffice application that I started a week ago.

Ain't progress grand ;-)

</soapbox>

Cheers,
   Dave




^ permalink raw reply	[flat|nested] 146+ messages in thread

* Re: why swap at all?
  2004-05-31 10:49         ` jlnance
@ 2004-06-01 11:57           ` Lenar Lõhmus
  2004-06-01 12:27             ` Robin Rosenberg
  2004-06-01 16:49             ` jlnance
  2004-06-01 12:21           ` David B. Stevens
  1 sibling, 2 replies; 146+ messages in thread
From: Lenar Lõhmus @ 2004-06-01 11:57 UTC (permalink / raw)
  To: Linux Kernel Mailinglist

jlnance@unity.ncsu.edu wrote:

>I'm not sure.  Copying a file is a pretty good indication that you
>are about to do something with either the new or the old file.
>
>  
>
Like taking the new file with me on USB dongle and deleting old one? 
Caching the file really doesn't help in this case.

Lenar


^ permalink raw reply	[flat|nested] 146+ messages in thread

* Re: why swap at all?
  2004-06-01 10:24       ` William Lee Irwin III
@ 2004-06-01 11:19         ` Tim Connors
  0 siblings, 0 replies; 146+ messages in thread
From: Tim Connors @ 2004-06-01 11:19 UTC (permalink / raw)
  To: William Lee Irwin III; +Cc: Linux Kernel Mailing List

On Tue, 1 Jun 2004, William Lee Irwin III wrote:

> On Tue, Jun 01, 2004 at 08:13:59PM +1000, Tim Connors wrote:
> > Incidentally, what happens when kswapd becomes a zombie? I've seen
> > this a few times, and I am currently posting on a machine that has
> > been up for 15 days, and which oopsed 10 or so days ago (something to
                                   ^^^^^^^^^^^^^^^^^^^^^^^^
> > do with nfs, but don't worry about that - the machine is running
> > 2.4.20, and is not exactly up-to-date), killing kswapd.
> > But I don't notice anything at all different about how the system is
> > behaving. However, I haven't been doing much more than running emacs
> > and mozilla recently - I haven't been running my visualisation
> > software that typically stresses the VM beyond usefullness.
>
> Check your syslog for oopsen. That's the only known reason for kswapd
> to become a zombie.

What ill effects are meant to happen? I haven't noticed anything ill about
the machine at all. No OOMs, no 'failed zero order allocation', etc. Swap
is currently 508388k used, 543860k free, 175060k cached, mem is 498908k
used, 15540k free.

-- 
TimC -- http://astronomy.swin.edu.au/staff/tconnors/
Error: Fuzzy Pointer Exception

^ permalink raw reply	[flat|nested] 146+ messages in thread

* Re: why swap at all?
  2004-06-01 10:13     ` Tim Connors
@ 2004-06-01 10:24       ` William Lee Irwin III
  2004-06-01 11:19         ` Tim Connors
  0 siblings, 1 reply; 146+ messages in thread
From: William Lee Irwin III @ 2004-06-01 10:24 UTC (permalink / raw)
  To: Tim Connors
  Cc: Buddy Lumpkin, 'John Bradford', 'Michael Brennan',
	linux-kernel, riel

On Tue, Jun 01, 2004 at 08:13:59PM +1000, Tim Connors wrote:
> Incidentally, what happens when kswapd becomes a zombie? I've seen
> this a few times, and I am currently posting on a machine that has
> been up for 15 days, and which oopsed 10 or so days ago (something to
> do with nfs, but don't worry about that - the machine is running
> 2.4.20, and is not exactly up-to-date), killing kswapd.
> But I don't notice anything at all different about how the system is
> behaving. However, I haven't been doing much more than running emacs
> and mozilla recently - I haven't been running my visualisation
> software that typically stresses the VM beyond usefullness.

Check your syslog for oopsen. That's the only known reason for kswapd
to become a zombie.


-- wli

^ permalink raw reply	[flat|nested] 146+ messages in thread

* Re:  why swap at all?
  2004-06-01  9:38   ` Buddy Lumpkin
@ 2004-06-01 10:13     ` Tim Connors
  2004-06-01 10:24       ` William Lee Irwin III
  0 siblings, 1 reply; 146+ messages in thread
From: Tim Connors @ 2004-06-01 10:13 UTC (permalink / raw)
  To: Buddy Lumpkin
  Cc: 'John Bradford', 'Michael Brennan', linux-kernel, riel

"Buddy Lumpkin" <b.lumpkin@comcast.net> said on Tue, 1 Jun 2004 02:38:42 -0700:
> If I know in advance that filesystem I/O will eventually fill physical
> memory with filesystem pages (pagecache), then why would I allow file system
> I/O to force out anonymous pages on the system? Also, why wake up an
> expensive algorithm (kswapd) that walks all pages in physical memory in
> order to determine which pages are "Least Recently Used" on a system where
...

Incidentally, what happens when kswapd becomes a zombie? I've seen
this a few times, and I am currently posting on a machine that has
been up for 15 days, and which oopsed 10 or so days ago (something to
do with nfs, but don't worry about that - the machine is running
2.4.20, and is not exactly up-to-date), killing kswapd.

But I don't notice anything at all different about how the system is
behaving. However, I haven't been doing much more than running emacs
and mozilla recently - I haven't been running my visualisation
software that typically stresses the VM beyond usefullness.

-- 
TimC -- http://astronomy.swin.edu.au/staff/tconnors/
Whip me. Beat me. Make me maintain AIX.

^ permalink raw reply	[flat|nested] 146+ messages in thread

* RE: why swap at all?
  2004-05-31 20:29 ` John Bradford
  2004-05-31 22:47   ` Nick Piggin
@ 2004-06-01  9:38   ` Buddy Lumpkin
  2004-06-01 10:13     ` Tim Connors
  1 sibling, 1 reply; 146+ messages in thread
From: Buddy Lumpkin @ 2004-06-01  9:38 UTC (permalink / raw)
  To: 'John Bradford', 'Michael Brennan', linux-kernel; +Cc: riel


> I'm not really sure what the above was intended to demonstrate, but 
> I assume that it was that having swap allowed the first grep to fill 
> physical RAM with
> cache at the expense of swapping other processes, which were 
> using physical RAM to disk.

> However, if 57 Mb of swap allows this, 57 Mb of extra physical RAM should
> also allow the grep to be cached, without having to swap out anything.

> Hence my comment about it not being a magical property of swap space.

> John.

Exactly! Swap will allow for room to evict anonymous pages (heap, stack,
shared memory, etc.) from memory to make room for other pages in memory.
Those pages could be file backed pages and therefore qualify as "pagecache".

When you read or write to/from a file at a given offset, the file operations
actually occur against memory.

Accessing an offset into the mapping of a file triggers a pagefault if a
page for that offset doesn't currently exist in memory. The point is, file
system I/O is actually achieved by accessing memory which triggers a major
pagefault (disk access) if the page doesn't already exist in memory. If the
page does already exist (minor fault) because the lookup of that
device,inode,offset succeeded, the already memory resident page is used
rather than incurring an I/O to disk.

This means if you grep thru the kernel tree, grep reads every line of every
file in the kernel tree trying to do pattern matching, every file in the
kernel tree is sitting in memory in it's entirety. Any subsequent reads of
those files are quite snappy since they are already memory resident (only
minor faults are incurred).

The above scenario where 57mb of swap allows for the entire kernel src tree
to be memory resident may provide tremendous value on a workstation or a
system with a small amount of disk I/O takes place, but it assumes that the
anonymous pages aren't going to be faulted right back in as soon as the
program that uses these pages becomes runnable again. It's not uncommon to
have a system where certain pages continually are paged in and out to/from
the swap device, simply because the system is very low on RAM, and
filesystem I/O is filling up physical memory at a rate that exceeds the
frequency that a process that is allocating anonymous memory becomes
runnable.

The problem is that any system doing enough constant filesystem I/O, with
enough data is eventually going to fill physical memory. This is not true
for anonymous memory. When I choose the amount of memory for a system, it is
first and foremost based on the resident set size of the application
processes. If I care about caching filesystem I/O, then all I have to do is
populate the system with enough "extra" physical RAM that files can be
cached in physical memory.

If I know in advance that filesystem I/O will eventually fill physical
memory with filesystem pages (pagecache), then why would I allow file system
I/O to force out anonymous pages on the system? Also, why wake up an
expensive algorithm (kswapd) that walks all pages in physical memory in
order to determine which pages are "Least Recently Used" on a system where
all resident set sizes of all processes add up to 100MB, and another 900MB
of physical RAM is full of filesystem backed pages?

On a server, where lots of I/O is taking place and you are willing to size
applications to fit completely within physical memory and add a little extra
for pagecache, I very much prefer the way that Solaris was modified to work
in Solaris 8 using a cyclical page cache.

The idea is "something" like this:

Filesystem backed pages are considered free memory. If you need to allocate
more anonymous memory, you just grab from (evict) the tail of a linked list
(called the cachelist) that represents the pagecache. If you update a page
or create a new page in the cachelist, then you simply move that page to the
head of the cachelist. This means there is always a small overhead in
maintaining the list, but nothing compared to the two-handed clock algorithm
that scans for pages to evict. The two-handed clock algorthim (the scanner)
is kept, but only when freemem falls to lotsfree. In Solaris 8, freemem is
the size of the cachelist + the size of free memory (pages that are not
pagecache and are free).

This way, filesystem I/O CANNOT cause the scanner to wake up and start
traversing main memory, eating up valuable CPU time. Also, anonymous pages
will not be evicted on systems due to lots of filesystem I/O. 

I won't try to imply that the Solaris 8 or later VM system outperforms the
Linux VM because I haven't compared the two, but I can attest that the
Solaris 8 VM beats the pants off the Solaris 7 VM system on systems where
large amounts of filesystem I/O take place.

You don't get the supposed "benefit" of evicting anonymous memory to swap in
order to cache filesystem pages, but quite frankly on a server, I would not
want this bug ... err, ... I mean ...  feature :) I would much rather size
my system such that applications fit in physical memory and if I so desire,
add a little extra for pagecache.

--Buddy



 

 









^ permalink raw reply	[flat|nested] 146+ messages in thread

* Re: why swap at all?
  2004-06-01  8:54           ` William Lee Irwin III
@ 2004-06-01  9:10             ` John Bradford
  2004-06-08  1:18               ` Tim Connors
  0 siblings, 1 reply; 146+ messages in thread
From: John Bradford @ 2004-06-01  9:10 UTC (permalink / raw)
  To: William Lee Irwin III; +Cc: Nick Piggin, Michael Brennan, linux-kernel

Quote from William Lee Irwin III <wli@holomorphy.com>:
> Quote from William Lee Irwin III <wli@holomorphy.com>:
> >> So you can move userspace pages out of ZONE_DMA as-needed.
> 
> On Tue, Jun 01, 2004 at 09:50:08AM +0100, John Bradford wrote:
> > But how does that improve performance before untouched RAM, (496788 in this
> > example), is exhausted?
> > In normal use, (almost always CPU bound), I've honestly never noticed any
> > performance gain from having swap configured.  I must admit I haven't put
> > a lot of effort recently in to looking at this, but I have never been able
> > to reproduce these 'swap increases performance even with untouched RAM'
> > claims.
> 
> Because ZONE_DMA, the lower 16MB is not all of RAM.

Ah, OK, this isn't really my area of expertise so maybe this is a stupid, (for
LKML), question, but can we only migrate data from low RAM via swap!?

Also, surely this is only relevant to X86 architectures?

John.

^ permalink raw reply	[flat|nested] 146+ messages in thread

* Re: why swap at all?
  2004-06-01  8:50         ` John Bradford
@ 2004-06-01  8:54           ` William Lee Irwin III
  2004-06-01  9:10             ` John Bradford
  0 siblings, 1 reply; 146+ messages in thread
From: William Lee Irwin III @ 2004-06-01  8:54 UTC (permalink / raw)
  To: John Bradford; +Cc: Nick Piggin, Michael Brennan, linux-kernel

Quote from William Lee Irwin III <wli@holomorphy.com>:
>> So you can move userspace pages out of ZONE_DMA as-needed.

On Tue, Jun 01, 2004 at 09:50:08AM +0100, John Bradford wrote:
> But how does that improve performance before untouched RAM, (496788 in this
> example), is exhausted?
> In normal use, (almost always CPU bound), I've honestly never noticed any
> performance gain from having swap configured.  I must admit I haven't put
> a lot of effort recently in to looking at this, but I have never been able
> to reproduce these 'swap increases performance even with untouched RAM'
> claims.

Because ZONE_DMA, the lower 16MB is not all of RAM.


-- wli

^ permalink raw reply	[flat|nested] 146+ messages in thread

* Re: why swap at all?
  2004-06-01  8:32       ` William Lee Irwin III
@ 2004-06-01  8:50         ` John Bradford
  2004-06-01  8:54           ` William Lee Irwin III
  0 siblings, 1 reply; 146+ messages in thread
From: John Bradford @ 2004-06-01  8:50 UTC (permalink / raw)
  To: William Lee Irwin III; +Cc: Nick Piggin, Michael Brennan, linux-kernel

Quote from William Lee Irwin III <wli@holomorphy.com>:
> On Tue, Jun 01, 2004 at 09:34:01AM +0100, John Bradford wrote:
> > Sure, but tell me, for example, what is the point of having swap on a system
> > like this:
> > $ free
> >              total       used       free     shared    buffers     cached
> > Mem:        516688      19900     496788          0        628      11276
> > -/+ buffers/cache:       7996     508692
> > Swap:            0          0          0
> 
> So you can move userspace pages out of ZONE_DMA as-needed.

But how does that improve performance before untouched RAM, (496788 in this
example), is exhausted?

In normal use, (almost always CPU bound), I've honestly never noticed any
performance gain from having swap configured.  I must admit I haven't put
a lot of effort recently in to looking at this, but I have never been able
to reproduce these 'swap increases performance even with untouched RAM'
claims.

John.

^ permalink raw reply	[flat|nested] 146+ messages in thread

* Re: why swap at all?
  2004-05-31 22:47   ` Nick Piggin
  2004-05-31 23:30     ` Bernd Eckenfels
@ 2004-06-01  8:34     ` John Bradford
  2004-06-01  8:32       ` William Lee Irwin III
  1 sibling, 1 reply; 146+ messages in thread
From: John Bradford @ 2004-06-01  8:34 UTC (permalink / raw)
  To: Nick Piggin; +Cc: Michael Brennan, linux-kernel

> > However, if 57 Mb of swap allows this, 57 Mb of extra physical RAM should also
> > also allow the grep to be cached, without having to swap out anything.
> > 
> 
> Well yes, but if I had another 57MB of physical memory then I would
> still turn on swap so that other 57MB of unused memory isn't wasted.

Sure, but tell me, for example, what is the point of having swap on a system
like this:

$ free
             total       used       free     shared    buffers     cached
Mem:        516688      19900     496788          0        628      11276
-/+ buffers/cache:       7996     508692
Swap:            0          0          0

John.

^ permalink raw reply	[flat|nested] 146+ messages in thread

* Re: why swap at all?
  2004-06-01  8:34     ` John Bradford
@ 2004-06-01  8:32       ` William Lee Irwin III
  2004-06-01  8:50         ` John Bradford
  0 siblings, 1 reply; 146+ messages in thread
From: William Lee Irwin III @ 2004-06-01  8:32 UTC (permalink / raw)
  To: John Bradford; +Cc: Nick Piggin, Michael Brennan, linux-kernel

On Tue, Jun 01, 2004 at 09:34:01AM +0100, John Bradford wrote:
> Sure, but tell me, for example, what is the point of having swap on a system
> like this:
> $ free
>              total       used       free     shared    buffers     cached
> Mem:        516688      19900     496788          0        628      11276
> -/+ buffers/cache:       7996     508692
> Swap:            0          0          0

So you can move userspace pages out of ZONE_DMA as-needed.


-- wli

^ permalink raw reply	[flat|nested] 146+ messages in thread

* Re: why swap at all?
  2004-05-31 22:47   ` Nick Piggin
@ 2004-05-31 23:30     ` Bernd Eckenfels
  2004-06-01 18:36       ` FabF
  2004-06-01  8:34     ` John Bradford
  1 sibling, 1 reply; 146+ messages in thread
From: Bernd Eckenfels @ 2004-05-31 23:30 UTC (permalink / raw)
  To: linux-kernel

In article <40BBB5F7.1010407@yahoo.com.au> you wrote:
> Well, at the "expense" of paging out unused memory. I don't see
> any swapin.

On a slow system with small memory you quite often see swapped out
applications like for example a kopete messenger windows. Once you click on
it, it takes 10sec or more to get responsive again. Of course its a slow
system, but gradually paging out and forgetting image pages has that effecct
on faster systems too, makes the desktop sluggish.

> Well yes, but if I had another 57MB of physical memory then I would
> still turn on swap so that other 57MB of unused memory isn't wasted.

Actually the number of totally unused memory is quite small. Therefore the
pages get swapped in sooner or later anyway. And even if you turn of fswap
completely, the image pages backed up by binaries on disk get still freeded,
if the code is unused. So on my multimedia system I prefer to have no swap
(1GB ram) and make sure the pages are not freeded so aggressivley to keep
the system smooth and responsive (and allow spin down of the disk).

Bernd
-- 
eckes privat - http://www.eckes.org/
Project Freefire - http://www.freefire.org/

^ permalink raw reply	[flat|nested] 146+ messages in thread

* Re: why swap at all?
  2004-05-31 20:29 ` John Bradford
@ 2004-05-31 22:47   ` Nick Piggin
  2004-05-31 23:30     ` Bernd Eckenfels
  2004-06-01  8:34     ` John Bradford
  2004-06-01  9:38   ` Buddy Lumpkin
  1 sibling, 2 replies; 146+ messages in thread
From: Nick Piggin @ 2004-05-31 22:47 UTC (permalink / raw)
  To: John Bradford; +Cc: Michael Brennan, linux-kernel

John Bradford wrote:
> Hi,
> 
> Quote from Michael Brennan <mbrennan@ezrs.com>:
> 
>>Hi!
>>I've recently started to follow this list.
>>I read the swap discussion here, and I was wondering about what Nick 
>>Pigging said about grepping the kernel tree.
>>
>>Nick Piggin wrote:
>> > For example, I have 57MB swapped right now. It allows me to instantly
>> > grep the kernel tree. If I turned swap off, each grep would probably
>> > take 30 seconds.
>>
>>Are the pages swapped to disk as a result of the grep run?
> 

The pages are gradually swapped to disk as I use the system.
> 
> I'm not really sure what the above was intended to demonstrate, but I assume
> that it was that having swap allowed the first grep to fill physical RAM with
> cache at the expense of swapping other processes, which were using physical
> RAM to disk.
> 

Well, at the "expense" of paging out unused memory. I don't see
any swapin.

> However, if 57 Mb of swap allows this, 57 Mb of extra physical RAM should also
> also allow the grep to be cached, without having to swap out anything.
> 

Well yes, but if I had another 57MB of physical memory then I would
still turn on swap so that other 57MB of unused memory isn't wasted.

^ permalink raw reply	[flat|nested] 146+ messages in thread

* Re: why swap at all?
  2004-05-31 19:34 Michael Brennan
@ 2004-05-31 20:29 ` John Bradford
  2004-05-31 22:47   ` Nick Piggin
  2004-06-01  9:38   ` Buddy Lumpkin
  0 siblings, 2 replies; 146+ messages in thread
From: John Bradford @ 2004-05-31 20:29 UTC (permalink / raw)
  To: Michael Brennan, linux-kernel

Hi,

Quote from Michael Brennan <mbrennan@ezrs.com>:
> Hi!
> I've recently started to follow this list.
> I read the swap discussion here, and I was wondering about what Nick 
> Pigging said about grepping the kernel tree.
> 
> Nick Piggin wrote:
>  > For example, I have 57MB swapped right now. It allows me to instantly
>  > grep the kernel tree. If I turned swap off, each grep would probably
>  > take 30 seconds.
> 
> Are the pages swapped to disk as a result of the grep run?

I'm not really sure what the above was intended to demonstrate, but I assume
that it was that having swap allowed the first grep to fill physical RAM with
cache at the expense of swapping other processes, which were using physical
RAM to disk.

However, if 57 Mb of swap allows this, 57 Mb of extra physical RAM should also
also allow the grep to be cached, without having to swap out anything.

Hence my comment about it not being a magical property of swap space.

John.

^ permalink raw reply	[flat|nested] 146+ messages in thread

* Re: why swap at all?
@ 2004-05-31 19:34 Michael Brennan
  2004-05-31 20:29 ` John Bradford
  0 siblings, 1 reply; 146+ messages in thread
From: Michael Brennan @ 2004-05-31 19:34 UTC (permalink / raw)
  To: linux-kernel

Hi!
I've recently started to follow this list.
I read the swap discussion here, and I was wondering about what Nick 
Pigging said about grepping the kernel tree.

Nick Piggin wrote:
 > For example, I have 57MB swapped right now. It allows me to instantly
 > grep the kernel tree. If I turned swap off, each grep would probably
 > take 30 seconds.

Are the pages swapped to disk as a result of the grep run?
Im still running 2.4.25. And when I do a grep on the linux kernel tree, 
it always takes at least 2 minutes at every run. Almost all physical 
ram, and 21MB of swap is used. Should the files read by grep be cached 
in memory/swap?

Michael Brennan

^ permalink raw reply	[flat|nested] 146+ messages in thread

* Re: why swap at all?
  2004-05-28 22:28       ` Bernd Eckenfels
  2004-05-29  7:31         ` Denis Vlasenko
@ 2004-05-31 10:49         ` jlnance
  2004-06-01 11:57           ` Lenar Lõhmus
  2004-06-01 12:21           ` David B. Stevens
  1 sibling, 2 replies; 146+ messages in thread
From: jlnance @ 2004-05-31 10:49 UTC (permalink / raw)
  To: linux-kernel


> > cp should use fadvise() and say that it _really_ does not need those pages.
> 
> Yes, indeed. On the other hand the sequential read could be detected by the kernel, too.

I'm not sure.  Copying a file is a pretty good indication that you
are about to do something with either the new or the old file.

Thanks,

Jim

^ permalink raw reply	[flat|nested] 146+ messages in thread

* Re: why swap at all?
  2004-05-28 22:28       ` Bernd Eckenfels
@ 2004-05-29  7:31         ` Denis Vlasenko
  2004-05-31 10:49         ` jlnance
  1 sibling, 0 replies; 146+ messages in thread
From: Denis Vlasenko @ 2004-05-29  7:31 UTC (permalink / raw)
  To: Bernd Eckenfels, linux-kernel

On Saturday 29 May 2004 01:28, Bernd Eckenfels wrote:
> In article <200405290037.17775.vda@port.imtp.ilyichevsk.odessa.ua> you wrote:
> >> The benchmark involved was ls.  It took several seconds.  If I ran it
> >> again in 5 seconds or so, it was fine.  Much longer and it would take
> >> several seconds again.  Sounds like pages getting evicted in LRU order.
> >
> > By what magic system can know that you are going to do ls again
> > in 2 minutes?
>
> The problem is more about the blocks cp touches, less  about predicting the
> ls workload.
>
> > cp should use fadvise() and say that it _really_ does not need those
> > pages.
>
> Yes, indeed. On the other hand the sequential read could be detected by the
> kernel, too.

Looks like it was. ls' read was sequential, too, so it did not get any
advantage. If you can definitely show that streaming io
(say, cat hugefile >/dev/null) flushes _non_ sequentially read data
(pages with program/library code, data of e.g. your Mozilla, etc),
please submit a report to lkml. VM gurus said more than once
that they _want_ to fix things, but need to know how to reproduce.
--
vda


^ permalink raw reply	[flat|nested] 146+ messages in thread

* Re: why swap at all?
  2004-05-28 21:37     ` Denis Vlasenko
@ 2004-05-28 22:28       ` Bernd Eckenfels
  2004-05-29  7:31         ` Denis Vlasenko
  2004-05-31 10:49         ` jlnance
  0 siblings, 2 replies; 146+ messages in thread
From: Bernd Eckenfels @ 2004-05-28 22:28 UTC (permalink / raw)
  To: linux-kernel

In article <200405290037.17775.vda@port.imtp.ilyichevsk.odessa.ua> you wrote:
>> The benchmark involved was ls.  It took several seconds.  If I ran it again
>> in 5 seconds or so, it was fine.  Much longer and it would take several
>> seconds again.  Sounds like pages getting evicted in LRU order.
> 
> By what magic system can know that you are going to do ls again
> in 2 minutes?

The problem is more about the blocks cp touches, less  about predicting the ls workload.

> cp should use fadvise() and say that it _really_ does not need those pages.

Yes, indeed. On the other hand the sequential read could be detected by the kernel, too.

Greetings
Bernd
-- 
eckes privat - http://www.eckes.org/
Project Freefire - http://www.freefire.org/

^ permalink raw reply	[flat|nested] 146+ messages in thread

* Re: why swap at all?
  2004-05-27 11:39   ` Andy Lutomirski
@ 2004-05-28 21:37     ` Denis Vlasenko
  2004-05-28 22:28       ` Bernd Eckenfels
  0 siblings, 1 reply; 146+ messages in thread
From: Denis Vlasenko @ 2004-05-28 21:37 UTC (permalink / raw)
  To: Andy Lutomirski, Nick Piggin
  Cc: Tom Felker, Matthias Schniedermeyer, linux-kernel

On Thursday 27 May 2004 14:39, Andy Lutomirski wrote:
> Nick Piggin wrote:
> > Tom Felker wrote:
> >> On Wednesday 26 May 2004 7:37 am, Matthias Schniedermeyer wrote:
> >>> program to kernel: "i read ONCE though this file caching not useful".
> >>
> >> Very true.  The system is based on the assumption that just-used pages
> >> are more useful that older pages, and it slows when this isn't true.
> >> We need ways to tell the kernel whether the assumption holds.
> >
> > A streaming flag is great, but we usually do OK without it. There
> > is a "used once" heuristic that often gets it right as far as I
> > know. Basically, new pages that are only used once put almost zero
> > pressure on the rest of the memory.
>
> (Disclaimer: I don't know all that much about the current scheme.)
>
> First, I don't believe this works.  A couple weeks ago I did
>
> # cp -a <~100GB> <different physical disk>
>
> and my system was nearly unusable for a few hours.  This is Athlon 64
> 3200+, 512MB RAM, DMA on on both drives, iowait time around 90%.  So this
> was an io/pagecache problem.
>
> The benchmark involved was ls.  It took several seconds.  If I ran it again
> in 5 seconds or so, it was fine.  Much longer and it would take several
> seconds again.  Sounds like pages getting evicted in LRU order.

By what magic system can know that you are going to do ls again
in 2 minutes?

Does is happen if you do ls several times in a row (to make needed pages
not-once-used), then wait a bit and do ls again?

> I have this problem not only on every linux kernel I've ever tried (on
> different computers) but on other OS's as well.  It's not an easy one to
> solve.
>
> For kicks, I checked out vmstat 1 (I don't have a copy right of the
> output).  It looked like cp -a dirtied pages as long as it could get them,
> and they got written out as quickly as they could.  And, for whatever
> reason, the writes lag behind the reads by an amount comparable to the size
> of my physical memory.

cp should use fadvise() and say that it _really_ does not need those pages.

> It seems like some kind of limiting/balancing of what gets to use the cache
> is needed.  I bet that most workloads that touch data much larger than RAM
> don't benefit that much from caching it all.  (Yes, that kernel-tree-grep
> from cache is nice, but having glibc in cache is also nice.)
>
> Should there be something like a (small) limit to how many dirty,
> non-mmaped pages a task can have?  I have no objection to a program taking
> longer to finish because the 100MB it writes need to mostly hit the platter
> before it returns, since, in return, I get a usable system while it's
> running and it's not taking any more CPU time.
>
> Second (IMHO) a "used once" heuristic has a fundamental problem:
>
> If there are more pages "used more than once" _in roughly sequential order_
> than fit in memory, then trying to cache them all is absurd.  That is, if
> some program makes _multiple passes_ over that 100GB (mkisofs?), the system
> should never try to cache it all.  It would be better off taking a guess
> (even a wild-ass-guess) of which 200MB to cache plus a few MB for
> readahead, leaving pages from other programs in cache for more than a few
> seconds, and probably getting better performance (i.e. those 200MB are at
> least cached next time around).

Easier said than done... Why 200MB and not 400? etc...

> Is any of this reasonable?

If you think you see VM misbehavior,
1) verify that it is indeed MISbehaving
2) produce useful bug report
3) report it and track it until fixed

Apps take ages to start after cache being trashed because they are bloated.
Fight bloat. Join uclibc/dietlibc/etc efforts.

Random example: why on earth ntpd daemon have RSS of ~1.5 Mb???!
--
vda


^ permalink raw reply	[flat|nested] 146+ messages in thread

* RE: why swap at all?
  2004-05-27  5:37 Nick Piggin
@ 2004-05-27 17:27 ` Buddy Lumpkin
  0 siblings, 0 replies; 146+ messages in thread
From: Buddy Lumpkin @ 2004-05-27 17:27 UTC (permalink / raw)
  To: 'Nick Piggin'
  Cc: 'John Bradford', 'William Lee Irwin III',
	orders, linux-kernel


>I can picture it but I don't know how the kernel is going to handle
>it. All I am doing is speaking from what I have seen.

>http://marc.theaimsgroup.com/?l=linux-kernel&m=107817776322044&w=2

>This post for example, has profiles of a 32 CPU system with 16 FC
>controllers and over 1000 disks, doing some database workload. Does
>this qualify as big iron?

>In the bottom profile, you see the disks being kept busy with 50%
>idle time. The top 6 functions are all to do with generating IO
>requests and pushing them through the block layer, none of them
>involve memory reclaim.


They are using direct I/O ... therefore the DMA memory transfers are mapped
directly into the user address space bypassing the pagecache altogether.

--Buddy




There are profiles from a different setup in a related thread here:

http://groups.google.com.au/groups?q=g:thl3816668183d&dq=&hl=en&lr=&ie=UTF-8
&selm=1yjKu-7qU-1%40gated-at.bofh.it&rnum=9

I think we see kmem_cache_alloc make a miserable showing for the
memory allocation team, but it wouldn't even be there if the
profile were sorted by ticks (the left hand column).


Now If you had some experiences of memory reclaim slowing down
block IO, I'd love to hear them because that is related to an area
that I am looking at currently. I'm not saying what you claim is
impossible, but it is something that shouldn't happen and we don't
relly see... You're continuing to insist there is a problem but
that simply isn't helpful without further details.


^ permalink raw reply	[flat|nested] 146+ messages in thread

* Re: why swap at all?
  2004-05-27 15:59   ` John Bradford
@ 2004-05-27 16:16     ` William Lee Irwin III
  0 siblings, 0 replies; 146+ messages in thread
From: William Lee Irwin III @ 2004-05-27 16:16 UTC (permalink / raw)
  To: John Bradford
  Cc: Piszcz, Justin Michael, Andy Lutomirski, Nick Piggin, Tom Felker,
	Matthias Schniedermeyer, linux-kernel

Quote from William Lee Irwin III <wli@holomorphy.com>:
>> Yes. You want swap so you can physically relocate anonymous pages in the
>> rare case one ends up somewhere it could cause memory pressure against
>> allocations that can only be satisfied by a restricted range of memory.

On Thu, May 27, 2004 at 04:59:52PM +0100, John Bradford wrote:
> I think you are assuming a 100% perfect VM system.  In practice, if
> the machine isn't heavily loaded, unnecessary swap is more likely to
> cause, (slight, and possibly negligable), slowdowns, than bring any
> noticable performance benefit.

First, the above not a performance issue to begin with. It's a workload
feasibility issue. Second, the only overhead of swap when it's unused
is vmallocspace. Third, the only way to eliminate the runtime overhead
of the swap layer is CONFIG_SWAP=n.

The above scenario is not particularly common, but can be "fatal" to
the critical applications whose allocations were infeasible. I'd
recommend using a small amount of swapspace on your 16GB machine, e.g.
256MB or 512MB. One method of removing this requirement that swapspace
be configured so the kernel can get itself out of this pathological
situation is to implement page migration, so that memory movement e.g.
between zones need not be carried out through a backing store.

-- wli

^ permalink raw reply	[flat|nested] 146+ messages in thread

* Re: why swap at all?
  2004-05-27 12:41 ` William Lee Irwin III
@ 2004-05-27 15:59   ` John Bradford
  2004-05-27 16:16     ` William Lee Irwin III
  2004-06-03 13:38   ` Bill Davidsen
  1 sibling, 1 reply; 146+ messages in thread
From: John Bradford @ 2004-05-27 15:59 UTC (permalink / raw)
  To: William Lee Irwin III, Piszcz, Justin Michael
  Cc: Andy Lutomirski, Nick Piggin, Tom Felker,
	Matthias Schniedermeyer, linux-kernel

Quote from William Lee Irwin III <wli@holomorphy.com>:
> On Thu, May 27, 2004 at 08:31:26AM -0400, Piszcz, Justin Michael wrote:
> > If I have 16GB of ram should I use swap?
> > Would swap cause the machine to slow down?
> 
> Yes. You want swap so you can physically relocate anonymous pages in the
> rare case one ends up somewhere it could cause memory pressure against
> allocations that can only be satisfied by a restricted range of memory.

I think you are assuming a 100% perfect VM system.  In practice, if the machine
isn't heavily loaded, unnecessary swap is more likely to cause, (slight, and
possibly negligable), slowdowns, than bring any noticable performance benefit.

John.

^ permalink raw reply	[flat|nested] 146+ messages in thread

* Re: why swap at all?
  2004-05-27  5:59               ` Nick Piggin
@ 2004-05-27 14:34                 ` Wakko Warner
  0 siblings, 0 replies; 146+ messages in thread
From: Wakko Warner @ 2004-05-27 14:34 UTC (permalink / raw)
  To: Nick Piggin; +Cc: linux-kernel

> > I have a question about that.  I keep a debian mirror on one of my machines. 
> > there is over 70000 files.  If I run find on that tree while it's
> > downloading the file list, it doesn't take as long.  I thought it would be
> > nice if there was some way I could keep that in memory.  The box has 256mb
> > ram no swap.  It is configured as diskless.
> > 
> 
> You mean that if you prime the cache by running find on the tree,
> your actual operation doesn't take as long?

Yup.  Running the mirror doesn't matter really.  I start that before I
retire at the end of the day.

> I don't doubt this. Slab cache is shrunk aggressively compared to
> page cache. Traditionally I think this has been due at least in
> part to some failure cases in the balancing there resulting in slab
> growing out of control with some systems.

Where it gets me is the 2nd mirror I have on a usb disk.  Updating it takes
a while.  Although priming the cache on the machine where the usb disk is is
a bit quicker than where the mirror is (rsync over tcp/ip).  Both disks use
ext3, but the machine the usb is on has way more memory, usb2, and overall
quicker than the other.

> These failure cases should be fixed now, and slab vs pagecache is
> probably something that should be looked at again. I really need
> to get my hands on a 2GB+ system before I'd be game to start
> fiddling with too much stuff though.

I've been wanting to upgrade that machine to 768mb, but I don't know if
it'll handle it.

-- 
 Lab tests show that use of micro$oft causes cancer in lab animals

^ permalink raw reply	[flat|nested] 146+ messages in thread

* Re: why swap at all?
  2004-05-27 12:31 Piszcz, Justin Michael
@ 2004-05-27 12:41 ` William Lee Irwin III
  2004-05-27 15:59   ` John Bradford
  2004-06-03 13:38   ` Bill Davidsen
  0 siblings, 2 replies; 146+ messages in thread
From: William Lee Irwin III @ 2004-05-27 12:41 UTC (permalink / raw)
  To: Piszcz, Justin Michael
  Cc: Andy Lutomirski, Nick Piggin, Tom Felker,
	Matthias Schniedermeyer, linux-kernel

On Thu, May 27, 2004 at 08:31:26AM -0400, Piszcz, Justin Michael wrote:
> If I have 16GB of ram should I use swap?
> Would swap cause the machine to slow down?

Yes. You want swap so you can physically relocate anonymous pages in the
rare case one ends up somewhere it could cause memory pressure against
allocations that can only be satisfied by a restricted range of memory.


-- wli

^ permalink raw reply	[flat|nested] 146+ messages in thread

* RE: why swap at all?
@ 2004-05-27 12:31 Piszcz, Justin Michael
  2004-05-27 12:41 ` William Lee Irwin III
  0 siblings, 1 reply; 146+ messages in thread
From: Piszcz, Justin Michael @ 2004-05-27 12:31 UTC (permalink / raw)
  To: Andy Lutomirski, Nick Piggin
  Cc: Tom Felker, Matthias Schniedermeyer, linux-kernel

If I have 16GB of ram should I use swap?
Would swap cause the machine to slow down?


-----Original Message-----
From: linux-kernel-owner@vger.kernel.org
[mailto:linux-kernel-owner@vger.kernel.org] On Behalf Of Andy Lutomirski
Sent: Thursday, May 27, 2004 7:39 AM
To: Nick Piggin
Cc: Tom Felker; Matthias Schniedermeyer; linux-kernel@vger.kernel.org
Subject: Re: why swap at all?

Nick Piggin wrote:
> Tom Felker wrote:
> 
>> On Wednesday 26 May 2004 7:37 am, Matthias Schniedermeyer wrote:
>>
>>
>>> program to kernel: "i read ONCE though this file caching not
useful".
>>
>>
>>
>> Very true.  The system is based on the assumption that just-used
pages 
>> are more useful that older pages, and it slows when this isn't true.

>> We need ways to tell the kernel whether the assumption holds.
>>
> 
> A streaming flag is great, but we usually do OK without it. There
> is a "used once" heuristic that often gets it right as far as I
> know. Basically, new pages that are only used once put almost zero
> pressure on the rest of the memory.

(Disclaimer: I don't know all that much about the current scheme.)

First, I don't believe this works.  A couple weeks ago I did

# cp -a <~100GB> <different physical disk>

and my system was nearly unusable for a few hours.  This is Athlon 64 
3200+, 512MB RAM, DMA on on both drives, iowait time around 90%.  So
this 
was an io/pagecache problem.

The benchmark involved was ls.  It took several seconds.  If I ran it
again 
in 5 seconds or so, it was fine.  Much longer and it would take several 
seconds again.  Sounds like pages getting evicted in LRU order.

I have this problem not only on every linux kernel I've ever tried (on 
different computers) but on other OS's as well.  It's not an easy one to
solve.

For kicks, I checked out vmstat 1 (I don't have a copy right of the 
output).  It looked like cp -a dirtied pages as long as it could get
them, 
and they got written out as quickly as they could.  And, for whatever 
reason, the writes lag behind the reads by an amount comparable to the
size 
of my physical memory.

It seems like some kind of limiting/balancing of what gets to use the
cache 
is needed.  I bet that most workloads that touch data much larger than
RAM 
don't benefit that much from caching it all.  (Yes, that
kernel-tree-grep 
from cache is nice, but having glibc in cache is also nice.)

Should there be something like a (small) limit to how many dirty, 
non-mmaped pages a task can have?  I have no objection to a program
taking 
longer to finish because the 100MB it writes need to mostly hit the
platter 
before it returns, since, in return, I get a usable system while it's 
running and it's not taking any more CPU time.

Second (IMHO) a "used once" heuristic has a fundamental problem:

If there are more pages "used more than once" _in roughly sequential
order_ 
than fit in memory, then trying to cache them all is absurd.  That is,
if 
some program makes _multiple passes_ over that 100GB (mkisofs?), the
system 
should never try to cache it all.  It would be better off taking a guess

(even a wild-ass-guess) of which 200MB to cache plus a few MB for 
readahead, leaving pages from other programs in cache for more than a
few 
seconds, and probably getting better performance (i.e. those 200MB are
at 
least cached next time around).


Is any of this reasonable?

--Andy
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel"
in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



^ permalink raw reply	[flat|nested] 146+ messages in thread

* Re: why swap at all?
       [not found] ` <fa.bqpvcrs.u648jq@ifi.uio.no>
@ 2004-05-27 11:39   ` Andy Lutomirski
  2004-05-28 21:37     ` Denis Vlasenko
  0 siblings, 1 reply; 146+ messages in thread
From: Andy Lutomirski @ 2004-05-27 11:39 UTC (permalink / raw)
  To: Nick Piggin; +Cc: Tom Felker, Matthias Schniedermeyer, linux-kernel

Nick Piggin wrote:
> Tom Felker wrote:
> 
>> On Wednesday 26 May 2004 7:37 am, Matthias Schniedermeyer wrote:
>>
>>
>>> program to kernel: "i read ONCE though this file caching not useful".
>>
>>
>>
>> Very true.  The system is based on the assumption that just-used pages 
>> are more useful that older pages, and it slows when this isn't true.  
>> We need ways to tell the kernel whether the assumption holds.
>>
> 
> A streaming flag is great, but we usually do OK without it. There
> is a "used once" heuristic that often gets it right as far as I
> know. Basically, new pages that are only used once put almost zero
> pressure on the rest of the memory.

(Disclaimer: I don't know all that much about the current scheme.)

First, I don't believe this works.  A couple weeks ago I did

# cp -a <~100GB> <different physical disk>

and my system was nearly unusable for a few hours.  This is Athlon 64 
3200+, 512MB RAM, DMA on on both drives, iowait time around 90%.  So this 
was an io/pagecache problem.

The benchmark involved was ls.  It took several seconds.  If I ran it again 
in 5 seconds or so, it was fine.  Much longer and it would take several 
seconds again.  Sounds like pages getting evicted in LRU order.

I have this problem not only on every linux kernel I've ever tried (on 
different computers) but on other OS's as well.  It's not an easy one to solve.

For kicks, I checked out vmstat 1 (I don't have a copy right of the 
output).  It looked like cp -a dirtied pages as long as it could get them, 
and they got written out as quickly as they could.  And, for whatever 
reason, the writes lag behind the reads by an amount comparable to the size 
of my physical memory.

It seems like some kind of limiting/balancing of what gets to use the cache 
is needed.  I bet that most workloads that touch data much larger than RAM 
don't benefit that much from caching it all.  (Yes, that kernel-tree-grep 
from cache is nice, but having glibc in cache is also nice.)

Should there be something like a (small) limit to how many dirty, 
non-mmaped pages a task can have?  I have no objection to a program taking 
longer to finish because the 100MB it writes need to mostly hit the platter 
before it returns, since, in return, I get a usable system while it's 
running and it's not taking any more CPU time.

Second (IMHO) a "used once" heuristic has a fundamental problem:

If there are more pages "used more than once" _in roughly sequential order_ 
than fit in memory, then trying to cache them all is absurd.  That is, if 
some program makes _multiple passes_ over that 100GB (mkisofs?), the system 
should never try to cache it all.  It would be better off taking a guess 
(even a wild-ass-guess) of which 200MB to cache plus a few MB for 
readahead, leaving pages from other programs in cache for more than a few 
seconds, and probably getting better performance (i.e. those 200MB are at 
least cached next time around).


Is any of this reasonable?

--Andy

^ permalink raw reply	[flat|nested] 146+ messages in thread

* Re: why swap at all?
  2004-05-26 23:32                     ` Kyle Moffett
@ 2004-05-27  8:05                       ` John Bradford
  0 siblings, 0 replies; 146+ messages in thread
From: John Bradford @ 2004-05-27  8:05 UTC (permalink / raw)
  To: Kyle Moffett; +Cc: linux-kernel, David Schwartz

Quote from Kyle Moffett <mrmacman_g4@mac.com>:
> On May 26, 2004, at 12:58, John Bradford wrote:
> >> 	A lot of people feel subjectively that swap makes a system slow. 
> >> There's
> >> anecdotal evidence that swap does horrible things or "must be badly 
> >> broken
> >> because the machine gets slow" on almost every operating system that
> >> supports swapping. In most cases, it's just a case where the real 
> >> working
> >> set has exceeded physical memory, and in that case, swap is just 
> >> doing what
> >> it's supposed to be doing.
> > It's true that physical RAM or swap, over and above the minimum needed 
> > for
> > the working set is usually beneficial.  However where there is 
> > physical RAM
> > which will never be touched during normal usage, adding swap will not 
> > be
> > beneficial.
> 
> If your RAM happens to be large enough to contain not only everything 
> on disk
> you ever want to even read *and* all the space you need for 
> calculations, then
> you have nothing to gain from using swap.  On the other hand, if you 
> are say,
> grepping through a kernel source tree, the first time it is read from 
> disk, but after
> that it is stored in cache in your RAM.  If you have swap, anonymous 
> pages of
> RAM that are not in use can be paged out while you do your grepping, 
> even if
> you are grepping through a 900MB+ dataset and only have 1GB RAM.  Swap
> allows non-filesystem-backed pages to be pushed to disk for some 
> filesystem
> backed pages to be loaded and used.

Think about it - you seem to be suggesting that adding more and more swap will
free up more and more physical RAM to be used as cache, but that it not really
true, because once you've freed up all of the physical RAM, there is no more
to free up.

That it not to say that there is no point in having more swap than physical
RAM at all, rather that once all non-filesystem-backed pages apart from cache
have been pushed out to swap, (and note that executables can and will be pushed
out to swap independently of swap space anyway), all that additional swap
space will allow is to run more processes, or move cache out to swap, which
admittedly could give a performance benefit in some instances, but in most
cases I think it would be minimal.

John.

^ permalink raw reply	[flat|nested] 146+ messages in thread

* Re: why swap at all?
  2004-05-27  5:14                       ` Tom Felker
  2004-05-27  6:02                         ` Nick Piggin
  2004-05-27  7:04                         ` Bernd Eckenfels
@ 2004-05-27  7:16                         ` Oliver Neukum
  2 siblings, 0 replies; 146+ messages in thread
From: Oliver Neukum @ 2004-05-27  7:16 UTC (permalink / raw)
  To: Tom Felker; +Cc: Matthias Schniedermeyer, Nick Piggin, linux-kernel

Am Donnerstag, 27. Mai 2004 07:14 schrieb Tom Felker:
> Most drastic would be to change the way to choose pages to throw out.  
> Different processes or pages could have different priorities, so you could 
> mark interactive processes as keepers even if you haven't used them in days.

Do you really want that? Wouldn't you rather want pages of such tasks
swapped in very aggresively once the first page fault happens? Or even
preemptively?

	Regards
		Oliver

^ permalink raw reply	[flat|nested] 146+ messages in thread

* Re: why swap at all?
  2004-05-27  5:14                       ` Tom Felker
  2004-05-27  6:02                         ` Nick Piggin
@ 2004-05-27  7:04                         ` Bernd Eckenfels
  2004-05-27  7:16                         ` Oliver Neukum
  2 siblings, 0 replies; 146+ messages in thread
From: Bernd Eckenfels @ 2004-05-27  7:04 UTC (permalink / raw)
  To: linux-kernel

In article <200405270014.10096.tcfelker@mtco.com> you wrote:
> O_STREAMING and a flag to not cache a file when it closes are a good start.  

Win32 API has a FILE_ATTRIBTE_TEMPORARY to mark files which should be
prefered be served from buffercache, FIL_FLAG_NO_BUFFERING allows raw access
(required block boundary reads). FILE_FLAG_RANDOM_ACCESS is used to hint the
cache (dont know what it does, maybe reduce prefetching?) as well als
FILE_FLAG_SEQUENTIAL_SCAN as a hint for the other case where you read the
stream. There is also a writethrough flag, which does not affect caching. So
basically I think the hints Win32 API offers are not the perfect set of
flags one can think about. Unless SEQUENTIAL_ACCESS implies also "forget
blocks vefore current read position".

Greetings
Bernd
-- 
eckes privat - http://www.eckes.org/
Project Freefire - http://www.freefire.org/

^ permalink raw reply	[flat|nested] 146+ messages in thread

* Re: why swap at all?
  2004-05-27  5:14                       ` Tom Felker
@ 2004-05-27  6:02                         ` Nick Piggin
  2004-05-27  7:04                         ` Bernd Eckenfels
  2004-05-27  7:16                         ` Oliver Neukum
  2 siblings, 0 replies; 146+ messages in thread
From: Nick Piggin @ 2004-05-27  6:02 UTC (permalink / raw)
  To: Tom Felker; +Cc: Matthias Schniedermeyer, linux-kernel

Tom Felker wrote:
> On Wednesday 26 May 2004 7:37 am, Matthias Schniedermeyer wrote:
> 
> 
>>program to kernel: "i read ONCE though this file caching not useful".
> 
> 
> Very true.  The system is based on the assumption that just-used pages are 
> more useful that older pages, and it slows when this isn't true.  We need 
> ways to tell the kernel whether the assumption holds.
> 

A streaming flag is great, but we usually do OK without it. There
is a "used once" heuristic that often gets it right as far as I
know. Basically, new pages that are only used once put almost zero
pressure on the rest of the memory.

It has a few corner cases where it breaks down. Hopefully they can
be improved...

^ permalink raw reply	[flat|nested] 146+ messages in thread

* Re: why swap at all?
  2004-05-26 20:11             ` Wakko Warner
@ 2004-05-27  5:59               ` Nick Piggin
  2004-05-27 14:34                 ` Wakko Warner
  0 siblings, 1 reply; 146+ messages in thread
From: Nick Piggin @ 2004-05-27  5:59 UTC (permalink / raw)
  To: Wakko Warner; +Cc: linux-kernel

Wakko Warner wrote:
>>>Come on, that is quite an exaggeration.  It can happen in a span of 
>>>minutes -- after rsyncing a dir to a backup dir, for example, which 
>>>fills ram rather quickly with cache I'll never use again.  Or after 
>>>configuring and compiling a package, which does the same thing.
>>>
>>
>>rsync is something known to break the VM's use-once heuristics.
>>I'm looking at that.
> 
> 
> I have a question about that.  I keep a debian mirror on one of my machines. 
> there is over 70000 files.  If I run find on that tree while it's
> downloading the file list, it doesn't take as long.  I thought it would be
> nice if there was some way I could keep that in memory.  The box has 256mb
> ram no swap.  It is configured as diskless.
> 

You mean that if you prime the cache by running find on the tree,
your actual operation doesn't take as long?

I don't doubt this. Slab cache is shrunk aggressively compared to
page cache. Traditionally I think this has been due at least in
part to some failure cases in the balancing there resulting in slab
growing out of control with some systems.

These failure cases should be fixed now, and slab vs pagecache is
probably something that should be looked at again. I really need
to get my hands on a 2GB+ system before I'd be game to start
fiddling with too much stuff though.

^ permalink raw reply	[flat|nested] 146+ messages in thread

* Re: why swap at all?
  2004-05-26 12:27                     ` Matthias Schniedermeyer
@ 2004-05-27  5:38                       ` Nick Piggin
  0 siblings, 0 replies; 146+ messages in thread
From: Nick Piggin @ 2004-05-27  5:38 UTC (permalink / raw)
  To: Matthias Schniedermeyer; +Cc: linux-kernel

Matthias Schniedermeyer wrote:
> On Wed, May 26, 2004 at 09:19:40PM +1000, Nick Piggin wrote:

>>OK, this is obviously bad. Do you get this behaviour with 2.6.5
>>or 2.6.6? If so, can you strace the program while it is writing
>>an ISO? (just send 20 lines or so). Or tell me what program you
>>use to create them and how to create one?
> 
> 
> program: mkisofs
> kernel: 2.4.4-2.4.25, 2.6.4-2.6.6
> (To say it in other words, i never (seen/felt) a difference in 3 years.
> So if there is a difference i just didn't realized there is one)
> The current kernel is 2.6.5 as 2.6.6 sometimes just "hangs"
> 
> Just throw together some lage files (My files are all >= 350MB, the
> "typical" case is about 4-5files with 800-1000MB each) and then
> mkisofs -J -r -o <image> <source-dir>
> I store the image files on another HDD to get best possibel throughput.
> My HDDs (these are "normal" IDE-HDDs) are capable of delivering about
> 35-40MB/s, the last time i measured i got about 70MB/s aggregated
> throughput while creating an image-file.
> 

Thanks. I'll see if I can reproduce.

^ permalink raw reply	[flat|nested] 146+ messages in thread

* Re: why swap at all?
@ 2004-05-27  5:37 Nick Piggin
  2004-05-27 17:27 ` Buddy Lumpkin
  0 siblings, 1 reply; 146+ messages in thread
From: Nick Piggin @ 2004-05-27  5:37 UTC (permalink / raw)
  To: Buddy Lumpkin
  Cc: 'John Bradford', 'William Lee Irwin III',
	orders, linux-kernel

Buddy Lumpkin wrote:
>>>Couple that with the fact that there are many pte's pointing at the same
>>>physical page (shared page) in many cases where many processes 
>>>
>>>are running
>>>on the system. Because all of the references to that page must be removed
>>>before the page can be evicted, there are some absolute 
>>>limitations in the
>>>rate that pages can be evicted from memory as the number of processes
>>>running on the system and the total amount of memory increases.
>>>
> 
> 
>>This is still many orders of magnitude faster than filling the page
>>from disk, and you typically don't reclaim much of mapped memory anyway.
> 
> 
> This discussion went broke-minded again. Your still picturing that single
> IDE hard drive in your workstation and im talking about big iron, large
> databases, etc.. where the total amount of aggregate disk I/O is completely
> limited by the rate you can evict pages from the pagecache.
> 
> Picture 6 to 7 fibre channel cards with over 70% utilization during peak
> usage times connected to a large EMC storage array with 64GB of non-volatile
> cache.
> 

I can picture it but I don't know how the kernel is going to handle
it. All I am doing is speaking from what I have seen.

http://marc.theaimsgroup.com/?l=linux-kernel&m=107817776322044&w=2

This post for example, has profiles of a 32 CPU system with 16 FC
controllers and over 1000 disks, doing some database workload. Does
this qualify as big iron?

In the bottom profile, you see the disks being kept busy with 50%
idle time. The top 6 functions are all to do with generating IO
requests and pushing them through the block layer, none of them
involve memory reclaim.

There are profiles from a different setup in a related thread here:

http://groups.google.com.au/groups?q=g:thl3816668183d&dq=&hl=en&lr=&ie=UTF-8&selm=1yjKu-7qU-1%40gated-at.bofh.it&rnum=9

I think we see kmem_cache_alloc make a miserable showing for the
memory allocation team, but it wouldn't even be there if the
profile were sorted by ticks (the left hand column).


Now If you had some experiences of memory reclaim slowing down
block IO, I'd love to hear them because that is related to an area
that I am looking at currently. I'm not saying what you claim is
impossible, but it is something that shouldn't happen and we don't
relly see... You're continuing to insist there is a problem but
that simply isn't helpful without further details.

^ permalink raw reply	[flat|nested] 146+ messages in thread

* Re: why swap at all?
  2004-05-26 12:37                     ` Matthias Schniedermeyer
  2004-05-26 13:06                       ` Gianni Tedesco
  2004-05-26 13:55                       ` Buddy Lumpkin
@ 2004-05-27  5:14                       ` Tom Felker
  2004-05-27  6:02                         ` Nick Piggin
                                           ` (2 more replies)
  2 siblings, 3 replies; 146+ messages in thread
From: Tom Felker @ 2004-05-27  5:14 UTC (permalink / raw)
  To: Matthias Schniedermeyer; +Cc: Nick Piggin, linux-kernel

On Wednesday 26 May 2004 7:37 am, Matthias Schniedermeyer wrote:

> program to kernel: "i read ONCE though this file caching not useful".

Very true.  The system is based on the assumption that just-used pages are 
more useful that older pages, and it slows when this isn't true.  We need 
ways to tell the kernel whether the assumption holds.

(What follows are progressively more impossible ideas that I have no idea how 
to implement.)

O_STREAMING and a flag to not cache a file when it closes are a good start.  

It would also be useful to do this on a per-process basis.  For example, you 
could set a running shell so that its (and it's children's) files are 
O_STREAMING, and use that shell to launch your one-time greps.

Ulimit could set a limit on how much cache a process and its children could 
use.  (How much overhead this would this entail?)  That would take the place 
of the above, and it might also be useful for shell server admins who don't 
want one user trashing everyone's interactivity.

Most drastic would be to change the way to choose pages to throw out.  
Different processes or pages could have different priorities, so you could 
mark interactive processes as keepers even if you haven't used them in days.

It's probably impossible because the kernel only knows about faults, but you 
could give frequently but not recently used pages (your day-old browser 
window) priority over recently but not frequently used pages (your one-time 
grep).  You'd also need a way to allow cache to grow, which this would 
otherwise curtail.

-- 
Tom Felker, <tcfelker@mtco.com>
<http://vlevel.sourceforge.net> - Stop fiddling with the volume knob.

Alchemists became chemists when they stopped keeping secrets.

^ permalink raw reply	[flat|nested] 146+ messages in thread

* Re: why swap at all?
  2004-05-26 16:58                   ` John Bradford
@ 2004-05-26 23:32                     ` Kyle Moffett
  2004-05-27  8:05                       ` John Bradford
  0 siblings, 1 reply; 146+ messages in thread
From: Kyle Moffett @ 2004-05-26 23:32 UTC (permalink / raw)
  To: John Bradford; +Cc: linux-kernel, David Schwartz

On May 26, 2004, at 12:58, John Bradford wrote:
>> 	A lot of people feel subjectively that swap makes a system slow. 
>> There's
>> anecdotal evidence that swap does horrible things or "must be badly 
>> broken
>> because the machine gets slow" on almost every operating system that
>> supports swapping. In most cases, it's just a case where the real 
>> working
>> set has exceeded physical memory, and in that case, swap is just 
>> doing what
>> it's supposed to be doing.
> It's true that physical RAM or swap, over and above the minimum needed 
> for
> the working set is usually beneficial.  However where there is 
> physical RAM
> which will never be touched during normal usage, adding swap will not 
> be
> beneficial.

If your RAM happens to be large enough to contain not only everything 
on disk
you ever want to even read *and* all the space you need for 
calculations, then
you have nothing to gain from using swap.  On the other hand, if you 
are say,
grepping through a kernel source tree, the first time it is read from 
disk, but after
that it is stored in cache in your RAM.  If you have swap, anonymous 
pages of
RAM that are not in use can be paged out while you do your grepping, 
even if
you are grepping through a 900MB+ dataset and only have 1GB RAM.  Swap
allows non-filesystem-backed pages to be pushed to disk for some 
filesystem
backed pages to be loaded and used.

Cheers,
Kyle Moffett


^ permalink raw reply	[flat|nested] 146+ messages in thread

* Re: why swap at all?
  2004-05-26  9:58           ` Nick Piggin
@ 2004-05-26 20:11             ` Wakko Warner
  2004-05-27  5:59               ` Nick Piggin
  0 siblings, 1 reply; 146+ messages in thread
From: Wakko Warner @ 2004-05-26 20:11 UTC (permalink / raw)
  To: Nick Piggin; +Cc: linux-kernel

> > Come on, that is quite an exaggeration.  It can happen in a span of 
> > minutes -- after rsyncing a dir to a backup dir, for example, which 
> > fills ram rather quickly with cache I'll never use again.  Or after 
> > configuring and compiling a package, which does the same thing.
> > 
> 
> rsync is something known to break the VM's use-once heuristics.
> I'm looking at that.

I have a question about that.  I keep a debian mirror on one of my machines. 
there is over 70000 files.  If I run find on that tree while it's
downloading the file list, it doesn't take as long.  I thought it would be
nice if there was some way I could keep that in memory.  The box has 256mb
ram no swap.  It is configured as diskless.

-- 
 Lab tests show that use of micro$oft causes cancer in lab animals

^ permalink raw reply	[flat|nested] 146+ messages in thread

* RE: why swap at all?
  2004-05-26 16:33                 ` David Schwartz
@ 2004-05-26 16:58                   ` John Bradford
  2004-05-26 23:32                     ` Kyle Moffett
  0 siblings, 1 reply; 146+ messages in thread
From: John Bradford @ 2004-05-26 16:58 UTC (permalink / raw)
  To: David Schwartz, linux-kernel

> 	A lot of people feel subjectively that swap makes a system slow. There's
> anecdotal evidence that swap does horrible things or "must be badly broken
> because the machine gets slow" on almost every operating system that
> supports swapping. In most cases, it's just a case where the real working
> set has exceeded physical memory, and in that case, swap is just doing what
> it's supposed to be doing.

It's true that physical RAM or swap, over and above the minimum needed for
the working set is usually beneficial.  However where there is physical RAM
which will never be touched during normal usage, adding swap will not be
beneficial.

John.

^ permalink raw reply	[flat|nested] 146+ messages in thread

* RE: why swap at all?
  2004-05-26 10:45               ` Martin Olsson
  2004-05-26 11:25                 ` Nick Piggin
@ 2004-05-26 16:33                 ` David Schwartz
  2004-05-26 16:58                   ` John Bradford
  1 sibling, 1 reply; 146+ messages in thread
From: David Schwartz @ 2004-05-26 16:33 UTC (permalink / raw)
  To: linux-kernel


> I agree with Anthony Disante, maybe not all users want swapping. I have
> myself felt very annoying with swapping lately but I've not yet tried to
> disable it.

	You're probably really just annoyed with physical memory that's too small
to hold your working set. Believe it or not, having swap delays the onset of
this problem.

> In school I've studied the swapping concept from a theoretical point
> of view, and I fully understand the fact that swapping, if used
> properly, can both increase performance and provide a safe way to get
> out of a bad situation when the box runs out of memory. The problem is
> that in reality this does not work, not on Linux nor on Windows 2000
> which I use at home. Unfortunately I cannot provide a specific reason
> why it does not work, I'm very much a end-user/desktop-user, I'm not a
> kernel hacker (yet). But I see two things that needs improvement atm:

	I don't think you really do understand it from a theoretical point of view,
because you say:

> A) when I do large data processing operations the computer is always
> very very slow afterwards

	Of course, this is because the working set has changed. However, with swap,
the least used pages can be evicted from physical memory. Without it, there
may be no place to put the least used pages and more frequently used pages
have to be evicted.

> B) if I have X Mb of RAM then there should not be imho a single swap
> read/write until the whole of my X Mb RAM is completely stuffed, is this
> so today?

	It depends what you mean by "stuffed". On a modern operating system like
Linux, pretty much all of your physical memory is in use all the time.
Without swap, dirty pages cannot be evicted from physical memory, even if
they haven't been used for days. If your physical memory exceeds your
working set size, you win no matter what. But without swap, every dirty page
is part of your working set, even if it hasn't been read/written for days.

> Also, imagine that I disable swap today and start a large data
> processing operation. During this operation I try to start a new
> process, here ideally the program should not OOM but instead the memory
> allocated for the data processing operation should be decreased. Is this
> possible using today's technology? Can be divide memory into two sorts,
> one for processes (here to stay memory) and another sort for batch
> operations (where the amount of memory does not really matter but less
> memory means less performance). I see the problem with "taking memory
> back" though, I guess its impossible.

	No, it's not difficult. The OS takes physical memory back all the time by
swapping.

	You seem to be missing a fundamental concept. Physical memory will always
get full because the OS will always keep copies of file data in memory just
in case it needs them again. Because new pages are always being read in and
processes are always allocating new memory, the OS will have to make a
decision of what pages to evict from physical memory. If a page is dirty, it
can only be evicted if there's swap. So if you have dirty pages that are
very rarely used, swap allows you to keep more hot, clean pages in memory.

	A lot of people feel subjectively that swap makes a system slow. There's
anecdotal evidence that swap does horrible things or "must be badly broken
because the machine gets slow" on almost every operating system that
supports swapping. In most cases, it's just a case where the real working
set has exceeded physical memory, and in that case, swap is just doing what
it's supposed to be doing.

	DS



^ permalink raw reply	[flat|nested] 146+ messages in thread

* RE: why swap at all?
  2004-05-26 12:37                     ` Matthias Schniedermeyer
  2004-05-26 13:06                       ` Gianni Tedesco
@ 2004-05-26 13:55                       ` Buddy Lumpkin
  2004-05-27  5:14                       ` Tom Felker
  2 siblings, 0 replies; 146+ messages in thread
From: Buddy Lumpkin @ 2004-05-26 13:55 UTC (permalink / raw)
  To: 'Matthias Schniedermeyer', 'Nick Piggin'; +Cc: linux-kernel

Well for mmapped pages, man madvise. Specifically look at MADV_SEQUENTIAL
and MADV_DONTNEED.

--Buddy

http://lxr.linux.no/source/mm/madvise.c?v=2.6.5#L92
 

-----Original Message-----
From: linux-kernel-owner@vger.kernel.org
[mailto:linux-kernel-owner@vger.kernel.org] On Behalf Of Matthias
Schniedermeyer
Sent: Wednesday, May 26, 2004 5:38 AM
To: Nick Piggin
Cc: linux-kernel@vger.kernel.org
Subject: Re: why swap at all?

On Wed, May 26, 2004 at 09:19:40PM +1000, Nick Piggin wrote:
> Matthias Schniedermeyer wrote:
> >On Wed, May 26, 2004 at 08:33:28PM +1000, Nick Piggin wrote:
> 
> OK, this is obviously bad. Do you get this behaviour with 2.6.5
> or 2.6.6? If so, can you strace the program while it is writing
> an ISO? (just send 20 lines or so). Or tell me what program you
> use to create them and how to create one?

To use other words, this is the typical case where a "hint" would be
useful.

program to kernel: "i read ONCE though this file caching not useful".

The last thing i knew in this area is that there exist a thing to tell
the kernel to drop all cache after the file is closed. (IIRC!)

But this doesn't help in this case as the image-file is up to 4,4GB in
whole which means that it ALONE can fill up the whole cache. Taking
aside the files the image was created from, which can (with a size of up
to 2GB (size-limit of iso9660-filesystem/linux-kernel)) also fill a lot
of cache until they are closed.

(The/My) typical case is this.
1 create image-file
2 remove source-files
3 burn image
4 remove image-file

Step 1 and 3 trash the cache without ANY positive effect.



Bis denn

-- 
Real Programmers consider "what you see is what you get" to be just as 
bad a concept in Text Editors as it is in women. No, the Real Programmer
wants a "you asked for it, you got it" text editor -- complicated, 
cryptic, powerful, unforgiving, dangerous.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


^ permalink raw reply	[flat|nested] 146+ messages in thread

* RE: why swap at all?
  2004-05-26 12:19           ` Denis Vlasenko
@ 2004-05-26 13:48             ` Buddy Lumpkin
  0 siblings, 0 replies; 146+ messages in thread
From: Buddy Lumpkin @ 2004-05-26 13:48 UTC (permalink / raw)
  To: 'Denis Vlasenko'; +Cc: orders, linux-kernel

I like that, pull out the Hobby OS/small purpose card when it's convenient,
but a lot of the plans I see talked about for the kernel revolve around
features that are needed to support the type of applications that are being
stressed by Corporate America. That's where a significant amount of money
being poured (directly or indirectly) into linux right now, and that is
where the bulk of the next level of challenges in terms of kernel
development are going to come from. 

Linux is already a modern OS in most respects, the next logical programming
challenges are solving some of the vertical scaling issues, even as most of
these applications are moving in a growing trend toward scaling
horizontally.

--Buddy



-----Original Message-----
From: linux-kernel-owner@vger.kernel.org
[mailto:linux-kernel-owner@vger.kernel.org] On Behalf Of Denis Vlasenko
Sent: Wednesday, May 26, 2004 5:20 AM
To: Buddy Lumpkin
Cc: orders@nodivisions.com; linux-kernel@vger.kernel.org
Subject: RE: why swap at all?

On Wednesday 26 May 2004 15:07, Buddy Lumpkin wrote:
> those environments horizontally in most cases. The biggest performance
> problems to solve (that people care about and are willing to pay $$ to
> solve) are for the large databases that run Corporate America. There are
> certainly scientific applications where performance is critical and there
> are dollars to fund improvement as well, but their numbers don't compare
to
> the number of Oracle instances out there running in the Enterprise.

Oh yeah, poor Corporate America. That what we should care most of.

> Optimizing the performance of swap operations for even a small tradeoff in
> performance for memory operations that take place entirely in physical
> memory is just a broke minded, brain dead direction in the year 2004 IMHO.

Sorry Buddy. I am _not_ Corporate America.
I have 4 boxes at work and 5 boxes at home,
and only one of them can be safely run swapless. It's a router.
--
vda
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


^ permalink raw reply	[flat|nested] 146+ messages in thread

* Re: why swap at all?
  2004-05-26 13:06                       ` Gianni Tedesco
@ 2004-05-26 13:41                         ` Matt H.
  0 siblings, 0 replies; 146+ messages in thread
From: Matt H. @ 2004-05-26 13:41 UTC (permalink / raw)
  To: Gianni Tedesco; +Cc: Matthias Schniedermeyer, Nick Piggin, linux-kernel

I believe it was a 2.4 patch , its still around somewhere. I can find it and 
post it , if it's still relevant. 

Matt H.

On Wednesday 26 May 2004 6:06 am, Gianni Tedesco wrote:
> On Wed, 2004-05-26 at 13:37, Matthias Schniedermeyer wrote:
> > On Wed, May 26, 2004 at 09:19:40PM +1000, Nick Piggin wrote:
> > > Matthias Schniedermeyer wrote:
> > > >On Wed, May 26, 2004 at 08:33:28PM +1000, Nick Piggin wrote:
> > >
> > > OK, this is obviously bad. Do you get this behaviour with 2.6.5
> > > or 2.6.6? If so, can you strace the program while it is writing
> > > an ISO? (just send 20 lines or so). Or tell me what program you
> > > use to create them and how to create one?
> >
> > To use other words, this is the typical case where a "hint" would be
> > useful.
> >
> > program to kernel: "i read ONCE though this file caching not useful".
>
> Wasn't their an O_STREAMING patch thrown around towards the beginning of
> the 2.5 development cycle?

^ permalink raw reply	[flat|nested] 146+ messages in thread

* Re: why swap at all?
  2004-05-26 13:00 ` Satoshi Oshima
@ 2004-05-26 13:38   ` William Lee Irwin III
  0 siblings, 0 replies; 146+ messages in thread
From: William Lee Irwin III @ 2004-05-26 13:38 UTC (permalink / raw)
  To: Satoshi Oshima; +Cc: orders, linux-kernel

On Wed, May 26, 2004 at 10:00:06PM +0900, Satoshi Oshima wrote:
> I really agree. And I think swappoff is not enough. Some of my
> customers have over 4GB of memory. RDMS, Java Virtual Machine or Grid
> system (like Globus tool kit) run on the servers. Those kinds of
> application make a lot of threads and they have huge amount of shared
> memory. And those shared memory is sometimes mlocked. I think, in
> those systems, memory aging itself is useless or obstructive in worst
> case. Because mlocked pages which can't be swapped off are on the LRU
> list. In such case, aging-off (relevant to process) is effective, I
> think. Of course, I agree that swap-off or aging-off is NEVER always
> useful. On the contrary, these functions may be required by very
> small number of user. But it is very important that we can choose 
> how we use the OS.

Could you try CONFIG_SWAP=n to see if that makes a difference?
More aggressive non-paging methods could be devised if not, e.g.
CONFIG_MMU=n support of various kinds for hardware supporting paging
and virtual memory (this is a suggestion, not an offer to implement).


-- wli

^ permalink raw reply	[flat|nested] 146+ messages in thread

* RE: why swap at all?
  2004-05-26 12:33           ` Richard B. Johnson
@ 2004-05-26 13:25             ` Buddy Lumpkin
  0 siblings, 0 replies; 146+ messages in thread
From: Buddy Lumpkin @ 2004-05-26 13:25 UTC (permalink / raw)
  To: root; +Cc: 'Denis Vlasenko', orders, 'Linux kernel'


> Gentlemen,

> There is not enough RAM address-space in even 64-bit machines
> to do a sort/merge of even a typical inventory with all the
> keys present in RAM. So you need multiple tasks, each with
> as much of the 64-bit address-space occupied by RAM, as
> possible. Even then, you need to do partial sorts, etc.

Ironically, a fortune 500 company I left very recently is famous for their
inventory system that has been implemented in the last three years. If
someone were to assume I am exaggerating, a search for my name in google
groups would likely reveal what company that is, and looking up news about
their at finance.yahoo.com would likely churn up a few articles about the
adda-boys they have received for their inventory system and what it has done
for the company. 

I was the primary system admin/engineer for this system and it only occupied
roughly 1TB in a single database instance. 1TB would certainly fit in a
64-bit address space. While they didn't have a zillion sku's like a company
like Walmart would, their skus change on a regular basis and change at the
store level while information about a jar of mayonnaise or a desk in most
companies can stay quite static. Where I am going with this is I doubt there
are many inventory systems out there that run much in excess of a few
Terabytes.


> It's not "bloat-ware" that requires getting as much free RAM
> as possible for an application, but the business of doing business.
> So, performance of data-intensive work such as the sort/merge
> is improved by writing the contents of sleeping tasks RAM to
> a storage device and using that RAM. It's just that simple.

Again, my expectation is that most large database instances out there will
happily fit in a 64-bit address space. Ironically, while code tends to run
slower on 64-bit architectures because of reasons like having half as many
cache lines because of the larger word size, byte packing, etc.. The ability
to do a hash join in memory of two insanely large tables that wouldn't fit
into a 32-bit address space easily negates this small issue. So in that
regard, a 64-bit address space results in a performance boost provided the
DBA knows how to leverage the features.

--Buddy


^ permalink raw reply	[flat|nested] 146+ messages in thread

* Re: why swap at all?
  2004-05-26 10:40         ` Buddy Lumpkin
@ 2004-05-26 13:15           ` Helge Hafting
  0 siblings, 0 replies; 146+ messages in thread
From: Helge Hafting @ 2004-05-26 13:15 UTC (permalink / raw)
  To: Buddy Lumpkin; +Cc: linux-kernel, linux-kernel

Buddy Lumpkin wrote:

>>Hi Buddy,
>>Even for systems that don't *need* the extra memory space, swap can
>>actually provide performance improvements by allowing unused memory
>>to be replaced with often-used memory.
>>    
>>
>
>  
>
>>For example, I have 57MB swapped right now. It allows me to instantly
>>grep the kernel tree. If I turned swap off, each grep would probably
>>take 30 seconds.
>>    
>>
>
>Your analogy is flawed. There are many reasons why this doesn't work in the
>real world.
>
>I don't think any modern and popular OS contains mechanisms that silently
>stage old pages to disk.
>
Linux is modern and popular . . .

> The constant twitching of the hard drive this
>causes for no apparent reason drives people insane 
>
Stupid people then. If they really expect the disk to work
only when they hit save or start up something.  Sheesh.

>and drains precious
>battery life on laptops. (see description for the pages_min, pages_low and
>pages_high watermarks for clarity)
>  
>
This is a valid concern. Laptop users may want to sacrifice performance
for battery life. Linux can be tweaked quite a bit for this, more
development is probably a good idea. We who use AC power don't
want a performance loss on our machines though, so any such tweaks
must be optional.


[...]

>One thing that can be done to minimize the problem where heavy filesystem
>I/O flushes important pages from memory like pages from shared libraries and
>executables only for them to fault back in as soon as they become runnable,
>is to implement something similar to what Sun implemented in Solaris 8
>called the cyclical page cache. The idea is that the pagecache pages against
>itself and is actually considered free memory from an anonymous memory
>perspective. The pagecache is free to grow all it wants, but since it is
>counted as free memory, anonymous memory allocation will cause the pagecache
>to shrink because it is considered free memory.
>  
>
Linux counts cache as free memory too, of course. 
Allocate memory, and cache will go away.
It has been like this for many years.

Helge Hafting

^ permalink raw reply	[flat|nested] 146+ messages in thread

* Re: why swap at all?
  2004-05-26 12:37                     ` Matthias Schniedermeyer
@ 2004-05-26 13:06                       ` Gianni Tedesco
  2004-05-26 13:41                         ` Matt H.
  2004-05-26 13:55                       ` Buddy Lumpkin
  2004-05-27  5:14                       ` Tom Felker
  2 siblings, 1 reply; 146+ messages in thread
From: Gianni Tedesco @ 2004-05-26 13:06 UTC (permalink / raw)
  To: Matthias Schniedermeyer; +Cc: Nick Piggin, linux-kernel

On Wed, 2004-05-26 at 13:37, Matthias Schniedermeyer wrote:
> On Wed, May 26, 2004 at 09:19:40PM +1000, Nick Piggin wrote:
> > Matthias Schniedermeyer wrote:
> > >On Wed, May 26, 2004 at 08:33:28PM +1000, Nick Piggin wrote:
> > 
> > OK, this is obviously bad. Do you get this behaviour with 2.6.5
> > or 2.6.6? If so, can you strace the program while it is writing
> > an ISO? (just send 20 lines or so). Or tell me what program you
> > use to create them and how to create one?
> 
> To use other words, this is the typical case where a "hint" would be
> useful.
> 
> program to kernel: "i read ONCE though this file caching not useful".

Wasn't their an O_STREAMING patch thrown around towards the beginning of
the 2.5 development cycle?

-- 
// Gianni Tedesco (gianni at scaramanga dot co dot uk)
lynx --source www.scaramanga.co.uk/scaramanga.asc | gpg --import
8646BE7D: 6D9F 2287 870E A2C9 8F60 3A3C 91B5 7669 8646 BE7D


^ permalink raw reply	[flat|nested] 146+ messages in thread

* Re: why swap at all?
  2004-05-26  9:40   ` John Bradford
@ 2004-05-26 13:06     ` Helge Hafting
  0 siblings, 0 replies; 146+ messages in thread
From: Helge Hafting @ 2004-05-26 13:06 UTC (permalink / raw)
  To: John Bradford; +Cc: orders, linux-kernel

John Bradford wrote:

>Quote from Helge Hafting <helgehaf@aitel.hist.no>:
>  
>
>>Anthony DiSante wrote:
>>
>>    
>>
>>>As a general question about ram/swap and relating to some of the 
>>>issues in this thread:
>>>
>>>    ~500 megs cached yet 2.6.5 goes into swap hell
>>>
>>>Consider this: I have a desktop system with 256MB ram, so I make a 
>>>256MB swap partition.  So I have 512MB "memory" and if some process 
>>>wants more, too bad, there is no more.
>>>
>>>Now I buy another 256MB of ram, so I have 512MB of real memory.  Why 
>>>not just disable my swap completely now?  I won't have increased my 
>>>memory's size at all, but won't I have increased its performance lots? 
>>>      
>>>
>>This is correct. You now have 512M of fast memory instead of
>>256M fast memory and 256M "slow" memory. You don't _need_ to have additional
>>swap, but it is usually a good idea.  If you keep your 256M of swap, 
>>then you now
>>have 512M fast memory + 256M slow memory for a total of 768M.  This is 
>>even better.
>>    
>>
>
>I strongly disagree on the last point.  It may be better, but it may also
>be a lot worse.  Too much swap can be a bad thing - see my example in another
>post about run-away processes on remote machines.
>  
>
Well, way too much swap is of course not good.  A swap that is half
the size of memory isn't that bad - at any time, at least two thirds of
what you want in memory is bound to be there. 10x as much swap as memory
might be bad though.

>>Please note that  your machine _will_ do one kind of swapping even if you
>>don't configure any swap: Executable files are a kind of swap-files,
>>if memory pressure happens then (part of) your programs will be evicted
>>from memory _because_ they can be reloaded from their executables.
>>
>>This cause the same sort of performance degradations as swapping to
>>a swap partition.  Actually, it is worse because swapping to a swap 
>>partition
>>allows swapping out little-used writeable memory before discarding
>>program code that might see more use.  So if swapping happens, then
>>you're better off with a swap partition because then it is the least used
>>stuff that goes first. Without a swap partition, the least used program code
>>goes, but it may or may not be the least used memory overall.
>>    
>>
>
>Again, the user _may_ be better off swapping to a swap partition rather than
>having executable code paged out, but this is not necessarily true in all
>circumstances.
>  
>
The problem is that swapping happens.  A small swap (no more than
the amount you accept being swapped out) ensures that the
paging code can select the best page for eviction, rather than
being forced to evict code.

If you worry about runaway processes and/or troublesome users,
use ulimit.

Helge Hafting


^ permalink raw reply	[flat|nested] 146+ messages in thread

* Re: why swap at all?
  2004-05-26  9:23   ` John Bradford
  2004-05-26  9:30     ` Roger Luethi
@ 2004-05-26 13:01     ` Helge Hafting
  1 sibling, 0 replies; 146+ messages in thread
From: Helge Hafting @ 2004-05-26 13:01 UTC (permalink / raw)
  To: John Bradford; +Cc: Roger Luethi, Anthony DiSante, linux-kernel

John Bradford wrote:

>Quote from Roger Luethi <rl@hellgate.ch>:
>  
>
>>On Wed, 26 May 2004 02:38:23 -0400, Anthony DiSante wrote:
>>    
>>
>>>Now I buy another 256MB of ram, so I have 512MB of real memory.  Why not 
>>>just disable my swap completely now?  I won't have increased my memory's 
>>>size at all, but won't I have increased its performance lots?
>>>
>>>Or, to make it more appealing, say I initially had 512MB ram and now I have 
>>>1GB.  Wouldn't I much rather not use swap at all anymore, in this case, on 
>>>my desktop?
>>>      
>>>
>>Swap serves another (often underrated) purpose: Graceful degradation.
>>
>>If you have a reasonably amount of swap space mounted, you will know
>>you are running out of RAM because your system will become noticeably
>>slower. If you have no swap whatsoever, your first warning will quite
>>possibly be an application OOM killed or losing data due to a failed
>>memory allocation.
>>
>>Think of the slowness of swap as a _feature_.
>>    
>>
>
>There is a very negative side to this approach as well, especially if users
>allocate excessive swap space.
>
>A run-away process on a server with too much swap can cause it to grind to
>almost a complete halt, and become almost compltely unresponsive to remote
>connections.
>
>If the total amount of storage is just enough for the tasks the server is
>expected to deal with, then a run-away process will likely be terminated
>quickly stopping it from causing the machine to grind to a halt.
>  
>
No.  Something will be terminated, not necessarily the "evil"
process. A runaway process can have quite a few server processes
killed before it eventually dies.   An attacker will of course
use processes that forks, so some remain and can keep
spending memory and process time.

ulimit is the way of limiting how much memory a user can use,
preventing this scenario.  Not setting up swap isn't.

>If, on the other hand, there is excessive storage, it can continue running
>for a long time, often consuming a lot of CPU.
>
>When the excess storage is physical RAM, this might not be particularly
>disasterous, but if it's swap space, it's much more likely to cause a serious
>drop in performance.
>
>  
>
Well, the process (or processes) can consume lots of cpu time
without swap too.  It can cause server processes to get killed (or
not get memory they need) and it can cause slowdowns ny
evicting nearly all program code from memory.

>For a desktop system, it might not be a big deal, but when it's an ISP's server
>in a remote data centre, it can create a lot of unnecessary work.
>
>  
>
Definitely use ulimit on such a machine.

Helge Hafting

^ permalink raw reply	[flat|nested] 146+ messages in thread

* Re: why swap at all?
  2004-05-26  6:38 Anthony DiSante
                   ` (5 preceding siblings ...)
  2004-05-26 10:02 ` Raphael Jacquot
@ 2004-05-26 13:00 ` Satoshi Oshima
  2004-05-26 13:38   ` William Lee Irwin III
  6 siblings, 1 reply; 146+ messages in thread
From: Satoshi Oshima @ 2004-05-26 13:00 UTC (permalink / raw)
  To: orders, linux-kernel

Anthony DiSante <orders@nodivisions.com> wrote:
> As a general question about ram/swap and relating to some of the issues 
in 
> this thread:
> 
>       ~500 megs cached yet 2.6.5 goes into swap hell
> 
> Consider this: I have a desktop system with 256MB ram, so I make a 256MB 
> swap partition.  So I have 512MB "memory" and if some process wants more, 
> too bad, there is no more.
> 
> Now I buy another 256MB of ram, so I have 512MB of real memory.  Why not 
> just disable my swap completely now?  I won't have increased my memory's 
> size at all, but won't I have increased its performance lots?
> 
> Or, to make it more appealing, say I initially had 512MB ram and now I 
have 
> 1GB.  Wouldn't I much rather not use swap at all anymore, in this case, 
on 
> my desktop?

I really agree. And I think swappoff is not enough.

Some of my customers have over 4GB of memory. RDMS, 
Java Virtual Machine or Grid system (like Globus tool 
kit) run on the servers. 
Those kinds of application make a lot of threads and 
they have huge amount of shared memory. And those 
shared memory is sometimes mlocked.

I think, in those systems, memory aging itself is 
useless or obstructive in worst case. Because mlocked 
pages which can't be swapped off are on the LRU list.

In such case, aging-off (relevant to process) is 
effective, I think.

Of course, I agree that swap-off or aging-off is 
NEVER always useful. On the contrary, these functions 
may be required by very small number of user.

But it is very important that we can choose 
how we use the OS.


Satoshi Oshima


^ permalink raw reply	[flat|nested] 146+ messages in thread

* RE: why swap at all?
  2004-05-26 12:19       ` Rik van Riel
@ 2004-05-26 12:55         ` Buddy Lumpkin
  0 siblings, 0 replies; 146+ messages in thread
From: Buddy Lumpkin @ 2004-05-26 12:55 UTC (permalink / raw)
  To: 'Rik van Riel'
  Cc: 'William Lee Irwin III', orders, linux-kernel


> Executables and shared libraries live in the filesystem
> cache.  Evicting those from memory - because swapping is
> disabled and "the VM should remove something from cache
> instead" - will feel exactly the same as swapping ...

I totally agree, you can't get away from evicting pages from memory to disk
if your doing file system I/O because you eventually fill up memory. Any
additional file system I/O requires evictions, period.

Trying to preference executables and shared libraries is difficult because
they are backed by named files, hence they also pagecache. (kind of reminds
me of that little white speck on chicken poop. If it came out of the
chicken, it's chicken poop too :)

But there is the case where massive amounts of file system I/O (consider
several fibre cards connecting to SAN attached storage that saturates the
centerplane on some insanely large system) will force pages from running
executables to be evicted, only to be faulted back a few milliseconds later.
It's this thrashing effect that a separation eliminates if you have the
ability to distinguish between executables (not just files mapped in
executable but actual running processes) and non-executable pages.

Consider how silly it would be for a system running a single process that
consumes only 100k that generates so much filesystem I/O that the process is
constantly paged out. When it needs to wake up again, becomes runnable and
the program counter starts to access pages within the process address space,
it gets faulted back in.

Lather, Rinse, Repeat ...

--Buddy




^ permalink raw reply	[flat|nested] 146+ messages in thread

* RE: why swap at all?
  2004-05-26 11:38         ` Buddy Lumpkin
  2004-05-26 12:12           ` Paulo Marques
  2004-05-26 12:14           ` Nick Piggin
@ 2004-05-26 12:40           ` Denis Vlasenko
  2 siblings, 0 replies; 146+ messages in thread
From: Denis Vlasenko @ 2004-05-26 12:40 UTC (permalink / raw)
  To: Buddy Lumpkin, 'William Lee Irwin III'; +Cc: orders, linux-kernel

On Wednesday 26 May 2004 10:55, William Lee Irwin III wrote:
> On Wed, May 26, 2004 at 12:31:16AM -0700, Buddy Lumpkin wrote:
> > This of course doesn't address the VM paging storms that happen due to
> > large amounts of file system writes. Once the pagecache fills up, dirty
> > pages must be evicted from the pagecache so that new pages can be added
> > to the pagecache.
>
> If you've got a real performance issue, please describe it properly
> instead of asserting without evidence the existence of one.

On Wed, May 26, 2004 at 01:30:09AM -0700, Buddy Lumpkin wrote:
> As for your short, two sentence comment below, let me save you the energy of
> insinuations and translate your message the way I read it: 
> -------------------------------------------------------------------------
> I don't recognize your name, therefore you can't possibly have a valuable
> opinion on the direction VM system development should go. I doubt you have
> an actual performance problem to share, but if you do, please share it and
> go away so that we can work on solving the problem.
> --------------------------------------------------------------------------
> My response:
> Get over yourself.

You were very wrong here. He did not say that. You pervert his words.

On Wednesday 26 May 2004 12:09, William Lee Irwin III wrote:
> >- My response:
> > Get over yourself.
>
> What the Hell? I have enough bugs I'm paid to fix that I'm not going to
> tolerate harassment for requesting that claims that the kernel behaves
> pathologically in some scenario be cast as comprehensible bugreports.
> It's also worth noting that paying customers don't respond so uncouthly.

wli, understandably, become angry.

On Wednesday 26 May 2004 14:38, Buddy Lumpkin wrote:
> If you follow the thread, you will see no claim from me that there is
> anything wrong with the kernel. I simply stated that the priority of VM
> system development should focus on physical memory...
...
> This situation isn't even remotely similar. In this case, you (a
> contributor to a very, very large FREE software project) misread a thread
> and made some surly comments that you ended up eating, and are so used to
> telling people that you owe them nothing, that you have some how conjured
> up the image that I actually want something from you.
...
> This is classic, you have managed to put yourself in a position where you
> spend the majority of your time working on a free project that has some
> very ambitious goals. It has afforded you the ability to forfill your own
> personal and professional goals as well, yet you reserve the right to
> discard all accountability for your actions when it's convenient because
> you get some frank feedback from someone that is not a paying customer.
>
> What a crutch.
>
> I can picture where this is going. Here is an interview between you and a
> popular Linux magazine in two years:

<joke>
Aha!
Now we all know that wli is evil. Thanks for your crystall ball.
</joke>
--
vda

^ permalink raw reply	[flat|nested] 146+ messages in thread

* Re: why swap at all?
  2004-05-26 11:19                   ` Nick Piggin
  2004-05-26 12:27                     ` Matthias Schniedermeyer
@ 2004-05-26 12:37                     ` Matthias Schniedermeyer
  2004-05-26 13:06                       ` Gianni Tedesco
                                         ` (2 more replies)
  1 sibling, 3 replies; 146+ messages in thread
From: Matthias Schniedermeyer @ 2004-05-26 12:37 UTC (permalink / raw)
  To: Nick Piggin; +Cc: linux-kernel

On Wed, May 26, 2004 at 09:19:40PM +1000, Nick Piggin wrote:
> Matthias Schniedermeyer wrote:
> >On Wed, May 26, 2004 at 08:33:28PM +1000, Nick Piggin wrote:
> 
> OK, this is obviously bad. Do you get this behaviour with 2.6.5
> or 2.6.6? If so, can you strace the program while it is writing
> an ISO? (just send 20 lines or so). Or tell me what program you
> use to create them and how to create one?

To use other words, this is the typical case where a "hint" would be
useful.

program to kernel: "i read ONCE though this file caching not useful".

The last thing i knew in this area is that there exist a thing to tell
the kernel to drop all cache after the file is closed. (IIRC!)

But this doesn't help in this case as the image-file is up to 4,4GB in
whole which means that it ALONE can fill up the whole cache. Taking
aside the files the image was created from, which can (with a size of up
to 2GB (size-limit of iso9660-filesystem/linux-kernel)) also fill a lot
of cache until they are closed.

(The/My) typical case is this.
1 create image-file
2 remove source-files
3 burn image
4 remove image-file

Step 1 and 3 trash the cache without ANY positive effect.



Bis denn

-- 
Real Programmers consider "what you see is what you get" to be just as 
bad a concept in Text Editors as it is in women. No, the Real Programmer
wants a "you asked for it, you got it" text editor -- complicated, 
cryptic, powerful, unforgiving, dangerous.


^ permalink raw reply	[flat|nested] 146+ messages in thread

* RE: why swap at all?
@ 2004-05-26 12:34 Piszcz, Justin Michael
  0 siblings, 0 replies; 146+ messages in thread
From: Piszcz, Justin Michael @ 2004-05-26 12:34 UTC (permalink / raw)
  To: linux-kernel; +Cc: ap

If one has 16GB of ram, would he or she want to use swap?
Would it slow the system down?


-----Original Message-----
From: linux-kernel-owner@vger.kernel.org
[mailto:linux-kernel-owner@vger.kernel.org] On Behalf Of Matthias
Schniedermeyer
Sent: Wednesday, May 26, 2004 8:27 AM
To: Nick Piggin
Cc: linux-kernel@vger.kernel.org
Subject: Re: why swap at all?

On Wed, May 26, 2004 at 09:19:40PM +1000, Nick Piggin wrote:
> Matthias Schniedermeyer wrote:
> >On Wed, May 26, 2004 at 08:33:28PM +1000, Nick Piggin wrote:
> >
> >>Matthias Schniedermeyer wrote:
> >>
> 
> >>>In my personal machine i have 3GB of RAM and i regularly create
> >>>DVD-ISO-Images (about 2 per day). After creating an image (reading
up to
> >>>4,4GB and writing up to 4,4GB) the cache is 100% trashed(1). With
swap
> >>>it would be even more trashed then it is without swap(1).
> >>>
> >>
> >>I don't disagree that you could find a situation where swap
> >>is worse than no swap. I don't understand what you mean by
> >>trashed and more trashed though :)
> >
> >
> >trashed means "everything i need(tm)" is paged out
(mozilla/konsole/xine
> >...)
> >
> >with swap the data-part of running programs was swapped out, without
> >swap only the program-part is thrown out of memory as the data-part
> >can't be moved anywhere else.
> >
> >I have a 10KPRM SCSI-HDD, i can here what my system is doing. :-)
> >
> 
> OK, this is obviously bad. Do you get this behaviour with 2.6.5
> or 2.6.6? If so, can you strace the program while it is writing
> an ISO? (just send 20 lines or so). Or tell me what program you
> use to create them and how to create one?

program: mkisofs
kernel: 2.4.4-2.4.25, 2.6.4-2.6.6
(To say it in other words, i never (seen/felt) a difference in 3 years.
So if there is a difference i just didn't realized there is one)
The current kernel is 2.6.5 as 2.6.6 sometimes just "hangs"

Just throw together some lage files (My files are all >= 350MB, the
"typical" case is about 4-5files with 800-1000MB each) and then
mkisofs -J -r -o <image> <source-dir>
I store the image files on another HDD to get best possibel throughput.
My HDDs (these are "normal" IDE-HDDs) are capable of delivering about
35-40MB/s, the last time i measured i got about 70MB/s aggregated
throughput while creating an image-file.



Bis denn

-- 
Real Programmers consider "what you see is what you get" to be just as 
bad a concept in Text Editors as it is in women. No, the Real Programmer
wants a "you asked for it, you got it" text editor -- complicated, 
cryptic, powerful, unforgiving, dangerous.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel"
in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



^ permalink raw reply	[flat|nested] 146+ messages in thread

* RE: why swap at all?
  2004-05-26 12:07         ` Buddy Lumpkin
  2004-05-26 12:06           ` Marc-Christian Petersen
  2004-05-26 12:19           ` Denis Vlasenko
@ 2004-05-26 12:33           ` Richard B. Johnson
  2004-05-26 13:25             ` Buddy Lumpkin
  2 siblings, 1 reply; 146+ messages in thread
From: Richard B. Johnson @ 2004-05-26 12:33 UTC (permalink / raw)
  To: Buddy Lumpkin; +Cc: 'Denis Vlasenko', orders, Linux kernel


Gentlemen,

There is not enough RAM address-space in even 64-bit machines
to do a sort/merge of even a typical inventory with all the
keys present in RAM. So you need multiple tasks, each with
as much of the 64-bit address-space occupied by RAM, as
possible. Even then, you need to do partial sorts, etc.

It's not "bloat-ware" that requires getting as much free RAM
as possible for an application, but the business of doing business.
So, performance of data-intensive work such as the sort/merge
is improved by writing the contents of sleeping tasks RAM to
a storage device and using that RAM. It's just that simple.

Many years ago, there was a small company that tried to sell
a sort/merge engine (a dedicated CPU) to Digital because the
problems with handling large databases was well known and
interactive performance on VAX/11-750 machines sucked when
database applications were being run (because their pages
were being swapped). Of course the NIH syndrome took its
toll and nobody ever got such an engine. The result being
that everybody has performance problems when database
operations are being run --even today, with different
machines.

Any data-intensive application needs as much RAM as possible and
that's never quite enough for best performance.

Cheers,
Dick Johnson
Penguin : Linux version 2.4.26 on an i686 machine (5570.56 BogoMips).
            Note 96.31% of all statistics are fiction.



^ permalink raw reply	[flat|nested] 146+ messages in thread

* RE: why swap at all?
  2004-05-26  9:06 ` John Bradford
@ 2004-05-26 12:31   ` Buddy Lumpkin
  0 siblings, 0 replies; 146+ messages in thread
From: Buddy Lumpkin @ 2004-05-26 12:31 UTC (permalink / raw)
  To: 'John Bradford', 'Anthony DiSante', linux-kernel


> In my experience, it's perfectly possible to run a typical desktop system
> with no swap at all.  Certainly the 'double the amount of physical RAM' 
> guideline has been taken far too literally in my opinion.

--------snip---------

In older BSD systems like SunOS 4.x, malloc would literally fail if you did
not have enough physical memory and backing store (swap) to store that
anonymous memory segment. 

This meant that if you wanted to leverage swap to get additional virtual
memory beyond the amount of installed physical memory, you needed more than
1x physical memory. This is where the old 2x physical memory came from.

--Buddy


^ permalink raw reply	[flat|nested] 146+ messages in thread

* RE: why swap at all?
  2004-05-26 10:41       ` Denis Vlasenko
  2004-05-26 12:07         ` Buddy Lumpkin
@ 2004-05-26 12:30         ` Rik van Riel
  1 sibling, 0 replies; 146+ messages in thread
From: Rik van Riel @ 2004-05-26 12:30 UTC (permalink / raw)
  To: Denis Vlasenko
  Cc: Buddy Lumpkin, 'William Lee Irwin III', orders, linux-kernel

On Wed, 26 May 2004, Denis Vlasenko wrote:

> No. Unfortunately, userspace programs grow in size as fast
> as your RAM. Because typically developers do not think
> about size of their program until it starts to outgrow
> their RAM.

It's worse than that.  Way worse.

The speed of hard disks doesn't grow anywhere near as
fast as the size of memory and applications. This means
that over the last years, swapping in any particular
application has gotten SLOWER than it used to be ...

This means that even though the VM is way smarter than
it used to be, the visibility of any wrong decision has
increased.

I wonder if there's a way we could change the VM so it
could recover faster from any mistakes it made, instead
of trying to prevent it from making any mistakes (those
will happen anyway, the VM can't predict the future).

-- 
"Debugging is twice as hard as writing the code in the first place.
Therefore, if you write the code as cleverly as possible, you are,
by definition, not smart enough to debug it." - Brian W. Kernighan


^ permalink raw reply	[flat|nested] 146+ messages in thread

* Re: why swap at all?
  2004-05-26 11:19                   ` Nick Piggin
@ 2004-05-26 12:27                     ` Matthias Schniedermeyer
  2004-05-27  5:38                       ` Nick Piggin
  2004-05-26 12:37                     ` Matthias Schniedermeyer
  1 sibling, 1 reply; 146+ messages in thread
From: Matthias Schniedermeyer @ 2004-05-26 12:27 UTC (permalink / raw)
  To: Nick Piggin; +Cc: linux-kernel

On Wed, May 26, 2004 at 09:19:40PM +1000, Nick Piggin wrote:
> Matthias Schniedermeyer wrote:
> >On Wed, May 26, 2004 at 08:33:28PM +1000, Nick Piggin wrote:
> >
> >>Matthias Schniedermeyer wrote:
> >>
> 
> >>>In my personal machine i have 3GB of RAM and i regularly create
> >>>DVD-ISO-Images (about 2 per day). After creating an image (reading up to
> >>>4,4GB and writing up to 4,4GB) the cache is 100% trashed(1). With swap
> >>>it would be even more trashed then it is without swap(1).
> >>>
> >>
> >>I don't disagree that you could find a situation where swap
> >>is worse than no swap. I don't understand what you mean by
> >>trashed and more trashed though :)
> >
> >
> >trashed means "everything i need(tm)" is paged out (mozilla/konsole/xine
> >...)
> >
> >with swap the data-part of running programs was swapped out, without
> >swap only the program-part is thrown out of memory as the data-part
> >can't be moved anywhere else.
> >
> >I have a 10KPRM SCSI-HDD, i can here what my system is doing. :-)
> >
> 
> OK, this is obviously bad. Do you get this behaviour with 2.6.5
> or 2.6.6? If so, can you strace the program while it is writing
> an ISO? (just send 20 lines or so). Or tell me what program you
> use to create them and how to create one?

program: mkisofs
kernel: 2.4.4-2.4.25, 2.6.4-2.6.6
(To say it in other words, i never (seen/felt) a difference in 3 years.
So if there is a difference i just didn't realized there is one)
The current kernel is 2.6.5 as 2.6.6 sometimes just "hangs"

Just throw together some lage files (My files are all >= 350MB, the
"typical" case is about 4-5files with 800-1000MB each) and then
mkisofs -J -r -o <image> <source-dir>
I store the image files on another HDD to get best possibel throughput.
My HDDs (these are "normal" IDE-HDDs) are capable of delivering about
35-40MB/s, the last time i measured i got about 70MB/s aggregated
throughput while creating an image-file.



Bis denn

-- 
Real Programmers consider "what you see is what you get" to be just as 
bad a concept in Text Editors as it is in women. No, the Real Programmer
wants a "you asked for it, you got it" text editor -- complicated, 
cryptic, powerful, unforgiving, dangerous.


^ permalink raw reply	[flat|nested] 146+ messages in thread

* RE: why swap at all?
  2004-05-26 12:07         ` Buddy Lumpkin
  2004-05-26 12:06           ` Marc-Christian Petersen
@ 2004-05-26 12:19           ` Denis Vlasenko
  2004-05-26 13:48             ` Buddy Lumpkin
  2004-05-26 12:33           ` Richard B. Johnson
  2 siblings, 1 reply; 146+ messages in thread
From: Denis Vlasenko @ 2004-05-26 12:19 UTC (permalink / raw)
  To: Buddy Lumpkin; +Cc: orders, linux-kernel

On Wednesday 26 May 2004 15:07, Buddy Lumpkin wrote:
> those environments horizontally in most cases. The biggest performance
> problems to solve (that people care about and are willing to pay $$ to
> solve) are for the large databases that run Corporate America. There are
> certainly scientific applications where performance is critical and there
> are dollars to fund improvement as well, but their numbers don't compare to
> the number of Oracle instances out there running in the Enterprise.

Oh yeah, poor Corporate America. That what we should care most of.

> Optimizing the performance of swap operations for even a small tradeoff in
> performance for memory operations that take place entirely in physical
> memory is just a broke minded, brain dead direction in the year 2004 IMHO.

Sorry Buddy. I am _not_ Corporate America.
I have 4 boxes at work and 5 boxes at home,
and only one of them can be safely run swapless. It's a router.
--
vda

^ permalink raw reply	[flat|nested] 146+ messages in thread

* RE: why swap at all?
  2004-05-26  8:30     ` Buddy Lumpkin
                         ` (3 preceding siblings ...)
  2004-05-26 10:44       ` Denis Vlasenko
@ 2004-05-26 12:19       ` Rik van Riel
  2004-05-26 12:55         ` Buddy Lumpkin
  4 siblings, 1 reply; 146+ messages in thread
From: Rik van Riel @ 2004-05-26 12:19 UTC (permalink / raw)
  To: Buddy Lumpkin; +Cc: 'William Lee Irwin III', orders, linux-kernel

On Wed, 26 May 2004, Buddy Lumpkin wrote:

> No. I am not making any assertions whatsoever. I am just calling out
> that systems that run happily from physical memory and are not in need
> of swap should never sacrifice an ounce of performance

Executables and shared libraries live in the filesystem
cache.  Evicting those from memory - because swapping is
disabled and "the VM should remove something from cache
instead" - will feel exactly the same as swapping ...

-- 
"Debugging is twice as hard as writing the code in the first place.
Therefore, if you write the code as cleverly as possible, you are,
by definition, not smart enough to debug it." - Brian W. Kernighan


^ permalink raw reply	[flat|nested] 146+ messages in thread

* RE: why swap at all?
  2004-05-26 11:57 Nick Piggin
@ 2004-05-26 12:19 ` Buddy Lumpkin
  0 siblings, 0 replies; 146+ messages in thread
From: Buddy Lumpkin @ 2004-05-26 12:19 UTC (permalink / raw)
  To: 'Nick Piggin'
  Cc: 'John Bradford', 'William Lee Irwin III',
	orders, linux-kernel

>> 
>> 3) once physical memory is full, file system I/O will only benefit from
>> reads that incur a minor fault. All other file system operations 
>> are bound
>> by the rate you can reclaim pages from physical memory.
>> 

> No, typically we can reclaim memory very quickly and the operations
> are bound by the speed of the block device.

So if all physical memory is full with either pagecache or anonymous memory,
where are you going to put these operations that are bound by the speed of
the block device?

You have to evict pages at the same rate your reading them in or writing to
the filesystem else you have nowhere to put them. This means that the rate
you can access the filesystem is governed by the rate you can evict pages
from memory.

Couple that with the fact that there are many pte's pointing at the same
physical page (shared page) in many cases where many processes are running
on the system. Because all of the references to that page must be removed
before the page can be evicted, there are some absolute limitations in the
rate that pages can be evicted from memory as the number of processes
running on the system and the total amount of memory increases.

--Buddy


^ permalink raw reply	[flat|nested] 146+ messages in thread

* Re: why swap at all?
  2004-05-26 11:38         ` Buddy Lumpkin
  2004-05-26 12:12           ` Paulo Marques
@ 2004-05-26 12:14           ` Nick Piggin
  2004-05-26 12:40           ` Denis Vlasenko
  2 siblings, 0 replies; 146+ messages in thread
From: Nick Piggin @ 2004-05-26 12:14 UTC (permalink / raw)
  To: Buddy Lumpkin; +Cc: 'William Lee Irwin III', orders, linux-kernel

Buddy Lumpkin wrote:
> 
> 
> -----Original Message-----
> From: William Lee Irwin III [mailto:wli@holomorphy.com] 
> Sent: Wednesday, May 26, 2004 2:09 AM
> To: Buddy Lumpkin
> Cc: orders@nodivisions.com; linux-kernel@vger.kernel.org
> Subject: Re: why swap at all?
> 
> On Wed, May 26, 2004 at 01:30:09AM -0700, Buddy Lumpkin wrote:
> 
>>As for your short, two sentence comment below, let me save you the energy
> 
> of
> 
>>insinuations and translate your message the way I read it: 
>>-------------------------------------------------------------------------
>>I don't recognize your name, therefore you can't possibly have a valuable
>>opinion on the direction VM system development should go. I doubt you have
>>an actual performance problem to share, but if you do, please share it and
>>go away so that we can work on solving the problem.
>>--------------------------------------------------------------------------
>>My response:
>>Get over yourself.
> 
> 
>>What the Hell? I have enough bugs I'm paid to fix that I'm not going to
>>tolerate harassment for requesting that claims that the kernel behaves
>>pathologically in some scenario be cast as comprehensible bugreports.
>>It's also worth noting that paying customers don't respond so uncouthly.
> 
> 
> 
>>-- wli
> 
> 
> If you follow the thread, you will see no claim from me that there is
> anything wrong with the kernel. I simply stated that the priority of VM
> system development should focus on physical memory, and that physical memory
> access should not suffer as a result of some tradeoff that improves the
> performance of the VM system when free physical memory is low and there is
> heavy use of the swap device.
> 

You also went on to say:
 > This of course doesn't address the VM paging storms that happen due to large
 > amounts of file system writes. Once the pagecache fills up, dirty pages must
 > be evicted from the pagecache so that new pages can be added to the
 > pagecache.

By and large, Linux doesn't reclaim dirty pages from the pagecache,
and it should not have paging storms due to large amounts of file
system writes.

If you had a workload where it does, we would be interested to see
it. I pointed out to you that this is what Bill was asking you to
file a detailed report about.

> I can't speak whether or not a case like this currently exists, but I know
> optimizing swap performance is a very complicated yet captivating subject
> that has consumed many a posts on this list. People have tried to optimize
> every part of the VM before, so I was just calling out what I believe to be
> a very reasonable and practical goal and put a little bit of substance
> around why I think it's practical.
> 

Actually, during the 2.5 development cycle, swapping performance
got fairly neglected to the point where we were performing twice
as bad as 2.4 for most things. I (and others) recently improved
this because real people doing real things were complaining.

[snip rant]

> 
> I can picture where this is going. Here is an interview between you and a
> popular Linux magazine in two years:
> 
> 
> Linux Magazine: You have contributed to linux for quite some time, correct?
> 
> William: Oh yes, it is my hobby and occupation. I love my work.
> 
> Linux Magazine: You have done all these wonderful things!
> 
> William: Thanks, I am very proud of that
> 
> Linux Magazine: Why did you make such and such decision that backfired?
> 
> William: I don't have to answer that, I don't owe you anything and your not
> a paying customer.
> 
> Give me a break.
> 

What?? Give *you* a break? From a fictional interview you concocted?

Give me a break.

^ permalink raw reply	[flat|nested] 146+ messages in thread

* RE: why swap at all?
  2004-05-26 11:38         ` Buddy Lumpkin
@ 2004-05-26 12:12           ` Paulo Marques
  2004-05-26 12:14           ` Nick Piggin
  2004-05-26 12:40           ` Denis Vlasenko
  2 siblings, 0 replies; 146+ messages in thread
From: Paulo Marques @ 2004-05-26 12:12 UTC (permalink / raw)
  To: Buddy Lumpkin; +Cc: 'William Lee Irwin III', orders, linux-kernel


I really should not feed the trolls, but...

William Lee Irwin wrote:

> If you've got a real performance issue, please describe it properly
> instead of asserting without evidence the existence of one.

> On Wed, May 26, 2004 at 01:30:09AM -0700, Buddy Lumpkin wrote:

> > insinuations and translate your message the way I read it: 
> > -------------------------------------------------------------------------
> > I don't recognize your name, therefore you can't possibly have a valuable
> > opinion on the direction VM system development should go. I doubt you have
> > an actual performance problem to share, but if you do, please share it and
> > go away so that we can work on solving the problem.
> > --------------------------------------------------------------------------

Conclusion:

You really should learn how to read :)


This is a *technical* discussion list. So far you been able to post 7
mails about vague ramblings about what you think the VM should do and
what swapping is (not to mention unjustified personal attacks).

If you really think there is a problem with the VM post benchmarks
demonstrating the problem. 

If you don't think there is a problem, don't waste our time.

If you want to continue this discussion, please do so off-list.

Best regards,

-- 
Paulo Marques - www.grupopie.com
"In a world without walls and fences who needs windows and gates?"


^ permalink raw reply	[flat|nested] 146+ messages in thread

* RE: why swap at all?
  2004-05-26 10:41       ` Denis Vlasenko
@ 2004-05-26 12:07         ` Buddy Lumpkin
  2004-05-26 12:06           ` Marc-Christian Petersen
                             ` (2 more replies)
  2004-05-26 12:30         ` Rik van Riel
  1 sibling, 3 replies; 146+ messages in thread
From: Buddy Lumpkin @ 2004-05-26 12:07 UTC (permalink / raw)
  To: 'Denis Vlasenko'; +Cc: orders, linux-kernel

640k? who wrote that?

In the last three years, I have witnessed many large Oracle databases where
the maximum SGA size of roughly 4GB + all shadow processes, parallel slaves,
dbwr, etc.. all run completely within physical memory with the most
aggressive settings available. Previously, Oracle databases were much
smaller, but I never saw databases sized this way such that they could exist
entirely in physical memory. 

In fact, the SGA is commonly configured to use large 4m, locked pages (ISM
in Solaris, not sure if hugepages are swappable in linux) that couldn't be
swapped to disk even if you wanted to.

Again, we are not talking about the bloatware that is developed using some
rad tool for a workstation that has continued to grow over the years. I am
talking about where the industry is dumping tons of money on performance
where it really, really counts. The middle-ware that connects to a database
may continue to grow in terms of bloat, but people are happily scaling those
environments horizontally in most cases. The biggest performance problems to
solve (that people care about and are willing to pay $$ to solve) are for
the large databases that run Corporate America. There are certainly
scientific applications where performance is critical and there are dollars
to fund improvement as well, but their numbers don't compare to the number
of Oracle instances out there running in the Enterprise.

Optimizing the performance of swap operations for even a small tradeoff in
performance for memory operations that take place entirely in physical
memory is just a broke minded, brain dead direction in the year 2004 IMHO.

--Buddy   






-----Original Message-----
From: Denis Vlasenko [mailto:vda@port.imtp.ilyichevsk.odessa.ua] 
Sent: Wednesday, May 26, 2004 3:41 AM
To: Buddy Lumpkin; 'William Lee Irwin III'
Cc: orders@nodivisions.com; linux-kernel@vger.kernel.org
Subject: RE: why swap at all?

On Wednesday 26 May 2004 11:30, Buddy Lumpkin wrote:
> I have worked at large fortune 500 companies with deep pockets though, so
> this may not be the case for many. I make this point though because I
think
> if it isn't the case yet, it will be in the near future as memory becomes
> even cheaper because the trend certainly exists.

"640k will be enough for anyone" ?

No. Unfortunately, userspace programs grow in size as fast
as your RAM. Because typically developers do not think
about size of their program until it starts to outgrow
their RAM.

Today, 128M RAM swapless is barely enough to run full
spectrum of apps. OpenOffice and Mozilla "lead" the pack,
followed by KDE/Gnome etc.
-- 
vda


^ permalink raw reply	[flat|nested] 146+ messages in thread

* Re: why swap at all?
  2004-05-26 12:07         ` Buddy Lumpkin
@ 2004-05-26 12:06           ` Marc-Christian Petersen
  2004-05-26 12:19           ` Denis Vlasenko
  2004-05-26 12:33           ` Richard B. Johnson
  2 siblings, 0 replies; 146+ messages in thread
From: Marc-Christian Petersen @ 2004-05-26 12:06 UTC (permalink / raw)
  To: linux-kernel; +Cc: Buddy Lumpkin, 'Denis Vlasenko', orders

On Wednesday 26 May 2004 14:07, Buddy Lumpkin wrote:

Hi Buddy,

> 640k? who wrote that?


Bill Gates, who else ...

ciao, Marc

^ permalink raw reply	[flat|nested] 146+ messages in thread

* Re: why swap at all?
@ 2004-05-26 11:57 Nick Piggin
  2004-05-26 12:19 ` Buddy Lumpkin
  0 siblings, 1 reply; 146+ messages in thread
From: Nick Piggin @ 2004-05-26 11:57 UTC (permalink / raw)
  To: Buddy Lumpkin
  Cc: 'John Bradford', 'William Lee Irwin III',
	orders, linux-kernel

Buddy Lumpkin wrote:
>>>That's true, but it's not a magical property of swap space
>>>- extra physical
>>>RAM would do more or less the same thing.
>>>
> 
> 
>>Well it is a magical property of swap space, because extra RAM
>>doesn't allow you to replace unused memory with often used memory.
> 
> 
>>The theory holds true no matter how much RAM you have. Swap can
>>improve performance. It can be trivially demonstrated.
> 
> 
> I bet you have demonstrated this. It strikes me of an observation that could
> be made in a lab environment. But your failing to realize that:
> 
> 1) you will fill physical memory with pages eventually or your not doing
> work.
> 
> 2) pages do not just silently move to the swap device. They move as a result
> of a memory shortfall
> 
> 3) once physical memory is full, file system I/O will only benefit from
> reads that incur a minor fault. All other file system operations are bound
> by the rate you can reclaim pages from physical memory.
> 

No, typically we can reclaim memory very quickly and the operations
are bound by the speed of the block device.

> 4) non-filesystem backed pages are still effected the same way, nothing has
> changed. When you run your next filesystem related operation, those pages
> will be faulted into physical memory, and something will be evicted to it's
> backing store (remember, memory is full).
> 

I haven't failed to realise 1, 2 or 4 and I don't know what you are
arguing about. All I said was basically "no matter how much ram you
have, swap can increase performance by allowing unused anonymous
memory to be paged out, thereby increasing your maximum effective RAM".

^ permalink raw reply	[flat|nested] 146+ messages in thread

* RE: why swap at all?
  2004-05-26 10:44       ` Denis Vlasenko
@ 2004-05-26 11:49         ` Buddy Lumpkin
  0 siblings, 0 replies; 146+ messages in thread
From: Buddy Lumpkin @ 2004-05-26 11:49 UTC (permalink / raw)
  To: 'Denis Vlasenko', 'William Lee Irwin III'
  Cc: orders, linux-kernel

I have no bug to report.

-----Original Message-----
From: Denis Vlasenko [mailto:vda@port.imtp.ilyichevsk.odessa.ua] 
Sent: Wednesday, May 26, 2004 3:45 AM
To: Buddy Lumpkin; 'William Lee Irwin III'
Cc: orders@nodivisions.com; linux-kernel@vger.kernel.org
Subject: RE: why swap at all?

On Wednesday 26 May 2004 11:30, Buddy Lumpkin wrote:
> As for your short, two sentence comment below, let me save you the energy
> of insinuations and translate your message the way I read it:
> 
> -------------------------------------------------------------------------
> I don't recognize your name, therefore you can't possibly have a valuable
> opinion on the direction VM system development should go. I doubt you have
> an actual performance problem to share, but if you do, please share it and
> go away so that we can work on solving the problem.
> --------------------------------------------------------------------------

He was asking for proper bugreport.

Preparing bug report:
=====================
How To Ask Questions The Smart Way:
    http://www.catb.org/~esr/faqs/smart-questions.html
        Anybody who has written software for public use will
        probably have received at least one bad bug report.
        Reports that say nothing ("It doesn't work!");
        reports that make no sense; reports that don't give
        enough information; reports that give wrong information.
How to Report Bugs Effectively:
    http://www.chiark.greenend.org.uk/~sgtatham/bugs.html
        Before asking a technical question by email, or in
        a newsgroup, or on a website chat board, do the following:
        * Try to find an answer by searching the Web.
        * Try to find an answer by reading the manual.
        * Try to find an answer by reading a FAQ.
        * Try to find an answer by inspection or experimentation.
        * Try to find an answer by reading the source code.
Compile problems: report GCC output and result of
        "grep '^CONFIG_' .config"
Oops: decode it with ksymoops (or use 2.6 with kksymoops enabled ;).
Unkillable process: Alt-SysRq-T and ksymoops relevant part.
Yes it means you should have ksymoops installed and tested,
which is easy to get wrong. I've done that too often.
-- 
vda


^ permalink raw reply	[flat|nested] 146+ messages in thread

* RE: why swap at all?
  2004-05-26  9:48           ` Nick Piggin
  2004-05-26 10:10             ` Matthias Schniedermeyer
  2004-05-26 10:46             ` John Bradford
@ 2004-05-26 11:46             ` Buddy Lumpkin
  2 siblings, 0 replies; 146+ messages in thread
From: Buddy Lumpkin @ 2004-05-26 11:46 UTC (permalink / raw)
  To: 'Nick Piggin', 'John Bradford'
  Cc: 'William Lee Irwin III', orders, linux-kernel

>> 
>> That's true, but it's not a magical property of swap space
>> - extra physical
>> RAM would do more or less the same thing.
>> 

> Well it is a magical property of swap space, because extra RAM
> doesn't allow you to replace unused memory with often used memory.

> The theory holds true no matter how much RAM you have. Swap can
> improve performance. It can be trivially demonstrated.

I bet you have demonstrated this. It strikes me of an observation that could
be made in a lab environment. But your failing to realize that:

1) you will fill physical memory with pages eventually or your not doing
work.

2) pages do not just silently move to the swap device. They move as a result
of a memory shortfall

3) once physical memory is full, file system I/O will only benefit from
reads that incur a minor fault. All other file system operations are bound
by the rate you can reclaim pages from physical memory.

4) non-filesystem backed pages are still effected the same way, nothing has
changed. When you run your next filesystem related operation, those pages
will be faulted into physical memory, and something will be evicted to it's
backing store (remember, memory is full).

--Buddy


^ permalink raw reply	[flat|nested] 146+ messages in thread

* RE: why swap at all?
  2004-05-26  9:34         ` John Bradford
  2004-05-26  9:48           ` Nick Piggin
@ 2004-05-26 11:39           ` Buddy Lumpkin
  1 sibling, 0 replies; 146+ messages in thread
From: Buddy Lumpkin @ 2004-05-26 11:39 UTC (permalink / raw)
  To: 'John Bradford', 'Nick Piggin'
  Cc: 'William Lee Irwin III', orders, linux-kernel

Exactly ...

-----Original Message-----
From: John Bradford [mailto:john@grabjohn.com] 
Sent: Wednesday, May 26, 2004 2:35 AM
To: Nick Piggin; Buddy Lumpkin
Cc: 'William Lee Irwin III'; orders@nodivisions.com;
linux-kernel@vger.kernel.org
Subject: Re: why swap at all?

Quote from Nick Piggin <nickpiggin@yahoo.com.au>:
> Even for systems that don't *need* the extra memory space, swap can
> actually provide performance improvements by allowing unused memory
> to be replaced with often-used memory.

That's true, but it's not a magical property of swap space - extra physical
RAM would do more or less the same thing.

John.


^ permalink raw reply	[flat|nested] 146+ messages in thread

* RE: why swap at all?
  2004-05-26  9:09       ` William Lee Irwin III
@ 2004-05-26 11:38         ` Buddy Lumpkin
  2004-05-26 12:12           ` Paulo Marques
                             ` (2 more replies)
  0 siblings, 3 replies; 146+ messages in thread
From: Buddy Lumpkin @ 2004-05-26 11:38 UTC (permalink / raw)
  To: 'William Lee Irwin III'; +Cc: orders, linux-kernel




-----Original Message-----
From: William Lee Irwin III [mailto:wli@holomorphy.com] 
Sent: Wednesday, May 26, 2004 2:09 AM
To: Buddy Lumpkin
Cc: orders@nodivisions.com; linux-kernel@vger.kernel.org
Subject: Re: why swap at all?

On Wed, May 26, 2004 at 01:30:09AM -0700, Buddy Lumpkin wrote:
> As for your short, two sentence comment below, let me save you the energy
of
> insinuations and translate your message the way I read it: 
> -------------------------------------------------------------------------
> I don't recognize your name, therefore you can't possibly have a valuable
> opinion on the direction VM system development should go. I doubt you have
> an actual performance problem to share, but if you do, please share it and
> go away so that we can work on solving the problem.
> --------------------------------------------------------------------------
> My response:
> Get over yourself.

> What the Hell? I have enough bugs I'm paid to fix that I'm not going to
> tolerate harassment for requesting that claims that the kernel behaves
> pathologically in some scenario be cast as comprehensible bugreports.
> It's also worth noting that paying customers don't respond so uncouthly.


> -- wli

If you follow the thread, you will see no claim from me that there is
anything wrong with the kernel. I simply stated that the priority of VM
system development should focus on physical memory, and that physical memory
access should not suffer as a result of some tradeoff that improves the
performance of the VM system when free physical memory is low and there is
heavy use of the swap device.

I can't speak whether or not a case like this currently exists, but I know
optimizing swap performance is a very complicated yet captivating subject
that has consumed many a posts on this list. People have tried to optimize
every part of the VM before, so I was just calling out what I believe to be
a very reasonable and practical goal and put a little bit of substance
around why I think it's practical.

Anthony DiSante's post was merely a catalyst for discussion as far as I was
conserned, I wasn't implying that I had witnessed any VM system performance
problems as of late.

To address your ranting about paying customers, etc ... 

After reading your message I had to check whether my original post was
addressed to you directly (it wasn't). One might gain the impression that
you were actually directly solicited for your opinion the way your carrying
on about harassment and paying customers ... sheesh, give me a break.

I have seen many cases, where one or more persons create a free application
that many people like, then after a while some of the user base starts to
demand features, and show various signs that they have become too
comfortable with the expectation that the application will continue to
improve and forget that they should be greatful.

This situation isn't even remotely similar. In this case, you (a contributor
to a very, very large FREE software project) misread a thread and made some
surly comments that you ended up eating, and are so used to telling people
that you owe them nothing, that you have some how conjured up the image that
I actually want something from you.

This is classic, you have managed to put yourself in a position where you
spend the majority of your time working on a free project that has some very
ambitious goals. It has afforded you the ability to forfill your own
personal and professional goals as well, yet you reserve the right to
discard all accountability for your actions when it's convenient because you
get some frank feedback from someone that is not a paying customer.

What a crutch.

I can picture where this is going. Here is an interview between you and a
popular Linux magazine in two years:


Linux Magazine: You have contributed to linux for quite some time, correct?

William: Oh yes, it is my hobby and occupation. I love my work.

Linux Magazine: You have done all these wonderful things!

William: Thanks, I am very proud of that

Linux Magazine: Why did you make such and such decision that backfired?

William: I don't have to answer that, I don't owe you anything and your not
a paying customer.

Give me a break.

--Buddy


^ permalink raw reply	[flat|nested] 146+ messages in thread

* Re: why swap at all?
  2004-05-26 10:45               ` Martin Olsson
@ 2004-05-26 11:25                 ` Nick Piggin
  2004-05-26 16:33                 ` David Schwartz
  1 sibling, 0 replies; 146+ messages in thread
From: Nick Piggin @ 2004-05-26 11:25 UTC (permalink / raw)
  To: Martin Olsson; +Cc: linux-kernel

Martin Olsson wrote:
> Hi Linux-gurus,
> 
> I agree with Anthony Disante, maybe not all users want swapping. I have 
> myself felt very annoying with swapping lately but I've not yet tried to 
> disable it.
> 
> In school I've studied the swapping concept from a theoretical point
> of view, and I fully understand the fact that swapping, if used 
> properly, can both increase performance and provide a safe way to get 
> out of a bad situation when the box runs out of memory. The problem is 
> that in reality this does not work, not on Linux nor on Windows 2000 
> which I use at home. Unfortunately I cannot provide a specific reason 
> why it does not work, I'm very much a end-user/desktop-user, I'm not a 
> kernel hacker (yet). But I see two things that needs improvement atm:
> 

You don't need to provide a specific reason, a report would be
valuable too.

> A) when I do large data processing operations the computer is always 
> very very slow afterwards
> 

Time how long the large data processing operations take, then turn
swap off and time them again.

> B) if I have X Mb of RAM then there should not be imho a single swap 
> read/write until the whole of my X Mb RAM is completely stuffed, is this 
> so today?
> 

Yes, Linux doesn't start swapping or reclaiming at all until your
RAM is full.

> ---
> 
> Also, imagine that I disable swap today and start a large data 
> processing operation. During this operation I try to start a new 
> process, here ideally the program should not OOM but instead the memory 
> allocated for the data processing operation should be decreased. Is this 
> possible using today's technology? Can be divide memory into two sorts, 
> one for processes (here to stay memory) and another sort for batch 
> operations (where the amount of memory does not really matter but less 
> memory means less performance). I see the problem with "taking memory 
> back" though, I guess its impossible.
> 

File backed data will be able to be reclaimed, yes.

^ permalink raw reply	[flat|nested] 146+ messages in thread

* Re: why swap at all?
  2004-05-26 10:58                 ` Matthias Schniedermeyer
@ 2004-05-26 11:19                   ` Nick Piggin
  2004-05-26 12:27                     ` Matthias Schniedermeyer
  2004-05-26 12:37                     ` Matthias Schniedermeyer
  0 siblings, 2 replies; 146+ messages in thread
From: Nick Piggin @ 2004-05-26 11:19 UTC (permalink / raw)
  To: Matthias Schniedermeyer; +Cc: linux-kernel

Matthias Schniedermeyer wrote:
> On Wed, May 26, 2004 at 08:33:28PM +1000, Nick Piggin wrote:
> 
>>Matthias Schniedermeyer wrote:
>>

>>>In my personal machine i have 3GB of RAM and i regularly create
>>>DVD-ISO-Images (about 2 per day). After creating an image (reading up to
>>>4,4GB and writing up to 4,4GB) the cache is 100% trashed(1). With swap
>>>it would be even more trashed then it is without swap(1).
>>>
>>
>>I don't disagree that you could find a situation where swap
>>is worse than no swap. I don't understand what you mean by
>>trashed and more trashed though :)
> 
> 
> trashed means "everything i need(tm)" is paged out (mozilla/konsole/xine
> ...)
> 
> with swap the data-part of running programs was swapped out, without
> swap only the program-part is thrown out of memory as the data-part
> can't be moved anywhere else.
> 
> I have a 10KPRM SCSI-HDD, i can here what my system is doing. :-)
> 

OK, this is obviously bad. Do you get this behaviour with 2.6.5
or 2.6.6? If so, can you strace the program while it is writing
an ISO? (just send 20 lines or so). Or tell me what program you
use to create them and how to create one?

Thanks

^ permalink raw reply	[flat|nested] 146+ messages in thread

* Re: why swap at all?
@ 2004-05-26 11:04 Nick Piggin
  0 siblings, 0 replies; 146+ messages in thread
From: Nick Piggin @ 2004-05-26 11:04 UTC (permalink / raw)
  To: Buddy Lumpkin; +Cc: 'William Lee Irwin III', orders, linux-kernel

Buddy Lumpkin wrote:
>>Hi Buddy,
>>Even for systems that don't *need* the extra memory space, swap can
>>actually provide performance improvements by allowing unused memory
>>to be replaced with often-used memory.
> 
> 
>>For example, I have 57MB swapped right now. It allows me to instantly
>>grep the kernel tree. If I turned swap off, each grep would probably
>>take 30 seconds.
> 
> 
> Your analogy is flawed. There are many reasons why this doesn't work in the
> real world.
> 

It is not an analogy.

[snip]

I understand the basics of how Linux's memory management works.

> Your grep analogy incorrectly assumes that you have a bunch of vacant memory
> just waiting to store those filesystem pages, but that simply isn't the
> case. Rather 57MB of anonymous memory was evicted to make room for 57MB of
> anonymous or file system backed pages. Unless you have freed anonymous
> memory on the system by closing applications. Your physical memory pages are
> still mostly occupied. 
> 

Yes the 57MB of anonymous memory *was* evicted to make room for 57MB
of file system backed pages that grep pulled in presumably.

I tend to use grep rather often. I'm very glad that crud from mozilla,
XFree86, nautilus, gnome-settin, x-session-ma, etc has been paged out.
It allows me to grep the kernel source instantly.

> This means your grep is only going to run faster if you already read those
> files recently and they are already in the pagecache. You still have the
> burdon of pushing pages that have not been used recently out of ram before
> you can read in the new ones. And as long as you are performing a sufficient
> amount of file system I/O, this is guaranteed to happen.
> 

What would you have it do? Push out pages that have been recently used?

> One thing that can be done to minimize the problem where heavy filesystem
> I/O flushes important pages from memory like pages from shared libraries and
> executables only for them to fault back in as soon as they become runnable,
> is to implement something similar to what Sun implemented in Solaris 8
> called the cyclical page cache. The idea is that the pagecache pages against
> itself and is actually considered free memory from an anonymous memory
> perspective. The pagecache is free to grow all it wants, but since it is
> counted as free memory, anonymous memory allocation will cause the pagecache
> to shrink because it is considered free memory.
>

"the pagecache pages against itself", what does that mean?

> As these pages are evicted from the pagecache, they are placed on the
> opposite side of the cachelist (linked list that stores pages that have a
> vnode+offset already) than the side where pages are being overwritten. This
> way frequently re-accessed pages that were placed on the cache list and were
> eligible to be reclaimed, are found when the next minor fault occurs for
> that vnode+offset and moved back to the opposite side of the list so that
> they are not evicted.
> 

I failed to grasp the mechanics of the cachelist and its opposite sides.
And why does one side have pages being overwritten? Sounds strange. But
I don't know Solaris.

Linux has an approximately-LRU ordered list. Newly accessed pages go in
the top and come out the bottom where they are reclaimed (or in the front
and out the back).

> Since the cache list is counted as free memory, there is no way to wake up
> the LRU mechanism to scan physical memory until 1/64 of physical memory is
> consumed by anonymous memory.  
> 

That assumes that file backed cache is worth zero compared to
anonymous memory, which is not the case.

In Linux, we actually do the replacement in terms of mapped and
unmapped pages and bias replacement toward unmapped pages. We
will still evict long term inactive mapped pages though, which is
a good thing.

^ permalink raw reply	[flat|nested] 146+ messages in thread

* Re: why swap at all?
  2004-05-26 10:33               ` Nick Piggin
@ 2004-05-26 10:58                 ` Matthias Schniedermeyer
  2004-05-26 11:19                   ` Nick Piggin
  0 siblings, 1 reply; 146+ messages in thread
From: Matthias Schniedermeyer @ 2004-05-26 10:58 UTC (permalink / raw)
  To: Nick Piggin; +Cc: linux-kernel

On Wed, May 26, 2004 at 08:33:28PM +1000, Nick Piggin wrote:
> Matthias Schniedermeyer wrote:
> >On Wed, May 26, 2004 at 07:48:10PM +1000, Nick Piggin wrote:
> >
> >>John Bradford wrote:
> >>
> >>>Quote from Nick Piggin <nickpiggin@yahoo.com.au>:
> >>>
> >>>
> >>>>Even for systems that don't *need* the extra memory space, swap can
> >>>>actually provide performance improvements by allowing unused memory
> >>>>to be replaced with often-used memory.
> >>>
> >>>
> >>>That's true, but it's not a magical property of swap space - extra 
> >>>physical
> >>>RAM would do more or less the same thing.
> >>>
> >>
> >>Well it is a magical property of swap space, because extra RAM
> >>doesn't allow you to replace unused memory with often used memory.
> >>
> >>The theory holds true no matter how much RAM you have. Swap can
> >>improve performance. It can be trivially demonstrated.
> >
> >
> >The other way around can be "demonstrated" equally trivially.
> >
> >In my personal machine i have 3GB of RAM and i regularly create
> >DVD-ISO-Images (about 2 per day). After creating an image (reading up to
> >4,4GB and writing up to 4,4GB) the cache is 100% trashed(1). With swap
> >it would be even more trashed then it is without swap(1).
> >
> 
> I don't disagree that you could find a situation where swap
> is worse than no swap. I don't understand what you mean by
> trashed and more trashed though :)

trashed means "everything i need(tm)" is paged out (mozilla/konsole/xine
...)

with swap the data-part of running programs was swapped out, without
swap only the program-part is thrown out of memory as the data-part
can't be moved anywhere else.

I have a 10KPRM SCSI-HDD, i can here what my system is doing. :-)

> Creating your ISOs makes your system swap a lot when swap
> is enabled?

Transfering up to 8,8GB tends to trash the cache.

> >1: This has "always(tm)" been so since i began burning DVDs 3 years ago.
> >Beginning from kernel 2.4.4-2.4.25 and 2.6.4-2.6.6. Currently i use 2.6.5. 
> >(This is no typo!)
> >
> >I have only tested the "with swap"-case with 2.4.4 as i didn't use swap
> >after 2.4.4 trashed so badly with swap enabled. But i don't think that
> >things have changed so fundamentaly that the "with swap"-case is
> >better(FOR ME!) than the "without swap"-case.
> >
> 
> The 2.6 VM has changed pretty fundamentally. It would be good
> if you could retest.





Bis denn

-- 
Real Programmers consider "what you see is what you get" to be just as 
bad a concept in Text Editors as it is in women. No, the Real Programmer
wants a "you asked for it, you got it" text editor -- complicated, 
cryptic, powerful, unforgiving, dangerous.


^ permalink raw reply	[flat|nested] 146+ messages in thread

* Re: why swap at all?
  2004-05-26 10:37         ` Nick Piggin
@ 2004-05-26 10:48           ` John Bradford
  0 siblings, 0 replies; 146+ messages in thread
From: John Bradford @ 2004-05-26 10:48 UTC (permalink / raw)
  To: Nick Piggin; +Cc: Roger Luethi, Anthony DiSante, linux-kernel

Quote from Nick Piggin <nickpiggin@yahoo.com.au>:
> John Bradford wrote:
> > Quote from Roger Luethi <rl@hellgate.ch>:
> > 
> >>On Wed, 26 May 2004 10:23:32 +0100, John Bradford wrote:
> >>
> >>>A run-away process on a server with too much swap can cause it to grind to
> >>>almost a complete halt, and become almost compltely unresponsive to remote
> >>>connections.
> >>>
> >>>If the total amount of storage is just enough for the tasks the server is
> >>>expected to deal with, then a run-away process will likely be terminated
> >>>quickly stopping it from causing the machine to grind to a halt.
> >>
> >>I'm not sure your optimism about the correct (run-away) process being
> >>terminated is justified. Granted, there are definitely scenarios
> >>where swapless operation is preferable, but in most circumstances --
> >>especially workstations as the original poster described -- I'd rather
> >>minimize the risk of losing data.
> > 
> > 
> > Well, I am basing this on experience.  I know an ISP who had their main
> > customer webserver down for hours because of this kind of problem - the whole
> > thing created a lot of work and wasted a lot of time.
> > 
> > In this particular scenario, I think the run-away process was probably using
> > up almost two thirds of the total RAM, so I'm pretty confident the correct
> > process would have been terminated.
> > 
> 
> I think this is somewhat orthogonal to whether swap should be
> used or not.
> 
> What we should be doing here is enforcing the RSS rlimit. I
> have a patch from Rik to do this which needs to be merged.
> 
> Hopefully this would give you the best case situation of
> having only the runaway process really slow down, without
> killing anything until the admin arrives.

Ideally, yes - by the way, this was some time ago, (I think the machine was
running a 2.2 kernel).

John.

^ permalink raw reply	[flat|nested] 146+ messages in thread

* Re: why swap at all?
  2004-05-26  9:48           ` Nick Piggin
  2004-05-26 10:10             ` Matthias Schniedermeyer
@ 2004-05-26 10:46             ` John Bradford
  2004-05-26 11:46             ` Buddy Lumpkin
  2 siblings, 0 replies; 146+ messages in thread
From: John Bradford @ 2004-05-26 10:46 UTC (permalink / raw)
  To: Nick Piggin
  Cc: Buddy Lumpkin, 'William Lee Irwin III', orders, linux-kernel

Quote from Nick Piggin <nickpiggin@yahoo.com.au>:
> John Bradford wrote:
> > Quote from Nick Piggin <nickpiggin@yahoo.com.au>:
> > 
> >>Even for systems that don't *need* the extra memory space, swap can
> >>actually provide performance improvements by allowing unused memory
> >>to be replaced with often-used memory.
> > 
> > 
> > That's true, but it's not a magical property of swap space - extra physical
> > RAM would do more or less the same thing.
> > 
> 
> Well it is a magical property of swap space, because extra RAM
> doesn't allow you to replace unused memory with often used memory.

Strictly speaking no, but instead of replacing unused memory with often used
memory, the often used memory has it's own silicon, so the unused memory can
stay paged in as well.

Or to put it another way, however much swap a machine has, installing that
much extra physical RAM, and removing the swap space will almost never cause
a loss in performance.  There are some theoretical cases where it would,
such as where adding extra physical RAM requires the use of different memory
addressing schemes, and some data which would have, by chance, resided in
more quickly accessible RAM before the upgrade no longer does, but those
scenarios are not really anything to do with the original discussion.

John.

^ permalink raw reply	[flat|nested] 146+ messages in thread

* Re: why swap at all?
  2004-05-26 10:10             ` Matthias Schniedermeyer
  2004-05-26 10:33               ` Nick Piggin
@ 2004-05-26 10:45               ` Martin Olsson
  2004-05-26 11:25                 ` Nick Piggin
  2004-05-26 16:33                 ` David Schwartz
  1 sibling, 2 replies; 146+ messages in thread
From: Martin Olsson @ 2004-05-26 10:45 UTC (permalink / raw)
  To: linux-kernel

Hi Linux-gurus,

I agree with Anthony Disante, maybe not all users want swapping. I have 
myself felt very annoying with swapping lately but I've not yet tried to 
disable it.

In school I've studied the swapping concept from a theoretical point
of view, and I fully understand the fact that swapping, if used 
properly, can both increase performance and provide a safe way to get 
out of a bad situation when the box runs out of memory. The problem is 
that in reality this does not work, not on Linux nor on Windows 2000 
which I use at home. Unfortunately I cannot provide a specific reason 
why it does not work, I'm very much a end-user/desktop-user, I'm not a 
kernel hacker (yet). But I see two things that needs improvement atm:

A) when I do large data processing operations the computer is always 
very very slow afterwards

B) if I have X Mb of RAM then there should not be imho a single swap 
read/write until the whole of my X Mb RAM is completely stuffed, is this 
so today?

---

Also, imagine that I disable swap today and start a large data 
processing operation. During this operation I try to start a new 
process, here ideally the program should not OOM but instead the memory 
allocated for the data processing operation should be decreased. Is this 
possible using today's technology? Can be divide memory into two sorts, 
one for processes (here to stay memory) and another sort for batch 
operations (where the amount of memory does not really matter but less 
memory means less performance). I see the problem with "taking memory 
back" though, I guess its impossible.



Regards,
/m

^ permalink raw reply	[flat|nested] 146+ messages in thread

* RE: why swap at all?
  2004-05-26  8:30     ` Buddy Lumpkin
                         ` (2 preceding siblings ...)
  2004-05-26 10:41       ` Denis Vlasenko
@ 2004-05-26 10:44       ` Denis Vlasenko
  2004-05-26 11:49         ` Buddy Lumpkin
  2004-05-26 12:19       ` Rik van Riel
  4 siblings, 1 reply; 146+ messages in thread
From: Denis Vlasenko @ 2004-05-26 10:44 UTC (permalink / raw)
  To: Buddy Lumpkin, 'William Lee Irwin III'; +Cc: orders, linux-kernel

On Wednesday 26 May 2004 11:30, Buddy Lumpkin wrote:
> As for your short, two sentence comment below, let me save you the energy
> of insinuations and translate your message the way I read it:
> 
> -------------------------------------------------------------------------
> I don't recognize your name, therefore you can't possibly have a valuable
> opinion on the direction VM system development should go. I doubt you have
> an actual performance problem to share, but if you do, please share it and
> go away so that we can work on solving the problem.
> --------------------------------------------------------------------------

He was asking for proper bugreport.

Preparing bug report:
=====================
How To Ask Questions The Smart Way:
    http://www.catb.org/~esr/faqs/smart-questions.html
        Anybody who has written software for public use will
        probably have received at least one bad bug report.
        Reports that say nothing ("It doesn't work!");
        reports that make no sense; reports that don't give
        enough information; reports that give wrong information.
How to Report Bugs Effectively:
    http://www.chiark.greenend.org.uk/~sgtatham/bugs.html
        Before asking a technical question by email, or in
        a newsgroup, or on a website chat board, do the following:
        * Try to find an answer by searching the Web.
        * Try to find an answer by reading the manual.
        * Try to find an answer by reading a FAQ.
        * Try to find an answer by inspection or experimentation.
        * Try to find an answer by reading the source code.
Compile problems: report GCC output and result of
        "grep '^CONFIG_' .config"
Oops: decode it with ksymoops (or use 2.6 with kksymoops enabled ;).
Unkillable process: Alt-SysRq-T and ksymoops relevant part.
Yes it means you should have ksymoops installed and tested,
which is easy to get wrong. I've done that too often.
-- 
vda

^ permalink raw reply	[flat|nested] 146+ messages in thread

* RE: why swap at all?
  2004-05-26  8:30     ` Buddy Lumpkin
  2004-05-26  8:44       ` Nick Piggin
  2004-05-26  9:09       ` William Lee Irwin III
@ 2004-05-26 10:41       ` Denis Vlasenko
  2004-05-26 12:07         ` Buddy Lumpkin
  2004-05-26 12:30         ` Rik van Riel
  2004-05-26 10:44       ` Denis Vlasenko
  2004-05-26 12:19       ` Rik van Riel
  4 siblings, 2 replies; 146+ messages in thread
From: Denis Vlasenko @ 2004-05-26 10:41 UTC (permalink / raw)
  To: Buddy Lumpkin, 'William Lee Irwin III'; +Cc: orders, linux-kernel

On Wednesday 26 May 2004 11:30, Buddy Lumpkin wrote:
> I have worked at large fortune 500 companies with deep pockets though, so
> this may not be the case for many. I make this point though because I think
> if it isn't the case yet, it will be in the near future as memory becomes
> even cheaper because the trend certainly exists.

"640k will be enough for anyone" ?

No. Unfortunately, userspace programs grow in size as fast
as your RAM. Because typically developers do not think
about size of their program until it starts to outgrow
their RAM.

Today, 128M RAM swapless is barely enough to run full
spectrum of apps. OpenOffice and Mozilla "lead" the pack,
followed by KDE/Gnome etc.
-- 
vda

^ permalink raw reply	[flat|nested] 146+ messages in thread

* RE: why swap at all?
  2004-05-26  8:44       ` Nick Piggin
  2004-05-26  9:34         ` John Bradford
  2004-05-26  9:42         ` Anthony DiSante
@ 2004-05-26 10:40         ` Buddy Lumpkin
  2004-05-26 13:15           ` Helge Hafting
  2 siblings, 1 reply; 146+ messages in thread
From: Buddy Lumpkin @ 2004-05-26 10:40 UTC (permalink / raw)
  To: 'Nick Piggin'
  Cc: 'William Lee Irwin III', orders, linux-kernel


> Hi Buddy,
> Even for systems that don't *need* the extra memory space, swap can
> actually provide performance improvements by allowing unused memory
> to be replaced with often-used memory.

> For example, I have 57MB swapped right now. It allows me to instantly
> grep the kernel tree. If I turned swap off, each grep would probably
> take 30 seconds.

Your analogy is flawed. There are many reasons why this doesn't work in the
real world.

I don't think any modern and popular OS contains mechanisms that silently
stage old pages to disk. The constant twitching of the hard drive this
causes for no apparent reason drives people insane and drains precious
battery life on laptops. (see description for the pages_min, pages_low and
pages_high watermarks for clarity)

Pages are evicted from memory due to a memory shortfall, plain and simple.
If your actually benefiting from the 57mb of anonymous memory that was
evicted during a memory shortfall on your system then your in the unique
position of not needing to do any more filesystem I/O, or allocating any
more anonymous memory space.

The fact is, if your doing filesystem I/O, you will eventually exhaust all
available physical memory on the system. At that point, you have to evict
pages before you can read or write another page to/from the filesystem. The
page replacement algorithms being somewhat LRU based make this better than
FILO, but only as long as they don't get too clever and kill the corner
cases due to complexity.

Your grep analogy incorrectly assumes that you have a bunch of vacant memory
just waiting to store those filesystem pages, but that simply isn't the
case. Rather 57MB of anonymous memory was evicted to make room for 57MB of
anonymous or file system backed pages. Unless you have freed anonymous
memory on the system by closing applications. Your physical memory pages are
still mostly occupied. 

This means your grep is only going to run faster if you already read those
files recently and they are already in the pagecache. You still have the
burdon of pushing pages that have not been used recently out of ram before
you can read in the new ones. And as long as you are performing a sufficient
amount of file system I/O, this is guaranteed to happen.

One thing that can be done to minimize the problem where heavy filesystem
I/O flushes important pages from memory like pages from shared libraries and
executables only for them to fault back in as soon as they become runnable,
is to implement something similar to what Sun implemented in Solaris 8
called the cyclical page cache. The idea is that the pagecache pages against
itself and is actually considered free memory from an anonymous memory
perspective. The pagecache is free to grow all it wants, but since it is
counted as free memory, anonymous memory allocation will cause the pagecache
to shrink because it is considered free memory.

As these pages are evicted from the pagecache, they are placed on the
opposite side of the cachelist (linked list that stores pages that have a
vnode+offset already) than the side where pages are being overwritten. This
way frequently re-accessed pages that were placed on the cache list and were
eligible to be reclaimed, are found when the next minor fault occurs for
that vnode+offset and moved back to the opposite side of the list so that
they are not evicted.

Since the cache list is counted as free memory, there is no way to wake up
the LRU mechanism to scan physical memory until 1/64 of physical memory is
consumed by anonymous memory.  

> The VM doesn't always get it right, and to make matters worse, desktop
> users don't appreciate their long running jobs finishing earlier, but
> *hate* having to wait a few seconds for a window to appear if it hasn't
> been used for 24 hours.

Again, if you have had enough file system I/O during that time, it would
eventually cause pages from your application to be paged to the swap device
as the processes that represent your window slept.

--Buddy


^ permalink raw reply	[flat|nested] 146+ messages in thread

* Re: why swap at all?
  2004-05-26 10:35       ` John Bradford
@ 2004-05-26 10:37         ` Nick Piggin
  2004-05-26 10:48           ` John Bradford
  0 siblings, 1 reply; 146+ messages in thread
From: Nick Piggin @ 2004-05-26 10:37 UTC (permalink / raw)
  To: John Bradford; +Cc: Roger Luethi, Anthony DiSante, linux-kernel

John Bradford wrote:
> Quote from Roger Luethi <rl@hellgate.ch>:
> 
>>On Wed, 26 May 2004 10:23:32 +0100, John Bradford wrote:
>>
>>>A run-away process on a server with too much swap can cause it to grind to
>>>almost a complete halt, and become almost compltely unresponsive to remote
>>>connections.
>>>
>>>If the total amount of storage is just enough for the tasks the server is
>>>expected to deal with, then a run-away process will likely be terminated
>>>quickly stopping it from causing the machine to grind to a halt.
>>
>>I'm not sure your optimism about the correct (run-away) process being
>>terminated is justified. Granted, there are definitely scenarios
>>where swapless operation is preferable, but in most circumstances --
>>especially workstations as the original poster described -- I'd rather
>>minimize the risk of losing data.
> 
> 
> Well, I am basing this on experience.  I know an ISP who had their main
> customer webserver down for hours because of this kind of problem - the whole
> thing created a lot of work and wasted a lot of time.
> 
> In this particular scenario, I think the run-away process was probably using
> up almost two thirds of the total RAM, so I'm pretty confident the correct
> process would have been terminated.
> 

I think this is somewhat orthogonal to whether swap should be
used or not.

What we should be doing here is enforcing the RSS rlimit. I
have a patch from Rik to do this which needs to be merged.

Hopefully this would give you the best case situation of
having only the runaway process really slow down, without
killing anything until the admin arrives.

^ permalink raw reply	[flat|nested] 146+ messages in thread

* Re: why swap at all?
  2004-05-26  9:30     ` Roger Luethi
@ 2004-05-26 10:35       ` John Bradford
  2004-05-26 10:37         ` Nick Piggin
  0 siblings, 1 reply; 146+ messages in thread
From: John Bradford @ 2004-05-26 10:35 UTC (permalink / raw)
  To: Roger Luethi; +Cc: Anthony DiSante, linux-kernel

Quote from Roger Luethi <rl@hellgate.ch>:
> On Wed, 26 May 2004 10:23:32 +0100, John Bradford wrote:
> > A run-away process on a server with too much swap can cause it to grind to
> > almost a complete halt, and become almost compltely unresponsive to remote
> > connections.
> > 
> > If the total amount of storage is just enough for the tasks the server is
> > expected to deal with, then a run-away process will likely be terminated
> > quickly stopping it from causing the machine to grind to a halt.
> 
> I'm not sure your optimism about the correct (run-away) process being
> terminated is justified. Granted, there are definitely scenarios
> where swapless operation is preferable, but in most circumstances --
> especially workstations as the original poster described -- I'd rather
> minimize the risk of losing data.

Well, I am basing this on experience.  I know an ISP who had their main
customer webserver down for hours because of this kind of problem - the whole
thing created a lot of work and wasted a lot of time.

In this particular scenario, I think the run-away process was probably using
up almost two thirds of the total RAM, so I'm pretty confident the correct
process would have been terminated.

I know that trusting the kernel to terminate the correct run-away process
might seem like a bit of a risky approach to take with respect to
loosing data, especially where a little bit of swap space might come to the
rescue.

However, in my opinion, if a machine has insufficient storage for the intended
task then that's an error condition straight away.  So, I am not really
concerned with trying to make sure that a desktop system running an
application which the user has underestimated the memory usage of doesn't crash
no matter what.  The machine is operating in an error condition, so data loss
should be expected.

No, I am more concerned about preventing unexpected usage of a machine from
causing large scale slowdowns, and unavailability to other users.

For example, if a run-away process occurs, or one user on a multi-user system
uses up excessive resources.

Excessive swap space might create an illusion of protecting against data loss,
by allowing things to continue working no matter what, just a bit slower, but
for multi user systems, it's preventing normal usage of the system.  This can
indirectly lead to data loss if the machine is not accessible over the network
to perform a critical function.

Ultimately, once a machine is spending 99% of it's time swapping, it's likely
to be well past the point where it's practical to log in remotely and fix it.

However, I think that there are probably more machines using excessive swap
which would benefit from reducing it, than the other way round, though, simply
because users are not as aware of the potential problems.

My opinion was that the machine was already in an error condition the minute
I couldn't access it remotely - a significant number of customer's webpages
were inaccessible, which potentially means lost business for them.

I assume that the scenario you were thinking of when you mentioned data loss
above was a system running a critical process which is using, for example,
90% of the available storage - in that case if another process starts up, and
uses up the rest of the available storage, then the first process will probably
be terminated, whereas if you increase the amount of storage, (either by adding
swap or physical RAM), then the second process can continue for longer.

However, in this situation, I can see two possibilities - either the second
process is genuine, (I.E. not a run-away process), in which case the machine
has insufficient storage for it's intended purpose, which is an operator error
in my opinion, or the process is a run-away process, in which case a little
extra storage isn't going to do much other than buy time before the first
process is terminated.  This may give an operator chance to log in and fix
the problem, (probably by terminating the run-away process), but if this extra
storage is swap space, the machine may well become unresponsive very quickly,
making it virtually impossible to log in remotely, and making other network
services on that machine virtually inaccessible.  Eventually the run-away
process may use up the swap space, and then the first process will probably
be terminated as before, just not as quickly.  If instead of a little extra
storage, a lot was added such that the first process was no longer using more
storage than the run-away process was when storage was full, then the kernel
will hopefully terminate the run-away process, but probably only after the
machine has been unresponsive for a long time, possibly causing other problems.

Basically, I would be skeptical about using a desktop system where one
terminated process could cause data loss to the extent that I couldn't easily
restore the data.

John.

^ permalink raw reply	[flat|nested] 146+ messages in thread

* Re: why swap at all?
  2004-05-26 10:10             ` Matthias Schniedermeyer
@ 2004-05-26 10:33               ` Nick Piggin
  2004-05-26 10:58                 ` Matthias Schniedermeyer
  2004-05-26 10:45               ` Martin Olsson
  1 sibling, 1 reply; 146+ messages in thread
From: Nick Piggin @ 2004-05-26 10:33 UTC (permalink / raw)
  To: Matthias Schniedermeyer; +Cc: linux-kernel

Matthias Schniedermeyer wrote:
> On Wed, May 26, 2004 at 07:48:10PM +1000, Nick Piggin wrote:
> 
>>John Bradford wrote:
>>
>>>Quote from Nick Piggin <nickpiggin@yahoo.com.au>:
>>>
>>>
>>>>Even for systems that don't *need* the extra memory space, swap can
>>>>actually provide performance improvements by allowing unused memory
>>>>to be replaced with often-used memory.
>>>
>>>
>>>That's true, but it's not a magical property of swap space - extra physical
>>>RAM would do more or less the same thing.
>>>
>>
>>Well it is a magical property of swap space, because extra RAM
>>doesn't allow you to replace unused memory with often used memory.
>>
>>The theory holds true no matter how much RAM you have. Swap can
>>improve performance. It can be trivially demonstrated.
> 
> 
> The other way around can be "demonstrated" equally trivially.
> 
> In my personal machine i have 3GB of RAM and i regularly create
> DVD-ISO-Images (about 2 per day). After creating an image (reading up to
> 4,4GB and writing up to 4,4GB) the cache is 100% trashed(1). With swap
> it would be even more trashed then it is without swap(1).
> 

I don't disagree that you could find a situation where swap
is worse than no swap. I don't understand what you mean by
trashed and more trashed though :)

Creating your ISOs makes your system swap a lot when swap
is enabled?

> 
> 
> 
> 1: This has "always(tm)" been so since i began burning DVDs 3 years ago.
> Beginning from kernel 2.4.4-2.4.25 and 2.6.4-2.6.6. Currently i use 2.6.5. (This is no typo!)
> 
> I have only tested the "with swap"-case with 2.4.4 as i didn't use swap
> after 2.4.4 trashed so badly with swap enabled. But i don't think that
> things have changed so fundamentaly that the "with swap"-case is
> better(FOR ME!) than the "without swap"-case.
> 

The 2.6 VM has changed pretty fundamentally. It would be good
if you could retest.

^ permalink raw reply	[flat|nested] 146+ messages in thread

* Re: why swap at all?
  2004-05-26  9:48           ` Nick Piggin
@ 2004-05-26 10:10             ` Matthias Schniedermeyer
  2004-05-26 10:33               ` Nick Piggin
  2004-05-26 10:45               ` Martin Olsson
  2004-05-26 10:46             ` John Bradford
  2004-05-26 11:46             ` Buddy Lumpkin
  2 siblings, 2 replies; 146+ messages in thread
From: Matthias Schniedermeyer @ 2004-05-26 10:10 UTC (permalink / raw)
  To: Nick Piggin; +Cc: linux-kernel

On Wed, May 26, 2004 at 07:48:10PM +1000, Nick Piggin wrote:
> John Bradford wrote:
> >Quote from Nick Piggin <nickpiggin@yahoo.com.au>:
> >
> >>Even for systems that don't *need* the extra memory space, swap can
> >>actually provide performance improvements by allowing unused memory
> >>to be replaced with often-used memory.
> >
> >
> >That's true, but it's not a magical property of swap space - extra physical
> >RAM would do more or less the same thing.
> >
> 
> Well it is a magical property of swap space, because extra RAM
> doesn't allow you to replace unused memory with often used memory.
> 
> The theory holds true no matter how much RAM you have. Swap can
> improve performance. It can be trivially demonstrated.

The other way around can be "demonstrated" equally trivially.

In my personal machine i have 3GB of RAM and i regularly create
DVD-ISO-Images (about 2 per day). After creating an image (reading up to
4,4GB and writing up to 4,4GB) the cache is 100% trashed(1). With swap
it would be even more trashed then it is without swap(1).




1: This has "always(tm)" been so since i began burning DVDs 3 years ago.
Beginning from kernel 2.4.4-2.4.25 and 2.6.4-2.6.6. Currently i use 2.6.5. (This is no typo!)

I have only tested the "with swap"-case with 2.4.4 as i didn't use swap
after 2.4.4 trashed so badly with swap enabled. But i don't think that
things have changed so fundamentaly that the "with swap"-case is
better(FOR ME!) than the "without swap"-case.


Bis denn

-- 
Real Programmers consider "what you see is what you get" to be just as 
bad a concept in Text Editors as it is in women. No, the Real Programmer
wants a "you asked for it, you got it" text editor -- complicated, 
cryptic, powerful, unforgiving, dangerous.


^ permalink raw reply	[flat|nested] 146+ messages in thread

* Re: why swap at all?
  2004-05-26  6:38 Anthony DiSante
                   ` (4 preceding siblings ...)
  2004-05-26  9:06 ` John Bradford
@ 2004-05-26 10:02 ` Raphael Jacquot
  2004-05-26 13:00 ` Satoshi Oshima
  6 siblings, 0 replies; 146+ messages in thread
From: Raphael Jacquot @ 2004-05-26 10:02 UTC (permalink / raw)
  To: linux-kernel

Anthony DiSante wrote:
> Or, to make it more appealing, say I initially had 512MB ram and now I 
> have 1GB.  Wouldn't I much rather not use swap at all anymore, in this 
> case, on my desktop?

I do that on embedded systems. no swap. when there's no more, the 
oomkiller kicks in and removes a few extraneous processes...

> -Anthony
> http://nodivisions.com/


^ permalink raw reply	[flat|nested] 146+ messages in thread

* Re: why swap at all?
  2004-05-26  9:42         ` Anthony DiSante
@ 2004-05-26  9:58           ` Nick Piggin
  2004-05-26 20:11             ` Wakko Warner
  0 siblings, 1 reply; 146+ messages in thread
From: Nick Piggin @ 2004-05-26  9:58 UTC (permalink / raw)
  To: orders; +Cc: linux-kernel

Anthony DiSante wrote:
> Nick Piggin wrote:
> 
>> The VM doesn't always get it right, and to make matters worse, desktop
>> users don't appreciate their long running jobs finishing earlier, but
>> *hate* having to wait a few seconds for a window to appear if it hasn't
>> been used for 24 hours.
> 
> 
> Come on, that is quite an exaggeration.  It can happen in a span of 
> minutes -- after rsyncing a dir to a backup dir, for example, which 
> fills ram rather quickly with cache I'll never use again.  Or after 
> configuring and compiling a package, which does the same thing.
> 

rsync is something known to break the VM's use-once heuristics.
I'm looking at that.

> As you said, the VM doesn't, in fact, always get it right.  If 512MB 
> worked before when it was half swap, 512MB of pure ram will work too, 
> only faster.  I don't see how adding more swap at that point could 
> increase performance unless you are keeping your ram full of non-cached 
> pages, and that's never the case for me -- my ram is almost always half 
> cached pages.
> 

It can.

^ permalink raw reply	[flat|nested] 146+ messages in thread

* Re: why swap at all?
  2004-05-26  9:34         ` John Bradford
@ 2004-05-26  9:48           ` Nick Piggin
  2004-05-26 10:10             ` Matthias Schniedermeyer
                               ` (2 more replies)
  2004-05-26 11:39           ` Buddy Lumpkin
  1 sibling, 3 replies; 146+ messages in thread
From: Nick Piggin @ 2004-05-26  9:48 UTC (permalink / raw)
  To: John Bradford
  Cc: Buddy Lumpkin, 'William Lee Irwin III', orders, linux-kernel

John Bradford wrote:
> Quote from Nick Piggin <nickpiggin@yahoo.com.au>:
> 
>>Even for systems that don't *need* the extra memory space, swap can
>>actually provide performance improvements by allowing unused memory
>>to be replaced with often-used memory.
> 
> 
> That's true, but it's not a magical property of swap space - extra physical
> RAM would do more or less the same thing.
> 

Well it is a magical property of swap space, because extra RAM
doesn't allow you to replace unused memory with often used memory.

The theory holds true no matter how much RAM you have. Swap can
improve performance. It can be trivially demonstrated.

^ permalink raw reply	[flat|nested] 146+ messages in thread

* Re: why swap at all?
  2004-05-26  8:44       ` Nick Piggin
  2004-05-26  9:34         ` John Bradford
@ 2004-05-26  9:42         ` Anthony DiSante
  2004-05-26  9:58           ` Nick Piggin
  2004-05-26 10:40         ` Buddy Lumpkin
  2 siblings, 1 reply; 146+ messages in thread
From: Anthony DiSante @ 2004-05-26  9:42 UTC (permalink / raw)
  To: linux-kernel

Nick Piggin wrote:
> The VM doesn't always get it right, and to make matters worse, desktop
> users don't appreciate their long running jobs finishing earlier, but
> *hate* having to wait a few seconds for a window to appear if it hasn't
> been used for 24 hours.

Come on, that is quite an exaggeration.  It can happen in a span of minutes 
-- after rsyncing a dir to a backup dir, for example, which fills ram rather 
quickly with cache I'll never use again.  Or after configuring and compiling 
a package, which does the same thing.

As you said, the VM doesn't, in fact, always get it right.  If 512MB worked 
before when it was half swap, 512MB of pure ram will work too, only faster. 
  I don't see how adding more swap at that point could increase performance 
unless you are keeping your ram full of non-cached pages, and that's never 
the case for me -- my ram is almost always half cached pages.

-Anthony
http://nodivisions.com/

^ permalink raw reply	[flat|nested] 146+ messages in thread

* Re: why swap at all?
  2004-05-26  9:00 ` Helge Hafting
@ 2004-05-26  9:40   ` John Bradford
  2004-05-26 13:06     ` Helge Hafting
  0 siblings, 1 reply; 146+ messages in thread
From: John Bradford @ 2004-05-26  9:40 UTC (permalink / raw)
  To: Helge Hafting, orders; +Cc: linux-kernel

Quote from Helge Hafting <helgehaf@aitel.hist.no>:
> Anthony DiSante wrote:
> 
> > As a general question about ram/swap and relating to some of the 
> > issues in this thread:
> >
> >     ~500 megs cached yet 2.6.5 goes into swap hell
> >
> > Consider this: I have a desktop system with 256MB ram, so I make a 
> > 256MB swap partition.  So I have 512MB "memory" and if some process 
> > wants more, too bad, there is no more.
> >
> > Now I buy another 256MB of ram, so I have 512MB of real memory.  Why 
> > not just disable my swap completely now?  I won't have increased my 
> > memory's size at all, but won't I have increased its performance lots? 
> 
> This is correct. You now have 512M of fast memory instead of
> 256M fast memory and 256M "slow" memory. You don't _need_ to have additional
> swap, but it is usually a good idea.  If you keep your 256M of swap, 
> then you now
> have 512M fast memory + 256M slow memory for a total of 768M.  This is 
> even better.

I strongly disagree on the last point.  It may be better, but it may also
be a lot worse.  Too much swap can be a bad thing - see my example in another
post about run-away processes on remote machines.

> Please note that  your machine _will_ do one kind of swapping even if you
> don't configure any swap: Executable files are a kind of swap-files,
> if memory pressure happens then (part of) your programs will be evicted
> from memory _because_ they can be reloaded from their executables.
> 
> This cause the same sort of performance degradations as swapping to
> a swap partition.  Actually, it is worse because swapping to a swap 
> partition
> allows swapping out little-used writeable memory before discarding
> program code that might see more use.  So if swapping happens, then
> you're better off with a swap partition because then it is the least used
> stuff that goes first. Without a swap partition, the least used program code
> goes, but it may or may not be the least used memory overall.

Again, the user _may_ be better off swapping to a swap partition rather than
having executable code paged out, but this is not necessarily true in all
circumstances.

John.

^ permalink raw reply	[flat|nested] 146+ messages in thread

* Re: why swap at all?
  2004-05-26  8:44       ` Nick Piggin
@ 2004-05-26  9:34         ` John Bradford
  2004-05-26  9:48           ` Nick Piggin
  2004-05-26 11:39           ` Buddy Lumpkin
  2004-05-26  9:42         ` Anthony DiSante
  2004-05-26 10:40         ` Buddy Lumpkin
  2 siblings, 2 replies; 146+ messages in thread
From: John Bradford @ 2004-05-26  9:34 UTC (permalink / raw)
  To: Nick Piggin, Buddy Lumpkin
  Cc: 'William Lee Irwin III', orders, linux-kernel

Quote from Nick Piggin <nickpiggin@yahoo.com.au>:
> Even for systems that don't *need* the extra memory space, swap can
> actually provide performance improvements by allowing unused memory
> to be replaced with often-used memory.

That's true, but it's not a magical property of swap space - extra physical
RAM would do more or less the same thing.

John.

^ permalink raw reply	[flat|nested] 146+ messages in thread

* Re: why swap at all?
  2004-05-26  9:23   ` John Bradford
@ 2004-05-26  9:30     ` Roger Luethi
  2004-05-26 10:35       ` John Bradford
  2004-05-26 13:01     ` Helge Hafting
  1 sibling, 1 reply; 146+ messages in thread
From: Roger Luethi @ 2004-05-26  9:30 UTC (permalink / raw)
  To: John Bradford; +Cc: Anthony DiSante, linux-kernel

On Wed, 26 May 2004 10:23:32 +0100, John Bradford wrote:
> A run-away process on a server with too much swap can cause it to grind to
> almost a complete halt, and become almost compltely unresponsive to remote
> connections.
> 
> If the total amount of storage is just enough for the tasks the server is
> expected to deal with, then a run-away process will likely be terminated
> quickly stopping it from causing the machine to grind to a halt.

I'm not sure your optimism about the correct (run-away) process being
terminated is justified. Granted, there are definitely scenarios
where swapless operation is preferable, but in most circumstances --
especially workstations as the original poster described -- I'd rather
minimize the risk of losing data.

Roger

^ permalink raw reply	[flat|nested] 146+ messages in thread

* Re: why swap at all?
  2004-05-26  8:27 ` Roger Luethi
@ 2004-05-26  9:23   ` John Bradford
  2004-05-26  9:30     ` Roger Luethi
  2004-05-26 13:01     ` Helge Hafting
  0 siblings, 2 replies; 146+ messages in thread
From: John Bradford @ 2004-05-26  9:23 UTC (permalink / raw)
  To: Roger Luethi, Anthony DiSante; +Cc: linux-kernel

Quote from Roger Luethi <rl@hellgate.ch>:
> On Wed, 26 May 2004 02:38:23 -0400, Anthony DiSante wrote:
> > Now I buy another 256MB of ram, so I have 512MB of real memory.  Why not 
> > just disable my swap completely now?  I won't have increased my memory's 
> > size at all, but won't I have increased its performance lots?
> > 
> > Or, to make it more appealing, say I initially had 512MB ram and now I have 
> > 1GB.  Wouldn't I much rather not use swap at all anymore, in this case, on 
> > my desktop?
> 
> Swap serves another (often underrated) purpose: Graceful degradation.
> 
> If you have a reasonably amount of swap space mounted, you will know
> you are running out of RAM because your system will become noticeably
> slower. If you have no swap whatsoever, your first warning will quite
> possibly be an application OOM killed or losing data due to a failed
> memory allocation.
> 
> Think of the slowness of swap as a _feature_.

There is a very negative side to this approach as well, especially if users
allocate excessive swap space.

A run-away process on a server with too much swap can cause it to grind to
almost a complete halt, and become almost compltely unresponsive to remote
connections.

If the total amount of storage is just enough for the tasks the server is
expected to deal with, then a run-away process will likely be terminated
quickly stopping it from causing the machine to grind to a halt.

If, on the other hand, there is excessive storage, it can continue running
for a long time, often consuming a lot of CPU.

When the excess storage is physical RAM, this might not be particularly
disasterous, but if it's swap space, it's much more likely to cause a serious
drop in performance.

For a desktop system, it might not be a big deal, but when it's an ISP's server
in a remote data centre, it can create a lot of unnecessary work.

John.

^ permalink raw reply	[flat|nested] 146+ messages in thread

* Re: why swap at all?
  2004-05-26  8:30     ` Buddy Lumpkin
  2004-05-26  8:44       ` Nick Piggin
@ 2004-05-26  9:09       ` William Lee Irwin III
  2004-05-26 11:38         ` Buddy Lumpkin
  2004-05-26 10:41       ` Denis Vlasenko
                         ` (2 subsequent siblings)
  4 siblings, 1 reply; 146+ messages in thread
From: William Lee Irwin III @ 2004-05-26  9:09 UTC (permalink / raw)
  To: Buddy Lumpkin; +Cc: orders, linux-kernel

On Wed, May 26, 2004 at 01:30:09AM -0700, Buddy Lumpkin wrote:
> As for your short, two sentence comment below, let me save you the energy of
> insinuations and translate your message the way I read it: 
> -------------------------------------------------------------------------
> I don't recognize your name, therefore you can't possibly have a valuable
> opinion on the direction VM system development should go. I doubt you have
> an actual performance problem to share, but if you do, please share it and
> go away so that we can work on solving the problem.
> --------------------------------------------------------------------------
> My response:
> Get over yourself.

What the Hell? I have enough bugs I'm paid to fix that I'm not going to
tolerate harassment for requesting that claims that the kernel behaves
pathologically in some scenario be cast as comprehensible bugreports.
It's also worth noting that paying customers don't respond so uncouthly.


-- wli

^ permalink raw reply	[flat|nested] 146+ messages in thread

* Re: why swap at all?
  2004-05-26  6:38 Anthony DiSante
                   ` (3 preceding siblings ...)
  2004-05-26  9:00 ` Helge Hafting
@ 2004-05-26  9:06 ` John Bradford
  2004-05-26 12:31   ` Buddy Lumpkin
  2004-05-26 10:02 ` Raphael Jacquot
  2004-05-26 13:00 ` Satoshi Oshima
  6 siblings, 1 reply; 146+ messages in thread
From: John Bradford @ 2004-05-26  9:06 UTC (permalink / raw)
  To: Anthony DiSante, linux-kernel

Quote from Anthony DiSante <orders@nodivisions.com>:
> As a general question about ram/swap and relating to some of the issues in 
> this thread:
> 
> 	~500 megs cached yet 2.6.5 goes into swap hell
> 
> Consider this: I have a desktop system with 256MB ram, so I make a 256MB 
> swap partition.  So I have 512MB "memory" and if some process wants more, 
> too bad, there is no more.
> 
> Now I buy another 256MB of ram, so I have 512MB of real memory.  Why not 
> just disable my swap completely now?  I won't have increased my memory's 
> size at all, but won't I have increased its performance lots?
> 
> Or, to make it more appealing, say I initially had 512MB ram and now I have 
> 1GB.  Wouldn't I much rather not use swap at all anymore, in this case, on 
> my desktop?

In my experience, it's perfectly possible to run a typical desktop system with
no swap at all.  Certainly the 'double the amount of physical RAM' guideline
has been taken far too literally in my opinion.

As you point out, if a typical system works fine with 512 Mb of storage, it
shouldn't matter what the mix of physical and virtual memory is.  Of course,
it will make a difference to performance, and there is a minimum practical
amount of real RAM because some things are unswapable, but in my opinion it
is absolutely wrong to size swap partitions simply by looking at the amount of 
physical RAM in a system and not considering the requirements of the workload.

Double the physical RAM is usually more than enough these days, and because
most of the time the negative effects of too little swap are more noticable
than too much swap, this 'rule' seems to work well enough.

See my recent posts in another thread about solving computing problems by
learning solutions to common scenarios and not learning about computers in a
generic way - in my opinion, the widespread inefficiency of swap space is a
classic example of how this simply doesn't work very well.

Instead of trying to work out how much swap space is needed for a particular
hardware configuation, I suggest that users look at the workload first.

Infact, it's generally sensible to think about the workload before buying any
hardware.

John.

^ permalink raw reply	[flat|nested] 146+ messages in thread

* Re: why swap at all?
  2004-05-26  6:38 Anthony DiSante
                   ` (2 preceding siblings ...)
  2004-05-26  8:32 ` Denis Vlasenko
@ 2004-05-26  9:00 ` Helge Hafting
  2004-05-26  9:40   ` John Bradford
  2004-05-26  9:06 ` John Bradford
                   ` (2 subsequent siblings)
  6 siblings, 1 reply; 146+ messages in thread
From: Helge Hafting @ 2004-05-26  9:00 UTC (permalink / raw)
  To: orders; +Cc: linux-kernel

Anthony DiSante wrote:

> As a general question about ram/swap and relating to some of the 
> issues in this thread:
>
>     ~500 megs cached yet 2.6.5 goes into swap hell
>
> Consider this: I have a desktop system with 256MB ram, so I make a 
> 256MB swap partition.  So I have 512MB "memory" and if some process 
> wants more, too bad, there is no more.
>
> Now I buy another 256MB of ram, so I have 512MB of real memory.  Why 
> not just disable my swap completely now?  I won't have increased my 
> memory's size at all, but won't I have increased its performance lots? 

This is correct. You now have 512M of fast memory instead of
256M fast memory and 256M "slow" memory. You don't _need_ to have additional
swap, but it is usually a good idea.  If you keep your 256M of swap, 
then you now
have 512M fast memory + 256M slow memory for a total of 768M.  This is 
even better.

Please note that  your machine _will_ do one kind of swapping even if you
don't configure any swap: Executable files are a kind of swap-files,
if memory pressure happens then (part of) your programs will be evicted
from memory _because_ they can be reloaded from their executables.

This cause the same sort of performance degradations as swapping to
a swap partition.  Actually, it is worse because swapping to a swap 
partition
allows swapping out little-used writeable memory before discarding
program code that might see more use.  So if swapping happens, then
you're better off with a swap partition because then it is the least used
stuff that goes first. Without a swap partition, the least used program code
goes, but it may or may not be the least used memory overall.

Helge Hafting




^ permalink raw reply	[flat|nested] 146+ messages in thread

* Re: why swap at all?
  2004-05-26  8:30     ` Buddy Lumpkin
@ 2004-05-26  8:44       ` Nick Piggin
  2004-05-26  9:34         ` John Bradford
                           ` (2 more replies)
  2004-05-26  9:09       ` William Lee Irwin III
                         ` (3 subsequent siblings)
  4 siblings, 3 replies; 146+ messages in thread
From: Nick Piggin @ 2004-05-26  8:44 UTC (permalink / raw)
  To: Buddy Lumpkin; +Cc: 'William Lee Irwin III', orders, linux-kernel

Buddy Lumpkin wrote:
> No. I am not making any assertions whatsoever. I am just calling out that
> systems that run happily from physical memory and are not in need of swap
> should never sacrifice an ounce of performance for even drastic improvements
> to swap performance. Swap is a band-aid for saving money on memory and a few
> years ago, it allowed you to save a substantial amount of money. 
> 

Hi Buddy,
Even for systems that don't *need* the extra memory space, swap can
actually provide performance improvements by allowing unused memory
to be replaced with often-used memory.

For example, I have 57MB swapped right now. It allows me to instantly
grep the kernel tree. If I turned swap off, each grep would probably
take 30 seconds.

The VM doesn't always get it right, and to make matters worse, desktop
users don't appreciate their long running jobs finishing earlier, but
*hate* having to wait a few seconds for a window to appear if it hasn't
been used for 24 hours.

> Whether the cost savings for utilizing swap vs buying more memory are
> substantial as of late is subject to opinion, but I cannot think of a system
> that I have sized in the last three years where swap was expected to be used
> except in un-anticipated memory shortfalls. In fact, if I didn't plan to
> store crash dumps on the swap device, I think I would have omitted swap all
> together in many configurations.
> 
> I have worked at large fortune 500 companies with deep pockets though, so
> this may not be the case for many. I make this point though because I think
> if it isn't the case yet, it will be in the near future as memory becomes
> even cheaper because the trend certainly exists.
> 
> As for your short, two sentence comment below, let me save you the energy of
> insinuations and translate your message the way I read it: 
> 

[snip]

I think the comment was rather directed at a specific problem you
described:

 > This of course doesn't address the VM paging storms that happen due to large
 > amounts of file system writes. Once the pagecache fills up, dirty pages must
 > be evicted from the pagecache so that new pages can be added to the
 > pagecache.

This sounds like you are having a serious problem, and it would be
great if you could describe it in detail. kernel version, workload,
description of the system, vmstat output, etc.

Let's keep it nice.

Best regards
Nick

^ permalink raw reply	[flat|nested] 146+ messages in thread

* Re: why swap at all?
  2004-05-26  6:38 Anthony DiSante
  2004-05-26  7:31 ` Buddy Lumpkin
  2004-05-26  8:27 ` Roger Luethi
@ 2004-05-26  8:32 ` Denis Vlasenko
  2004-05-26  9:00 ` Helge Hafting
                   ` (3 subsequent siblings)
  6 siblings, 0 replies; 146+ messages in thread
From: Denis Vlasenko @ 2004-05-26  8:32 UTC (permalink / raw)
  To: orders, linux-kernel

On Wednesday 26 May 2004 09:38, Anthony DiSante wrote:
> As a general question about ram/swap and relating to some of the issues in
> this thread:
>
> 	~500 megs cached yet 2.6.5 goes into swap hell
>
> Consider this: I have a desktop system with 256MB ram, so I make a 256MB
> swap partition.  So I have 512MB "memory" and if some process wants more,
> too bad, there is no more.
>
> Now I buy another 256MB of ram, so I have 512MB of real memory.  Why not
> just disable my swap completely now?  I won't have increased my memory's
> size at all, but won't I have increased its performance lots?
>
> Or, to make it more appealing, say I initially had 512MB ram and now I have
> 1GB.  Wouldn't I much rather not use swap at all anymore, in this case, on
> my desktop?

Yes, you can run swapless. Nothing wrong with that.
--
vda

^ permalink raw reply	[flat|nested] 146+ messages in thread

* RE: why swap at all?
  2004-05-26  7:55   ` William Lee Irwin III
@ 2004-05-26  8:30     ` Buddy Lumpkin
  2004-05-26  8:44       ` Nick Piggin
                         ` (4 more replies)
  0 siblings, 5 replies; 146+ messages in thread
From: Buddy Lumpkin @ 2004-05-26  8:30 UTC (permalink / raw)
  To: 'William Lee Irwin III'; +Cc: orders, linux-kernel

No. I am not making any assertions whatsoever. I am just calling out that
systems that run happily from physical memory and are not in need of swap
should never sacrifice an ounce of performance for even drastic improvements
to swap performance. Swap is a band-aid for saving money on memory and a few
years ago, it allowed you to save a substantial amount of money. 

Whether the cost savings for utilizing swap vs buying more memory are
substantial as of late is subject to opinion, but I cannot think of a system
that I have sized in the last three years where swap was expected to be used
except in un-anticipated memory shortfalls. In fact, if I didn't plan to
store crash dumps on the swap device, I think I would have omitted swap all
together in many configurations.

I have worked at large fortune 500 companies with deep pockets though, so
this may not be the case for many. I make this point though because I think
if it isn't the case yet, it will be in the near future as memory becomes
even cheaper because the trend certainly exists.

As for your short, two sentence comment below, let me save you the energy of
insinuations and translate your message the way I read it: 

-------------------------------------------------------------------------
I don't recognize your name, therefore you can't possibly have a valuable
opinion on the direction VM system development should go. I doubt you have
an actual performance problem to share, but if you do, please share it and
go away so that we can work on solving the problem.
--------------------------------------------------------------------------

My response:

Get over yourself.

Regards,

--Buddy

-----Original Message-----
From: William Lee Irwin III [mailto:wli@holomorphy.com] 
Sent: Wednesday, May 26, 2004 12:55 AM
To: Buddy Lumpkin
Cc: orders@nodivisions.com; linux-kernel@vger.kernel.org
Subject: Re: why swap at all?

On Wed, May 26, 2004 at 12:31:16AM -0700, Buddy Lumpkin wrote:
> This is a really good point. I think the bar should be set at max
> performance for systems that never need to use the swap device. 
> If someone wants to tune swap performance to their hearts content, so be
it.
> But given cheap prices for memory, and the horrible best case performance
> for swap, an increase in swap performance should never, ever come at the
> expense of performance for a system that has been sized such that
executable
> address spaces, libraries and anonymous memory will fit easily within
> physical ram.
> This of course doesn't address the VM paging storms that happen due to
large
> amounts of file system writes. Once the pagecache fills up, dirty pages
must
> be evicted from the pagecache so that new pages can be added to the
> pagecache.

If you've got a real performance issue, please describe it properly
instead of asserting without evidence the existence of one.


-- wli


^ permalink raw reply	[flat|nested] 146+ messages in thread

* Re: why swap at all?
  2004-05-26  6:38 Anthony DiSante
  2004-05-26  7:31 ` Buddy Lumpkin
@ 2004-05-26  8:27 ` Roger Luethi
  2004-05-26  9:23   ` John Bradford
  2004-05-26  8:32 ` Denis Vlasenko
                   ` (4 subsequent siblings)
  6 siblings, 1 reply; 146+ messages in thread
From: Roger Luethi @ 2004-05-26  8:27 UTC (permalink / raw)
  To: Anthony DiSante; +Cc: linux-kernel

On Wed, 26 May 2004 02:38:23 -0400, Anthony DiSante wrote:
> Now I buy another 256MB of ram, so I have 512MB of real memory.  Why not 
> just disable my swap completely now?  I won't have increased my memory's 
> size at all, but won't I have increased its performance lots?
> 
> Or, to make it more appealing, say I initially had 512MB ram and now I have 
> 1GB.  Wouldn't I much rather not use swap at all anymore, in this case, on 
> my desktop?

Swap serves another (often underrated) purpose: Graceful degradation.

If you have a reasonably amount of swap space mounted, you will know
you are running out of RAM because your system will become noticeably
slower. If you have no swap whatsoever, your first warning will quite
possibly be an application OOM killed or losing data due to a failed
memory allocation.

Think of the slowness of swap as a _feature_.

Roger

^ permalink raw reply	[flat|nested] 146+ messages in thread

* Re: why swap at all?
  2004-05-26  7:31 ` Buddy Lumpkin
@ 2004-05-26  7:55   ` William Lee Irwin III
  2004-05-26  8:30     ` Buddy Lumpkin
  0 siblings, 1 reply; 146+ messages in thread
From: William Lee Irwin III @ 2004-05-26  7:55 UTC (permalink / raw)
  To: Buddy Lumpkin; +Cc: orders, linux-kernel

On Wed, May 26, 2004 at 12:31:16AM -0700, Buddy Lumpkin wrote:
> This is a really good point. I think the bar should be set at max
> performance for systems that never need to use the swap device. 
> If someone wants to tune swap performance to their hearts content, so be it.
> But given cheap prices for memory, and the horrible best case performance
> for swap, an increase in swap performance should never, ever come at the
> expense of performance for a system that has been sized such that executable
> address spaces, libraries and anonymous memory will fit easily within
> physical ram.
> This of course doesn't address the VM paging storms that happen due to large
> amounts of file system writes. Once the pagecache fills up, dirty pages must
> be evicted from the pagecache so that new pages can be added to the
> pagecache.

If you've got a real performance issue, please describe it properly
instead of asserting without evidence the existence of one.


-- wli

^ permalink raw reply	[flat|nested] 146+ messages in thread

* RE: why swap at all?
  2004-05-26  6:38 Anthony DiSante
@ 2004-05-26  7:31 ` Buddy Lumpkin
  2004-05-26  7:55   ` William Lee Irwin III
  2004-05-26  8:27 ` Roger Luethi
                   ` (5 subsequent siblings)
  6 siblings, 1 reply; 146+ messages in thread
From: Buddy Lumpkin @ 2004-05-26  7:31 UTC (permalink / raw)
  To: orders, linux-kernel

This is a really good point. I think the bar should be set at max
performance for systems that never need to use the swap device. 

If someone wants to tune swap performance to their hearts content, so be it.
But given cheap prices for memory, and the horrible best case performance
for swap, an increase in swap performance should never, ever come at the
expense of performance for a system that has been sized such that executable
address spaces, libraries and anonymous memory will fit easily within
physical ram.

This of course doesn't address the VM paging storms that happen due to large
amounts of file system writes. Once the pagecache fills up, dirty pages must
be evicted from the pagecache so that new pages can be added to the
pagecache.

--Buddy



-----Original Message-----
From: linux-kernel-owner@vger.kernel.org
[mailto:linux-kernel-owner@vger.kernel.org] On Behalf Of Anthony DiSante
Sent: Tuesday, May 25, 2004 11:38 PM
To: linux-kernel@vger.kernel.org
Subject: why swap at all?

As a general question about ram/swap and relating to some of the issues in 
this thread:

	~500 megs cached yet 2.6.5 goes into swap hell

Consider this: I have a desktop system with 256MB ram, so I make a 256MB 
swap partition.  So I have 512MB "memory" and if some process wants more, 
too bad, there is no more.

Now I buy another 256MB of ram, so I have 512MB of real memory.  Why not 
just disable my swap completely now?  I won't have increased my memory's 
size at all, but won't I have increased its performance lots?

Or, to make it more appealing, say I initially had 512MB ram and now I have 
1GB.  Wouldn't I much rather not use swap at all anymore, in this case, on 
my desktop?

-Anthony
http://nodivisions.com/
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


^ permalink raw reply	[flat|nested] 146+ messages in thread

* why swap at all?
@ 2004-05-26  6:38 Anthony DiSante
  2004-05-26  7:31 ` Buddy Lumpkin
                   ` (6 more replies)
  0 siblings, 7 replies; 146+ messages in thread
From: Anthony DiSante @ 2004-05-26  6:38 UTC (permalink / raw)
  To: linux-kernel

As a general question about ram/swap and relating to some of the issues in 
this thread:

	~500 megs cached yet 2.6.5 goes into swap hell

Consider this: I have a desktop system with 256MB ram, so I make a 256MB 
swap partition.  So I have 512MB "memory" and if some process wants more, 
too bad, there is no more.

Now I buy another 256MB of ram, so I have 512MB of real memory.  Why not 
just disable my swap completely now?  I won't have increased my memory's 
size at all, but won't I have increased its performance lots?

Or, to make it more appealing, say I initially had 512MB ram and now I have 
1GB.  Wouldn't I much rather not use swap at all anymore, in this case, on 
my desktop?

-Anthony
http://nodivisions.com/

^ permalink raw reply	[flat|nested] 146+ messages in thread

end of thread, other threads:[~2004-06-15 20:14 UTC | newest]

Thread overview: 146+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2004-05-26 12:24 why swap at all? Nick Piggin
2004-05-26 13:03 ` Buddy Lumpkin
2004-05-26 13:27   ` Helge Hafting
     [not found] <fa.amhil9e.o5kt1u@ifi.uio.no>
     [not found] ` <fa.kfm8lru.1l2mdp4@ifi.uio.no>
2004-06-08 15:12   ` Ray Bryant
2004-06-08 15:15   ` Ray Bryant
2004-06-09 19:24     ` Bill Davidsen
  -- strict thread matches above, loose matches on Subject: below --
2004-05-31 19:34 Michael Brennan
2004-05-31 20:29 ` John Bradford
2004-05-31 22:47   ` Nick Piggin
2004-05-31 23:30     ` Bernd Eckenfels
2004-06-01 18:36       ` FabF
2004-06-01 19:02         ` Valdis.Kletnieks
2004-06-01 19:53           ` FabF
2004-06-01 20:00             ` Valdis.Kletnieks
2004-06-01 20:14               ` FabF
2004-06-01 20:22                 ` Valdis.Kletnieks
2004-06-01 21:15                   ` FabF
2004-06-01 21:40                     ` Valdis.Kletnieks
2004-06-03 13:54                     ` Bill Davidsen
2004-06-04  0:01                       ` Nick Piggin
2004-06-01 23:17               ` Bernd Eckenfels
2004-06-02  5:38                 ` FabF
2004-06-02 11:42                   ` Con Kolivas
2004-06-02 12:22                     ` John Bradford
2004-06-02 12:22                       ` Con Kolivas
2004-06-02 17:06                     ` FabF
2004-06-03 14:14                     ` Bill Davidsen
2004-06-04  7:23                       ` Buddy Lumpkin
2004-06-04 17:08                         ` Bill Davidsen
2004-06-15 14:55                           ` Charles Shannon Hendrix
2004-06-04  9:11                       ` Catalin BOIE
2004-06-04 17:24                         ` Bill Davidsen
2004-06-06 14:39                       ` Rik van Riel
2004-06-02 17:59                   ` Valdis.Kletnieks
2004-06-02 18:30                     ` FabF
2004-06-02 23:54                       ` Con Kolivas
2004-06-03 16:16                         ` FabF
2004-06-03 23:56                           ` Con Kolivas
2004-06-04  0:16                             ` Con Kolivas
2004-06-03 14:18                     ` Bill Davidsen
2004-06-03 14:27                       ` Con Kolivas
2004-06-02 17:52                 ` Valdis.Kletnieks
2004-06-02  3:50           ` Tim Connors
2004-06-02 17:45             ` Valdis.Kletnieks
2004-06-01  8:34     ` John Bradford
2004-06-01  8:32       ` William Lee Irwin III
2004-06-01  8:50         ` John Bradford
2004-06-01  8:54           ` William Lee Irwin III
2004-06-01  9:10             ` John Bradford
2004-06-08  1:18               ` Tim Connors
2004-06-08  5:29                 ` Denis Vlasenko
2004-06-01  9:38   ` Buddy Lumpkin
2004-06-01 10:13     ` Tim Connors
2004-06-01 10:24       ` William Lee Irwin III
2004-06-01 11:19         ` Tim Connors
2004-05-27 12:31 Piszcz, Justin Michael
2004-05-27 12:41 ` William Lee Irwin III
2004-05-27 15:59   ` John Bradford
2004-05-27 16:16     ` William Lee Irwin III
2004-06-03 13:38   ` Bill Davidsen
     [not found] <fa.fegqf9v.kmidof@ifi.uio.no>
     [not found] ` <fa.bqpvcrs.u648jq@ifi.uio.no>
2004-05-27 11:39   ` Andy Lutomirski
2004-05-28 21:37     ` Denis Vlasenko
2004-05-28 22:28       ` Bernd Eckenfels
2004-05-29  7:31         ` Denis Vlasenko
2004-05-31 10:49         ` jlnance
2004-06-01 11:57           ` Lenar Lõhmus
2004-06-01 12:27             ` Robin Rosenberg
2004-06-01 16:49             ` jlnance
2004-06-02 18:38               ` John Hendrikx
2004-06-01 12:21           ` David B. Stevens
2004-05-27  5:37 Nick Piggin
2004-05-27 17:27 ` Buddy Lumpkin
2004-05-26 12:34 Piszcz, Justin Michael
2004-05-26 11:57 Nick Piggin
2004-05-26 12:19 ` Buddy Lumpkin
2004-05-26 11:04 Nick Piggin
2004-05-26  6:38 Anthony DiSante
2004-05-26  7:31 ` Buddy Lumpkin
2004-05-26  7:55   ` William Lee Irwin III
2004-05-26  8:30     ` Buddy Lumpkin
2004-05-26  8:44       ` Nick Piggin
2004-05-26  9:34         ` John Bradford
2004-05-26  9:48           ` Nick Piggin
2004-05-26 10:10             ` Matthias Schniedermeyer
2004-05-26 10:33               ` Nick Piggin
2004-05-26 10:58                 ` Matthias Schniedermeyer
2004-05-26 11:19                   ` Nick Piggin
2004-05-26 12:27                     ` Matthias Schniedermeyer
2004-05-27  5:38                       ` Nick Piggin
2004-05-26 12:37                     ` Matthias Schniedermeyer
2004-05-26 13:06                       ` Gianni Tedesco
2004-05-26 13:41                         ` Matt H.
2004-05-26 13:55                       ` Buddy Lumpkin
2004-05-27  5:14                       ` Tom Felker
2004-05-27  6:02                         ` Nick Piggin
2004-05-27  7:04                         ` Bernd Eckenfels
2004-05-27  7:16                         ` Oliver Neukum
2004-05-26 10:45               ` Martin Olsson
2004-05-26 11:25                 ` Nick Piggin
2004-05-26 16:33                 ` David Schwartz
2004-05-26 16:58                   ` John Bradford
2004-05-26 23:32                     ` Kyle Moffett
2004-05-27  8:05                       ` John Bradford
2004-05-26 10:46             ` John Bradford
2004-05-26 11:46             ` Buddy Lumpkin
2004-05-26 11:39           ` Buddy Lumpkin
2004-05-26  9:42         ` Anthony DiSante
2004-05-26  9:58           ` Nick Piggin
2004-05-26 20:11             ` Wakko Warner
2004-05-27  5:59               ` Nick Piggin
2004-05-27 14:34                 ` Wakko Warner
2004-05-26 10:40         ` Buddy Lumpkin
2004-05-26 13:15           ` Helge Hafting
2004-05-26  9:09       ` William Lee Irwin III
2004-05-26 11:38         ` Buddy Lumpkin
2004-05-26 12:12           ` Paulo Marques
2004-05-26 12:14           ` Nick Piggin
2004-05-26 12:40           ` Denis Vlasenko
2004-05-26 10:41       ` Denis Vlasenko
2004-05-26 12:07         ` Buddy Lumpkin
2004-05-26 12:06           ` Marc-Christian Petersen
2004-05-26 12:19           ` Denis Vlasenko
2004-05-26 13:48             ` Buddy Lumpkin
2004-05-26 12:33           ` Richard B. Johnson
2004-05-26 13:25             ` Buddy Lumpkin
2004-05-26 12:30         ` Rik van Riel
2004-05-26 10:44       ` Denis Vlasenko
2004-05-26 11:49         ` Buddy Lumpkin
2004-05-26 12:19       ` Rik van Riel
2004-05-26 12:55         ` Buddy Lumpkin
2004-05-26  8:27 ` Roger Luethi
2004-05-26  9:23   ` John Bradford
2004-05-26  9:30     ` Roger Luethi
2004-05-26 10:35       ` John Bradford
2004-05-26 10:37         ` Nick Piggin
2004-05-26 10:48           ` John Bradford
2004-05-26 13:01     ` Helge Hafting
2004-05-26  8:32 ` Denis Vlasenko
2004-05-26  9:00 ` Helge Hafting
2004-05-26  9:40   ` John Bradford
2004-05-26 13:06     ` Helge Hafting
2004-05-26  9:06 ` John Bradford
2004-05-26 12:31   ` Buddy Lumpkin
2004-05-26 10:02 ` Raphael Jacquot
2004-05-26 13:00 ` Satoshi Oshima
2004-05-26 13:38   ` William Lee Irwin III

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).