LKML Archive on lore.kernel.org
help / color / mirror / Atom feed
* [PATCH] ipc: replace costly bailout check in sysvipc_find_ipc()
@ 2021-08-09 20:35 Rafael Aquini
  2021-08-10  1:48 ` Waiman Long
                   ` (4 more replies)
  0 siblings, 5 replies; 8+ messages in thread
From: Rafael Aquini @ 2021-08-09 20:35 UTC (permalink / raw)
  To: linux-kernel; +Cc: Andrew Morton, Manfred Spraul, Davidlohr Bueso, Waiman Long

sysvipc_find_ipc() was left with a costly way to check if the offset
position fed to it is bigger than the total number of IPC IDs in use.
So much so that the time it takes to iterate over /proc/sysvipc/* files
grows exponentially for a custom benchmark that creates "N" SYSV shm
segments and then times the read of /proc/sysvipc/shm (milliseconds):

    12 msecs to read   1024 segs from /proc/sysvipc/shm
    18 msecs to read   2048 segs from /proc/sysvipc/shm
    65 msecs to read   4096 segs from /proc/sysvipc/shm
   325 msecs to read   8192 segs from /proc/sysvipc/shm
  1303 msecs to read  16384 segs from /proc/sysvipc/shm
  5182 msecs to read  32768 segs from /proc/sysvipc/shm

The root problem lies with the loop that computes the total amount of ids
in use to check if the "pos" feeded to sysvipc_find_ipc() grew bigger than
"ids->in_use". That is a quite inneficient way to get to the maximum index
in the id lookup table, specially when that value is already provided by
struct ipc_ids.max_idx.

This patch follows up on the optimization introduced via commit 15df03c879836
("sysvipc: make get_maxid O(1) again") and gets rid of the aforementioned
costly loop replacing it by a simpler checkpoint based on ipc_get_maxidx()
returned value, which allows for a smooth linear increase in time complexity
for the same custom benchmark:

     2 msecs to read   1024 segs from /proc/sysvipc/shm
     2 msecs to read   2048 segs from /proc/sysvipc/shm
     4 msecs to read   4096 segs from /proc/sysvipc/shm
     9 msecs to read   8192 segs from /proc/sysvipc/shm
    19 msecs to read  16384 segs from /proc/sysvipc/shm
    39 msecs to read  32768 segs from /proc/sysvipc/shm

Signed-off-by: Rafael Aquini <aquini@redhat.com>
---
 ipc/util.c | 16 ++++------------
 1 file changed, 4 insertions(+), 12 deletions(-)

diff --git a/ipc/util.c b/ipc/util.c
index 0027e47626b7..d48d8cfa1f3f 100644
--- a/ipc/util.c
+++ b/ipc/util.c
@@ -788,21 +788,13 @@ struct pid_namespace *ipc_seq_pid_ns(struct seq_file *s)
 static struct kern_ipc_perm *sysvipc_find_ipc(struct ipc_ids *ids, loff_t pos,
 					      loff_t *new_pos)
 {
-	struct kern_ipc_perm *ipc;
-	int total, id;
-
-	total = 0;
-	for (id = 0; id < pos && total < ids->in_use; id++) {
-		ipc = idr_find(&ids->ipcs_idr, id);
-		if (ipc != NULL)
-			total++;
-	}
+	struct kern_ipc_perm *ipc = NULL;
+	int max_idx = ipc_get_maxidx(ids);
 
-	ipc = NULL;
-	if (total >= ids->in_use)
+	if (max_idx == -1 || pos > max_idx)
 		goto out;
 
-	for (; pos < ipc_mni; pos++) {
+	for (; pos <= max_idx; pos++) {
 		ipc = idr_find(&ids->ipcs_idr, pos);
 		if (ipc != NULL) {
 			rcu_read_lock();
-- 
2.26.3


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] ipc: replace costly bailout check in sysvipc_find_ipc()
  2021-08-09 20:35 [PATCH] ipc: replace costly bailout check in sysvipc_find_ipc() Rafael Aquini
@ 2021-08-10  1:48 ` Waiman Long
  2021-08-10  4:40 ` Davidlohr Bueso
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 8+ messages in thread
From: Waiman Long @ 2021-08-10  1:48 UTC (permalink / raw)
  To: Rafael Aquini, linux-kernel
  Cc: Andrew Morton, Manfred Spraul, Davidlohr Bueso, Waiman Long

On 8/9/21 4:35 PM, Rafael Aquini wrote:
> sysvipc_find_ipc() was left with a costly way to check if the offset
> position fed to it is bigger than the total number of IPC IDs in use.
> So much so that the time it takes to iterate over /proc/sysvipc/* files
> grows exponentially for a custom benchmark that creates "N" SYSV shm
> segments and then times the read of /proc/sysvipc/shm (milliseconds):
>
>      12 msecs to read   1024 segs from /proc/sysvipc/shm
>      18 msecs to read   2048 segs from /proc/sysvipc/shm
>      65 msecs to read   4096 segs from /proc/sysvipc/shm
>     325 msecs to read   8192 segs from /proc/sysvipc/shm
>    1303 msecs to read  16384 segs from /proc/sysvipc/shm
>    5182 msecs to read  32768 segs from /proc/sysvipc/shm
>
> The root problem lies with the loop that computes the total amount of ids
> in use to check if the "pos" feeded to sysvipc_find_ipc() grew bigger than
> "ids->in_use". That is a quite inneficient way to get to the maximum index
> in the id lookup table, specially when that value is already provided by
> struct ipc_ids.max_idx.
>
> This patch follows up on the optimization introduced via commit 15df03c879836
> ("sysvipc: make get_maxid O(1) again") and gets rid of the aforementioned
> costly loop replacing it by a simpler checkpoint based on ipc_get_maxidx()
> returned value, which allows for a smooth linear increase in time complexity
> for the same custom benchmark:
>
>       2 msecs to read   1024 segs from /proc/sysvipc/shm
>       2 msecs to read   2048 segs from /proc/sysvipc/shm
>       4 msecs to read   4096 segs from /proc/sysvipc/shm
>       9 msecs to read   8192 segs from /proc/sysvipc/shm
>      19 msecs to read  16384 segs from /proc/sysvipc/shm
>      39 msecs to read  32768 segs from /proc/sysvipc/shm
>
> Signed-off-by: Rafael Aquini <aquini@redhat.com>
> ---
>   ipc/util.c | 16 ++++------------
>   1 file changed, 4 insertions(+), 12 deletions(-)
>
> diff --git a/ipc/util.c b/ipc/util.c
> index 0027e47626b7..d48d8cfa1f3f 100644
> --- a/ipc/util.c
> +++ b/ipc/util.c
> @@ -788,21 +788,13 @@ struct pid_namespace *ipc_seq_pid_ns(struct seq_file *s)
>   static struct kern_ipc_perm *sysvipc_find_ipc(struct ipc_ids *ids, loff_t pos,
>   					      loff_t *new_pos)
>   {
> -	struct kern_ipc_perm *ipc;
> -	int total, id;
> -
> -	total = 0;
> -	for (id = 0; id < pos && total < ids->in_use; id++) {
> -		ipc = idr_find(&ids->ipcs_idr, id);
> -		if (ipc != NULL)
> -			total++;
> -	}
> +	struct kern_ipc_perm *ipc = NULL;
> +	int max_idx = ipc_get_maxidx(ids);
>   
> -	ipc = NULL;
> -	if (total >= ids->in_use)
> +	if (max_idx == -1 || pos > max_idx)
>   		goto out;
>   
> -	for (; pos < ipc_mni; pos++) {
> +	for (; pos <= max_idx; pos++) {
>   		ipc = idr_find(&ids->ipcs_idr, pos);
>   		if (ipc != NULL) {
>   			rcu_read_lock();

The "pos > max_idx" check is redundant and so is not really necessary. 
Other than that, the patch looks good to me.

Reviewed-by: Waiman Long <longman@redhat.com>


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] ipc: replace costly bailout check in sysvipc_find_ipc()
  2021-08-09 20:35 [PATCH] ipc: replace costly bailout check in sysvipc_find_ipc() Rafael Aquini
  2021-08-10  1:48 ` Waiman Long
@ 2021-08-10  4:40 ` Davidlohr Bueso
  2021-08-17 18:34 ` Manfred Spraul
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 8+ messages in thread
From: Davidlohr Bueso @ 2021-08-10  4:40 UTC (permalink / raw)
  To: Rafael Aquini; +Cc: linux-kernel, Andrew Morton, Manfred Spraul, Waiman Long

On 2021-08-09 13:35, Rafael Aquini wrote:
> sysvipc_find_ipc() was left with a costly way to check if the offset
> position fed to it is bigger than the total number of IPC IDs in use.
> So much so that the time it takes to iterate over /proc/sysvipc/* files
> grows exponentially for a custom benchmark that creates "N" SYSV shm
> segments and then times the read of /proc/sysvipc/shm (milliseconds):
> 
>     12 msecs to read   1024 segs from /proc/sysvipc/shm
>     18 msecs to read   2048 segs from /proc/sysvipc/shm
>     65 msecs to read   4096 segs from /proc/sysvipc/shm
>    325 msecs to read   8192 segs from /proc/sysvipc/shm
>   1303 msecs to read  16384 segs from /proc/sysvipc/shm
>   5182 msecs to read  32768 segs from /proc/sysvipc/shm
> 
> The root problem lies with the loop that computes the total amount of 
> ids
> in use to check if the "pos" feeded to sysvipc_find_ipc() grew bigger 
> than
> "ids->in_use". That is a quite inneficient way to get to the maximum 
> index
> in the id lookup table, specially when that value is already provided 
> by
> struct ipc_ids.max_idx.
> 
> This patch follows up on the optimization introduced via commit 
> 15df03c879836
> ("sysvipc: make get_maxid O(1) again") and gets rid of the 
> aforementioned
> costly loop replacing it by a simpler checkpoint based on 
> ipc_get_maxidx()
> returned value, which allows for a smooth linear increase in time 
> complexity
> for the same custom benchmark:
> 
>      2 msecs to read   1024 segs from /proc/sysvipc/shm
>      2 msecs to read   2048 segs from /proc/sysvipc/shm
>      4 msecs to read   4096 segs from /proc/sysvipc/shm
>      9 msecs to read   8192 segs from /proc/sysvipc/shm
>     19 msecs to read  16384 segs from /proc/sysvipc/shm
>     39 msecs to read  32768 segs from /proc/sysvipc/shm
> 
> Signed-off-by: Rafael Aquini <aquini@redhat.com>

Acked-by: Davidlohr Bueso <dbueso@suse.de>

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] ipc: replace costly bailout check in sysvipc_find_ipc()
  2021-08-09 20:35 [PATCH] ipc: replace costly bailout check in sysvipc_find_ipc() Rafael Aquini
  2021-08-10  1:48 ` Waiman Long
  2021-08-10  4:40 ` Davidlohr Bueso
@ 2021-08-17 18:34 ` Manfred Spraul
  2021-08-20 19:41 ` Manfred Spraul
  2021-08-28 16:35 ` Manfred Spraul
  4 siblings, 0 replies; 8+ messages in thread
From: Manfred Spraul @ 2021-08-17 18:34 UTC (permalink / raw)
  To: Rafael Aquini, linux-kernel; +Cc: Andrew Morton, Davidlohr Bueso, Waiman Long

Hello Rafael,

I still try to understand the code. It seems, it is more or less 
unchanged from 2009:

|

https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/patch/ipc/util.c?id=7ca7e564e049d8b350ec9d958ff25eaa24226352 
|


On 8/9/21 10:35 PM, Rafael Aquini wrote:
> --- a/ipc/util.c
> +++ b/ipc/util.c
> @@ -788,21 +788,13 @@ struct pid_namespace *ipc_seq_pid_ns(struct seq_file *s)
>   static struct kern_ipc_perm *sysvipc_find_ipc(struct ipc_ids *ids, loff_t pos,
>   					      loff_t *new_pos)
>   {
> -	struct kern_ipc_perm *ipc;
> -	int total, id;
> -
> -	total = 0;
> -	for (id = 0; id < pos && total < ids->in_use; id++) {
> -		ipc = idr_find(&ids->ipcs_idr, id);
> -		if (ipc != NULL)
> -			total++;
> -	}
> +	struct kern_ipc_perm *ipc = NULL;
> +	int max_idx = ipc_get_maxidx(ids);
>   
> -	ipc = NULL;
> -	if (total >= ids->in_use)
> +	if (max_idx == -1 || pos > max_idx)
>   		goto out;
>   
> -	for (; pos < ipc_mni; pos++) {
> +	for (; pos <= max_idx; pos++) {
>   		ipc = idr_find(&ids->ipcs_idr, pos);
>   		if (ipc != NULL) {
>   			rcu_read_lock();

The change looks as correct to me. But I'm still not sure that I really 
understand what the current code does:

- first, loop over index values in the idr, starting from 0, count found 
entries.

- if we found all entries before we are at index=pos: fail

- then search up to ipc_nmi for the next entry with an index >=pos.

- if something is found: use it. otherwise fail.

It seems the code tries to avoid that we loop until ipc_mni after the 
last entry was found, and therefore we loop every time from 0.


 From what I see, the change looks to be correct: You now remove the 
first loop, and instead of searching until ipc_mni, the search ends at 
<= max_idx.

I'll try to find some time to test it.


But: What about using idr_get_next() instead of the idr_find()?

We want to find the next used index, thus idr_get_next() should be even 
better than the for loop, ...


--

     Manfred


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] ipc: replace costly bailout check in sysvipc_find_ipc()
  2021-08-09 20:35 [PATCH] ipc: replace costly bailout check in sysvipc_find_ipc() Rafael Aquini
                   ` (2 preceding siblings ...)
  2021-08-17 18:34 ` Manfred Spraul
@ 2021-08-20 19:41 ` Manfred Spraul
  2021-08-25 22:15   ` Rafael Aquini
  2021-08-28 16:35 ` Manfred Spraul
  4 siblings, 1 reply; 8+ messages in thread
From: Manfred Spraul @ 2021-08-20 19:41 UTC (permalink / raw)
  To: Rafael Aquini, linux-kernel
  Cc: Andrew Morton, Davidlohr Bueso, Waiman Long, 1vier1

[-- Attachment #1: Type: text/plain, Size: 2371 bytes --]

Hi Rafael,

On 8/9/21 10:35 PM, Rafael Aquini wrote:
> sysvipc_find_ipc() was left with a costly way to check if the offset
> position fed to it is bigger than the total number of IPC IDs in use.
> So much so that the time it takes to iterate over /proc/sysvipc/* files
> grows exponentially for a custom benchmark that creates "N" SYSV shm
> segments and then times the read of /proc/sysvipc/shm (milliseconds):
>
>      12 msecs to read   1024 segs from /proc/sysvipc/shm
>      18 msecs to read   2048 segs from /proc/sysvipc/shm
>      65 msecs to read   4096 segs from /proc/sysvipc/shm
>     325 msecs to read   8192 segs from /proc/sysvipc/shm
>    1303 msecs to read  16384 segs from /proc/sysvipc/shm
>    5182 msecs to read  32768 segs from /proc/sysvipc/shm
>
> The root problem lies with the loop that computes the total amount of ids
> in use to check if the "pos" feeded to sysvipc_find_ipc() grew bigger than
> "ids->in_use". That is a quite inneficient way to get to the maximum index
> in the id lookup table, specially when that value is already provided by
> struct ipc_ids.max_idx.
>
> This patch follows up on the optimization introduced via commit 15df03c879836
> ("sysvipc: make get_maxid O(1) again") and gets rid of the aforementioned
> costly loop replacing it by a simpler checkpoint based on ipc_get_maxidx()
> returned value, which allows for a smooth linear increase in time complexity
> for the same custom benchmark:
>
>       2 msecs to read   1024 segs from /proc/sysvipc/shm
>       2 msecs to read   2048 segs from /proc/sysvipc/shm
>       4 msecs to read   4096 segs from /proc/sysvipc/shm
>       9 msecs to read   8192 segs from /proc/sysvipc/shm
>      19 msecs to read  16384 segs from /proc/sysvipc/shm
>      39 msecs to read  32768 segs from /proc/sysvipc/shm

Could you run your test with the attached patch?

The patch switches the code to idr_get_next(), and I see a speedup of 
factor 400 for this test:

- boot with ipcmni_extend

- create ipc object

- echo 16000000 > /proc/sys/kernel/msg_next_id

- create ipc object

- time cat /proc/sysvipc/msg

with current mainline: 8.65 seconds

with the patch: 0.02 seconds


If there are no gaps, then I would assume there is no speed-up compared 
to your patch, but it would be create if you could check

[and check that there is no slow-down]


Thanks,

--

     Manfred


[-- Attachment #2: 0001-PATCH-Improve-sysvipc_find_ipc.patch --]
[-- Type: text/x-patch, Size: 2760 bytes --]

From 4b7975d712db27c3d08731e0ebe4efd684256ca4 Mon Sep 17 00:00:00 2001
From: Manfred Spraul <manfred@colorfullife.com>
Date: Fri, 20 Aug 2021 21:08:12 +0200
Subject: [PATCH] [PATCH] Improve sysvipc_find_ipc()

Initially noticed by Rafael Aquini, see
https://lore.kernel.org/lkml/20210809203554.1562989-1-aquini@redhat.com/

The algorithm used in sysvipc_find_ipc() is highly inefficient.
It actually needs to find the next used index in an idr, and it uses
a for loop to locate that entry.

But: The IDR API contains idr_get_next(), thus switch the code to use
idr_get_next().

In addition: Update a few comments.

Signed-off-by: Manfred Spraul <manfred@colorfullife.com>
---
 ipc/util.c | 43 ++++++++++++++++++++-----------------------
 1 file changed, 20 insertions(+), 23 deletions(-)

diff --git a/ipc/util.c b/ipc/util.c
index 0027e47626b7..083fd6dba1a1 100644
--- a/ipc/util.c
+++ b/ipc/util.c
@@ -783,35 +783,32 @@ struct pid_namespace *ipc_seq_pid_ns(struct seq_file *s)
 }
 
 /*
- * This routine locks the ipc structure found at least at position pos.
+ * This routine locks the ipc structure found at least at index pos.
  */
 static struct kern_ipc_perm *sysvipc_find_ipc(struct ipc_ids *ids, loff_t pos,
 					      loff_t *new_pos)
 {
+	int tmpidx;
 	struct kern_ipc_perm *ipc;
-	int total, id;
-
-	total = 0;
-	for (id = 0; id < pos && total < ids->in_use; id++) {
-		ipc = idr_find(&ids->ipcs_idr, id);
-		if (ipc != NULL)
-			total++;
-	}
 
-	ipc = NULL;
-	if (total >= ids->in_use)
-		goto out;
+	tmpidx = pos;
 
-	for (; pos < ipc_mni; pos++) {
-		ipc = idr_find(&ids->ipcs_idr, pos);
-		if (ipc != NULL) {
-			rcu_read_lock();
-			ipc_lock_object(ipc);
-			break;
-		}
+	ipc = idr_get_next(&ids->ipcs_idr, &tmpidx);
+	if (ipc != NULL) {
+		rcu_read_lock();
+		ipc_lock_object(ipc);
+		/*
+		 * We found the object with the index tmpidx.
+		 * For next search, start with tmpidx+1
+		 */
+		*new_pos = tmpidx + 1;
+	} else {
+		/*
+		 * EOF. seq_file can't notice that, thus
+		 * move the offset by one.
+		 */
+		*new_pos = pos + 1;
 	}
-out:
-	*new_pos = pos + 1;
 	return ipc;
 }
 
@@ -829,7 +826,7 @@ static void *sysvipc_proc_next(struct seq_file *s, void *it, loff_t *pos)
 }
 
 /*
- * File positions: pos 0 -> header, pos n -> ipc id = n - 1.
+ * File positions: pos 0 -> header, pos n -> ipc idx = n - 1.
  * SeqFile iterator: iterator value locked ipc pointer or SEQ_TOKEN_START.
  */
 static void *sysvipc_proc_start(struct seq_file *s, loff_t *pos)
@@ -854,7 +851,7 @@ static void *sysvipc_proc_start(struct seq_file *s, loff_t *pos)
 	if (*pos == 0)
 		return SEQ_START_TOKEN;
 
-	/* Find the (pos-1)th ipc */
+	/* Find the ipc object with the index >= (pos-1) */
 	return sysvipc_find_ipc(ids, *pos - 1, pos);
 }
 
-- 
2.31.1



^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] ipc: replace costly bailout check in sysvipc_find_ipc()
  2021-08-20 19:41 ` Manfred Spraul
@ 2021-08-25 22:15   ` Rafael Aquini
  2021-08-27  2:17     ` Rafael Aquini
  0 siblings, 1 reply; 8+ messages in thread
From: Rafael Aquini @ 2021-08-25 22:15 UTC (permalink / raw)
  To: Manfred Spraul
  Cc: linux-kernel, Andrew Morton, Davidlohr Bueso, Waiman Long, 1vier1

On Fri, Aug 20, 2021 at 09:41:32PM +0200, Manfred Spraul wrote:
> Hi Rafael,
> 
> On 8/9/21 10:35 PM, Rafael Aquini wrote:
> > sysvipc_find_ipc() was left with a costly way to check if the offset
> > position fed to it is bigger than the total number of IPC IDs in use.
> > So much so that the time it takes to iterate over /proc/sysvipc/* files
> > grows exponentially for a custom benchmark that creates "N" SYSV shm
> > segments and then times the read of /proc/sysvipc/shm (milliseconds):
> >
> >      12 msecs to read   1024 segs from /proc/sysvipc/shm
> >      18 msecs to read   2048 segs from /proc/sysvipc/shm
> >      65 msecs to read   4096 segs from /proc/sysvipc/shm
> >     325 msecs to read   8192 segs from /proc/sysvipc/shm
> >    1303 msecs to read  16384 segs from /proc/sysvipc/shm
> >    5182 msecs to read  32768 segs from /proc/sysvipc/shm
> >
> > The root problem lies with the loop that computes the total amount of ids
> > in use to check if the "pos" feeded to sysvipc_find_ipc() grew bigger than
> > "ids->in_use". That is a quite inneficient way to get to the maximum index
> > in the id lookup table, specially when that value is already provided by
> > struct ipc_ids.max_idx.
> >
> > This patch follows up on the optimization introduced via commit 15df03c879836
> > ("sysvipc: make get_maxid O(1) again") and gets rid of the aforementioned
> > costly loop replacing it by a simpler checkpoint based on ipc_get_maxidx()
> > returned value, which allows for a smooth linear increase in time complexity
> > for the same custom benchmark:
> >
> >       2 msecs to read   1024 segs from /proc/sysvipc/shm
> >       2 msecs to read   2048 segs from /proc/sysvipc/shm
> >       4 msecs to read   4096 segs from /proc/sysvipc/shm
> >       9 msecs to read   8192 segs from /proc/sysvipc/shm
> >      19 msecs to read  16384 segs from /proc/sysvipc/shm
> >      39 msecs to read  32768 segs from /proc/sysvipc/shm
> 
> Could you run your test with the attached patch?
>

Manfred, 

Sorry it took me a while to get back to you here. (coming back from a short
leave). I'll take a look into your approach and report back in a few days.

Cheers,
-- Rafael

> The patch switches the code to idr_get_next(), and I see a speedup of 
> factor 400 for this test:
> 
> - boot with ipcmni_extend
> 
> - create ipc object
> 
> - echo 16000000 > /proc/sys/kernel/msg_next_id
> 
> - create ipc object
> 
> - time cat /proc/sysvipc/msg
> 
> with current mainline: 8.65 seconds
> 
> with the patch: 0.02 seconds
> 
> 
> If there are no gaps, then I would assume there is no speed-up compared 
> to your patch, but it would be create if you could check
> 
> [and check that there is no slow-down]
> 
> 
> Thanks,
> 
> --
> 
>      Manfred
> 

> From 4b7975d712db27c3d08731e0ebe4efd684256ca4 Mon Sep 17 00:00:00 2001
> From: Manfred Spraul <manfred@colorfullife.com>
> Date: Fri, 20 Aug 2021 21:08:12 +0200
> Subject: [PATCH] [PATCH] Improve sysvipc_find_ipc()
> 
> Initially noticed by Rafael Aquini, see
> https://lore.kernel.org/lkml/20210809203554.1562989-1-aquini@redhat.com/
> 
> The algorithm used in sysvipc_find_ipc() is highly inefficient.
> It actually needs to find the next used index in an idr, and it uses
> a for loop to locate that entry.
> 
> But: The IDR API contains idr_get_next(), thus switch the code to use
> idr_get_next().
> 
> In addition: Update a few comments.
> 
> Signed-off-by: Manfred Spraul <manfred@colorfullife.com>
> ---
>  ipc/util.c | 43 ++++++++++++++++++++-----------------------
>  1 file changed, 20 insertions(+), 23 deletions(-)
> 
> diff --git a/ipc/util.c b/ipc/util.c
> index 0027e47626b7..083fd6dba1a1 100644
> --- a/ipc/util.c
> +++ b/ipc/util.c
> @@ -783,35 +783,32 @@ struct pid_namespace *ipc_seq_pid_ns(struct seq_file *s)
>  }
>  
>  /*
> - * This routine locks the ipc structure found at least at position pos.
> + * This routine locks the ipc structure found at least at index pos.
>   */
>  static struct kern_ipc_perm *sysvipc_find_ipc(struct ipc_ids *ids, loff_t pos,
>  					      loff_t *new_pos)
>  {
> +	int tmpidx;
>  	struct kern_ipc_perm *ipc;
> -	int total, id;
> -
> -	total = 0;
> -	for (id = 0; id < pos && total < ids->in_use; id++) {
> -		ipc = idr_find(&ids->ipcs_idr, id);
> -		if (ipc != NULL)
> -			total++;
> -	}
>  
> -	ipc = NULL;
> -	if (total >= ids->in_use)
> -		goto out;
> +	tmpidx = pos;
>  
> -	for (; pos < ipc_mni; pos++) {
> -		ipc = idr_find(&ids->ipcs_idr, pos);
> -		if (ipc != NULL) {
> -			rcu_read_lock();
> -			ipc_lock_object(ipc);
> -			break;
> -		}
> +	ipc = idr_get_next(&ids->ipcs_idr, &tmpidx);
> +	if (ipc != NULL) {
> +		rcu_read_lock();
> +		ipc_lock_object(ipc);
> +		/*
> +		 * We found the object with the index tmpidx.
> +		 * For next search, start with tmpidx+1
> +		 */
> +		*new_pos = tmpidx + 1;
> +	} else {
> +		/*
> +		 * EOF. seq_file can't notice that, thus
> +		 * move the offset by one.
> +		 */
> +		*new_pos = pos + 1;
>  	}
> -out:
> -	*new_pos = pos + 1;
>  	return ipc;
>  }
>  
> @@ -829,7 +826,7 @@ static void *sysvipc_proc_next(struct seq_file *s, void *it, loff_t *pos)
>  }
>  
>  /*
> - * File positions: pos 0 -> header, pos n -> ipc id = n - 1.
> + * File positions: pos 0 -> header, pos n -> ipc idx = n - 1.
>   * SeqFile iterator: iterator value locked ipc pointer or SEQ_TOKEN_START.
>   */
>  static void *sysvipc_proc_start(struct seq_file *s, loff_t *pos)
> @@ -854,7 +851,7 @@ static void *sysvipc_proc_start(struct seq_file *s, loff_t *pos)
>  	if (*pos == 0)
>  		return SEQ_START_TOKEN;
>  
> -	/* Find the (pos-1)th ipc */
> +	/* Find the ipc object with the index >= (pos-1) */
>  	return sysvipc_find_ipc(ids, *pos - 1, pos);
>  }
>  
> -- 
> 2.31.1
> 
> 


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] ipc: replace costly bailout check in sysvipc_find_ipc()
  2021-08-25 22:15   ` Rafael Aquini
@ 2021-08-27  2:17     ` Rafael Aquini
  0 siblings, 0 replies; 8+ messages in thread
From: Rafael Aquini @ 2021-08-27  2:17 UTC (permalink / raw)
  To: Manfred Spraul
  Cc: linux-kernel, Andrew Morton, Davidlohr Bueso, Waiman Long, 1vier1

On Wed, Aug 25, 2021 at 06:15:06PM -0400, Rafael Aquini wrote:
> On Fri, Aug 20, 2021 at 09:41:32PM +0200, Manfred Spraul wrote:
> > Hi Rafael,
> > 
> > On 8/9/21 10:35 PM, Rafael Aquini wrote:
> > > sysvipc_find_ipc() was left with a costly way to check if the offset
> > > position fed to it is bigger than the total number of IPC IDs in use.
> > > So much so that the time it takes to iterate over /proc/sysvipc/* files
> > > grows exponentially for a custom benchmark that creates "N" SYSV shm
> > > segments and then times the read of /proc/sysvipc/shm (milliseconds):
> > >
> > >      12 msecs to read   1024 segs from /proc/sysvipc/shm
> > >      18 msecs to read   2048 segs from /proc/sysvipc/shm
> > >      65 msecs to read   4096 segs from /proc/sysvipc/shm
> > >     325 msecs to read   8192 segs from /proc/sysvipc/shm
> > >    1303 msecs to read  16384 segs from /proc/sysvipc/shm
> > >    5182 msecs to read  32768 segs from /proc/sysvipc/shm
> > >
> > > The root problem lies with the loop that computes the total amount of ids
> > > in use to check if the "pos" feeded to sysvipc_find_ipc() grew bigger than
> > > "ids->in_use". That is a quite inneficient way to get to the maximum index
> > > in the id lookup table, specially when that value is already provided by
> > > struct ipc_ids.max_idx.
> > >
> > > This patch follows up on the optimization introduced via commit 15df03c879836
> > > ("sysvipc: make get_maxid O(1) again") and gets rid of the aforementioned
> > > costly loop replacing it by a simpler checkpoint based on ipc_get_maxidx()
> > > returned value, which allows for a smooth linear increase in time complexity
> > > for the same custom benchmark:
> > >
> > >       2 msecs to read   1024 segs from /proc/sysvipc/shm
> > >       2 msecs to read   2048 segs from /proc/sysvipc/shm
> > >       4 msecs to read   4096 segs from /proc/sysvipc/shm
> > >       9 msecs to read   8192 segs from /proc/sysvipc/shm
> > >      19 msecs to read  16384 segs from /proc/sysvipc/shm
> > >      39 msecs to read  32768 segs from /proc/sysvipc/shm
> > 
> > Could you run your test with the attached patch?
> >
> 
> Manfred, 
> 
> Sorry it took me a while to get back to you here. (coming back from a short
> leave). I'll take a look into your approach and report back in a few days.
>

Manfred,

as promised I re-ran the tests, adjusting the shm workload to leave a gap in
the ids between the runs (effectivelly deleting all previously created segments
before recreating the batch for the next turn), and the outcome between the two
patches is virtually the same:

* your patch:

     1 msecs to read   1024 segs from /proc/sysvipc/shm
     2 msecs to read   2048 segs from /proc/sysvipc/shm
     4 msecs to read   4096 segs from /proc/sysvipc/shm
     9 msecs to read   8192 segs from /proc/sysvipc/shm
    24 msecs to read  16384 segs from /proc/sysvipc/shm
    44 msecs to read  32768 segs from /proc/sysvipc/shm



* my patch:

     1 msecs to read   1024 segs from /proc/sysvipc/shm
     2 msecs to read   2048 segs from /proc/sysvipc/shm
     4 msecs to read   4096 segs from /proc/sysvipc/shm
     9 msecs to read   8192 segs from /proc/sysvipc/shm
    22 msecs to read  16384 segs from /proc/sysvipc/shm
    45 msecs to read  32768 segs from /proc/sysvipc/shm



and even perf stat numbers are virtually the same for reading
32768 segments from /proc/sysvipc/shm:

* your patch

 Performance counter stats for 'cat /proc/sysvipc/shm' (10 runs):

             45.03 msec task-clock                #    0.990 CPUs utilized            ( +-  0.27% )
                 0      context-switches          #    0.004 K/sec                    ( +-100.00% )
                 0      cpu-migrations            #    0.004 K/sec                    ( +-100.00% )
                70      page-faults               #    0.002 M/sec                    ( +-  0.61% )
       149,631,684      cycles                    #    3.323 GHz                      ( +-  0.25% )  (82.23%)
         2,148,928      stalled-cycles-frontend   #    1.44% frontend cycles idle     ( +-  2.21% )  (82.24%)
        67,488,510      stalled-cycles-backend    #   45.10% backend cycles idle      ( +-  0.73% )  (83.32%)
       249,186,578      instructions              #    1.67  insn per cycle
                                                  #    0.27  stalled cycles per insn  ( +-  0.03% )  (84.31%)
        62,268,386      branches                  # 1382.790 M/sec                    ( +-  0.02% )  (84.45%)
            83,170      branch-misses             #    0.13% of all branches          ( +-  1.57% )  (83.44%)

          0.045485 +- 0.000142 seconds time elapsed  ( +-  0.31% )



* my patch:

 Performance counter stats for 'cat /proc/sysvipc/shm' (10 runs):

             45.22 msec task-clock                #    0.990 CPUs utilized            ( +-  0.24% )
                 0      context-switches          #    0.002 K/sec                    ( +-100.00% )
                 0      cpu-migrations            #    0.000 K/sec
                70      page-faults               #    0.002 M/sec                    ( +-  0.92% )
       150,273,699      cycles                    #    3.323 GHz                      ( +-  0.25% )  (82.31%)
        12,288,154      stalled-cycles-frontend   #    8.18% frontend cycles idle     ( +-  0.95% )  (82.31%)
        61,198,482      stalled-cycles-backend    #   40.72% backend cycles idle      ( +-  0.91% )  (83.14%)
       245,485,497      instructions              #    1.63  insn per cycle
                                                  #    0.25  stalled cycles per insn  ( +-  0.03% )  (84.41%)
        61,736,363      branches                  # 1365.267 M/sec                    ( +-  0.02% )  (84.52%)
            82,381      branch-misses             #    0.13% of all branches          ( +-  1.17% )  (83.31%)

          0.045671 +- 0.000125 seconds time elapsed  ( +-  0.27% )


I'll leave it up to you, as clearly both approaches do pretty good in reducing
the noted overhead imposed by the old approach.

Since my patch is already in linux-next, may I ask you to ack it and follow it 
up there if you decide on going with your approach? The diff below applies 
cleanly on top of linux-next head (it's the one I used on tests). 

Cheers!
Rafael

--

diff --git a/ipc/util.c b/ipc/util.c
index d48d8cfa1f3f..261540154782 100644
--- a/ipc/util.c
+++ b/ipc/util.c
@@ -788,22 +788,26 @@ struct pid_namespace *ipc_seq_pid_ns(struct seq_file *s)
 static struct kern_ipc_perm *sysvipc_find_ipc(struct ipc_ids *ids, loff_t pos,
 					      loff_t *new_pos)
 {
-	struct kern_ipc_perm *ipc = NULL;
-	int max_idx = ipc_get_maxidx(ids);
+	struct kern_ipc_perm *ipc;
+	int tmpidx = pos;
 
-	if (max_idx == -1 || pos > max_idx)
-		goto out;
-
-	for (; pos <= max_idx; pos++) {
-		ipc = idr_find(&ids->ipcs_idr, pos);
-		if (ipc != NULL) {
-			rcu_read_lock();
-			ipc_lock_object(ipc);
-			break;
-		}
+	ipc = idr_get_next(&ids->ipcs_idr, &tmpidx);
+	if (ipc != NULL) {
+		rcu_read_lock();
+		ipc_lock_object(ipc);
+		/*
+		 * We found the object with the index tmpidx.
+		 * For next search, start with tmpidx+1
+		 */
+		*new_pos = tmpidx + 1;
+	} else {
+		/*
+		 * EOF. seq_file can't notice that, thus
+		 * move the offset by one.
+		 */
+		*new_pos = pos + 1;
 	}
-out:
-	*new_pos = pos + 1;
+
 	return ipc;
 }
 


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] ipc: replace costly bailout check in sysvipc_find_ipc()
  2021-08-09 20:35 [PATCH] ipc: replace costly bailout check in sysvipc_find_ipc() Rafael Aquini
                   ` (3 preceding siblings ...)
  2021-08-20 19:41 ` Manfred Spraul
@ 2021-08-28 16:35 ` Manfred Spraul
  4 siblings, 0 replies; 8+ messages in thread
From: Manfred Spraul @ 2021-08-28 16:35 UTC (permalink / raw)
  To: Rafael Aquini, linux-kernel; +Cc: Andrew Morton, Davidlohr Bueso, Waiman Long

Hi Rafael,

On 8/9/21 10:35 PM, Rafael Aquini wrote:
> sysvipc_find_ipc() was left with a costly way to check if the offset
> position fed to it is bigger than the total number of IPC IDs in use.
> So much so that the time it takes to iterate over /proc/sysvipc/* files
> grows exponentially for a custom benchmark that creates "N" SYSV shm
> segments and then times the read of /proc/sysvipc/shm (milliseconds):
>
>      12 msecs to read   1024 segs from /proc/sysvipc/shm
>      18 msecs to read   2048 segs from /proc/sysvipc/shm
>      65 msecs to read   4096 segs from /proc/sysvipc/shm
>     325 msecs to read   8192 segs from /proc/sysvipc/shm
>    1303 msecs to read  16384 segs from /proc/sysvipc/shm
>    5182 msecs to read  32768 segs from /proc/sysvipc/shm
>
> The root problem lies with the loop that computes the total amount of ids
> in use to check if the "pos" feeded to sysvipc_find_ipc() grew bigger than
> "ids->in_use". That is a quite inneficient way to get to the maximum index
> in the id lookup table, specially when that value is already provided by
> struct ipc_ids.max_idx.
>
> This patch follows up on the optimization introduced via commit 15df03c879836
> ("sysvipc: make get_maxid O(1) again") and gets rid of the aforementioned
> costly loop replacing it by a simpler checkpoint based on ipc_get_maxidx()
> returned value, which allows for a smooth linear increase in time complexity
> for the same custom benchmark:
>
>       2 msecs to read   1024 segs from /proc/sysvipc/shm
>       2 msecs to read   2048 segs from /proc/sysvipc/shm
>       4 msecs to read   4096 segs from /proc/sysvipc/shm
>       9 msecs to read   8192 segs from /proc/sysvipc/shm
>      19 msecs to read  16384 segs from /proc/sysvipc/shm
>      39 msecs to read  32768 segs from /proc/sysvipc/shm
>
> Signed-off-by: Rafael Aquini <aquini@redhat.com>

I've tested the patch, no issues or objections.
Thus: Let's merge your patch, and then switch to ipc_get_next() later.

Acked-by: Manfred Spraul <manfred@colorfullife.com>

--

     Manfred


^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2021-08-28 16:36 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-08-09 20:35 [PATCH] ipc: replace costly bailout check in sysvipc_find_ipc() Rafael Aquini
2021-08-10  1:48 ` Waiman Long
2021-08-10  4:40 ` Davidlohr Bueso
2021-08-17 18:34 ` Manfred Spraul
2021-08-20 19:41 ` Manfred Spraul
2021-08-25 22:15   ` Rafael Aquini
2021-08-27  2:17     ` Rafael Aquini
2021-08-28 16:35 ` Manfred Spraul

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).