LKML Archive on lore.kernel.orghelp / color / mirror / Atom feed

*[PATCH] lib/kstrtox.c break if overflow is detected[not found] <aksgarg1989@gmail.com>@ 2015-01-21 19:26 ` Anshul Garg2015-01-21 19:39 ` Levente Kurusa 2015-01-22 13:58 ` [PATCH] lib/kstrtox.c Stop parsing integer on overflow Anshul Garg ` (6 subsequent siblings) 7 siblings, 1 reply; 41+ messages in thread From: Anshul Garg @ 2015-01-21 19:26 UTC (permalink / raw) To: akpm, levex, felipe.contreras, linux-kernel;+Cc:aksgarg1989, anshul.g From: Anshul Garg <aksgarg1989@gmail.com> 1. While converting string representation to integer break the loop if overflow is detected. 2. Clean kstrtoll function Signed-off-by: Anshul Garg <aksgarg1989@gmail.com> --- lib/kstrtox.c | 28 +++++++++++++--------------- 1 file changed, 13 insertions(+), 15 deletions(-) diff --git a/lib/kstrtox.c b/lib/kstrtox.c index ec8da78..8cbe5ca 100644 --- a/lib/kstrtox.c +++ b/lib/kstrtox.c @@ -70,8 +70,10 @@ unsigned int _parse_integer(const char *s, unsigned int base, unsigned long long * it in the max base we support (16) */ if (unlikely(res & (~0ull << 60))) { - if (res > div_u64(ULLONG_MAX - val, base)) + if (res > div_u64(ULLONG_MAX - val, base)) { overflow = 1; + break; + } } res = res * base + val; rv++; @@ -146,23 +148,19 @@ EXPORT_SYMBOL(kstrtoull); int kstrtoll(const char *s, unsigned int base, long long *res) { unsigned long long tmp; - int rv; + int rv, sign = 1; if (s[0] == '-') { - rv = _kstrtoull(s + 1, base, &tmp); - if (rv < 0) - return rv; - if ((long long)(-tmp) >= 0) - return -ERANGE; - *res = -tmp; - } else { - rv = kstrtoull(s, base, &tmp); - if (rv < 0) - return rv; - if ((long long)tmp < 0) - return -ERANGE; - *res = tmp; + sign = -1; + s++; } + + rv = kstrtoull(s, base, &tmp); + if (rv < 0) + return rv; + if ((long long)tmp < 0) + return -ERANGE; + *res = sign * tmp; return 0; } EXPORT_SYMBOL(kstrtoll); -- 1.7.9.5 --- This email has been checked for viruses by Avast antivirus software. http://www.avast.com ^ permalink raw reply [flat|nested] 41+ messages in thread

*Re: [PATCH] lib/kstrtox.c break if overflow is detected2015-01-21 19:26 ` [PATCH] lib/kstrtox.c break if overflow is detected Anshul Garg@ 2015-01-21 19:39 ` Levente Kurusa2015-01-22 13:38 ` Anshul Garg 0 siblings, 1 reply; 41+ messages in thread From: Levente Kurusa @ 2015-01-21 19:39 UTC (permalink / raw) To: Anshul Garg;+Cc:akpm, felipe.contreras, linux-kernel, anshul.g [-- Attachment #1: Type: text/plain, Size: 2030 bytes --] Hi, On Wed, Jan 21, 2015 at 11:26:26AM -0800, Anshul Garg wrote: > From: Anshul Garg <aksgarg1989@gmail.com> > > 1. While converting string representation to integer > break the loop if overflow is detected. > 2. Clean kstrtoll function > > Signed-off-by: Anshul Garg <aksgarg1989@gmail.com> > --- > lib/kstrtox.c | 28 +++++++++++++--------------- > 1 file changed, 13 insertions(+), 15 deletions(-) > > diff --git a/lib/kstrtox.c b/lib/kstrtox.c > index ec8da78..8cbe5ca 100644 > --- a/lib/kstrtox.c > +++ b/lib/kstrtox.c > @@ -70,8 +70,10 @@ unsigned int _parse_integer(const char *s, unsigned int base, unsigned long long > * it in the max base we support (16) > */ > if (unlikely(res & (~0ull << 60))) { > - if (res > div_u64(ULLONG_MAX - val, base)) > + if (res > div_u64(ULLONG_MAX - val, base)) { > overflow = 1; > + break; > + } > } > res = res * base + val; > rv++; > @@ -146,23 +148,19 @@ EXPORT_SYMBOL(kstrtoull); > int kstrtoll(const char *s, unsigned int base, long long *res) > { > unsigned long long tmp; > - int rv; > + int rv, sign = 1; > > if (s[0] == '-') { > - rv = _kstrtoull(s + 1, base, &tmp); > - if (rv < 0) > - return rv; > - if ((long long)(-tmp) >= 0) > - return -ERANGE; > - *res = -tmp; > - } else { > - rv = kstrtoull(s, base, &tmp); > - if (rv < 0) > - return rv; > - if ((long long)tmp < 0) > - return -ERANGE; > - *res = tmp; > + sign = -1; > + s++; > } > + > + rv = kstrtoull(s, base, &tmp); > + if (rv < 0) > + return rv; > + if ((long long)tmp < 0) > + return -ERANGE; > + *res = sign * tmp; > return 0; > } > EXPORT_SYMBOL(kstrtoll); Looks correct to me, so: Reviewed-by: Levente Kurusa <levex@linux.com> But I believe the two hunks are completely unrelated to eachother and hence should split into a patch series. Could you please do that and resend? Or if Andrew is OK with picking it up as is, then it's fine. Thanks, Levente. [-- Attachment #2: Digital signature --] [-- Type: application/pgp-signature, Size: 490 bytes --] ^ permalink raw reply [flat|nested] 41+ messages in thread

*Re: [PATCH] lib/kstrtox.c break if overflow is detected2015-01-21 19:39 ` Levente Kurusa@ 2015-01-22 13:38 ` Anshul Garg0 siblings, 0 replies; 41+ messages in thread From: Anshul Garg @ 2015-01-22 13:38 UTC (permalink / raw) To: Levente Kurusa;+Cc:akpm, Felipe Contreras, linux-kernel, anshul.g Dear Mr. Levente , Thanks for the reply. I will segregate this patch in two different patches. I will send these patches again. Thanks Anshul Garg On Thu, Jan 22, 2015 at 1:09 AM, Levente Kurusa <levex@linux.com> wrote: > Hi, > > On Wed, Jan 21, 2015 at 11:26:26AM -0800, Anshul Garg wrote: >> From: Anshul Garg <aksgarg1989@gmail.com> >> >> 1. While converting string representation to integer >> break the loop if overflow is detected. >> 2. Clean kstrtoll function >> >> Signed-off-by: Anshul Garg <aksgarg1989@gmail.com> >> --- >> lib/kstrtox.c | 28 +++++++++++++--------------- >> 1 file changed, 13 insertions(+), 15 deletions(-) >> >> diff --git a/lib/kstrtox.c b/lib/kstrtox.c >> index ec8da78..8cbe5ca 100644 >> --- a/lib/kstrtox.c >> +++ b/lib/kstrtox.c >> @@ -70,8 +70,10 @@ unsigned int _parse_integer(const char *s, unsigned int base, unsigned long long >> * it in the max base we support (16) >> */ >> if (unlikely(res & (~0ull << 60))) { >> - if (res > div_u64(ULLONG_MAX - val, base)) >> + if (res > div_u64(ULLONG_MAX - val, base)) { >> overflow = 1; >> + break; >> + } >> } >> res = res * base + val; >> rv++; >> @@ -146,23 +148,19 @@ EXPORT_SYMBOL(kstrtoull); >> int kstrtoll(const char *s, unsigned int base, long long *res) >> { >> unsigned long long tmp; >> - int rv; >> + int rv, sign = 1; >> >> if (s[0] == '-') { >> - rv = _kstrtoull(s + 1, base, &tmp); >> - if (rv < 0) >> - return rv; >> - if ((long long)(-tmp) >= 0) >> - return -ERANGE; >> - *res = -tmp; >> - } else { >> - rv = kstrtoull(s, base, &tmp); >> - if (rv < 0) >> - return rv; >> - if ((long long)tmp < 0) >> - return -ERANGE; >> - *res = tmp; >> + sign = -1; >> + s++; >> } >> + >> + rv = kstrtoull(s, base, &tmp); >> + if (rv < 0) >> + return rv; >> + if ((long long)tmp < 0) >> + return -ERANGE; >> + *res = sign * tmp; >> return 0; >> } >> EXPORT_SYMBOL(kstrtoll); > > Looks correct to me, so: > > Reviewed-by: Levente Kurusa <levex@linux.com> > > But I believe the two hunks are completely unrelated to > eachother and hence should split into a patch series. > > Could you please do that and resend? Or if Andrew is OK > with picking it up as is, then it's fine. > > Thanks, > Levente. ^ permalink raw reply [flat|nested] 41+ messages in thread

*[PATCH] lib/kstrtox.c Stop parsing integer on overflow[not found] <aksgarg1989@gmail.com> 2015-01-21 19:26 ` [PATCH] lib/kstrtox.c break if overflow is detected Anshul Garg@ 2015-01-22 13:58 ` Anshul Garg2015-01-29 13:35 ` [PATCH] lib/int_sqrt.c: Optimize square root function Anshul Garg ` (5 subsequent siblings) 7 siblings, 0 replies; 41+ messages in thread From: Anshul Garg @ 2015-01-22 13:58 UTC (permalink / raw) To: akpm, levex, felipe.contreras, linux-kernel;+Cc:aksgarg1989, anshul.g From: Anshul Garg <aksgarg1989@gmail.com> While converting string representation to integer break the loop if overflow is detected. Signed-off-by: Anshul Garg <aksgarg1989@gmail.com> --- lib/kstrtox.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/lib/kstrtox.c b/lib/kstrtox.c index ec8da78..6f30209 100644 --- a/lib/kstrtox.c +++ b/lib/kstrtox.c @@ -70,8 +70,10 @@ unsigned int _parse_integer(const char *s, unsigned int base, unsigned long long * it in the max base we support (16) */ if (unlikely(res & (~0ull << 60))) { - if (res > div_u64(ULLONG_MAX - val, base)) + if (res > div_u64(ULLONG_MAX - val, base)) { overflow = 1; + break; + } } res = res * base + val; rv++; -- 1.7.9.5 --- This email has been checked for viruses by Avast antivirus software. http://www.avast.com ^ permalink raw reply [flat|nested] 41+ messages in thread

*[PATCH] lib/int_sqrt.c: Optimize square root function[not found] <aksgarg1989@gmail.com> 2015-01-21 19:26 ` [PATCH] lib/kstrtox.c break if overflow is detected Anshul Garg 2015-01-22 13:58 ` [PATCH] lib/kstrtox.c Stop parsing integer on overflow Anshul Garg@ 2015-01-29 13:35 ` Anshul Garg2015-01-31 17:31 ` [RESEND PATCH] " Anshul Garg ` (4 subsequent siblings) 7 siblings, 0 replies; 41+ messages in thread From: Anshul Garg @ 2015-01-29 13:35 UTC (permalink / raw) To: linux-kernel;+Cc:aksgarg1989, anshul.g From: Anshul Garg <aksgarg1989@gmail.com> Unnecessary instructions are executing even though m is greater than x so added logic to make m less than equal to x before performing these operations. Signed-off-by: Anshul Garg <aksgarg1989@gmail.com> --- lib/int_sqrt.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/lib/int_sqrt.c b/lib/int_sqrt.c index 1ef4cc3..64ae722 100644 --- a/lib/int_sqrt.c +++ b/lib/int_sqrt.c @@ -22,6 +22,9 @@ unsigned long int_sqrt(unsigned long x) return x; m = 1UL << (BITS_PER_LONG - 2); + + while (m > x) + m >>= 2; while (m != 0) { b = y + m; y >>= 1; -- 1.7.9.5 --- This email has been checked for viruses by Avast antivirus software. http://www.avast.com ^ permalink raw reply [flat|nested] 41+ messages in thread

*[RESEND PATCH] lib/int_sqrt.c: Optimize square root function[not found] <aksgarg1989@gmail.com> ` (2 preceding siblings ...) 2015-01-29 13:35 ` [PATCH] lib/int_sqrt.c: Optimize square root function Anshul Garg@ 2015-01-31 17:31 ` Anshul Garg2015-02-02 17:12 ` [PATCH] " Anshul Garg ` (3 subsequent siblings) 7 siblings, 0 replies; 41+ messages in thread From: Anshul Garg @ 2015-01-31 17:31 UTC (permalink / raw) To: linux-kernel;+Cc:aksgarg1989, anshul.g From: Anshul Garg <aksgarg1989@gmail.com> Unnecessary instructions are executing even though m is greater than x so added logic to make m less than equal to x before performing these operations. Signed-off-by: Anshul Garg <aksgarg1989@gmail.com> --- lib/int_sqrt.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/lib/int_sqrt.c b/lib/int_sqrt.c index 1ef4cc3..64ae722 100644 --- a/lib/int_sqrt.c +++ b/lib/int_sqrt.c @@ -22,6 +22,9 @@ unsigned long int_sqrt(unsigned long x) return x; m = 1UL << (BITS_PER_LONG - 2); + + while (m > x) + m >>= 2; while (m != 0) { b = y + m; y >>= 1; -- 1.7.9.5 --- This email has been checked for viruses by Avast antivirus software. http://www.avast.com ^ permalink raw reply [flat|nested] 41+ messages in thread

*[PATCH] lib/int_sqrt.c: Optimize square root function[not found] <aksgarg1989@gmail.com> ` (3 preceding siblings ...) 2015-01-31 17:31 ` [RESEND PATCH] " Anshul Garg@ 2015-02-02 17:12 ` Anshul Garg2015-02-02 19:00 ` Linus Torvalds 2015-02-02 19:10 ` Joe Perches 2015-02-16 18:48 ` [PATCH] lib/kstrtox.c Stop parsing integer on overflow Anshul Garg ` (2 subsequent siblings) 7 siblings, 2 replies; 41+ messages in thread From: Anshul Garg @ 2015-02-02 17:12 UTC (permalink / raw) To: linux-kernel;+Cc:aksgarg1989, anshul.g, torvalds From: Anshul Garg <aksgarg1989@gmail.com> Unnecessary instructions are executing even though m is greater than x so added logic to make m less than equal to x before performing these operations. Signed-off-by: Anshul Garg <aksgarg1989@gmail.com> --- lib/int_sqrt.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/lib/int_sqrt.c b/lib/int_sqrt.c index 1ef4cc3..64ae722 100644 --- a/lib/int_sqrt.c +++ b/lib/int_sqrt.c @@ -22,6 +22,9 @@ unsigned long int_sqrt(unsigned long x) return x; m = 1UL << (BITS_PER_LONG - 2); + + while (m > x) + m >>= 2; while (m != 0) { b = y + m; y >>= 1; -- 1.7.9.5 --- This email has been checked for viruses by Avast antivirus software. http://www.avast.com ^ permalink raw reply [flat|nested] 41+ messages in thread

*Re: [PATCH] lib/int_sqrt.c: Optimize square root function2015-02-02 17:12 ` [PATCH] " Anshul Garg@ 2015-02-02 19:00 ` Linus Torvalds2015-02-02 19:13 ` Linus Torvalds 2015-02-03 4:47 ` Davidlohr Bueso 2015-02-02 19:10 ` Joe Perches 1 sibling, 2 replies; 41+ messages in thread From: Linus Torvalds @ 2015-02-02 19:00 UTC (permalink / raw) To: Anshul Garg, Davidlohr Bueso;+Cc:Linux Kernel Mailing List, anshul.g Hmm. I don't disagree, but would like some more feedback. Davidlohr - you were the person to touch this function last (commit 30493cc9dddb: "lib/int_sqrt.c: optimize square root algorithm"), and you did so for performance reasons. And in fact, when you did that, you removed that initial loop: - one = 1UL << (BITS_PER_LONG - 2); - while (one > op) - one >>= 2; but I'm not sure that was actually all that conscious, I think the real optimization was the changes inside the loop to make the final real loop faster and simpler. Also, you had performance numbers, so presumably a test harness for it all. It probably depends a lot on the actual distribution of argument values, of course, but it would be good to accompany the patch with actual real numbers like lasty time. (I'm also not entirely sure what uses int_sqrt() that ends up being so performance-critical, so it would be good to document that too, since that probably also matters for the "what's the normal argument range" question..) Linus On Mon, Feb 2, 2015 at 9:12 AM, Anshul Garg <aksgarg1989@gmail.com> wrote: > From: Anshul Garg <aksgarg1989@gmail.com> > > Unnecessary instructions are executing even though m is > greater than x so added logic to make m less than equal to > x before performing these operations. > > Signed-off-by: Anshul Garg <aksgarg1989@gmail.com> > --- > lib/int_sqrt.c | 3 +++ > 1 file changed, 3 insertions(+) > > diff --git a/lib/int_sqrt.c b/lib/int_sqrt.c > index 1ef4cc3..64ae722 100644 > --- a/lib/int_sqrt.c > +++ b/lib/int_sqrt.c > @@ -22,6 +22,9 @@ unsigned long int_sqrt(unsigned long x) > return x; > > m = 1UL << (BITS_PER_LONG - 2); > + > + while (m > x) > + m >>= 2; > while (m != 0) { > b = y + m; > y >>= 1; > -- > 1.7.9.5 > > > --- > This email has been checked for viruses by Avast antivirus software. > http://www.avast.com > ^ permalink raw reply [flat|nested] 41+ messages in thread

*Re: [PATCH] lib/int_sqrt.c: Optimize square root function2015-02-02 17:12 ` [PATCH] " Anshul Garg 2015-02-02 19:00 ` Linus Torvalds@ 2015-02-02 19:10 ` Joe Perches2015-02-02 19:39 ` Linus Torvalds 1 sibling, 1 reply; 41+ messages in thread From: Joe Perches @ 2015-02-02 19:10 UTC (permalink / raw) To: Anshul Garg;+Cc:linux-kernel, anshul.g, torvalds On Mon, 2015-02-02 at 09:12 -0800, Anshul Garg wrote: > From: Anshul Garg <aksgarg1989@gmail.com> > > Unnecessary instructions are executing even though m is > greater than x so added logic to make m less than equal to > x before performing these operations. > > Signed-off-by: Anshul Garg <aksgarg1989@gmail.com> > --- > lib/int_sqrt.c | 3 +++ > 1 file changed, 3 insertions(+) > > diff --git a/lib/int_sqrt.c b/lib/int_sqrt.c [] > @@ -22,6 +22,9 @@ unsigned long int_sqrt(unsigned long x) > return x; > > m = 1UL << (BITS_PER_LONG - 2); > + > + while (m > x) > + m >>= 2; Perhaps removing the while and using fls(x) would be better. Perhaps something like: m = 1UL << min(sizeof(unsigned long) == sizeof(u64) ? fls64(x) : fls(x), BITS_PER_LONG - 2); > while (m != 0) { > b = y + m; > y >>= 1; ^ permalink raw reply [flat|nested] 41+ messages in thread

*Re: [PATCH] lib/int_sqrt.c: Optimize square root function2015-02-02 19:00 ` Linus Torvalds@ 2015-02-02 19:13 ` Linus Torvalds2015-02-03 4:41 ` Davidlohr Bueso 2017-07-20 11:24 ` Peter Zijlstra 2015-02-03 4:47 ` Davidlohr Bueso 1 sibling, 2 replies; 41+ messages in thread From: Linus Torvalds @ 2015-02-02 19:13 UTC (permalink / raw) To: Anshul Garg, Davidlohr Bueso;+Cc:Linux Kernel Mailing List, anshul.g On Mon, Feb 2, 2015 at 11:00 AM, Linus Torvalds <torvalds@linux-foundation.org> wrote: > > (I'm also not entirely sure what uses int_sqrt() that ends up being so > performance-critical, so it would be good to document that too, since > that probably also matters for the "what's the normal argument range" > question..) ... it's also not entirely clear that we need a whole new loop. We might just instead start off with a better guess for 'm' using some calculation that might be doable with a single conditional move instruction instead of a loop. Because I suspect that the inevitable branch misprediction of a new loop is likely as expensive as a few iterations through the core one. IOW, instead of m = 1UL << (BITS_PER_LONG - 2); perhaps something like m = 1UL << (BITS_PER_LONG/2- 2); if (m < x) m <<= BITS_PER_LONG/2; (assuming gcc can change that code into a "cmov") might cut down the "lots of empty loops" case in half for small values of 'x'. There's probably some other better cheap initial guess value estimator. Linus ^ permalink raw reply [flat|nested] 41+ messages in thread

*Re: [PATCH] lib/int_sqrt.c: Optimize square root function2015-02-02 19:10 ` Joe Perches@ 2015-02-02 19:39 ` Linus Torvalds0 siblings, 0 replies; 41+ messages in thread From: Linus Torvalds @ 2015-02-02 19:39 UTC (permalink / raw) To: Joe Perches;+Cc:Anshul Garg, Linux Kernel Mailing List, anshul.g On Mon, Feb 2, 2015 at 11:10 AM, Joe Perches <joe@perches.com> wrote: > > Perhaps removing the while and using fls(x) would be better. No. fls is often slow, and you then do have to also round it down to an even bit number etc, so it gets somewhat complex too. We *probably* have some argument range that we care more about, which is why I'd like to know what the profile is that triggered this optimization, and what the common argument range is. For example, one user is the cpu frequency governor, which seems to try to predict the length of the next sleep. The values can probably be anything (it has protections for overflowing into "long long"), but presumably the onyl time we care about performance is when it's called very very often, in which case I'm going out on a limb and saying that the intervals will be short, and the standard deviation of those intervals will also be small. So *maybe* that function really only cares about performance when we have small arguments. I dunno. Linus ^ permalink raw reply [flat|nested] 41+ messages in thread

*Re: [PATCH] lib/int_sqrt.c: Optimize square root function2015-02-02 19:13 ` Linus Torvalds@ 2015-02-03 4:41 ` Davidlohr Bueso2015-02-03 4:57 ` Davidlohr Bueso 2015-02-03 15:54 ` Anshul Garg 2017-07-20 11:24 ` Peter Zijlstra 1 sibling, 2 replies; 41+ messages in thread From: Davidlohr Bueso @ 2015-02-03 4:41 UTC (permalink / raw) To: Linus Torvalds;+Cc:Anshul Garg, Linux Kernel Mailing List, anshul.g On Mon, 2015-02-02 at 11:13 -0800, Linus Torvalds wrote: > IOW, instead of > > m = 1UL << (BITS_PER_LONG - 2); > > perhaps something like > > m = 1UL << (BITS_PER_LONG/2- 2); > if (m < x) > m <<= BITS_PER_LONG/2; > > (assuming gcc can change that code into a "cmov") might cut down the > "lots of empty loops" case in half for small values of 'x'. Makes a lot of sense. > There's probably some other better cheap initial guess value estimator. Just to get a feeling for the normal arg range in the non-driver parts that use this thing: fs/ceph/super.h: congestion_kb = (16*int_sqrt(totalram_pages)) << (PAGE_SHIFT-10); fs/nfs/write.c: nfs_congestion_kb = (16*int_sqrt(totalram_pages)) << (PAGE_SHIFT-10); fs/nfsd/nfscache.c: limit = (16 * int_sqrt(low_pages)) << (PAGE_SHIFT-10); kernel/rcu/tree_plugin.h: ls = int_sqrt(nr_cpu_ids); mm/memcontrol.c: inactive_ratio = int_sqrt(10 * gb); mm/page_alloc.c: ratio = int_sqrt(10 * gb); mm/page_alloc.c: new_min_free_kbytes = int_sqrt(lowmem_kbytes * 16); So mostly values that scale according to mem or cpu. ^ permalink raw reply [flat|nested] 41+ messages in thread

*Re: [PATCH] lib/int_sqrt.c: Optimize square root function2015-02-02 19:00 ` Linus Torvalds 2015-02-02 19:13 ` Linus Torvalds@ 2015-02-03 4:47 ` Davidlohr Bueso2015-02-03 15:42 ` Anshul Garg 1 sibling, 1 reply; 41+ messages in thread From: Davidlohr Bueso @ 2015-02-03 4:47 UTC (permalink / raw) To: Linus Torvalds;+Cc:Anshul Garg, Linux Kernel Mailing List, anshul.g On Mon, 2015-02-02 at 11:00 -0800, Linus Torvalds wrote: > Hmm. I don't disagree, but would like some more feedback. > > Davidlohr - you were the person to touch this function last (commit > 30493cc9dddb: "lib/int_sqrt.c: optimize square root algorithm"), and > you did so for performance reasons. And in fact, when you did that, > you removed that initial loop: > > - one = 1UL << (BITS_PER_LONG - 2); > - while (one > op) > - one >>= 2; > > but I'm not sure that was actually all that conscious, I think the > real optimization was the changes inside the loop to make the final > real loop faster and simpler. I missed that. And yes, the real optimization should be in the loop. > > Also, you had performance numbers, so presumably a test harness for it > all. It probably depends a lot on the actual distribution of argument > values, of course, but it would be good to accompany the patch with > actual real numbers like lasty time. Aha. In my case I recall I ran a usersapce program using each function from 1 to a million, and throwing perf at it for 10 times. > (I'm also not entirely sure what uses int_sqrt() that ends up being so > performance-critical, so it would be good to document that too, since > that probably also matters for the "what's the normal argument range" > question..) It's not a big deal afaik. Thanks, Davidlohr ^ permalink raw reply [flat|nested] 41+ messages in thread

*Re: [PATCH] lib/int_sqrt.c: Optimize square root function2015-02-03 4:41 ` Davidlohr Bueso@ 2015-02-03 4:57 ` Davidlohr Bueso2015-02-03 15:54 ` Anshul Garg 1 sibling, 0 replies; 41+ messages in thread From: Davidlohr Bueso @ 2015-02-03 4:57 UTC (permalink / raw) To: Linus Torvalds;+Cc:Anshul Garg, Linux Kernel Mailing List, anshul.g On Mon, 2015-02-02 at 20:41 -0800, Davidlohr Bueso wrote: > On Mon, 2015-02-02 at 11:13 -0800, Linus Torvalds wrote: > > IOW, instead of > > > > m = 1UL << (BITS_PER_LONG - 2); > > > > perhaps something like > > > > m = 1UL << (BITS_PER_LONG/2- 2); > > if (m < x) > > m <<= BITS_PER_LONG/2; Assuming small values mostly, we could also try more aggressive estimators for BITS_PER_LONG == 64. But then again, it probably doesn't matter. ^ permalink raw reply [flat|nested] 41+ messages in thread

*Re: [PATCH] lib/int_sqrt.c: Optimize square root function2015-02-03 4:47 ` Davidlohr Bueso@ 2015-02-03 15:42 ` Anshul Garg2015-02-05 18:20 ` Linus Torvalds 0 siblings, 1 reply; 41+ messages in thread From: Anshul Garg @ 2015-02-03 15:42 UTC (permalink / raw) To: Davidlohr Bueso;+Cc:Linus Torvalds, Linux Kernel Mailing List, anshul.g On Tue, Feb 3, 2015 at 10:17 AM, Davidlohr Bueso <dave@stgolabs.net> wrote: > On Mon, 2015-02-02 at 11:00 -0800, Linus Torvalds wrote: >> Hmm. I don't disagree, but would like some more feedback. >> >> Davidlohr - you were the person to touch this function last (commit >> 30493cc9dddb: "lib/int_sqrt.c: optimize square root algorithm"), and >> you did so for performance reasons. And in fact, when you did that, >> you removed that initial loop: >> >> - one = 1UL << (BITS_PER_LONG - 2); >> - while (one > op) >> - one >>= 2; >> >> but I'm not sure that was actually all that conscious, I think the >> real optimization was the changes inside the loop to make the final >> real loop faster and simpler. > > I missed that. And yes, the real optimization should be in the loop. > >> >> Also, you had performance numbers, so presumably a test harness for it >> all. It probably depends a lot on the actual distribution of argument >> values, of course, but it would be good to accompany the patch with >> actual real numbers like lasty time. > > Aha. In my case I recall I ran a usersapce program using each function > from 1 to a million, and throwing perf at it for 10 times. > I have done profiling of int_sqrt function using perf tool for 10 times. For this purpose i have created a userspace program which uses sqrt function from 1 to a million. int_sqrt_old -> current algorithm version int_sqrt_new -> with proposed change these results are for BITS_PER_LONG=64. Performance counter stats for './int_sqrt_old' (10 runs): 460.944061 task-clock (msec) # 0.969 CPUs utilized ( +- 1.72% ) 64 context-switches # 0.139 K/sec ( +- 2.27% ) 0 cpu-migrations # 0.000 K/sec 132 page-faults # 0.286 K/sec <not supported> cycles <not supported> stalled-cycles-frontend <not supported> stalled-cycles-backend <not supported> instructions <not supported> branches <not supported> branch-misses 0.475795341 seconds time elapsed( +- 3.20% ) Performance counter stats for './int_sqrt_new' (10 runs): 401.782119 task-clock (msec) # 0.974 CPUs utilized( +- 1.55% ) 57 context-switches # 0.141 K/sec( +- 1.92% ) 0 cpu-migrations # 0.000 K/sec 132 page-faults # 0.329 K/sec <not supported> cycles <not supported> stalled-cycles-frontend <not supported> stalled-cycles-backend <not supported> instructions <not supported> branches <not supported> branch-misses 0.412593296 seconds time elapsed( +- 2.03% ) As per profiling definitely there is improvement in algorithm timing. >> (I'm also not entirely sure what uses int_sqrt() that ends up being so >> performance-critical, so it would be good to document that too, since >> that probably also matters for the "what's the normal argument range" >> question..) > > It's not a big deal afaik. > > Thanks, > Davidlohr > ^ permalink raw reply [flat|nested] 41+ messages in thread

*Re: [PATCH] lib/int_sqrt.c: Optimize square root function2015-02-03 4:41 ` Davidlohr Bueso 2015-02-03 4:57 ` Davidlohr Bueso@ 2015-02-03 15:54 ` Anshul Garg1 sibling, 0 replies; 41+ messages in thread From: Anshul Garg @ 2015-02-03 15:54 UTC (permalink / raw) To: Davidlohr Bueso;+Cc:Linus Torvalds, Linux Kernel Mailing List, anshul.g On Tue, Feb 3, 2015 at 10:11 AM, Davidlohr Bueso <dave@stgolabs.net> wrote: > On Mon, 2015-02-02 at 11:13 -0800, Linus Torvalds wrote: >> IOW, instead of >> >> m = 1UL << (BITS_PER_LONG - 2); >> >> perhaps something like >> >> m = 1UL << (BITS_PER_LONG/2- 2); >> if (m < x) >> m <<= BITS_PER_LONG/2; On doing this change also extra instruction(shown below) will execute if x is small(say 10000). b = y + m; y >>= 1; if (x >= b) { Although i understand your point that we can set value of m according to range of x. But in future also range of x can change if apart from memory some other module starts using this function more extensively. So in worst case it will be almost same as current implementation. So i think its better to scale m according to x by doing while(m > x) m>>=2; rather than according to range of x. Thanks Anshul Garg >> >> (assuming gcc can change that code into a "cmov") might cut down the >> "lots of empty loops" case in half for small values of 'x'. > > Makes a lot of sense. > >> There's probably some other better cheap initial guess value estimator. > > Just to get a feeling for the normal arg range in the non-driver parts > that use this thing: > > fs/ceph/super.h: congestion_kb = (16*int_sqrt(totalram_pages)) << (PAGE_SHIFT-10); > fs/nfs/write.c: nfs_congestion_kb = (16*int_sqrt(totalram_pages)) << (PAGE_SHIFT-10); > fs/nfsd/nfscache.c: limit = (16 * int_sqrt(low_pages)) << (PAGE_SHIFT-10); > kernel/rcu/tree_plugin.h: ls = int_sqrt(nr_cpu_ids); > mm/memcontrol.c: inactive_ratio = int_sqrt(10 * gb); > mm/page_alloc.c: ratio = int_sqrt(10 * gb); > mm/page_alloc.c: new_min_free_kbytes = int_sqrt(lowmem_kbytes * 16); > > So mostly values that scale according to mem or cpu. > > > ^ permalink raw reply [flat|nested] 41+ messages in thread

*Re: [PATCH] lib/int_sqrt.c: Optimize square root function2015-02-03 15:42 ` Anshul Garg@ 2015-02-05 18:20 ` Linus Torvalds2015-02-05 18:33 ` Linus Torvalds 0 siblings, 1 reply; 41+ messages in thread From: Linus Torvalds @ 2015-02-05 18:20 UTC (permalink / raw) To: Anshul Garg;+Cc:Davidlohr Bueso, Linux Kernel Mailing List, anshul.g [-- Attachment #1: Type: text/plain, Size: 1328 bytes --] On Tue, Feb 3, 2015 at 7:42 AM, Anshul Garg <aksgarg1989@gmail.com> wrote: > > I have done profiling of int_sqrt function using perf tool for 10 times. > For this purpose i have created a userspace program which uses sqrt function > from 1 to a million. Hmm. I did that too, and it doesn't improve things for me. In fact, it makes it slower. [torvalds@i7 ~]$ gcc -Wall -O2 -DREDUCE int_sqrt.c ; time ./a.out real 0m2.098s user 0m2.095s sys 0m0.000s [torvalds@i7 ~]$ gcc -Wall -O2 int_sqrt.c ; time ./a.out real 0m1.886s user 0m1.883s sys 0m0.000s and the profile shows that 35% of the time is spent on that branch back of the initial reduction loop. In contrast, my suggested "reduce just once" does seem to improve things: [torvalds@i7 ~]$ gcc -Wall -O2 -DONCE int_sqrt.c ; time ./a.out real 0m1.436s user 0m1.434s sys 0m0.000s but it's kind of hacky. NOTE! This probably depends a lot on microarchitecture details, including very much branch predictor etc. And I didn't actually check that it gives the right result, but I do think that this optimization needs to be looked at more if we want to do it. I was running this on an i7-4770S, fwiw. Attached is the stupid test-program I used to do the above. Maybe I did something wrong. Linus [-- Attachment #2: int_sqrt.c --] [-- Type: text/x-csrc, Size: 951 bytes --] /* * Copyright (C) 2013 Davidlohr Bueso <davidlohr.bueso@hp.com> * * Based on the shift-and-subtract algorithm for computing integer * square root from Guy L. Steele. */ #define BITS_PER_LONG (8*sizeof(long)) /** * int_sqrt - rough approximation to sqrt * @x: integer of which to calculate the sqrt * * A very rough approximation to the sqrt() function. */ unsigned long __attribute__((noinline)) int_sqrt(unsigned long x) { unsigned long b, m, y = 0; if (x <= 1) return x; m = 1UL << (BITS_PER_LONG - 2); #ifdef REDUCE while (m > x) m >>= 2; #elif defined(ONCE) { unsigned long n = m >> (BITS_PER_LONG/2); m = (n >= x) ? n : m; } #endif while (m != 0) { b = y + m; y >>= 1; if (x >= b) { x -= b; y += m; } m >>= 2; } return y; } int main(int argc, char **argv) { unsigned long i; for (i = 0; i < 100000000; i++) { unsigned long a = int_sqrt(i); asm volatile("": : "r" (a)); } return 0; } ^ permalink raw reply [flat|nested] 41+ messages in thread

*Re: [PATCH] lib/int_sqrt.c: Optimize square root function2015-02-05 18:20 ` Linus Torvalds@ 2015-02-05 18:33 ` Linus Torvalds2015-02-05 18:43 ` Anshul Garg 0 siblings, 1 reply; 41+ messages in thread From: Linus Torvalds @ 2015-02-05 18:33 UTC (permalink / raw) To: Anshul Garg;+Cc:Davidlohr Bueso, Linux Kernel Mailing List, anshul.g On Thu, Feb 5, 2015 at 10:20 AM, Linus Torvalds <torvalds@linux-foundation.org> wrote: > > Hmm. I did that too [..] Side note: one difference in our results (apart from possibly just CPU uarch details) is that my loop goes to 100M to make it easier to just time it. Which means that my load essentially had about three more iterations over nonzero data. Linus ^ permalink raw reply [flat|nested] 41+ messages in thread

*2015-02-05 18:33 ` Linus TorvaldsRe: [PATCH] lib/int_sqrt.c: Optimize square root function@ 2015-02-05 18:43 ` Anshul Garg2015-02-05 19:37 ` Linus Torvalds 0 siblings, 1 reply; 41+ messages in thread From: Anshul Garg @ 2015-02-05 18:43 UTC (permalink / raw) To: Linus Torvalds;+Cc:Davidlohr Bueso, Linux Kernel Mailing List, anshul.g [-- Attachment #1: Type: text/plain, Size: 1134 bytes --] On Thu, Feb 5, 2015 at 10:33 AM, Linus Torvalds <torvalds@linux-foundation.org> wrote: > On Thu, Feb 5, 2015 at 10:20 AM, Linus Torvalds > <torvalds@linux-foundation.org> wrote: >> >> Hmm. I did that too [..] > > Side note: one difference in our results (apart from possibly just CPU > uarch details) is that my loop goes to 100M to make it easier to just > time it. Which means that my load essentially had about three more > iterations over nonzero data. > > Linus I have also done the same testing on 100 million numbers. Attaching source codes. Below is the result :: int_sqrt_old - current int_sqrt_new - With proposed change anshul@ubuntu:~/kernel_latest/sqrt$ time ./int_sqrt_old real 0m41.895s user 0m36.490s sys 0m0.365s anshul@ubuntu:~/kernel_latest/sqrt$ time ./int_sqrt_new real 0m39.491s user 0m36.495s sys 0m0.338s I have run this test on Intel(R) Core(TM) i3-4000M CPU @ 2.40GHz VMWare Machine. Please check if i am doing anything wrong. NOTE :: I have not used gcc optimizations while compilation. With O2 level optimization proposed solution is taking more time. [-- Attachment #2: int_sqrt_old.c --] [-- Type: text/x-csrc, Size: 721 bytes --] /* * Copyright (C) 2013 Davidlohr Bueso <davidlohr.bueso@hp.com> * * Based on the shift-and-subtract algorithm for computing integer * square root from Guy L. Steele. */ #include <stdio.h> /** * int_sqrt - rough approximation to sqrt * @x: integer of which to calculate the sqrt * * A very rough approximation to the sqrt() function. */ #define BITS_PER_LONG (8*sizeof(long)) unsigned long int_sqrt(unsigned long x) { unsigned long b, m, y = 0; if (x <= 1) return x; m = 1UL << (BITS_PER_LONG - 2); while (m != 0) { b = y + m; y >>= 1; if (x >= b) { x -= b; y += m; } m >>= 2; } return y; } int main() { unsigned long n = 1; for(;n<=100000000;++n) int_sqrt(n); return 0; } [-- Attachment #3: int_sqrt_new.c --] [-- Type: text/x-csrc, Size: 750 bytes --] /* * Copyright (C) 2013 Davidlohr Bueso <davidlohr.bueso@hp.com> * * Based on the shift-and-subtract algorithm for computing integer * square root from Guy L. Steele. */ #include <stdio.h> #define BITS_PER_LONG (8*sizeof(long)) /** * int_sqrt - rough approximation to sqrt * @x: integer of which to calculate the sqrt * * A very rough approximation to the sqrt() function. */ unsigned long int_sqrt(unsigned long x) { unsigned long b, m, y = 0; if (x <= 1) return x; m = 1UL << (BITS_PER_LONG - 2); while(m > x ) m >>= 2; while (m != 0) { b = y + m; y >>= 1; if (x >= b) { x -= b; y += m; } m >>= 2; } return y; } int main() { unsigned long n = 1; for(;n<=100000000;++n) int_sqrt(n); return 0; } ^ permalink raw reply [flat|nested] 41+ messages in thread

*Re: [PATCH] lib/int_sqrt.c: Optimize square root function2015-02-05 18:43 ` Anshul Garg@ 2015-02-05 19:37 ` Linus Torvalds2015-02-08 15:39 ` Anshul Garg 0 siblings, 1 reply; 41+ messages in thread From: Linus Torvalds @ 2015-02-05 19:37 UTC (permalink / raw) To: Anshul Garg;+Cc:Davidlohr Bueso, Linux Kernel Mailing List, anshul.g On Thu, Feb 5, 2015 at 10:43 AM, Anshul Garg <aksgarg1989@gmail.com> wrote: > > NOTE :: > I have not used gcc optimizations while compilation. > With O2 level optimization proposed solution is taking more time. The thing is, the kernel is compiled with -O2, so that's what matters. Also, for very tight loops like this, the major costs tend to be very subtle microarchitectural details, particularly branch prediction. Which in turn end up sometimes depending on just exactly where the branches were placed, and even whether two conditional branches were in the same 8-byte aligned region etc things (because the branch prediction might be done ignoring the low bits of the EIP etc). So not only does the exact microarchitecture matter, things that don't *seem* like they should matter can change behavior a lot. My point is really that the performance numbers are very ambiguous. The patch may well help in some situations, but hurt in others. Linus ^ permalink raw reply [flat|nested] 41+ messages in thread

*Re: [PATCH] lib/int_sqrt.c: Optimize square root function2015-02-05 19:37 ` Linus Torvalds@ 2015-02-08 15:39 ` Anshul Garg0 siblings, 0 replies; 41+ messages in thread From: Anshul Garg @ 2015-02-08 15:39 UTC (permalink / raw) To: Linus Torvalds;+Cc:Davidlohr Bueso, Linux Kernel Mailing List, anshul.g Dear Mr. linus, Thanks for quick replies. Yes performance numbers are not conclusive enough. So its better to discard this patch as of now. I will try to explore more in this area. Thanks & regards Anshul Garg On Fri, Feb 6, 2015 at 1:07 AM, Linus Torvalds <torvalds@linux-foundation.org> wrote: > On Thu, Feb 5, 2015 at 10:43 AM, Anshul Garg <aksgarg1989@gmail.com> wrote: >> >> NOTE :: >> I have not used gcc optimizations while compilation. >> With O2 level optimization proposed solution is taking more time. > > The thing is, the kernel is compiled with -O2, so that's what matters. > > Also, for very tight loops like this, the major costs tend to be very > subtle microarchitectural details, particularly branch prediction. > Which in turn end up sometimes depending on just exactly where the > branches were placed, and even whether two conditional branches were > in the same 8-byte aligned region etc things (because the branch > prediction might be done ignoring the low bits of the EIP etc). So not > only does the exact microarchitecture matter, things that don't *seem* > like they should matter can change behavior a lot. > > My point is really that the performance numbers are very ambiguous. > The patch may well help in some situations, but hurt in others. > > Linus ^ permalink raw reply [flat|nested] 41+ messages in thread

*[PATCH] lib/kstrtox.c Stop parsing integer on overflow[not found] <aksgarg1989@gmail.com> ` (4 preceding siblings ...) 2015-02-02 17:12 ` [PATCH] " Anshul Garg@ 2015-02-16 18:48 ` Anshul Garg2015-02-17 23:20 ` Richard Weinberger 2015-02-16 18:49 ` [PATCH] lib/vsprintf.c:Avoid extra operation in dentry_name Anshul Garg 2015-07-01 19:29 ` [PATCH] Sched/rt:Fix memory leak in alloc_rt_sched_group Anshul Garg 7 siblings, 1 reply; 41+ messages in thread From: Anshul Garg @ 2015-02-16 18:48 UTC (permalink / raw) To: linux-kernel;+Cc:aksgarg1989, anshul.g, torvalds From: Anshul Garg <aksgarg1989@gmail.com> While converting string representation to integer break the loop if overflow is detected. Signed-off-by: Anshul Garg <aksgarg1989@gmail.com> --- lib/kstrtox.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/lib/kstrtox.c b/lib/kstrtox.c index ec8da78..6f30209 100644 --- a/lib/kstrtox.c +++ b/lib/kstrtox.c @@ -70,8 +70,10 @@ unsigned int _parse_integer(const char *s, unsigned int base, unsigned long long * it in the max base we support (16) */ if (unlikely(res & (~0ull << 60))) { - if (res > div_u64(ULLONG_MAX - val, base)) + if (res > div_u64(ULLONG_MAX - val, base)) { overflow = 1; + break; + } } res = res * base + val; rv++; -- 1.7.9.5 --- This email has been checked for viruses by Avast antivirus software. http://www.avast.com ^ permalink raw reply [flat|nested] 41+ messages in thread

*[PATCH] lib/vsprintf.c:Avoid extra operation in dentry_name[not found] <aksgarg1989@gmail.com> ` (5 preceding siblings ...) 2015-02-16 18:48 ` [PATCH] lib/kstrtox.c Stop parsing integer on overflow Anshul Garg@ 2015-02-16 18:49 ` Anshul Garg2015-02-16 19:14 ` Richard Weinberger 2015-07-01 19:29 ` [PATCH] Sched/rt:Fix memory leak in alloc_rt_sched_group Anshul Garg 7 siblings, 1 reply; 41+ messages in thread From: Anshul Garg @ 2015-02-16 18:49 UTC (permalink / raw) To: linux-kernel;+Cc:aksgarg1989, anshul.g, torvalds From: Anshul Garg <aksgarg1989@gmail.com> Remove unnecessary increment and decrement operation in dentry_name function as after increment operation loop is breaked and then decrement operation is performed. So remove increment and decrement operation from the function. Signed-off-by: Anshul Garg <aksgarg1989@gmail.com> --- lib/vsprintf.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/lib/vsprintf.c b/lib/vsprintf.c index ec337f6..2a38105 100644 --- a/lib/vsprintf.c +++ b/lib/vsprintf.c @@ -576,11 +576,10 @@ char *dentry_name(char *buf, char *end, const struct dentry *d, struct printf_sp if (p == d) { if (i) array[i] = ""; - i++; break; } } - s = array[--i]; + s = array[i]; for (n = 0; n != spec.precision; n++, buf++) { char c = *s++; if (!c) { -- 1.7.9.5 --- This email has been checked for viruses by Avast antivirus software. http://www.avast.com ^ permalink raw reply [flat|nested] 41+ messages in thread

*Re: [PATCH] lib/vsprintf.c:Avoid extra operation in dentry_name2015-02-16 18:49 ` [PATCH] lib/vsprintf.c:Avoid extra operation in dentry_name Anshul Garg@ 2015-02-16 19:14 ` Richard Weinberger2015-02-17 17:11 ` Anshul Garg 0 siblings, 1 reply; 41+ messages in thread From: Richard Weinberger @ 2015-02-16 19:14 UTC (permalink / raw) To: Anshul Garg;+Cc:LKML, anshul.g, Linus Torvalds On Mon, Feb 16, 2015 at 7:49 PM, Anshul Garg <aksgarg1989@gmail.com> wrote: > From: Anshul Garg <aksgarg1989@gmail.com> > > Remove unnecessary increment and decrement operation > in dentry_name function as after increment operation > loop is breaked and then decrement operation is > performed. So remove increment and decrement operation > from the function. > > Signed-off-by: Anshul Garg <aksgarg1989@gmail.com> > --- > lib/vsprintf.c | 3 +-- > 1 file changed, 1 insertion(+), 2 deletions(-) > > diff --git a/lib/vsprintf.c b/lib/vsprintf.c > index ec337f6..2a38105 100644 > --- a/lib/vsprintf.c > +++ b/lib/vsprintf.c > @@ -576,11 +576,10 @@ char *dentry_name(char *buf, char *end, const struct dentry *d, struct printf_sp > if (p == d) { > if (i) > array[i] = ""; > - i++; > break; > } > } > - s = array[--i]; > + s = array[i]; > for (n = 0; n != spec.precision; n++, buf++) { > char c = *s++; > if (!c) { What if the if (d == d) branch is not taken? Does the code then really behave exactly as before? -- Thanks, //richard ^ permalink raw reply [flat|nested] 41+ messages in thread

*Re: [PATCH] lib/vsprintf.c:Avoid extra operation in dentry_name2015-02-16 19:14 ` Richard Weinberger@ 2015-02-17 17:11 ` Anshul Garg0 siblings, 0 replies; 41+ messages in thread From: Anshul Garg @ 2015-02-17 17:11 UTC (permalink / raw) To: Richard Weinberger;+Cc:LKML, anshul.g, Linus Torvalds Yes suggested code modification will break if (p==d) branch is not taken. Thanks for pointing out this point. So its better to keep code as it is. On Tue, Feb 17, 2015 at 12:44 AM, Richard Weinberger <richard.weinberger@gmail.com> wrote: > On Mon, Feb 16, 2015 at 7:49 PM, Anshul Garg <aksgarg1989@gmail.com> wrote: >> From: Anshul Garg <aksgarg1989@gmail.com> >> >> Remove unnecessary increment and decrement operation >> in dentry_name function as after increment operation >> loop is breaked and then decrement operation is >> performed. So remove increment and decrement operation >> from the function. >> >> Signed-off-by: Anshul Garg <aksgarg1989@gmail.com> >> --- >> lib/vsprintf.c | 3 +-- >> 1 file changed, 1 insertion(+), 2 deletions(-) >> >> diff --git a/lib/vsprintf.c b/lib/vsprintf.c >> index ec337f6..2a38105 100644 >> --- a/lib/vsprintf.c >> +++ b/lib/vsprintf.c >> @@ -576,11 +576,10 @@ char *dentry_name(char *buf, char *end, const struct dentry *d, struct printf_sp >> if (p == d) { >> if (i) >> array[i] = ""; >> - i++; >> break; >> } >> } >> - s = array[--i]; >> + s = array[i]; >> for (n = 0; n != spec.precision; n++, buf++) { >> char c = *s++; >> if (!c) { > > What if the if (d == d) branch is not taken? > Does the code then really behave exactly as before? > > -- > Thanks, > //richard ^ permalink raw reply [flat|nested] 41+ messages in thread

*Re: [PATCH] lib/kstrtox.c Stop parsing integer on overflow2015-02-16 18:48 ` [PATCH] lib/kstrtox.c Stop parsing integer on overflow Anshul Garg@ 2015-02-17 23:20 ` Richard Weinberger0 siblings, 0 replies; 41+ messages in thread From: Richard Weinberger @ 2015-02-17 23:20 UTC (permalink / raw) To: Anshul Garg;+Cc:LKML, anshul.g, Linus Torvalds On Mon, Feb 16, 2015 at 7:48 PM, Anshul Garg <aksgarg1989@gmail.com> wrote: > From: Anshul Garg <aksgarg1989@gmail.com> > > While converting string representation to integer > break the loop if overflow is detected. > > Signed-off-by: Anshul Garg <aksgarg1989@gmail.com> > --- > lib/kstrtox.c | 4 +++- > 1 file changed, 3 insertions(+), 1 deletion(-) > > diff --git a/lib/kstrtox.c b/lib/kstrtox.c > index ec8da78..6f30209 100644 > --- a/lib/kstrtox.c > +++ b/lib/kstrtox.c > @@ -70,8 +70,10 @@ unsigned int _parse_integer(const char *s, unsigned int base, unsigned long long > * it in the max base we support (16) > */ > if (unlikely(res & (~0ull << 60))) { > - if (res > div_u64(ULLONG_MAX - val, base)) > + if (res > div_u64(ULLONG_MAX - val, base)) { > overflow = 1; > + break; This changes the behaviour as described at the function description. Did you double check that all users of that function can deal with that? > + } > } > res = res * base + val; > rv++; > -- > 1.7.9.5 > > > --- > This email has been checked for viruses by Avast antivirus software. > http://www.avast.com > > -- > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > Please read the FAQ at http://www.tux.org/lkml/ -- Thanks, //richard ^ permalink raw reply [flat|nested] 41+ messages in thread

*[PATCH] Sched/rt:Fix memory leak in alloc_rt_sched_group[not found] <aksgarg1989@gmail.com> ` (6 preceding siblings ...) 2015-02-16 18:49 ` [PATCH] lib/vsprintf.c:Avoid extra operation in dentry_name Anshul Garg@ 2015-07-01 19:29 ` Anshul Garg2015-07-06 15:35 ` [tip:sched/core] sched/rt: Fix memory leak in alloc_rt_sched_group() tip-bot for Anshul Garg 7 siblings, 1 reply; 41+ messages in thread From: Anshul Garg @ 2015-07-01 19:29 UTC (permalink / raw) To: linux-kernel, mingo, peterz;+Cc:aksgarg1989 From: Anshul Garg <aksgarg1989@gmail.com> Added code to free allocated memory by function alloc_rt_sched_group in case kzalloc api fails. Signed-off-by: Anshul Garg <aksgarg1989@gmail.com> --- kernel/sched/rt.c | 8 ++++++-- 1 file changed, 6 insertions(+), 2 deletions(-) diff --git a/kernel/sched/rt.c b/kernel/sched/rt.c index f4d4b07..47213e9 100644 --- a/kernel/sched/rt.c +++ b/kernel/sched/rt.c @@ -177,7 +177,7 @@ int alloc_rt_sched_group(struct task_group *tg, struct task_group *parent) goto err; tg->rt_se = kzalloc(sizeof(rt_se) * nr_cpu_ids, GFP_KERNEL); if (!tg->rt_se) - goto err; + goto err_free_rt_rq; init_rt_bandwidth(&tg->rt_bandwidth, ktime_to_ns(def_rt_bandwidth.rt_period), 0); @@ -186,7 +186,7 @@ int alloc_rt_sched_group(struct task_group *tg, struct task_group *parent) rt_rq = kzalloc_node(sizeof(struct rt_rq), GFP_KERNEL, cpu_to_node(i)); if (!rt_rq) - goto err; + goto err_free_rt_se; rt_se = kzalloc_node(sizeof(struct sched_rt_entity), GFP_KERNEL, cpu_to_node(i)); @@ -202,6 +202,10 @@ int alloc_rt_sched_group(struct task_group *tg, struct task_group *parent) err_free_rq: kfree(rt_rq); +err_free_rt_se: + kfree(tg->rt_se); +err_free_rt_rq: + kfree(tg->rt_rq); err: return 0; } -- 1.7.9.5 --- This email has been checked for viruses by Avast antivirus software. https://www.avast.com/antivirus ^ permalink raw reply [flat|nested] 41+ messages in thread

*[tip:sched/core] sched/rt: Fix memory leak in alloc_rt_sched_group()2015-07-01 19:29 ` [PATCH] Sched/rt:Fix memory leak in alloc_rt_sched_group Anshul Garg@ 2015-07-06 15:35 ` tip-bot for Anshul Garg2015-07-06 19:43 ` Anshul Garg 0 siblings, 1 reply; 41+ messages in thread From: tip-bot for Anshul Garg @ 2015-07-06 15:35 UTC (permalink / raw) To: linux-tip-commits Cc: torvalds, mingo, aksgarg1989, tglx, linux-kernel, hpa, peterz Commit-ID: ecdd6804b7c9e15fe8fc836ba0233d9912834e8b Gitweb: http://git.kernel.org/tip/ecdd6804b7c9e15fe8fc836ba0233d9912834e8b Author: Anshul Garg <aksgarg1989@gmail.com> AuthorDate: Wed, 1 Jul 2015 12:29:26 -0700 Committer: Ingo Molnar <mingo@kernel.org> CommitDate: Mon, 6 Jul 2015 14:17:14 +0200 sched/rt: Fix memory leak in alloc_rt_sched_group() Added code to free allocated memory by function alloc_rt_sched_group() in case the kzalloc() API fails. Signed-off-by: Anshul Garg <aksgarg1989@gmail.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/1435778966-36415-1-git-send-email-aksgarg1989@gmail.com Signed-off-by: Ingo Molnar <mingo@kernel.org> --- kernel/sched/rt.c | 8 ++++++-- 1 file changed, 6 insertions(+), 2 deletions(-) diff --git a/kernel/sched/rt.c b/kernel/sched/rt.c index 0d193a24..cfa907d 100644 --- a/kernel/sched/rt.c +++ b/kernel/sched/rt.c @@ -193,7 +193,7 @@ int alloc_rt_sched_group(struct task_group *tg, struct task_group *parent) goto err; tg->rt_se = kzalloc(sizeof(rt_se) * nr_cpu_ids, GFP_KERNEL); if (!tg->rt_se) - goto err; + goto err_free_rt_rq; init_rt_bandwidth(&tg->rt_bandwidth, ktime_to_ns(def_rt_bandwidth.rt_period), 0); @@ -202,7 +202,7 @@ int alloc_rt_sched_group(struct task_group *tg, struct task_group *parent) rt_rq = kzalloc_node(sizeof(struct rt_rq), GFP_KERNEL, cpu_to_node(i)); if (!rt_rq) - goto err; + goto err_free_rt_se; rt_se = kzalloc_node(sizeof(struct sched_rt_entity), GFP_KERNEL, cpu_to_node(i)); @@ -218,6 +218,10 @@ int alloc_rt_sched_group(struct task_group *tg, struct task_group *parent) err_free_rq: kfree(rt_rq); +err_free_rt_se: + kfree(tg->rt_se); +err_free_rt_rq: + kfree(tg->rt_rq); err: return 0; } ^ permalink raw reply [flat|nested] 41+ messages in thread

*Re: [tip:sched/core] sched/rt: Fix memory leak in alloc_rt_sched_group()2015-07-06 15:35 ` [tip:sched/core] sched/rt: Fix memory leak in alloc_rt_sched_group() tip-bot for Anshul Garg@ 2015-07-06 19:43 ` Anshul Garg2015-07-07 6:45 ` Ingo Molnar 0 siblings, 1 reply; 41+ messages in thread From: Anshul Garg @ 2015-07-06 19:43 UTC (permalink / raw) To: peterz, hpa, Linus Torvalds, Linux Kernel Mailing List, tglx, mingo, anshul garg Cc: linux-tip-commits Dear linux community, I think this patch" sched/rt: Fix memory leak in alloc_rt_sched_group()" with Commit-ID: ecdd6804b7c9e15fe8fc836ba0233d9912834e8b is not correct. As it will create panic in case of kzalloc failure because of doingkfree twice. As if alloc_rt_sched_group fails it will return zero then core.c will call free_sched_group in core.c which subsequently calls free_rt_sched_group . In free_rt_sched_group memory is getting freed properly. so it seems patch sent by me is not correct. Please help to review patch once again as it might break some things. Sorry for inconvenience. Thanks Anshul Garg On Mon, Jul 6, 2015 at 9:05 PM, tip-bot for Anshul Garg <tipbot@zytor.com> wrote: > Commit-ID: ecdd6804b7c9e15fe8fc836ba0233d9912834e8b > Gitweb: http://git.kernel.org/tip/ecdd6804b7c9e15fe8fc836ba0233d9912834e8b > Author: Anshul Garg <aksgarg1989@gmail.com> > AuthorDate: Wed, 1 Jul 2015 12:29:26 -0700 > Committer: Ingo Molnar <mingo@kernel.org> > CommitDate: Mon, 6 Jul 2015 14:17:14 +0200 > > sched/rt: Fix memory leak in alloc_rt_sched_group() > > Added code to free allocated memory by function > alloc_rt_sched_group() in case the kzalloc() API fails. > > Signed-off-by: Anshul Garg <aksgarg1989@gmail.com> > Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> > Cc: Linus Torvalds <torvalds@linux-foundation.org> > Cc: Peter Zijlstra <peterz@infradead.org> > Cc: Thomas Gleixner <tglx@linutronix.de> > Link: http://lkml.kernel.org/r/1435778966-36415-1-git-send-email-aksgarg1989@gmail.com > Signed-off-by: Ingo Molnar <mingo@kernel.org> > --- > kernel/sched/rt.c | 8 ++++++-- > 1 file changed, 6 insertions(+), 2 deletions(-) > > diff --git a/kernel/sched/rt.c b/kernel/sched/rt.c > index 0d193a24..cfa907d 100644 > --- a/kernel/sched/rt.c > +++ b/kernel/sched/rt.c > @@ -193,7 +193,7 @@ int alloc_rt_sched_group(struct task_group *tg, struct task_group *parent) > goto err; > tg->rt_se = kzalloc(sizeof(rt_se) * nr_cpu_ids, GFP_KERNEL); > if (!tg->rt_se) > - goto err; > + goto err_free_rt_rq; > > init_rt_bandwidth(&tg->rt_bandwidth, > ktime_to_ns(def_rt_bandwidth.rt_period), 0); > @@ -202,7 +202,7 @@ int alloc_rt_sched_group(struct task_group *tg, struct task_group *parent) > rt_rq = kzalloc_node(sizeof(struct rt_rq), > GFP_KERNEL, cpu_to_node(i)); > if (!rt_rq) > - goto err; > + goto err_free_rt_se; > > rt_se = kzalloc_node(sizeof(struct sched_rt_entity), > GFP_KERNEL, cpu_to_node(i)); > @@ -218,6 +218,10 @@ int alloc_rt_sched_group(struct task_group *tg, struct task_group *parent) > > err_free_rq: > kfree(rt_rq); > +err_free_rt_se: > + kfree(tg->rt_se); > +err_free_rt_rq: > + kfree(tg->rt_rq); > err: > return 0; > } ^ permalink raw reply [flat|nested] 41+ messages in thread

*Re: [tip:sched/core] sched/rt: Fix memory leak in alloc_rt_sched_group()2015-07-06 19:43 ` Anshul Garg@ 2015-07-07 6:45 ` Ingo Molnar0 siblings, 0 replies; 41+ messages in thread From: Ingo Molnar @ 2015-07-07 6:45 UTC (permalink / raw) To: Anshul Garg Cc: peterz, hpa, Linus Torvalds, Linux Kernel Mailing List, tglx, linux-tip-commits * Anshul Garg <aksgarg1989@gmail.com> wrote: > Dear linux community, > > I think this patch" sched/rt: Fix memory leak in > alloc_rt_sched_group()" with Commit-ID: > ecdd6804b7c9e15fe8fc836ba0233d9912834e8b is not correct. > > As it will create panic in case of kzalloc failure because of doingkfree twice. > > As if alloc_rt_sched_group fails it will return zero then core.c will > call free_sched_group in core.c > which subsequently calls free_rt_sched_group . > > In free_rt_sched_group memory is getting freed properly. > > so it seems patch sent by me is not correct. > > Please help to review patch once again as it might break some things. > > Sorry for inconvenience. > > Thanks > Anshul Garg Ok, I've removed this patch. Thanks! Ingo ^ permalink raw reply [flat|nested] 41+ messages in thread

*Re: [PATCH] lib/int_sqrt.c: Optimize square root function2015-02-02 19:13 ` Linus Torvalds 2015-02-03 4:41 ` Davidlohr Bueso@ 2017-07-20 11:24 ` Peter Zijlstra2017-07-20 11:52 ` Joe Perches ` (2 more replies) 1 sibling, 3 replies; 41+ messages in thread From: Peter Zijlstra @ 2017-07-20 11:24 UTC (permalink / raw) To: Linus Torvalds Cc: Anshul Garg, Davidlohr Bueso, Linux Kernel Mailing List, anshul.g, Thomas Gleixner, joe On Mon, Feb 02, 2015 at 11:13:44AM -0800, Linus Torvalds wrote: > On Mon, Feb 2, 2015 at 11:00 AM, Linus Torvalds > <torvalds@linux-foundation.org> wrote: > > > > (I'm also not entirely sure what uses int_sqrt() that ends up being so > > performance-critical, so it would be good to document that too, since > > that probably also matters for the "what's the normal argument range" > > question..) > > ... it's also not entirely clear that we need a whole new loop. We > might just instead start off with a better guess for 'm' using some > calculation that might be doable with a single conditional move > instruction instead of a loop. Because I suspect that the inevitable > branch misprediction of a new loop is likely as expensive as a few > iterations through the core one. > > IOW, instead of > > m = 1UL << (BITS_PER_LONG - 2); > > perhaps something like > > m = 1UL << (BITS_PER_LONG/2- 2); > if (m < x) > m <<= BITS_PER_LONG/2; > > (assuming gcc can change that code into a "cmov") might cut down the > "lots of empty loops" case in half for small values of 'x'. > > There's probably some other better cheap initial guess value estimator. So I had a wee look at int_sqrt(), and yes Davidlohr's patch makes it worse for my machine (Xeon E3v5 aka. fancy Skylake). As in _MUCH_ worse.. ~/tmp$ gcc -o sqrt sqrt.c -lm -O2 -DLOOPS=10000000 -DNEW=1 ; perf stat --repeat 10 -e cycles:u -e instructions:u ./sqrt Performance counter stats for './sqrt' (10 runs): 828,233,295 cycles:u ( +- 0.04% ) 2,480,095,771 instructions:u # 2.99 insn per cycle ( +- 0.00% ) 0.220202758 seconds time elapsed ( +- 0.18% ) ~/tmp$ gcc -o sqrt sqrt.c -lm -O2 -DLOOPS=10000000 ; perf stat --repeat 10 -e cycles:u -e instructions:u ./sqrt Performance counter stats for './sqrt' (10 runs): 603,809,551 cycles:u ( +- 1.30% ) 1,677,726,591 instructions:u # 2.78 insn per cycle ( +- 0.00% ) 0.160801044 seconds time elapsed That is, we went from ~60 cycles to ~82 cycles per invocation. Now, the patches proposed in this thread do indeed improve matters but seem to not have merged up until this day: ~/tmp$ gcc -o sqrt sqrt.c -lm -O2 -DLOOPS=10000000 -DNEW=1 -DANSHUL=1 ; perf stat --repeat 10 -e cycles:u -e instructions:u ./sqrt Performance counter stats for './sqrt' (10 runs): 598,827,196 cycles:u ( +- 0.18% ) 1,697,726,381 instructions:u # 2.84 insn per cycle ( +- 0.00% ) 0.159880738 seconds time elapsed ( +- 0.36% ) ~/tmp$ gcc -o sqrt sqrt.c -lm -O2 -DLOOPS=10000000 -DNEW=1 -DLINUS=1 ; perf stat --repeat 10 -e cycles:u -e instructions:u ./sqrt Performance counter stats for './sqrt' (10 runs): 601,463,184 cycles:u ( +- 0.07% ) 1,380,095,976 instructions:u # 2.29 insn per cycle ( +- 0.00% ) 0.160788095 seconds time elapsed ( +- 0.55% ) Which basically bring us back to the old performance. So I would strongly suggest we merge one. Now, the thing is, Intel CPUs are actually fairly good at fls(), so I gave Joe's suggestions a spin too.. ~/tmp$ gcc -o sqrt sqrt.c -lm -O2 -DLOOPS=10000000 -DNEW=1 -DFLS=1 ; perf stat --repeat 10 -e cycles:u -e instructions:u ./sqrt Performance counter stats for './sqrt' (10 runs): 390,420,004 cycles:u ( +- 0.07% ) 1,141,802,493 instructions:u # 2.92 insn per cycle ( +- 0.00% ) 0.105480000 seconds time elapsed ( +- 0.64% ) ~/tmp$ gcc -o sqrt sqrt.c -lm -O2 -DLOOPS=10000000 -DNEW=1 -DFLS=1 -DANSHUL=1 ; perf stat --repeat 10 -e cycles:u -e instructions:u ./sqrt Performance counter stats for './sqrt' (10 runs): 328,415,775 cycles:u ( +- 0.15% ) 1,138,579,704 instructions:u # 3.47 insn per cycle ( +- 0.00% ) 0.088703205 seconds time elapsed Which gets us a real improvement... Now even with the software fls() fallback from include/asm-generic/bitops/fls.h there is a significant improvement: ~/tmp$ gcc -o sqrt sqrt.c -lm -O2 -DLOOPS=10000000 -DNEW=1 -DFLS=1 -DANSHUL=1 -DSOFTFLS=1 ; perf stat --repeat 10 -e cycles:u -e instructions:u ./sqrt Performance counter stats for './sqrt' (10 runs): 375,633,931 cycles:u ( +- 0.13% ) 1,289,700,013 instructions:u # 3.43 insn per cycle ( +- 0.00% ) 0.100596211 seconds time elapsed Also, I ran with -DVALIDATE=1 and while the comment says its a 'rough' estimate of the sqrt() I did not in fact find any value for which we deviate < INT_MAX. ----------------------[ sqrt.c ]--------------------------- #include <math.h> #include <stdio.h> #define BITS_PER_LONG (sizeof(long) * 8) #ifndef LOOPS #define LOOPS 1000000 #endif #ifdef SOFTFLS static __always_inline unsigned long fls(unsigned long x) { int r = 32; if (!x) return 0; if (!(x & 0xffff0000u)) { x <<= 16; r -= 16; } if (!(x & 0xff000000u)) { x <<= 8; r -= 8; } if (!(x & 0xf0000000u)) { x <<= 4; r -= 4; } if (!(x & 0xc0000000u)) { x <<= 2; r -= 2; } if (!(x & 0x80000000u)) { x <<= 1; r -= 1; } return r; } #else static __always_inline unsigned long fls(unsigned long word) { asm("rep; bsr %1,%0" : "=r" (word) : "rm" (word)); return BITS_PER_LONG - 1 - word; } #endif #ifdef NEW unsigned long __attribute__((noinline)) int_sqrt(unsigned long x) { unsigned long b, m, y = 0; if (x <= 1) return x; #ifdef LINUS m = 1UL << (BITS_PER_LONG/2 - 2); if (m < x) m <<= BITS_PER_LONG/2; #else #ifdef FLS m = 1UL << ((fls(x) + 1) & ~1UL); #else m = 1UL << (BITS_PER_LONG - 2); #endif #ifdef ANSHUL while (m > x) m >>= 2; #endif #endif while (m != 0) { b = y + m; y >>= 1; if (x >= b) { x -= b; y += m; } m >>= 2; } return y; } #else unsigned long __attribute__((noinline)) int_sqrt(unsigned long x) { unsigned long op, res, one; op = x; res = 0; one = 1UL << (BITS_PER_LONG - 2); while (one > op) one >>= 2; while (one != 0) { if (op >= res + one) { op = op - (res + one); res = res + 2 * one; } res /= 2; one /= 4; } return res; } #endif void main(void) { unsigned long i; for (i=0; i<LOOPS; i++) { unsigned long a = int_sqrt(i); asm volatile("" : : "r" (a)); #ifdef VALIDATE { unsigned long b = floor(sqrt(i)); if (a != b) printf("%ld %ld %ld\n", i, a, b); } #endif } } -------------- The only caveat seems to be that our fls() is only defined for 'int' not long :/ --- lib/int_sqrt.c | 11 +++++++---- 1 file changed, 7 insertions(+), 4 deletions(-) diff --git a/lib/int_sqrt.c b/lib/int_sqrt.c index 1ef4cc344977..87c3aa360441 100644 --- a/lib/int_sqrt.c +++ b/lib/int_sqrt.c @@ -7,12 +7,11 @@ #include <linux/kernel.h> #include <linux/export.h> +#include <linux/bitops.h> /** - * int_sqrt - rough approximation to sqrt + * int_sqrt - approximation to sqrt * @x: integer of which to calculate the sqrt - * - * A very rough approximation to the sqrt() function. */ unsigned long int_sqrt(unsigned long x) { @@ -21,7 +20,11 @@ unsigned long int_sqrt(unsigned long x) if (x <= 1) return x; - m = 1UL << (BITS_PER_LONG - 2); + m = 1UL << ((fls(x) + 1) & ~1UL); + + while (m > x) + m >>= 2; + while (m != 0) { b = y + m; y >>= 1; ^ permalink raw reply [flat|nested] 41+ messages in thread

*Re: [PATCH] lib/int_sqrt.c: Optimize square root function2017-07-20 11:24 ` Peter Zijlstra@ 2017-07-20 11:52 ` Joe Perches2017-07-20 14:13 ` Peter Zijlstra 2017-07-20 15:27 ` Peter Zijlstra 2017-07-20 18:31 ` Linus Torvalds 2 siblings, 1 reply; 41+ messages in thread From: Joe Perches @ 2017-07-20 11:52 UTC (permalink / raw) To: Peter Zijlstra, Linus Torvalds Cc: Anshul Garg, Davidlohr Bueso, Linux Kernel Mailing List, anshul.g, Thomas Gleixner On Thu, 2017-07-20 at 13:24 +0200, Peter Zijlstra wrote: [] > + m = 1UL << ((fls(x) + 1) & ~1UL); maybe #if BITS_PER_LONG == 64 m = 1UL << ((fls64(x) + 1) & ~1UL); #else m = 1UL << ((fls(x) + 1) & ~1UL); #endif ^ permalink raw reply [flat|nested] 41+ messages in thread

*Re: [PATCH] lib/int_sqrt.c: Optimize square root function2017-07-20 11:52 ` Joe Perches@ 2017-07-20 14:13 ` Peter Zijlstra0 siblings, 0 replies; 41+ messages in thread From: Peter Zijlstra @ 2017-07-20 14:13 UTC (permalink / raw) To: Joe Perches Cc: Linus Torvalds, Anshul Garg, Davidlohr Bueso, Linux Kernel Mailing List, anshul.g, Thomas Gleixner On Thu, Jul 20, 2017 at 04:52:49AM -0700, Joe Perches wrote: > On Thu, 2017-07-20 at 13:24 +0200, Peter Zijlstra wrote: > [] > > + m = 1UL << ((fls(x) + 1) & ~1UL); > > maybe > > #if BITS_PER_LONG == 64 > m = 1UL << ((fls64(x) + 1) & ~1UL); > #else > m = 1UL << ((fls(x) + 1) & ~1UL); > #endif We do seem to have __fls() which is defined on long. ^ permalink raw reply [flat|nested] 41+ messages in thread

*Re: [PATCH] lib/int_sqrt.c: Optimize square root function2017-07-20 11:24 ` Peter Zijlstra 2017-07-20 11:52 ` Joe Perches@ 2017-07-20 15:27 ` Peter Zijlstra2017-07-20 18:31 ` Linus Torvalds 2 siblings, 0 replies; 41+ messages in thread From: Peter Zijlstra @ 2017-07-20 15:27 UTC (permalink / raw) To: Linus Torvalds Cc: Anshul Garg, Davidlohr Bueso, Linux Kernel Mailing List, anshul.g, Thomas Gleixner, joe On Thu, Jul 20, 2017 at 01:24:49PM +0200, Peter Zijlstra wrote: > ~/tmp$ gcc -o sqrt sqrt.c -lm -O2 -DLOOPS=10000000 -DNEW=1 -DFLS=1 -DANSHUL=1 ; perf stat --repeat 10 -e cycles:u -e instructions:u ./sqrt > > Performance counter stats for './sqrt' (10 runs): > > 328,415,775 cycles:u ( +- 0.15% ) > 1,138,579,704 instructions:u # 3.47 insn per cycle ( +- 0.00% ) > > 0.088703205 seconds time elapsed > static __always_inline unsigned long fls(unsigned long word) > { > asm("rep; bsr %1,%0" > : "=r" (word) > : "rm" (word)); > return BITS_PER_LONG - 1 - word; > } That is actually "lzcnt", if I used the regular fls implementation: static __always_inline unsigned long __fls(unsigned long word) { asm("bsr %1,%0" : "=r" (word) : "rm" (word)); return word; } It ends up slightly more expensive: ~/tmp$ gcc -o sqrt sqrt.c -lm -O2 -DLOOPS=10000000 -DNEW=1 -DFLS=1 -DANSHUL=1 ; perf stat --repeat 10 -e cycles:u -e instructions:u ./sqrt Performance counter stats for './sqrt' (10 runs): 384,842,215 cycles:u ( +- 0.08% ) 1,118,579,712 instructions:u # 2.91 insn per cycle ( +- 0.00% ) 0.103018001 seconds time elapsed Still loads cheaper than pretty much any other combination. ^ permalink raw reply [flat|nested] 41+ messages in thread

*Re: [PATCH] lib/int_sqrt.c: Optimize square root function2017-07-20 11:24 ` Peter Zijlstra 2017-07-20 11:52 ` Joe Perches 2017-07-20 15:27 ` Peter Zijlstra@ 2017-07-20 18:31 ` Linus Torvalds2017-07-20 22:34 ` Peter Zijlstra 2 siblings, 1 reply; 41+ messages in thread From: Linus Torvalds @ 2017-07-20 18:31 UTC (permalink / raw) To: Peter Zijlstra Cc: Anshul Garg, Davidlohr Bueso, Linux Kernel Mailing List, anshul.g, Thomas Gleixner, Joe Perches How did this two-year old thread get resurrected? Anyway, it got resurrected without even answering one core question: On Thu, Jul 20, 2017 at 4:24 AM, Peter Zijlstra <peterz@infradead.org> wrote: > On Mon, Feb 02, 2015 at 11:13:44AM -0800, Linus Torvalds wrote: >>>> On Mon, Feb 2, 2015 at 11:00 AM, Linus Torvalds <torvalds@linux-foundation.org> wrote: >> > >> > (I'm also not entirely sure what uses int_sqrt() that ends up being so >> > performance-critical, so it would be good to document that too, since >> > that probably also matters for the "what's the normal argument range" >> > question..) This is still the case. Which of the (very few) users really _care_? And what are the normal values for that? For example, the 802.11 minstrel code does a "MINSTREL_TRUNC()" on a "unsigned int" value that is in millions. It's not even "unsigned long", so we know it's not many thousands of millions, and MINSTREL_TRUNC shifts it down by 12 bits. So we know we have at most a 20-bit argument. The one case that uses actual unsigned long seems to be "slow_is_prime_number()", and honestly, the sqrt() is the *least* of our problems there. There's a few drivers and filesystems that use it. I do not believe performance matters in those cases. Even if you do a "int_sqrt()" per nertwork packet on some high-performance network (and none of them look anything like that). And there's a couple of VM users. They don't look particularly critical either. So why do you care? Because honestly, calling int_sqrt() once in a blue moon with caches cold and no branch prediction information tends to have very different performance characteristics from calling it in a loop with very predictable input. So I think your "benchmark" is just garbage, in that it's testing something entirely different than the actual load. Also, since this is a generic library routine, no way can we depend on fls being fast. But we could certainly improve on the initial value a lot. It's just that we should probably strive to improve on it without adding extra branch misprediction or I$ misses - both things that your benchmark isn't actually testing at all, since it does the exact opposite of that by basically preloading both. And the *most* important question is that first one: "Why does this matter, and what is the range it matters for?" Linus ^ permalink raw reply [flat|nested] 41+ messages in thread

*Re: [PATCH] lib/int_sqrt.c: Optimize square root function2017-07-20 18:31 ` Linus Torvalds@ 2017-07-20 22:34 ` Peter Zijlstra2017-07-20 23:24 ` Linus Torvalds 0 siblings, 1 reply; 41+ messages in thread From: Peter Zijlstra @ 2017-07-20 22:34 UTC (permalink / raw) To: Linus Torvalds Cc: Anshul Garg, Davidlohr Bueso, Linux Kernel Mailing List, anshul.g, Thomas Gleixner, Joe Perches On Thu, Jul 20, 2017 at 11:31:36AM -0700, Linus Torvalds wrote: > How did this two-year old thread get resurrected? I was looking for the original thread doing that 'optimization' Davidlohr did but found this first. > And the *most* important question is that first one: > > "Why does this matter, and what is the range it matters for?" I was looking to do some work on the idle estimator. Parts of that keep online avg and variance for normal distributions. I wanted to bias the avg downwards, the way to do that is to subtract a scaled stdev from it. Computing the stdev from a variance requires the sqrt. Thomas rightly asked how expensive our sqrt is, I found Davidlohr's commit message and got confused by the numbers, so I reran them and found the optimization did the reverse, it made things worse. By then I was prodding at it... 'fun' problem :-) In any case, I suppose the range of values would be in the order of TICK_NSEC, so the variance would be a number of orders below that. So we're looking at fairly small numbers <1e5. > Also, since this is a generic library routine, no way can we depend on > fls being fast. Which is why I also tested the software fls, but you're right in that the microbench primes the branch predictor. Still, the software fls is 6 branches, whereas the 'missing' loop: while (m > x) m >>= 2; would need up to 30 or so cycles worst case. So even in that respect it makes sense its a 'win', esp. so for small numbers. ^ permalink raw reply [flat|nested] 41+ messages in thread

*Re: [PATCH] lib/int_sqrt.c: Optimize square root function2017-07-20 22:34 ` Peter Zijlstra@ 2017-07-20 23:24 ` Linus Torvalds2017-07-21 11:40 ` Peter Zijlstra 0 siblings, 1 reply; 41+ messages in thread From: Linus Torvalds @ 2017-07-20 23:24 UTC (permalink / raw) To: Peter Zijlstra Cc: Anshul Garg, Davidlohr Bueso, Linux Kernel Mailing List, anshul.g, Thomas Gleixner, Joe Perches [-- Attachment #1: Type: text/plain, Size: 1777 bytes --] On Thu, Jul 20, 2017 at 3:34 PM, Peter Zijlstra <peterz@infradead.org> wrote: >> >> "Why does this matter, and what is the range it matters for?" > > I was looking to do some work on the idle estimator. Parts of that keep > online avg and variance for normal distributions. I wanted to bias the > avg downwards, the way to do that is to subtract a scaled stdev from it. > Computing the stdev from a variance requires the sqrt. > > In any case, I suppose the range of values would be in the order of > TICK_NSEC, so the variance would be a number of orders below that. So > we're looking at fairly small numbers <1e5. Ok. So that already cuts down the problem space a lot. And yes, the current code is very suboptimal for exactly the small number case. It's also typiocally the case that right-shifting is more expensive than left-shifting, so the fact that we left-shift twice in that loop for all the big values is extra expensive. So I would actually suggest a different approach: - start with a small 'm' - and grow it up. The "small m" means that there is a small constant, which is good. And growing it up is a single trivial left shift by two, which can also be done just two adds or as a single lea, so it works fine even on architectures with slow shifters etc. So mayube something like the attached? NOTE NOTE NOTE! This may be pure garbage. It's untested. But the code generation looks ok even if gcc is being way too clever and turns that first loop into a counted loop instead. Whatever, maybe it's the right thing to do. But note that I might have broken the sqrt for some case, so you need to double-check my logic there. The *intention* is that it's the exact same thing as it used to do, just initializing 'm' differently. Linus [-- Attachment #2: patch.diff --] [-- Type: text/plain, Size: 745 bytes --] lib/int_sqrt.c | 18 +++++++++++++----- 1 file changed, 13 insertions(+), 5 deletions(-) diff --git a/lib/int_sqrt.c b/lib/int_sqrt.c index 1ef4cc344977..da3b3dabad8e 100644 --- a/lib/int_sqrt.c +++ b/lib/int_sqrt.c @@ -16,14 +16,22 @@ */ unsigned long int_sqrt(unsigned long x) { - unsigned long b, m, y = 0; + unsigned long m, y; if (x <= 1) return x; - m = 1UL << (BITS_PER_LONG - 2); - while (m != 0) { - b = y + m; + m = 64; + do { + unsigned long new_m = m << 2; + if (!new_m) + break; + m = new_m; + } while (m < x); + + y = 0; + do { + unsigned long b = y + m; y >>= 1; if (x >= b) { @@ -31,7 +39,7 @@ unsigned long int_sqrt(unsigned long x) y += m; } m >>= 2; - } + } while (m); return y; } ^ permalink raw reply [flat|nested] 41+ messages in thread

*Re: [PATCH] lib/int_sqrt.c: Optimize square root function2017-07-20 23:24 ` Linus Torvalds@ 2017-07-21 11:40 ` Peter Zijlstra2017-07-21 12:15 ` Joe Perches 0 siblings, 1 reply; 41+ messages in thread From: Peter Zijlstra @ 2017-07-21 11:40 UTC (permalink / raw) To: Linus Torvalds Cc: Anshul Garg, Davidlohr Bueso, Linux Kernel Mailing List, anshul.g, Thomas Gleixner, Joe Perches, Ingo Molnar, Will Deacon [-- Attachment #1: Type: text/plain, Size: 4744 bytes --] On Thu, Jul 20, 2017 at 04:24:32PM -0700, Linus Torvalds wrote: > So mayube something like the attached? > > NOTE NOTE NOTE! This may be pure garbage. It's untested. But the code > generation looks ok even if gcc is being way too clever and turns that > first loop into a counted loop instead. Whatever, maybe it's the right > thing to do. It passes a -DVALIDATE=1 run (up to a billion or so), so it seems to work. I've made the test thing a bit more complicated in order to address your concerns for primed branch predictors (and once the branch predictors get wiped, the predictable input values don't matter anymore either I think, although I could easily LFSR generate them as well of course). Find it attached in a tarball. The results, from running ./test.sh (left SKL, right SNB): (values <100k) Cycles: EVENT=0 -DLINUS=1 event: 29.533080 +- 0.046261 36.800100 +- 0.046067 EVENT=0 -DLINUS=1 -DWIPE_BTB=1 event: 143.513440 +- 0.152651 153.054640 +- 0.099403 EVENT=0 -DNEW=1 -DANSHUL=1 event: 41.457300 +- 0.048409 55.046100 +- 0.044968 EVENT=0 -DNEW=1 -DANSHUL=1 -DWIPE_BTB=1 event: 128.351400 +- 0.366873 161.183690 +- 0.132301 EVENT=0 -DNEW=1 -DANSHUL=1 -DFLS=1 event: 21.433170 +- 0.043859 27.672700 +- 0.063982 EVENT=0 -DNEW=1 -DANSHUL=1 -DFLS=1 -DWIPE_BTB=1 event: 79.850210 +- 0.133646 112.422470 +- 0.101465 EVENT=0 -DNEW=1 -DANSHUL=1 -DFLS=1 -DSOFTFLS=1 event: 31.771680 +- 0.037655 37.515160 +- 0.080305 EVENT=0 -DNEW=1 -DANSHUL=1 -DFLS=1 -DSOFTFLS=1 -DWIPE_BTB=1 event: 129.673440 +- 0.430308 125.514390 +- 0.117590 Branches: EVENT=4 -DLINUS=1 event: 31.507740 +- 0.010470 31.507710 +- 0.010470 EVENT=4 -DLINUS=1 -DWIPE_BTB=1 event: 31.507720 +- 0.010470 31.507760 +- 0.010470 EVENT=4 -DNEW=1 -DANSHUL=1 event: 45.125510 +- 0.002696 45.125510 +- 0.002696 EVENT=4 -DNEW=1 -DANSHUL=1 -DWIPE_BTB=1 event: 45.125490 +- 0.002696 45.125540 +- 0.002697 EVENT=4 -DNEW=1 -DANSHUL=1 -DFLS=1 event: 21.252320 +- 0.005266 21.252330 +- 0.005266 EVENT=4 -DNEW=1 -DANSHUL=1 -DFLS=1 -DWIPE_BTB=1 event: 21.252320 +- 0.005266 21.252340 +- 0.005266 EVENT=4 -DNEW=1 -DANSHUL=1 -DFLS=1 -DSOFTFLS=1 event: 25.727940 +- 0.005975 25.727930 +- 0.005975 EVENT=4 -DNEW=1 -DANSHUL=1 -DFLS=1 -DSOFTFLS=1 -DWIPE_BTB=1 event: 25.727940 +- 0.005975 25.727930 +- 0.005975 Branch-Misses: EVENT=5 -DLINUS=1 event: 0.022670 +- 0.000789 0.020920 +- 0.000812 EVENT=5 -DLINUS=1 -DWIPE_BTB=1 event: 5.481750 +- 0.006930 5.955130 +- 0.005228 EVENT=5 -DNEW=1 -DANSHUL=1 event: 0.025990 +- 0.000798 0.017170 +- 0.000690 EVENT=5 -DNEW=1 -DANSHUL=1 -DWIPE_BTB=1 event: 3.343600 +- 0.004249 5.723570 +- 0.004876 EVENT=5 -DNEW=1 -DANSHUL=1 -DFLS=1 event: 0.019040 +- 0.000731 0.022570 +- 0.000829 EVENT=5 -DNEW=1 -DANSHUL=1 -DFLS=1 -DWIPE_BTB=1 event: 2.156350 +- 0.004643 4.297650 +- 0.004910 EVENT=5 -DNEW=1 -DANSHUL=1 -DFLS=1 -DSOFTFLS=1 event: 0.019800 +- 0.000747 0.016780 +- 0.000688 EVENT=5 -DNEW=1 -DANSHUL=1 -DFLS=1 -DSOFTFLS=1 -DWIPE_BTB=1 event: 4.481360 +- 0.005505 4.518300 +- 0.004877 Which still doesn't explain your hatred of fls(), as even with cold branches and a software fls, its faster than your latest proposal (as can be explained by the lower total number of branches for the software fls variant compared to your proposal). And when we start to look at larger values, the fls() one pulls even further ahead. So I'll stick with my proposal... I can write up a proper Changelog and maybe add my test to tools/testing/sqrt/ if you think its worth it. --- lib/int_sqrt.c | 11 ++++++++--- 1 file changed, 8 insertions(+), 3 deletions(-) diff --git a/lib/int_sqrt.c b/lib/int_sqrt.c index 1ef4cc344977..611933760225 100644 --- a/lib/int_sqrt.c +++ b/lib/int_sqrt.c @@ -7,12 +7,13 @@ #include <linux/kernel.h> #include <linux/export.h> +#include <linux/bitops.h> /** - * int_sqrt - rough approximation to sqrt + * int_sqrt - Computes the integer square root * @x: integer of which to calculate the sqrt * - * A very rough approximation to the sqrt() function. + * Computes: floor(sqrt(@x)) */ unsigned long int_sqrt(unsigned long x) { @@ -21,7 +22,11 @@ unsigned long int_sqrt(unsigned long x) if (x <= 1) return x; - m = 1UL << (BITS_PER_LONG - 2); + m = 1UL << (__fls(x) & ~1U); + + while (m > x) + m >>= 2; + while (m != 0) { b = y + m; y >>= 1; [-- Attachment #2: sqrt.tar --] [-- Type: application/x-tar, Size: 20480 bytes --] ^ permalink raw reply [flat|nested] 41+ messages in thread

*Re: [PATCH] lib/int_sqrt.c: Optimize square root function2017-07-21 11:40 ` Peter Zijlstra@ 2017-07-21 12:15 ` Joe Perches2017-07-21 13:26 ` Peter Zijlstra 0 siblings, 1 reply; 41+ messages in thread From: Joe Perches @ 2017-07-21 12:15 UTC (permalink / raw) To: Peter Zijlstra, Linus Torvalds Cc: Anshul Garg, Davidlohr Bueso, Linux Kernel Mailing List, anshul.g, Thomas Gleixner, Ingo Molnar, Will Deacon On Fri, 2017-07-21 at 13:40 +0200, Peter Zijlstra wrote: > @@ -21,7 +22,11 @@ unsigned long int_sqrt(unsigned long x) > if (x <= 1) > return x; > > - m = 1UL << (BITS_PER_LONG - 2); > + m = 1UL << (__fls(x) & ~1U); > + > + while (m > x) > + m >>= 2; while (m > x) ? Belt and suspenders if __fls is broken? ^ permalink raw reply [flat|nested] 41+ messages in thread

*Re: [PATCH] lib/int_sqrt.c: Optimize square root function2017-07-21 12:15 ` Joe Perches@ 2017-07-21 13:26 ` Peter Zijlstra2017-07-21 13:33 ` Peter Zijlstra 0 siblings, 1 reply; 41+ messages in thread From: Peter Zijlstra @ 2017-07-21 13:26 UTC (permalink / raw) To: Joe Perches Cc: Linus Torvalds, Anshul Garg, Davidlohr Bueso, Linux Kernel Mailing List, anshul.g, Thomas Gleixner, Ingo Molnar, Will Deacon On Fri, Jul 21, 2017 at 05:15:10AM -0700, Joe Perches wrote: > On Fri, 2017-07-21 at 13:40 +0200, Peter Zijlstra wrote: > > @@ -21,7 +22,11 @@ unsigned long int_sqrt(unsigned long x) > > if (x <= 1) > > return x; > > > > - m = 1UL << (BITS_PER_LONG - 2); > > + m = 1UL << (__fls(x) & ~1U); > > + > > + while (m > x) > > + m >>= 2; > > while (m > x) ? > > Belt and suspenders if __fls is broken? Hmm... you're right, that should not happen. It is a remnant from when I rounded up, like: m = 1UL << ((__fls(x) + 1) & ~1UL); Because I worried about the case where m == x, which is not included in the loop above (but works when you look at the actual computation loop and passes VALIDATE=1). But check this... I cannot explain :/ When I remove that loop, we, as fully expected, loose 1 branch, but the cycle count for the branch-cold case shoots up. Must be something GCC does. EVENT=0 -DNEW=1 -DFLS=1 event: 19.626050 +- 0.038995 EVENT=0 -DNEW=1 -DFLS=1 -DWIPE_BTB=1 event: 109.610670 +- 0.425667 EVENT=0 -DNEW=1 -DFLS=1 -DANSHUL=1 event: 21.445680 +- 0.043782 EVENT=0 -DNEW=1 -DFLS=1 -DANSHUL=1 -DWIPE_BTB=1 event: 83.590420 +- 0.142126 EVENT=4 -DNEW=1 -DFLS=1 event: 20.252330 +- 0.005265 EVENT=4 -DNEW=1 -DFLS=1 -DWIPE_BTB=1 event: 20.252340 +- 0.005265 EVENT=4 -DNEW=1 -DFLS=1 -DANSHUL=1 event: 21.252300 +- 0.005266 EVENT=4 -DNEW=1 -DFLS=1 -DANSHUL=1 -DWIPE_BTB=1 event: 21.252300 +- 0.005266 EVENT=5 -DNEW=1 -DFLS=1 event: 0.019370 +- 0.000732 EVENT=5 -DNEW=1 -DFLS=1 -DWIPE_BTB=1 event: 3.665240 +- 0.005309 EVENT=5 -DNEW=1 -DFLS=1 -DANSHUL=1 event: 0.020150 +- 0.000755 EVENT=5 -DNEW=1 -DFLS=1 -DANSHUL=1 -DWIPE_BTB=1 event: 2.225330 +- 0.004875 Let me dig out another GCC version current: gcc (Debian 6.3.0-18) 6.3.0 20170516 ^ permalink raw reply [flat|nested] 41+ messages in thread

*Re: [PATCH] lib/int_sqrt.c: Optimize square root function2017-07-21 13:26 ` Peter Zijlstra@ 2017-07-21 13:33 ` Peter Zijlstra0 siblings, 0 replies; 41+ messages in thread From: Peter Zijlstra @ 2017-07-21 13:33 UTC (permalink / raw) To: Joe Perches Cc: Linus Torvalds, Anshul Garg, Davidlohr Bueso, Linux Kernel Mailing List, anshul.g, Thomas Gleixner, Ingo Molnar, Will Deacon On Fri, Jul 21, 2017 at 03:26:21PM +0200, Peter Zijlstra wrote: > > EVENT=0 -DNEW=1 -DFLS=1 > event: 19.626050 +- 0.038995 > EVENT=0 -DNEW=1 -DFLS=1 -DWIPE_BTB=1 > event: 109.610670 +- 0.425667 > > EVENT=0 -DNEW=1 -DFLS=1 -DANSHUL=1 > event: 21.445680 +- 0.043782 > EVENT=0 -DNEW=1 -DFLS=1 -DANSHUL=1 -DWIPE_BTB=1 > event: 83.590420 +- 0.142126 > > Let me dig out another GCC version current: > > gcc (Debian 6.3.0-18) 6.3.0 20170516 gcc-7 (Debian 7.1.0-9) 7.1.0 EVENT=0 -DNEW=1 -DFLS=1 event: 24.179400 +- 0.031344 EVENT=0 -DNEW=1 -DFLS=1 -DWIPE_BTB=1 event: 137.892390 +- 0.307314 EVENT=0 -DNEW=1 -DFLS=1 -DANSHUL=1 event: 22.740300 +- 0.051317 EVENT=0 -DNEW=1 -DFLS=1 -DANSHUL=1 -DWIPE_BTB=1 event: 136.980640 +- 0.223410 GCC regressed it seems... *sigh* ^ permalink raw reply [flat|nested] 41+ messages in thread

end of thread, other threads:[~2017-07-21 13:33 UTC | newest]Thread overview:41+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- [not found] <aksgarg1989@gmail.com> 2015-01-21 19:26 ` [PATCH] lib/kstrtox.c break if overflow is detected Anshul Garg 2015-01-21 19:39 ` Levente Kurusa 2015-01-22 13:38 ` Anshul Garg 2015-01-22 13:58 ` [PATCH] lib/kstrtox.c Stop parsing integer on overflow Anshul Garg 2015-01-29 13:35 ` [PATCH] lib/int_sqrt.c: Optimize square root function Anshul Garg 2015-01-31 17:31 ` [RESEND PATCH] " Anshul Garg 2015-02-02 17:12 ` [PATCH] " Anshul Garg 2015-02-02 19:00 ` Linus Torvalds 2015-02-02 19:13 ` Linus Torvalds 2015-02-03 4:41 ` Davidlohr Bueso 2015-02-03 4:57 ` Davidlohr Bueso 2015-02-03 15:54 ` Anshul Garg 2017-07-20 11:24 ` Peter Zijlstra 2017-07-20 11:52 ` Joe Perches 2017-07-20 14:13 ` Peter Zijlstra 2017-07-20 15:27 ` Peter Zijlstra 2017-07-20 18:31 ` Linus Torvalds 2017-07-20 22:34 ` Peter Zijlstra 2017-07-20 23:24 ` Linus Torvalds 2017-07-21 11:40 ` Peter Zijlstra 2017-07-21 12:15 ` Joe Perches 2017-07-21 13:26 ` Peter Zijlstra 2017-07-21 13:33 ` Peter Zijlstra 2015-02-03 4:47 ` Davidlohr Bueso 2015-02-03 15:42 ` Anshul Garg 2015-02-05 18:20 ` Linus Torvalds 2015-02-05 18:33 ` Linus Torvalds 2015-02-05 18:43 ` Anshul Garg 2015-02-05 19:37 ` Linus Torvalds 2015-02-08 15:39 ` Anshul Garg 2015-02-02 19:10 ` Joe Perches 2015-02-02 19:39 ` Linus Torvalds 2015-02-16 18:48 ` [PATCH] lib/kstrtox.c Stop parsing integer on overflow Anshul Garg 2015-02-17 23:20 ` Richard Weinberger 2015-02-16 18:49 ` [PATCH] lib/vsprintf.c:Avoid extra operation in dentry_name Anshul Garg 2015-02-16 19:14 ` Richard Weinberger 2015-02-17 17:11 ` Anshul Garg 2015-07-01 19:29 ` [PATCH] Sched/rt:Fix memory leak in alloc_rt_sched_group Anshul Garg 2015-07-06 15:35 ` [tip:sched/core] sched/rt: Fix memory leak in alloc_rt_sched_group() tip-bot for Anshul Garg 2015-07-06 19:43 ` Anshul Garg 2015-07-07 6:45 ` Ingo Molnar

This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).