From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S965413AbbDVQDB (ORCPT ); Wed, 22 Apr 2015 12:03:01 -0400 Received: from mail-ig0-f172.google.com ([209.85.213.172]:35017 "EHLO mail-ig0-f172.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1755730AbbDVQDA (ORCPT ); Wed, 22 Apr 2015 12:03:00 -0400 Message-ID: <1429718577.18561.103.camel@edumazet-glaptop2.roam.corp.google.com> Subject: Re: [PATCH 1/2] timer: Avoid waking up an idle-core by migrate running timer From: Eric Dumazet To: Peter Zijlstra Cc: Thomas Gleixner , viresh kumar , Ingo Molnar , linaro-kernel@lists.linaro.org, linux-kernel@vger.kernel.org, Steven Miao , shashim@codeaurora.org Date: Wed, 22 Apr 2015 09:02:57 -0700 In-Reply-To: <20150422152940.GC3007@worktop.Skamania.guest> References: <80182e47a7103608d2ddab7f62c0c3dffc99fdcc.1427782893.git.viresh.kumar@linaro.org> <5530C086.2020700@linaro.org> <1429653295.18561.16.camel@edumazet-glaptop2.roam.corp.google.com> <20150422152940.GC3007@worktop.Skamania.guest> Content-Type: text/plain; charset="UTF-8" X-Mailer: Evolution 3.10.4-0ubuntu2 Mime-Version: 1.0 Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, 2015-04-22 at 17:29 +0200, Peter Zijlstra wrote: > Hmm, that sounds unfortunate, this would wreck life for the power aware > laptop/tablet etc.. people. > > There is already a sysctl to manage this, is that not enough to mitigate > this problem on the server side of things? The thing is : 99% of networking timers never fire. But when they _do_ fire, and host is under attack, they all fire on unrelated cpu and this one can not keep up. Added latencies fire monitoring alerts. Check commit 4a8e320c929991c9480 ("net: sched: use pinned timers") for a specific example of the problems that can be raised. When we set a timer to fire in 10 seconds, knowing the _current_ idle state for cpus is of no help. Add to this that softirq processing is not considered as making current cpu as non idle. networking tried hard to use cpu affinities (and all techniques described in Documentation/networking/scaling.txt), but /proc/sys/kernel/timer_migration adds a fair overhead in many workloads. get_nohz_timer_target() has to touch 3 cache lines per cpu... Its in the top 10 in "perf top" profiles on servers with 72 threads. This /proc/sys/kernel/timer_migration should have been instead : /proc/sys/kernel/timer_on_a_single_cpu_for_laptop_sake