From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1756851AbYBECy7 (ORCPT ); Mon, 4 Feb 2008 21:54:59 -0500 Received: (majordomo@vger.kernel.org) by vger.kernel.org id S1753419AbYBECyu (ORCPT ); Mon, 4 Feb 2008 21:54:50 -0500 Received: from gateway-1237.mvista.com ([63.81.120.158]:4652 "EHLO dwalker1.mvista.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1753261AbYBECyt (ORCPT ); Mon, 4 Feb 2008 21:54:49 -0500 Date: Mon, 4 Feb 2008 18:51:44 -0800 From: Daniel Walker To: Max Krasnyanskiy Cc: Ingo Molnar , LKML , linux-rt-users@vger.kernel.org Subject: Re: CPU hotplug and IRQ affinity with 2.6.24-rt1 Message-ID: <20080205025144.GA31774@dwalker1.mvista.com> References: <47A7A131.8040800@qualcomm.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <47A7A131.8040800@qualcomm.com> User-Agent: Mutt/1.5.17 (2007-11-01) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Mon, Feb 04, 2008 at 03:35:13PM -0800, Max Krasnyanskiy wrote: > This is just an FYI. As part of the "Isolated CPU extensions" thread Daniel suggest for me > to check out latest RT kernels. So I did or at least tried to and immediately spotted a couple > of issues. > > The machine I'm running it on is: > HP xw9300, Dual Opteron, NUMA > > It looks like with -rt kernel IRQ affinity masks are ignored on that > system. ie I write 1 to lets say /proc/irq/23/smp_affinity but the > interrupts keep coming to CPU1. Vanilla 2.6.24 does not have that issue. I tried this, and it works according to /proc/interrupts .. Are you looking at the interrupt threads affinity? > Also the first thing I tried was to bring CPU1 off-line. Thats the fastest > way to get irqs, soft-irqs, timers, etc of a CPU. But the box hung > completely. It also managed to mess up my ext3 filesystem to the point > where it required manual fsck (have not see that for a couple of > years now). I tried the same thing (ie echo 0 > > /sys/devices/cpu/cpu1/online) from the console. It hang again with the > message that looked something like: > CPU1 is now off-line > Thread IRQ-23 is on CPU1 ... I get the following when I tried it, BUG: sleeping function called from invalid context bash(5126) at kernel/rtmutex.c:638 in_atomic():1 [00000001], irqs_disabled():1 Pid: 5126, comm: bash Not tainted 2.6.24-rt1 #1 [] show_trace_log_lvl+0x1d/0x3a [] show_trace+0x12/0x14 [] dump_stack+0x6c/0x72 [] __might_sleep+0xe8/0xef [] __rt_spin_lock+0x24/0x59 [] rt_spin_lock+0x8/0xa [] kfree+0x2c/0x8d [] rq_attach_root+0x67/0xba [] cpu_attach_domain+0x2b6/0x2f7 [] detach_destroy_domains+0x23/0x37 [] update_sched_domains+0x2d/0x40 [] notifier_call_chain+0x2b/0x55 [] __raw_notifier_call_chain+0x19/0x1e [] _cpu_down+0x84/0x24c [] cpu_down+0x28/0x3a [] store_online+0x27/0x5a [] sysdev_store+0x20/0x25 [] sysfs_write_file+0xad/0xde [] vfs_write+0x82/0xb8 [] sys_write+0x3d/0x61 [] sysenter_past_esp+0x5f/0x85 ======================= --------------------------- | preempt count: 00000001 ] | 1-level deep critical section nesting: ---------------------------------------- .. [] .... __spin_lock_irqsave+0x14/0x3b .....[] .. ( <= rq_attach_root+0x12/0xba) Which is clearly a problem .. (I added linux-rt-users to the CC) Daniel