LKML Archive on lore.kernel.org
help / color / mirror / Atom feed
* CPU hotplug and IRQ affinity with 2.6.24-rt1
@ 2008-02-04 23:35 Max Krasnyanskiy
  2008-02-05  2:51 ` Daniel Walker
  0 siblings, 1 reply; 11+ messages in thread
From: Max Krasnyanskiy @ 2008-02-04 23:35 UTC (permalink / raw)
  To: LKML; +Cc: Ingo Molnar, Daniel Walker

This is just an FYI. As part of the "Isolated CPU extensions" thread Daniel suggest for me
to check out latest RT kernels. So I did or at least tried to and immediately spotted a couple
of issues.

The machine I'm running it on is:
	HP xw9300, Dual Opteron, NUMA

It looks like with -rt kernel IRQ affinity masks are ignored on that system. ie I write 1 to 
lets say /proc/irq/23/smp_affinity but the interrupts keep coming to CPU1. 
Vanilla 2.6.24 does not have that issue.

Also the first thing I tried was to bring CPU1 off-line. Thats the fastest way to get irqs, 
soft-irqs, timers, etc of a CPU. But the box hung completely. It also managed to mess up my 
ext3 filesystem to the point where it required manual fsck (have not see that for a couple of
years now). 
I tried the same thing (ie echo 0 > /sys/devices/cpu/cpu1/online) from the console. It hang 
again with the message that looked something like:
	CPU1 is now off-line
	Thread IRQ-23 is on CPU1 ...

IRQ 23 is NVidia SATA. So I guess it has something to do with the borked affinity handling.
Vanilla 2.6.24 handles this just fine.

Anyway, like I said it's just an FYI, not an urgent issue.

Max


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: CPU hotplug and IRQ affinity with 2.6.24-rt1
  2008-02-04 23:35 CPU hotplug and IRQ affinity with 2.6.24-rt1 Max Krasnyanskiy
@ 2008-02-05  2:51 ` Daniel Walker
  2008-02-05  3:27   ` Gregory Haskins
                     ` (3 more replies)
  0 siblings, 4 replies; 11+ messages in thread
From: Daniel Walker @ 2008-02-05  2:51 UTC (permalink / raw)
  To: Max Krasnyanskiy; +Cc: Ingo Molnar, LKML, linux-rt-users

On Mon, Feb 04, 2008 at 03:35:13PM -0800, Max Krasnyanskiy wrote:
> This is just an FYI. As part of the "Isolated CPU extensions" thread Daniel suggest for me
> to check out latest RT kernels. So I did or at least tried to and immediately spotted a couple
> of issues.
>
> The machine I'm running it on is:
> 	HP xw9300, Dual Opteron, NUMA
>
> It looks like with -rt kernel IRQ affinity masks are ignored on that 
> system. ie I write 1 to lets say /proc/irq/23/smp_affinity but the 
> interrupts keep coming to CPU1. Vanilla 2.6.24 does not have that issue.

I tried this, and it works according to /proc/interrupts .. Are you
looking at the interrupt threads affinity?

> Also the first thing I tried was to bring CPU1 off-line. Thats the fastest 
> way to get irqs, soft-irqs, timers, etc of a CPU. But the box hung 
> completely. It also managed to mess up my ext3 filesystem to the point 
> where it required manual fsck (have not see that for a couple of
> years now). I tried the same thing (ie echo 0 > 
> /sys/devices/cpu/cpu1/online) from the console. It hang again with the 
> message that looked something like:
> 	CPU1 is now off-line
> 	Thread IRQ-23 is on CPU1 ...

I get the following when I tried it,

BUG: sleeping function called from invalid context bash(5126) at
kernel/rtmutex.c:638
in_atomic():1 [00000001], irqs_disabled():1
Pid: 5126, comm: bash Not tainted 2.6.24-rt1 #1
 [<c010506b>] show_trace_log_lvl+0x1d/0x3a
 [<c01059cd>] show_trace+0x12/0x14
 [<c0106151>] dump_stack+0x6c/0x72
 [<c011d153>] __might_sleep+0xe8/0xef
 [<c03b2326>] __rt_spin_lock+0x24/0x59
 [<c03b2363>] rt_spin_lock+0x8/0xa
 [<c0165b2f>] kfree+0x2c/0x8d
 [<c011eacb>] rq_attach_root+0x67/0xba
 [<c01209ae>] cpu_attach_domain+0x2b6/0x2f7
 [<c0120a12>] detach_destroy_domains+0x23/0x37
 [<c0121368>] update_sched_domains+0x2d/0x40
 [<c013b482>] notifier_call_chain+0x2b/0x55
 [<c013b4d9>] __raw_notifier_call_chain+0x19/0x1e
 [<c01420d3>] _cpu_down+0x84/0x24c
 [<c01422c3>] cpu_down+0x28/0x3a
 [<c029f59e>] store_online+0x27/0x5a
 [<c029c9dc>] sysdev_store+0x20/0x25
 [<c019a695>] sysfs_write_file+0xad/0xde
 [<c0169929>] vfs_write+0x82/0xb8
 [<c0169e2a>] sys_write+0x3d/0x61
 [<c0104072>] sysenter_past_esp+0x5f/0x85
 =======================
---------------------------
| preempt count: 00000001 ]
| 1-level deep critical section nesting:
----------------------------------------
.. [<c03b25e2>] .... __spin_lock_irqsave+0x14/0x3b
.....[<c011ea76>] ..   ( <= rq_attach_root+0x12/0xba)

Which is clearly a problem .. 

(I added linux-rt-users to the CC)

Daniel

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: CPU hotplug and IRQ affinity with 2.6.24-rt1
  2008-02-05  2:51 ` Daniel Walker
@ 2008-02-05  3:27   ` Gregory Haskins
  2008-02-05  4:21   ` Max Krasnyansky
                     ` (2 subsequent siblings)
  3 siblings, 0 replies; 11+ messages in thread
From: Gregory Haskins @ 2008-02-05  3:27 UTC (permalink / raw)
  To: Daniel Walker, Max Krasnyanskiy; +Cc: Ingo Molnar, LKML, linux-rt-users

Hi Daniel,

  See inline...

>>> On Mon, Feb 4, 2008 at  9:51 PM, in message
<20080205025144.GA31774@dwalker1.mvista.com>, Daniel Walker
<dwalker@dwalker1.mvista.com> wrote: 
> On Mon, Feb 04, 2008 at 03:35:13PM -0800, Max Krasnyanskiy wrote:
>> This is just an FYI. As part of the "Isolated CPU extensions" thread Daniel 
> suggest for me
>> to check out latest RT kernels. So I did or at least tried to and 
> immediately spotted a couple
>> of issues.
>>
>> The machine I'm running it on is:
>> 	HP xw9300, Dual Opteron, NUMA
>>
>> It looks like with -rt kernel IRQ affinity masks are ignored on that 
>> system. ie I write 1 to lets say /proc/irq/23/smp_affinity but the 
>> interrupts keep coming to CPU1. Vanilla 2.6.24 does not have that issue.
> 
> I tried this, and it works according to /proc/interrupts .. Are you
> looking at the interrupt threads affinity?
> 
>> Also the first thing I tried was to bring CPU1 off-line. Thats the fastest 
>> way to get irqs, soft-irqs, timers, etc of a CPU. But the box hung 
>> completely. It also managed to mess up my ext3 filesystem to the point 
>> where it required manual fsck (have not see that for a couple of
>> years now). I tried the same thing (ie echo 0 > 
>> /sys/devices/cpu/cpu1/online) from the console. It hang again with the 
>> message that looked something like:
>> 	CPU1 is now off-line
>> 	Thread IRQ-23 is on CPU1 ...
> 
> I get the following when I tried it,
> 
> BUG: sleeping function called from invalid context bash(5126) at
> kernel/rtmutex.c:638
> in_atomic():1 [00000001], irqs_disabled():1
> Pid: 5126, comm: bash Not tainted 2.6.24-rt1 #1
>  [<c010506b>] show_trace_log_lvl+0x1d/0x3a
>  [<c01059cd>] show_trace+0x12/0x14
>  [<c0106151>] dump_stack+0x6c/0x72
>  [<c011d153>] __might_sleep+0xe8/0xef
>  [<c03b2326>] __rt_spin_lock+0x24/0x59
>  [<c03b2363>] rt_spin_lock+0x8/0xa
>  [<c0165b2f>] kfree+0x2c/0x8d

Doh!  This is my bug.  Ill have to come up with a good way to free that memory under atomic, or do this another way.  Stay tuned.

>  [<c011eacb>] rq_attach_root+0x67/0xba
>  [<c01209ae>] cpu_attach_domain+0x2b6/0x2f7
>  [<c0120a12>] detach_destroy_domains+0x23/0x37
>  [<c0121368>] update_sched_domains+0x2d/0x40
>  [<c013b482>] notifier_call_chain+0x2b/0x55
>  [<c013b4d9>] __raw_notifier_call_chain+0x19/0x1e
>  [<c01420d3>] _cpu_down+0x84/0x24c
>  [<c01422c3>] cpu_down+0x28/0x3a
>  [<c029f59e>] store_online+0x27/0x5a
>  [<c029c9dc>] sysdev_store+0x20/0x25
>  [<c019a695>] sysfs_write_file+0xad/0xde
>  [<c0169929>] vfs_write+0x82/0xb8
>  [<c0169e2a>] sys_write+0x3d/0x61
>  [<c0104072>] sysenter_past_esp+0x5f/0x85
>  =======================
> ---------------------------
> | preempt count: 00000001 ]
> | 1-level deep critical section nesting:
> ----------------------------------------
> .. [<c03b25e2>] .... __spin_lock_irqsave+0x14/0x3b
> .....[<c011ea76>] ..   ( <= rq_attach_root+0x12/0xba)
> 
> Which is clearly a problem .. 
> 
> (I added linux-rt-users to the CC)
> 
> Daniel
> -
> To unsubscribe from this list: send the line "unsubscribe linux-rt-users" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html



^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: CPU hotplug and IRQ affinity with 2.6.24-rt1
  2008-02-05  2:51 ` Daniel Walker
  2008-02-05  3:27   ` Gregory Haskins
@ 2008-02-05  4:21   ` Max Krasnyansky
  2008-02-05  5:02   ` Gregory Haskins
  2008-02-05 14:00   ` Gregory Haskins
  3 siblings, 0 replies; 11+ messages in thread
From: Max Krasnyansky @ 2008-02-05  4:21 UTC (permalink / raw)
  To: Daniel Walker; +Cc: Ingo Molnar, LKML, linux-rt-users



Daniel Walker wrote:
> On Mon, Feb 04, 2008 at 03:35:13PM -0800, Max Krasnyanskiy wrote:
>> This is just an FYI. As part of the "Isolated CPU extensions" thread Daniel suggest for me
>> to check out latest RT kernels. So I did or at least tried to and immediately spotted a couple
>> of issues.
>>
>> The machine I'm running it on is:
>> 	HP xw9300, Dual Opteron, NUMA
>>
>> It looks like with -rt kernel IRQ affinity masks are ignored on that 
>> system. ie I write 1 to lets say /proc/irq/23/smp_affinity but the 
>> interrupts keep coming to CPU1. Vanilla 2.6.24 does not have that issue.
> 
> I tried this, and it works according to /proc/interrupts .. Are you
> looking at the interrupt threads affinity ?
Nope. I'm looking at the /proc/interrupts. ie The interrupt count keeps incrementing for cpu1 even
though affinity mask is set to 1.

IRQ thread affinity was btw set to 3 which is probably wrong.
To clarify, by default after reboot:
	- IRQ affinity set 3, IRQ thread affinity set to 3
	- User writes 1 into /proc/irq/N/smp_affinity
	- IRQ affinity is now set to 1, IRQ thread affinity is still set to 3

It'd still work I guess but does not seem right. Ideally IRQ thread affinity should have change as well.
We could of course just have some user-space tool that adjusts both.

Looks like Greg already replied to the cpu hotplug issue. For me it did not oops. Just got stuck probably
because it could not move an IRQ due to broken IRQ affinity logic.

Max

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: CPU hotplug and IRQ affinity with 2.6.24-rt1
  2008-02-05  2:51 ` Daniel Walker
  2008-02-05  3:27   ` Gregory Haskins
  2008-02-05  4:21   ` Max Krasnyansky
@ 2008-02-05  5:02   ` Gregory Haskins
  2008-02-05 16:59     ` Daniel Walker
  2008-02-05 14:00   ` Gregory Haskins
  3 siblings, 1 reply; 11+ messages in thread
From: Gregory Haskins @ 2008-02-05  5:02 UTC (permalink / raw)
  To: Daniel Walker, Max Krasnyanskiy; +Cc: Ingo Molnar, LKML, linux-rt-users

>>> On Mon, Feb 4, 2008 at  9:51 PM, in message
<20080205025144.GA31774@dwalker1.mvista.com>, Daniel Walker
<dwalker@dwalker1.mvista.com> wrote: 
> I get the following when I tried it,
> 
> BUG: sleeping function called from invalid context bash(5126) at
> kernel/rtmutex.c:638
> in_atomic():1 [00000001], irqs_disabled():1

Hi Daniel,
  Can you try this patch and let me know if it fixes your problem?

-----------------------

use rcu for root-domain kfree

Signed-off-by: Gregory Haskins <ghaskins@novell.com>

diff --git a/kernel/sched.c b/kernel/sched.c
index e6ad493..77e86c1 100644
--- a/kernel/sched.c
+++ b/kernel/sched.c
@@ -339,6 +339,7 @@ struct root_domain {
        atomic_t refcount;
        cpumask_t span;
        cpumask_t online;
+       struct rcu_head rcu;

        /*
         * The "RT overload" flag: it gets set if a CPU has more than
@@ -6222,6 +6223,12 @@ sd_parent_degenerate(struct sched_domain *sd, struct sched_domain *parent)
        return 1;
 }

+/* rcu callback to free a root-domain */
+static void rq_free_root(struct rcu_head *rcu)
+{
+       kfree(container_of(rcu, struct root_domain, rcu));
+}
+
 static void rq_attach_root(struct rq *rq, struct root_domain *rd)
 {
        unsigned long flags;
@@ -6241,7 +6248,7 @@ static void rq_attach_root(struct rq *rq, struct root_domain *rd)
                cpu_clear(rq->cpu, old_rd->online);

                if (atomic_dec_and_test(&old_rd->refcount))
-                       kfree(old_rd);
+                       call_rcu(&old_rd->rcu, rq_free_root);
        }

        atomic_inc(&rd->refcount);


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: CPU hotplug and IRQ affinity with 2.6.24-rt1
  2008-02-05  2:51 ` Daniel Walker
                     ` (2 preceding siblings ...)
  2008-02-05  5:02   ` Gregory Haskins
@ 2008-02-05 14:00   ` Gregory Haskins
  3 siblings, 0 replies; 11+ messages in thread
From: Gregory Haskins @ 2008-02-05 14:00 UTC (permalink / raw)
  To: dwalker, Max Krasnyanskiy; +Cc: Ingo Molnar, LKML, linux-rt-users

>>> On Mon, Feb 4, 2008 at  9:51 PM, in message
<20080205025144.GA31774@dwalker1.mvista.com>, Daniel Walker
<dwalker@dwalker1.mvista.com> wrote: 
> On Mon, Feb 04, 2008 at 03:35:13PM -0800, Max Krasnyanskiy wrote:

[snip]

>>
>> Also the first thing I tried was to bring CPU1 off-line. Thats the fastest 
>> way to get irqs, soft-irqs, timers, etc of a CPU. But the box hung 
>> completely.

After applying my earlier submitted patch, I was able to reproduce the hang you mentioned.  I poked around in sysrq and it looked like a deadlock on a rt_mutex, so I turned on lockdep and it found:


=======================================================
[ INFO: possible circular locking dependency detected ]
[ 2.6.24-rt1-rt #3
-------------------------------------------------------
bash/4604 is trying to acquire lock:
 (events){--..}, at: [<ffffffff802537b6>] cleanup_workqueue_thread+0x16/0x80

but task is already holding lock:
 (workqueue_mutex){--..}, at: [<ffffffff80254615>] workqueue_cpu_callback+0xe5/0x140

which lock already depends on the new lock.


the existing dependency chain (in reverse order) is:

-> #5 (workqueue_mutex){--..}:
       [<ffffffff80266752>] __lock_acquire+0xf82/0x1090
       [<ffffffff802668b7>] lock_acquire+0x57/0x80
       [<ffffffff80254615>] workqueue_cpu_callback+0xe5/0x140
       [<ffffffff80486818>] _mutex_lock+0x28/0x40
       [<ffffffff80254615>] workqueue_cpu_callback+0xe5/0x140
       [<ffffffff8048a575>] notifier_call_chain+0x45/0x90
       [<ffffffff8025d079>] __raw_notifier_call_chain+0x9/0x10
       [<ffffffff8025d091>] raw_notifier_call_chain+0x11/0x20
       [<ffffffff8026d157>] _cpu_down+0x97/0x2d0
       [<ffffffff8026d3b5>] cpu_down+0x25/0x60
       [<ffffffff8026d3c8>] cpu_down+0x38/0x60
       [<ffffffff803d6719>] store_online+0x49/0xa0
       [<ffffffff803d2774>] sysdev_store+0x24/0x30
       [<ffffffff8031279f>] sysfs_write_file+0xcf/0x140
       [<ffffffff802c0005>] vfs_write+0xe5/0x1a0
       [<ffffffff802c0733>] sys_write+0x53/0x90
       [<ffffffff8020c4fe>] system_call+0x7e/0x83
       [<ffffffffffffffff>] 0xffffffffffffffff

-> #4 (cache_chain_mutex){--..}:
       [<ffffffff80266752>] __lock_acquire+0xf82/0x1090
       [<ffffffff802668b7>] lock_acquire+0x57/0x80
       [<ffffffff802bb7fa>] kmem_cache_create+0x6a/0x480
       [<ffffffff80486818>] _mutex_lock+0x28/0x40
       [<ffffffff802bb7fa>] kmem_cache_create+0x6a/0x480
       [<ffffffff802872a6>] __rcu_read_unlock+0x96/0xb0
       [<ffffffff8046b824>] fib_hash_init+0xa4/0xe0
       [<ffffffff80467ee5>] fib_new_table+0x35/0x70
       [<ffffffff80467fb1>] fib_magic+0x91/0x100
       [<ffffffff80468093>] fib_add_ifaddr+0x73/0x170
       [<ffffffff8046829b>] fib_inetaddr_event+0x4b/0x260
       [<ffffffff8048a575>] notifier_call_chain+0x45/0x90
       [<ffffffff8025d2ce>] __blocking_notifier_call_chain+0x5e/0x90
       [<ffffffff8025d311>] blocking_notifier_call_chain+0x11/0x20
       [<ffffffff8045f714>] __inet_insert_ifa+0xd4/0x170
       [<ffffffff8045f7bd>] inet_insert_ifa+0xd/0x10
       [<ffffffff8046083a>] inetdev_event+0x45a/0x510
       [<ffffffff8041ee4d>] fib_rules_event+0x6d/0x160
       [<ffffffff8048a575>] notifier_call_chain+0x45/0x90
       [<ffffffff8025d079>] __raw_notifier_call_chain+0x9/0x10
       [<ffffffff8025d091>] raw_notifier_call_chain+0x11/0x20
       [<ffffffff8040f466>] call_netdevice_notifiers+0x16/0x20
       [<ffffffff80410f6d>] dev_open+0x8d/0xa0
       [<ffffffff8040f5e9>] dev_change_flags+0x99/0x1b0
       [<ffffffff80460ffd>] devinet_ioctl+0x5ad/0x760
       [<ffffffff80410d6a>] dev_ioctl+0x4ba/0x590
       [<ffffffff8026523d>] trace_hardirqs_on+0xd/0x10
       [<ffffffff8046162d>] inet_ioctl+0x5d/0x80
       [<ffffffff80400f21>] sock_ioctl+0xd1/0x260
       [<ffffffff802ce154>] do_ioctl+0x34/0xa0
       [<ffffffff802ce239>] vfs_ioctl+0x79/0x2f0
       [<ffffffff80485f30>] trace_hardirqs_on_thunk+0x3a/0x3f
       [<ffffffff802ce532>] sys_ioctl+0x82/0xa0
       [<ffffffff8020c4fe>] system_call+0x7e/0x83
       [<ffffffffffffffff>] 0xffffffffffffffff

-> #3 ((inetaddr_chain).rwsem){..--}:
       [<ffffffff80266752>] __lock_acquire+0xf82/0x1090
       [<ffffffff802668b7>] lock_acquire+0x57/0x80
       [<ffffffff8026ca9b>] rt_down_read+0xb/0x10
       [<ffffffff8026ca29>] __rt_down_read+0x29/0x80
       [<ffffffff8026ca9b>] rt_down_read+0xb/0x10
       [<ffffffff8025d2b8>] __blocking_notifier_call_chain+0x48/0x90
       [<ffffffff8025d311>] blocking_notifier_call_chain+0x11/0x20
       [<ffffffff8045f714>] __inet_insert_ifa+0xd4/0x170
       [<ffffffff8045f7bd>] inet_insert_ifa+0xd/0x10
       [<ffffffff8046083a>] inetdev_event+0x45a/0x510
       [<ffffffff8041ee4d>] fib_rules_event+0x6d/0x160
       [<ffffffff8048a575>] notifier_call_chain+0x45/0x90
       [<ffffffff8025d079>] __raw_notifier_call_chain+0x9/0x10
       [<ffffffff8025d091>] raw_notifier_call_chain+0x11/0x20
       [<ffffffff8040f466>] call_netdevice_notifiers+0x16/0x20
       [<ffffffff80410f6d>] dev_open+0x8d/0xa0
       [<ffffffff8040f5e9>] dev_change_flags+0x99/0x1b0
       [<ffffffff80460ffd>] devinet_ioctl+0x5ad/0x760
       [<ffffffff80410d6a>] dev_ioctl+0x4ba/0x590
       [<ffffffff8026523d>] trace_hardirqs_on+0xd/0x10
       [<ffffffff8046162d>] inet_ioctl+0x5d/0x80
       [<ffffffff80400f21>] sock_ioctl+0xd1/0x260
       [<ffffffff802ce154>] do_ioctl+0x34/0xa0
       [<ffffffff802ce239>] vfs_ioctl+0x79/0x2f0
       [<ffffffff80485f30>] trace_hardirqs_on_thunk+0x3a/0x3f
       [<ffffffff802ce532>] sys_ioctl+0x82/0xa0
       [<ffffffff8020c4fe>] system_call+0x7e/0x83
       [<ffffffffffffffff>] 0xffffffffffffffff

-> #2 (rtnl_mutex){--..}:
       [<ffffffff80266752>] __lock_acquire+0xf82/0x1090
       [<ffffffff802668b7>] lock_acquire+0x57/0x80
       [<ffffffff8041a450>] rtnl_lock+0x10/0x20
       [<ffffffff80486818>] _mutex_lock+0x28/0x40
       [<ffffffff8041a450>] rtnl_lock+0x10/0x20
       [<ffffffff8041b839>] linkwatch_event+0x9/0x40
       [<ffffffff80253481>] run_workqueue+0x221/0x2f0
       [<ffffffff8041b830>] linkwatch_event+0x0/0x40
       [<ffffffff802544c3>] worker_thread+0xd3/0x140
       [<ffffffff80257f40>] autoremove_wake_function+0x0/0x40
       [<ffffffff802543f0>] worker_thread+0x0/0x140
       [<ffffffff80257b4d>] kthread+0x4d/0x80
       [<ffffffff8020d468>] child_rip+0xa/0x12
       [<ffffffff8020cb53>] restore_args+0x0/0x30
       [<ffffffff80257b00>] kthread+0x0/0x80
       [<ffffffff8020d45e>] child_rip+0x0/0x12
       [<ffffffffffffffff>] 0xffffffffffffffff

-> #1 ((linkwatch_work).work){--..}:
       [<ffffffff80266752>] __lock_acquire+0xf82/0x1090
       [<ffffffff802668b7>] lock_acquire+0x57/0x80
       [<ffffffff8025342a>] run_workqueue+0x1ca/0x2f0
       [<ffffffff8025347a>] run_workqueue+0x21a/0x2f0
       [<ffffffff8041b830>] linkwatch_event+0x0/0x40
       [<ffffffff802544c3>] worker_thread+0xd3/0x140
       [<ffffffff80257f40>] autoremove_wake_function+0x0/0x40
       [<ffffffff802543f0>] worker_thread+0x0/0x140
       [<ffffffff80257b4d>] kthread+0x4d/0x80
       [<ffffffff8020d468>] child_rip+0xa/0x12
       [<ffffffff8020cb53>] restore_args+0x0/0x30
       [<ffffffff80257b00>] kthread+0x0/0x80
       [<ffffffff8020d45e>] child_rip+0x0/0x12
       [<ffffffffffffffff>] 0xffffffffffffffff

-> #0 (events){--..}:
       [<ffffffff802636e9>] print_circular_bug_entry+0x49/0x60
       [<ffffffff80266550>] __lock_acquire+0xd80/0x1090
       [<ffffffff802668b7>] lock_acquire+0x57/0x80
       [<ffffffff802537b6>] cleanup_workqueue_thread+0x16/0x80
       [<ffffffff802537d9>] cleanup_workqueue_thread+0x39/0x80
       [<ffffffff802545bd>] workqueue_cpu_callback+0x8d/0x140
       [<ffffffff8048a575>] notifier_call_chain+0x45/0x90
       [<ffffffff8025d079>] __raw_notifier_call_chain+0x9/0x10
       [<ffffffff8025d091>] raw_notifier_call_chain+0x11/0x20
       [<ffffffff8026d2ab>] _cpu_down+0x1eb/0x2d0
       [<ffffffff8026d3b5>] cpu_down+0x25/0x60
       [<ffffffff8026d3c8>] cpu_down+0x38/0x60
       [<ffffffff803d6719>] store_online+0x49/0xa0
       [<ffffffff803d2774>] sysdev_store+0x24/0x30
       [<ffffffff8031279f>] sysfs_write_file+0xcf/0x140
       [<ffffffff802c0005>] vfs_write+0xe5/0x1a0
       [<ffffffff802c0733>] sys_write+0x53/0x90
       [<ffffffff8020c4fe>] system_call+0x7e/0x83
       [<ffffffffffffffff>] 0xffffffffffffffff

other info that might help us debug this:

5 locks held by bash/4604:
 #0:  (&buffer->mutex){--..}, at: [<ffffffff80312711>] sysfs_write_file+0x41/0x140
 #1:  (cpu_add_remove_lock){--..}, at: [<ffffffff8026d3b5>] cpu_down+0x25/0x60
 #2:  (sched_hotcpu_mutex){--..}, at: [<ffffffff80239c31>] migration_call+0x2b1/0x540
 #3:  (cache_chain_mutex){--..}, at: [<ffffffff802bb541>] cpuup_callback+0x211/0x400
 #4:  (workqueue_mutex){--..}, at: [<ffffffff80254615>] workqueue_cpu_callback+0xe5/0x140

stack backtrace:
Pid: 4604, comm: bash Not tainted 2.6.24-rt1-rt #3

Call Trace:
 [<ffffffff80264094>] print_circular_bug_tail+0x84/0x90
 [<ffffffff802636e9>] print_circular_bug_entry+0x49/0x60
 [<ffffffff80266550>] __lock_acquire+0xd80/0x1090
 [<ffffffff802668b7>] lock_acquire+0x57/0x80
 [<ffffffff802537b6>] cleanup_workqueue_thread+0x16/0x80
 [<ffffffff802537d9>] cleanup_workqueue_thread+0x39/0x80
 [<ffffffff802545bd>] workqueue_cpu_callback+0x8d/0x140
 [<ffffffff8048a575>] notifier_call_chain+0x45/0x90
 [<ffffffff8025d079>] __raw_notifier_call_chain+0x9/0x10
 [<ffffffff8025d091>] raw_notifier_call_chain+0x11/0x20
 [<ffffffff8026d2ab>] _cpu_down+0x1eb/0x2d0
 [<ffffffff8026d3b5>] cpu_down+0x25/0x60
 [<ffffffff8026d3c8>] cpu_down+0x38/0x60
 [<ffffffff803d6719>] store_online+0x49/0xa0
 [<ffffffff803d2774>] sysdev_store+0x24/0x30
 [<ffffffff8031279f>] sysfs_write_file+0xcf/0x140
 [<ffffffff802c0005>] vfs_write+0xe5/0x1a0
 [<ffffffff802c0733>] sys_write+0x53/0x90
 [<ffffffff8020c4fe>] system_call+0x7e/0x83

INFO: lockdep is turned off.
---------------------------
| preempt count: 00000000 ]
| 0-level deep critical section nesting:
----------------------------------------




^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: CPU hotplug and IRQ affinity with 2.6.24-rt1
  2008-02-05  5:02   ` Gregory Haskins
@ 2008-02-05 16:59     ` Daniel Walker
  2008-02-05 17:13       ` Gregory Haskins
  2008-02-05 18:25       ` Gregory Haskins
  0 siblings, 2 replies; 11+ messages in thread
From: Daniel Walker @ 2008-02-05 16:59 UTC (permalink / raw)
  To: Gregory Haskins; +Cc: Ingo Molnar, LKML, linux-rt-users, Max Krasnyanskiy

On Mon, Feb 04, 2008 at 10:02:12PM -0700, Gregory Haskins wrote:
> >>> On Mon, Feb 4, 2008 at  9:51 PM, in message
> <20080205025144.GA31774@dwalker1.mvista.com>, Daniel Walker
> <dwalker@dwalker1.mvista.com> wrote: 
> > I get the following when I tried it,
> > 
> > BUG: sleeping function called from invalid context bash(5126) at
> > kernel/rtmutex.c:638
> > in_atomic():1 [00000001], irqs_disabled():1
> 
> Hi Daniel,
>   Can you try this patch and let me know if it fixes your problem?
> 
> -----------------------
> 
> use rcu for root-domain kfree
> 
> Signed-off-by: Gregory Haskins <ghaskins@novell.com>
> 
> diff --git a/kernel/sched.c b/kernel/sched.c
> index e6ad493..77e86c1 100644
> --- a/kernel/sched.c
> +++ b/kernel/sched.c
> @@ -339,6 +339,7 @@ struct root_domain {
>         atomic_t refcount;
>         cpumask_t span;
>         cpumask_t online;
> +       struct rcu_head rcu;
> 
>         /*
>          * The "RT overload" flag: it gets set if a CPU has more than
> @@ -6222,6 +6223,12 @@ sd_parent_degenerate(struct sched_domain *sd, struct sched_domain *parent)
>         return 1;
>  }
> 
> +/* rcu callback to free a root-domain */
> +static void rq_free_root(struct rcu_head *rcu)
> +{
> +       kfree(container_of(rcu, struct root_domain, rcu));
> +}
> +

I looked at the code a bit, and I'm not sure you need this complexity..
Once you have replace the old_rq, there is no reason it needs to
protection of the run queue spinlock .. So you could just move the kfree
down below the spin_unlock_irqrestore() ..

Daniel

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: CPU hotplug and IRQ affinity with 2.6.24-rt1
  2008-02-05 16:59     ` Daniel Walker
@ 2008-02-05 17:13       ` Gregory Haskins
  2008-02-05 18:25       ` Gregory Haskins
  1 sibling, 0 replies; 11+ messages in thread
From: Gregory Haskins @ 2008-02-05 17:13 UTC (permalink / raw)
  To: Daniel Walker; +Cc: Ingo Molnar, Max Krasnyanskiy, LKML, linux-rt-users

>>> On Tue, Feb 5, 2008 at 11:59 AM, in message
<20080205165936.GA18613@dwalker1.mvista.com>, Daniel Walker
<dwalker@dwalker1.mvista.com> wrote: 
> On Mon, Feb 04, 2008 at 10:02:12PM -0700, Gregory Haskins wrote:
>> >>> On Mon, Feb 4, 2008 at  9:51 PM, in message
>> <20080205025144.GA31774@dwalker1.mvista.com>, Daniel Walker
>> <dwalker@dwalker1.mvista.com> wrote: 
>> > I get the following when I tried it,
>> > 
>> > BUG: sleeping function called from invalid context bash(5126) at
>> > kernel/rtmutex.c:638
>> > in_atomic():1 [00000001], irqs_disabled():1
>> 
>> Hi Daniel,
>>   Can you try this patch and let me know if it fixes your problem?
>> 
>> -----------------------
>> 
>> use rcu for root-domain kfree
>> 
>> Signed-off-by: Gregory Haskins <ghaskins@novell.com>
>> 
>> diff --git a/kernel/sched.c b/kernel/sched.c
>> index e6ad493..77e86c1 100644
>> --- a/kernel/sched.c
>> +++ b/kernel/sched.c
>> @@ -339,6 +339,7 @@ struct root_domain {
>>         atomic_t refcount;
>>         cpumask_t span;
>>         cpumask_t online;
>> +       struct rcu_head rcu;
>> 
>>         /*
>>          * The "RT overload" flag: it gets set if a CPU has more than
>> @@ -6222,6 +6223,12 @@ sd_parent_degenerate(struct sched_domain *sd, struct 
> sched_domain *parent)
>>         return 1;
>>  }
>> 
>> +/* rcu callback to free a root-domain */
>> +static void rq_free_root(struct rcu_head *rcu)
>> +{
>> +       kfree(container_of(rcu, struct root_domain, rcu));
>> +}
>> +
> 
> I looked at the code a bit, and I'm not sure you need this complexity..
> Once you have replace the old_rq, there is no reason it needs to
> protection of the run queue spinlock .. So you could just move the kfree
> down below the spin_unlock_irqrestore() ..

Indeed.  When I looked last night at the stack, I thought the in_atomic was coming from further up in the trace.  I see the issue now, thanks Daniel.  (Anyone have a spare brown bag?)

-Greg

> 
> Daniel
> -
> To unsubscribe from this list: send the line "unsubscribe linux-rt-users" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html



^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: CPU hotplug and IRQ affinity with 2.6.24-rt1
  2008-02-05 16:59     ` Daniel Walker
  2008-02-05 17:13       ` Gregory Haskins
@ 2008-02-05 18:25       ` Gregory Haskins
  2008-02-05 21:58         ` Daniel Walker
  1 sibling, 1 reply; 11+ messages in thread
From: Gregory Haskins @ 2008-02-05 18:25 UTC (permalink / raw)
  To: dwalker; +Cc: Ingo Molnar, Max Krasnyanskiy, LKML, linux-rt-users

>>> On Tue, Feb 5, 2008 at 11:59 AM, in message
<20080205165936.GA18613@dwalker1.mvista.com>, Daniel Walker
<dwalker@dwalker1.mvista.com> wrote: 

> 
> I looked at the code a bit, and I'm not sure you need this complexity..
> Once you have replace the old_rq, there is no reason it needs to
> protection of the run queue spinlock .. So you could just move the kfree
> down below the spin_unlock_irqrestore() ..

Here is a new version to address your observation:
-----------------------

we cannot kfree while in_atomic()

Signed-off-by: Gregory Haskins <ghaskins@novell.com>

diff --git a/kernel/sched.c b/kernel/sched.c
index e6ad493..0978912 100644
--- a/kernel/sched.c
+++ b/kernel/sched.c
@@ -6226,6 +6226,7 @@ static void rq_attach_root(struct rq *rq, struct root_domain *rd)
 {
        unsigned long flags;
        const struct sched_class *class;
+       struct root_domain *reap = NULL;

        spin_lock_irqsave(&rq->lock, flags);

@@ -6241,7 +6242,7 @@ static void rq_attach_root(struct rq *rq, struct root_domain *rd)
                cpu_clear(rq->cpu, old_rd->online);

                if (atomic_dec_and_test(&old_rd->refcount))
-                       kfree(old_rd);
+                       reap = old_rd;
        }

        atomic_inc(&rd->refcount);
@@ -6257,6 +6258,10 @@ static void rq_attach_root(struct rq *rq, struct root_domain *rd)
        }

        spin_unlock_irqrestore(&rq->lock, flags);
+
+       /* Don't try to free the memory while in-atomic() */
+       if (unlikely(reap))
+               kfree(reap);
 }




> 
> Daniel
> -
> To unsubscribe from this list: send the line "unsubscribe linux-rt-users" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html



^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: CPU hotplug and IRQ affinity with 2.6.24-rt1
  2008-02-05 18:25       ` Gregory Haskins
@ 2008-02-05 21:58         ` Daniel Walker
  2008-02-05 22:03           ` Gregory Haskins
  0 siblings, 1 reply; 11+ messages in thread
From: Daniel Walker @ 2008-02-05 21:58 UTC (permalink / raw)
  To: Gregory Haskins; +Cc: Ingo Molnar, Max Krasnyanskiy, LKML, linux-rt-users

On Tue, Feb 05, 2008 at 11:25:18AM -0700, Gregory Haskins wrote:
> @@ -6241,7 +6242,7 @@ static void rq_attach_root(struct rq *rq, struct root_domain *rd)
>                 cpu_clear(rq->cpu, old_rd->online);
> 
>                 if (atomic_dec_and_test(&old_rd->refcount))
> -                       kfree(old_rd);
> +                       reap = old_rd;

Unrelated to the in atomic issue, I was wondering if this if statement
isn't true can the old_rd memory get leaked, or is it cleaned up
someplace else?

Daniel

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: CPU hotplug and IRQ affinity with 2.6.24-rt1
  2008-02-05 21:58         ` Daniel Walker
@ 2008-02-05 22:03           ` Gregory Haskins
  0 siblings, 0 replies; 11+ messages in thread
From: Gregory Haskins @ 2008-02-05 22:03 UTC (permalink / raw)
  To: dwalker; +Cc: Ingo Molnar, Max Krasnyanskiy, LKML, linux-rt-users

>>> On Tue, Feb 5, 2008 at  4:58 PM, in message
<20080205215805.GD18613@dwalker1.mvista.com>, Daniel Walker
<dwalker@dwalker1.mvista.com> wrote: 
> On Tue, Feb 05, 2008 at 11:25:18AM -0700, Gregory Haskins wrote:
>> @@ -6241,7 +6242,7 @@ static void rq_attach_root(struct rq *rq, struct 
> root_domain *rd)
>>                 cpu_clear(rq->cpu, old_rd->online);
>> 
>>                 if (atomic_dec_and_test(&old_rd->refcount))
>> -                       kfree(old_rd);
>> +                       reap = old_rd;
> 
> Unrelated to the in atomic issue, I was wondering if this if statement
> isn't true can the old_rd memory get leaked, or is it cleaned up
> someplace else?

Each RQ always has a reference to one root-domain and is thus represented by the rd->refcount.  When the last RQ drops its reference to a particular instance, we free the structure.  So this is the only place where we clean up, but it should also be the only place we need to (unless I am misunderstanding you?)

Note that there is one exception: the default root-domain is never freed, which is why we initialize it with a refcount = 1.  So it is theoretically possible to have this particular root-domain dangling with no RQs associated with it, but that is by design. 

Regards,
-Greg


^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2008-02-05 22:10 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2008-02-04 23:35 CPU hotplug and IRQ affinity with 2.6.24-rt1 Max Krasnyanskiy
2008-02-05  2:51 ` Daniel Walker
2008-02-05  3:27   ` Gregory Haskins
2008-02-05  4:21   ` Max Krasnyansky
2008-02-05  5:02   ` Gregory Haskins
2008-02-05 16:59     ` Daniel Walker
2008-02-05 17:13       ` Gregory Haskins
2008-02-05 18:25       ` Gregory Haskins
2008-02-05 21:58         ` Daniel Walker
2008-02-05 22:03           ` Gregory Haskins
2008-02-05 14:00   ` Gregory Haskins

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).