LKML Archive on lore.kernel.org
help / color / mirror / Atom feed
* ptrace warning due to "cgroup: get rid of cgroup_freezer_frozen_exit()"
@ 2019-05-07 20:04 Qian Cai
  2019-05-07 21:37 ` Roman Gushchin
  0 siblings, 1 reply; 5+ messages in thread
From: Qian Cai @ 2019-05-07 20:04 UTC (permalink / raw)
  To: guro; +Cc: oleg, tj, lizefan, hannes, cgroups, linux-kernel

LTP ptrace01 test case triggers a warning below. Looks at ptrace_stop() calls
cgroup_enter_frozen() there in the cgroup v2 freezer.

[ 8373.336330] WARNING: CPU: 56 PID: 67026 at kernel/cgroup/cgroup.c:6008
cgroup_exit+0x2a9/0x2f0
[ 8373.345001] Modules linked in: brd ext4 crc16 mbcache jbd2 overlay loop
nls_iso8859_1 nls_cp437 vfat fat kvm_amd kvm ses enclosure dax_pmem irqbypass
dax_pmem_core efivars ip_tables x_tables xfs sd_mod smartpqi scsi_transport_sas
tg3 mlx5_core libphy firmware_class dm_mirror dm_region_hash dm_log dm_mod
efivarfs [last unloaded: dummy_del_mod]
[ 8373.375561] CPU: 56 PID: 67026 Comm: ptrace01 Tainted:
G           O      5.1.0-next-20190507+ #25
[ 8373.384579] Hardware name: HPE ProLiant DL385 Gen10/ProLiant DL385 Gen10,
BIOS A40 01/25/2019
[ 8373.393164] RIP: 0010:cgroup_exit+0x2a9/0x2f0
[ 8373.397556] Code: 0d ff ff ff 4c 89 f7 e8 75 4b 1b 00 4c 8b ab 20 0f 00 00 49
8d 7d 50 e8 65 4b 1b 00 49 8b 7d 50 e8 4c 56 00 00 e9 db fe ff ff <0f> 0b e9 3a
fe ff ff 48 01 f1 0f 82 3b ff ff ff 48 c7 c7 40 83 5b
[ 8373.416443] RSP: 0018:ffff888bdc9ef9b8 EFLAGS: 00010002
[ 8373.421709] RAX: 0000000000000000 RBX: ffff888e5cfcc040 RCX: ffffffffab3a8e7d
[ 8373.428897] RDX: 1ffff111cb9f9875 RSI: dffffc0000000000 RDI: ffff888e5cfcc3a8
[ 8373.436080] RBP: ffff888bdc9efa50 R08: ffffed117b93df25 R09: ffffed117b93df24
[ 8373.443266] R10: ffffed117b93df24 R11: 0000000000000003 R12: ffff888bdc9efa28
[ 8373.450451] R13: ffff888f4c2346c8 R14: ffff888e5cfccf60 R15: ffff888e5cfccf68
[ 8373.457637] FS:  00007ff1e2e3d5c0(0000) GS:ffff88902f800000(0000)
knlGS:0000000000000000
[ 8373.465781] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 8373.471569] CR2: 00007ff1e286fe8a CR3: 000000092c412000 CR4: 00000000001406a0
[ 8373.478750] Call Trace:
[ 8373.481219]  ? cgroup_post_fork+0x350/0x350
[ 8373.485435]  ? fpu__drop+0x5e/0x230
[ 8373.488951]  ? exit_thread+0x10c/0x160
[ 8373.492736]  do_exit+0x5cb/0x1740
[ 8373.496083]  ? check_chain_key+0x142/0x200
[ 8373.500210]  ? mm_update_next_owner+0x360/0x360
[ 8373.504775]  ? map_id_up+0x14c/0x1f0
[ 8373.508380]  ? check_chain_key+0x142/0x200
[ 8373.512512]  ? get_signal+0x5f1/0xde0
[ 8373.516206]  ? lock_downgrade+0x300/0x300
[ 8373.520246]  ? lock_downgrade+0x300/0x300
[ 8373.524287]  do_group_exit+0x78/0x160
[ 8373.527978]  get_signal+0x1e8/0xde0
[ 8373.531498]  do_signal+0x9c/0x9d0
[ 8373.534841]  ? check_chain_key+0x142/0x200
[ 8373.538970]  ? setup_sigcontext+0x280/0x280
[ 8373.543185]  ? lock_downgrade+0x300/0x300
[ 8373.547228]  ? kill_pid_info+0x2e/0xd0
[ 8373.551006]  ? kill_pid_info+0xa4/0xd0
[ 8373.554788]  ? __x64_sys_kill+0x262/0x350
[ 8373.558830]  exit_to_usermode_loop+0x9d/0xc0
[ 8373.563131]  do_syscall_64+0x470/0x5d8
[ 8373.566910]  ? syscall_return_slowpath+0xf0/0xf0
[ 8373.571565]  ? __do_page_fault+0x44d/0x5b0
[ 8373.575698]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
[ 8373.580789] RIP: 0033:0x7ff1e2893c3b
[ 8373.584402] Code: Bad RIP value.
[ 8373.587653] RSP: 002b:00007ffd8e5efe78 EFLAGS: 00000206 ORIG_RAX:
000000000000003e
[ 8373.595276] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 00007ff1e2893c3b
[ 8373.602461] RDX: 0000000000000000 RSI: 000000000000000c RDI: 00000000000105d2
[ 8373.609642] RBP: 0000000000000000 R08: 00000000ffffffff R09: 00007ff1e2e3d5c0
[ 8373.616824] R10: fffffffffffff768 R11: 0000000000000206 R12: 00007ffd8e5efe98
[ 8373.624005] R13: 00007ffd8e5f00c0 R14: 0000000000000000 R15: 0000000000000000
[ 8373.631190] ---[ end trace a7169f3366f1d100 ]---

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: ptrace warning due to "cgroup: get rid of cgroup_freezer_frozen_exit()"
  2019-05-07 20:04 ptrace warning due to "cgroup: get rid of cgroup_freezer_frozen_exit()" Qian Cai
@ 2019-05-07 21:37 ` Roman Gushchin
  2019-05-08 13:10   ` Qian Cai
  2019-05-08 15:25   ` Oleg Nesterov
  0 siblings, 2 replies; 5+ messages in thread
From: Roman Gushchin @ 2019-05-07 21:37 UTC (permalink / raw)
  To: Qian Cai; +Cc: oleg, tj, lizefan, hannes, cgroups, linux-kernel

On Tue, May 07, 2019 at 04:04:22PM -0400, Qian Cai wrote:
> LTP ptrace01 test case triggers a warning below. Looks at ptrace_stop() calls
> cgroup_enter_frozen() there in the cgroup v2 freezer.
> 
> [ 8373.336330] WARNING: CPU: 56 PID: 67026 at kernel/cgroup/cgroup.c:6008
> cgroup_exit+0x2a9/0x2f0
> [ 8373.345001] Modules linked in: brd ext4 crc16 mbcache jbd2 overlay loop
> nls_iso8859_1 nls_cp437 vfat fat kvm_amd kvm ses enclosure dax_pmem irqbypass
> dax_pmem_core efivars ip_tables x_tables xfs sd_mod smartpqi scsi_transport_sas
> tg3 mlx5_core libphy firmware_class dm_mirror dm_region_hash dm_log dm_mod
> efivarfs [last unloaded: dummy_del_mod]
> [ 8373.375561] CPU: 56 PID: 67026 Comm: ptrace01 Tainted:
> G           O      5.1.0-next-20190507+ #25
> [ 8373.384579] Hardware name: HPE ProLiant DL385 Gen10/ProLiant DL385 Gen10,
> BIOS A40 01/25/2019
> [ 8373.393164] RIP: 0010:cgroup_exit+0x2a9/0x2f0
> [ 8373.397556] Code: 0d ff ff ff 4c 89 f7 e8 75 4b 1b 00 4c 8b ab 20 0f 00 00 49
> 8d 7d 50 e8 65 4b 1b 00 49 8b 7d 50 e8 4c 56 00 00 e9 db fe ff ff <0f> 0b e9 3a
> fe ff ff 48 01 f1 0f 82 3b ff ff ff 48 c7 c7 40 83 5b
> [ 8373.416443] RSP: 0018:ffff888bdc9ef9b8 EFLAGS: 00010002
> [ 8373.421709] RAX: 0000000000000000 RBX: ffff888e5cfcc040 RCX: ffffffffab3a8e7d
> [ 8373.428897] RDX: 1ffff111cb9f9875 RSI: dffffc0000000000 RDI: ffff888e5cfcc3a8
> [ 8373.436080] RBP: ffff888bdc9efa50 R08: ffffed117b93df25 R09: ffffed117b93df24
> [ 8373.443266] R10: ffffed117b93df24 R11: 0000000000000003 R12: ffff888bdc9efa28
> [ 8373.450451] R13: ffff888f4c2346c8 R14: ffff888e5cfccf60 R15: ffff888e5cfccf68
> [ 8373.457637] FS:  00007ff1e2e3d5c0(0000) GS:ffff88902f800000(0000)
> knlGS:0000000000000000
> [ 8373.465781] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [ 8373.471569] CR2: 00007ff1e286fe8a CR3: 000000092c412000 CR4: 00000000001406a0
> [ 8373.478750] Call Trace:
> [ 8373.481219]  ? cgroup_post_fork+0x350/0x350
> [ 8373.485435]  ? fpu__drop+0x5e/0x230
> [ 8373.488951]  ? exit_thread+0x10c/0x160
> [ 8373.492736]  do_exit+0x5cb/0x1740
> [ 8373.496083]  ? check_chain_key+0x142/0x200
> [ 8373.500210]  ? mm_update_next_owner+0x360/0x360
> [ 8373.504775]  ? map_id_up+0x14c/0x1f0
> [ 8373.508380]  ? check_chain_key+0x142/0x200
> [ 8373.512512]  ? get_signal+0x5f1/0xde0
> [ 8373.516206]  ? lock_downgrade+0x300/0x300
> [ 8373.520246]  ? lock_downgrade+0x300/0x300
> [ 8373.524287]  do_group_exit+0x78/0x160
> [ 8373.527978]  get_signal+0x1e8/0xde0
> [ 8373.531498]  do_signal+0x9c/0x9d0
> [ 8373.534841]  ? check_chain_key+0x142/0x200
> [ 8373.538970]  ? setup_sigcontext+0x280/0x280
> [ 8373.543185]  ? lock_downgrade+0x300/0x300
> [ 8373.547228]  ? kill_pid_info+0x2e/0xd0
> [ 8373.551006]  ? kill_pid_info+0xa4/0xd0
> [ 8373.554788]  ? __x64_sys_kill+0x262/0x350
> [ 8373.558830]  exit_to_usermode_loop+0x9d/0xc0
> [ 8373.563131]  do_syscall_64+0x470/0x5d8
> [ 8373.566910]  ? syscall_return_slowpath+0xf0/0xf0
> [ 8373.571565]  ? __do_page_fault+0x44d/0x5b0
> [ 8373.575698]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
> [ 8373.580789] RIP: 0033:0x7ff1e2893c3b
> [ 8373.584402] Code: Bad RIP value.
> [ 8373.587653] RSP: 002b:00007ffd8e5efe78 EFLAGS: 00000206 ORIG_RAX:
> 000000000000003e
> [ 8373.595276] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 00007ff1e2893c3b
> [ 8373.602461] RDX: 0000000000000000 RSI: 000000000000000c RDI: 00000000000105d2
> [ 8373.609642] RBP: 0000000000000000 R08: 00000000ffffffff R09: 00007ff1e2e3d5c0
> [ 8373.616824] R10: fffffffffffff768 R11: 0000000000000206 R12: 00007ffd8e5efe98
> [ 8373.624005] R13: 00007ffd8e5f00c0 R14: 0000000000000000 R15: 0000000000000000
> [ 8373.631190] ---[ end trace a7169f3366f1d100 ]---

Hello, Qian!

Thank you for the report!

Can you, please, try if the following patch fixes the problem?

Thank you!

Roman

--

diff --git a/kernel/signal.c b/kernel/signal.c
index 16b72f4f14df..bf2f268f1386 100644
--- a/kernel/signal.c
+++ b/kernel/signal.c
@@ -2484,9 +2484,6 @@ bool get_signal(struct ksignal *ksig)
                sigdelset(&current->pending.signal, SIGKILL);
                recalc_sigpending();
                current->jobctl &= ~JOBCTL_TRAP_FREEZE;
-               spin_unlock_irq(&sighand->siglock);
-               if (unlikely(cgroup_task_frozen(current)))
-                       cgroup_leave_frozen(true);
                goto fatal;
        }
 
@@ -2608,8 +2605,10 @@ bool get_signal(struct ksignal *ksig)
                        continue;
                }
 
-               spin_unlock_irq(&sighand->siglock);
        fatal:
+               spin_unlock_irq(&sighand->siglock);
+               if (unlikely(cgroup_task_frozen(current)))
+                       cgroup_leave_frozen(true);
 
                /*
                 * Anything else is fatal, maybe with a core dump.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: ptrace warning due to "cgroup: get rid of cgroup_freezer_frozen_exit()"
  2019-05-07 21:37 ` Roman Gushchin
@ 2019-05-08 13:10   ` Qian Cai
  2019-05-08 15:25   ` Oleg Nesterov
  1 sibling, 0 replies; 5+ messages in thread
From: Qian Cai @ 2019-05-08 13:10 UTC (permalink / raw)
  To: Roman Gushchin; +Cc: oleg, tj, lizefan, hannes, cgroups, linux-kernel

On Tue, 2019-05-07 at 21:37 +0000, Roman Gushchin wrote:
> Can you, please, try if the following patch fixes the problem?

It works great.

> --
> 
> diff --git a/kernel/signal.c b/kernel/signal.c
> index 16b72f4f14df..bf2f268f1386 100644
> --- a/kernel/signal.c
> +++ b/kernel/signal.c
> @@ -2484,9 +2484,6 @@ bool get_signal(struct ksignal *ksig)
>                 sigdelset(&current->pending.signal, SIGKILL);
>                 recalc_sigpending();
>                 current->jobctl &= ~JOBCTL_TRAP_FREEZE;
> -               spin_unlock_irq(&sighand->siglock);
> -               if (unlikely(cgroup_task_frozen(current)))
> -                       cgroup_leave_frozen(true);
>                 goto fatal;
>         }
>  
> @@ -2608,8 +2605,10 @@ bool get_signal(struct ksignal *ksig)
>                         continue;
>                 }
>  
> -               spin_unlock_irq(&sighand->siglock);
>         fatal:
> +               spin_unlock_irq(&sighand->siglock);
> +               if (unlikely(cgroup_task_frozen(current)))
> +                       cgroup_leave_frozen(true);
>  
>                 /*
>                  * Anything else is fatal, maybe with a core dump.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: ptrace warning due to "cgroup: get rid of cgroup_freezer_frozen_exit()"
  2019-05-07 21:37 ` Roman Gushchin
  2019-05-08 13:10   ` Qian Cai
@ 2019-05-08 15:25   ` Oleg Nesterov
  2019-05-08 18:03     ` Roman Gushchin
  1 sibling, 1 reply; 5+ messages in thread
From: Oleg Nesterov @ 2019-05-08 15:25 UTC (permalink / raw)
  To: Roman Gushchin; +Cc: Qian Cai, tj, lizefan, hannes, cgroups, linux-kernel

On 05/07, Roman Gushchin wrote:
>
> --- a/kernel/signal.c
> +++ b/kernel/signal.c
> @@ -2484,9 +2484,6 @@ bool get_signal(struct ksignal *ksig)
>                 sigdelset(&current->pending.signal, SIGKILL);
>                 recalc_sigpending();
>                 current->jobctl &= ~JOBCTL_TRAP_FREEZE;

just noticed... perhaps it makes more sense to clear JOBCTL_TRAP_FREEZE
before recalc_sigpending(). Or simply not clear it at all, see below.

> -               spin_unlock_irq(&sighand->siglock);
> -               if (unlikely(cgroup_task_frozen(current)))
> -                       cgroup_leave_frozen(true);
>                 goto fatal;
>         }
>  
> @@ -2608,8 +2605,10 @@ bool get_signal(struct ksignal *ksig)
>                         continue;
>                 }
>  
> -               spin_unlock_irq(&sighand->siglock);
>         fatal:
> +               spin_unlock_irq(&sighand->siglock);
> +               if (unlikely(cgroup_task_frozen(current)))
> +                       cgroup_leave_frozen(true);

Yes, ptrace_signal() can return a fatal signal... and in this case we do not
clear JOBCTL_TRAP_FREEZE. This doesn't look consistent with the code above.



I can only repeat that somehow we need to cleanup/improve the whole logic.

Say, a traced task reports syscall-enter. ptrace_stop() does enter_frozen().
The cgroup can become CGRP_FROZEN after that. Now the debugger does PTRACE_CONT,
the frozen task actually starts the syscall. Obviously not good.

Heh, and if this syscall is sys_exit or sys_exit_group we can hit the same
warning.

Oleg.


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: ptrace warning due to "cgroup: get rid of cgroup_freezer_frozen_exit()"
  2019-05-08 15:25   ` Oleg Nesterov
@ 2019-05-08 18:03     ` Roman Gushchin
  0 siblings, 0 replies; 5+ messages in thread
From: Roman Gushchin @ 2019-05-08 18:03 UTC (permalink / raw)
  To: Oleg Nesterov; +Cc: Qian Cai, tj, lizefan, hannes, cgroups, linux-kernel

Hello, Oleg!

On Wed, May 08, 2019 at 05:25:36PM +0200, Oleg Nesterov wrote:
> On 05/07, Roman Gushchin wrote:
> >
> > --- a/kernel/signal.c
> > +++ b/kernel/signal.c
> > @@ -2484,9 +2484,6 @@ bool get_signal(struct ksignal *ksig)
> >                 sigdelset(&current->pending.signal, SIGKILL);
> >                 recalc_sigpending();
> >                 current->jobctl &= ~JOBCTL_TRAP_FREEZE;
> 
> just noticed... perhaps it makes more sense to clear JOBCTL_TRAP_FREEZE
> before recalc_sigpending(). Or simply not clear it at all, see below.

Agree, we don't need to clear it, because there is no way back for the task
at this moment, only exiting. I'm dropping this line.

> 
> > -               spin_unlock_irq(&sighand->siglock);
> > -               if (unlikely(cgroup_task_frozen(current)))
> > -                       cgroup_leave_frozen(true);
> >                 goto fatal;
> >         }
> >  
> > @@ -2608,8 +2605,10 @@ bool get_signal(struct ksignal *ksig)
> >                         continue;
> >                 }
> >  
> > -               spin_unlock_irq(&sighand->siglock);
> >         fatal:
> > +               spin_unlock_irq(&sighand->siglock);
> > +               if (unlikely(cgroup_task_frozen(current)))
> > +                       cgroup_leave_frozen(true);
> 
> Yes, ptrace_signal() can return a fatal signal... and in this case we do not
> clear JOBCTL_TRAP_FREEZE. This doesn't look consistent with the code above.

The same here.

> 
> 
> 
> I can only repeat that somehow we need to cleanup/improve the whole logic.
> 
> Say, a traced task reports syscall-enter. ptrace_stop() does enter_frozen().
> The cgroup can become CGRP_FROZEN after that. Now the debugger does PTRACE_CONT,
> the frozen task actually starts the syscall. Obviously not good.
> 
> Heh, and if this syscall is sys_exit or sys_exit_group we can hit the same
> warning.

Ok, I totally agree, but let's fix this noisy WARN_ON() for now,
to have some time to think about the rest.

I believe this will fix it for now:
https://github.com/rgushchin/linux/commit/3f65507e407c81d3cdf4a9145f36a8d0f755ca04

Please, let me know if you have any concerns here, otherwise I'll ask Tejun to
pull it asap.

Thank you for looking into it!

Roman

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2019-05-08 18:04 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-05-07 20:04 ptrace warning due to "cgroup: get rid of cgroup_freezer_frozen_exit()" Qian Cai
2019-05-07 21:37 ` Roman Gushchin
2019-05-08 13:10   ` Qian Cai
2019-05-08 15:25   ` Oleg Nesterov
2019-05-08 18:03     ` Roman Gushchin

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).