On 2020-09-14 19:02, ppvk@codeaurora.org wrote: > On 2020-09-14 13:41, Miklos Szeredi wrote: >> On Thu, Sep 10, 2020 at 5:42 PM wrote: >>> >>> On 2020-09-08 16:55, Miklos Szeredi wrote: >>> > On Tue, Sep 8, 2020 at 10:17 AM Pradeep P V K >>> > wrote: >>> >> >>> >> From: Pradeep P V K >>> >> >>> >> There is a potential race between fuse_abort_conn() and >>> >> fuse_copy_page() as shown below, due to which VM_BUG_ON_PAGE >>> >> crash is observed for accessing a free page. >>> >> >>> >> context#1: context#2: >>> >> fuse_dev_do_read() fuse_abort_conn() >>> >> ->fuse_copy_args() ->end_requests() >>> > >>> > This shouldn't happen due to FR_LOCKED logic. Are you seeing this on >>> > an upstream kernel? Which version? >>> > >>> > Thanks, >>> > Miklos >>> >>> This is happen just after unlock_request() in fuse_ref_page(). In >>> unlock_request(), it will clear the FR_LOCKED bit. >>> As there is no protection between context#1 & context#2 during >>> unlock_request(), there are chances that it could happen. >> >> Ah, indeed, I missed that one. >> >> Similar issue in fuse_try_move_page(), which dereferences oldpage >> after unlock_request(). >> >> Fix for both is to grab a reference to the page from ap->pages[] array >> *before* calling unlock_request(). >> >> Attached untested patch. Could you please verify that it fixes the >> bug? >> > Thanks for the patch. It is an one time issue and bit hard to > reproduce but still we > will verify the above proposed patch and update the test results here. > Not seen any issue during 24 hours(+) of stability run with your proposed patch. This covers reads/writes on fuse paths + reboots + other concurrency's. > Minor comments on the commit text of the proposed patch : This issue > was originally reported by me and kernel test robot > identified compilation errors on the patch that i submitted. > This confusion might be due to un proper commit text note on "changes > since v1" > >> Thanks, >> Miklos > > Thanks and Regards, > Pradeep