Details
-
Bug
-
Resolution: Fixed
-
Minor
-
Lustre 2.1.0
-
None
-
3
-
5068
Description
we can easily get LBUG if calling open with O_CREATE but without specified mode:
fd = open(filename, O_CREAT |O_RDWR);
Apr 4 06:12:01 prime kernel: LustreError: 6697:0:(client.c:2083:__ptlrpc_free_req()) ASSERTION(!request->rq_replay) failed: req ffff81021d734800
Apr 4 06:12:01 prime kernel: LustreError: 6697:0:(client.c:2083:__ptlrpc_free_req()) LBUG
Apr 4 06:12:01 prime kernel: Pid: 6697, comm: foo
Apr 4 06:12:01 prime kernel:
Apr 4 06:12:01 prime kernel: Call Trace:
Apr 4 06:12:01 prime kernel: [<ffffffff887cf5f1>] libcfs_debug_dumpstack+0x51/0x60 [libcfs]
Apr 4 06:12:01 prime kernel: [<ffffffff887cfb2a>] lbug_with_loc+0x7a/0xd0 [libcfs]
Apr 4 06:12:01 prime kernel: [<ffffffff8899fbdc>] __ptlrpc_req_finished+0x44c/0x850 [ptlrpc]
Apr 4 06:12:01 prime kernel: [<ffffffff88e637aa>] ll_intent_release+0x11a/0x190 [lustre]
Apr 4 06:12:01 prime kernel: [<ffffffff88eaa761>] ll_lookup_nd+0x291/0x400 [lustre]
Apr 4 06:12:01 prime kernel: [<ffffffff800228d9>] d_alloc+0x174/0x1a9
Apr 4 06:12:01 prime kernel: [<ffffffff80036d71>] __lookup_hash+0x10b/0x12f
Apr 4 06:12:01 prime kernel: [<ffffffff8001afef>] open_namei+0xf2/0x6d5
Apr 4 06:12:01 prime kernel: [<ffffffff80066b88>] do_page_fault+0x4fe/0x874
Apr 4 06:12:01 prime kernel: [<ffffffff800274e7>] do_filp_open+0x1c/0x38
Apr 4 06:12:01 prime kernel: [<ffffffff80019e1e>] do_sys_open+0x44/0xbe
Apr 4 06:12:01 prime kernel: [<ffffffff8005d28d>] tracesys+0xd5/0xe0
after reading into code, I found it's because VFS will not check & validate nameidata::intent::open::create_mode before calling into lookup of llite, so llite will get a totally random lookup_intent::it_create_mode, meanwhile, llite will use high bits of it_create_mode to store M_CHECK_STALE, so it will see an unexpected M_CHECK_STALE and totally screw up logic.