[LU-4971] sanity-scrub test_2: ldlm_lock2desc()) ASSERTION( lock->l_policy_data.l_inodebits.bits == (MDS_INODELOCK_LOOKUP | MDS_INODELOCK_UPDATE | MDS_INODELOCK_LAYOUT)failed Created: 28/Apr/14 Updated: 08/Feb/17 Resolved: 10/Jul/14 |
|
| Status: | Resolved |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | Lustre 2.6.0, Lustre 2.5.4 |
| Fix Version/s: | Lustre 2.6.0 |
| Type: | Bug | Priority: | Critical |
| Reporter: | Maloo | Assignee: | nasf (Inactive) |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | None | ||
| Issue Links: |
|
||||||||||||||||||||
| Severity: | 3 | ||||||||||||||||||||
| Rank (Obsolete): | 13763 | ||||||||||||||||||||
| Description |
|
This issue was created by maloo for Nathaniel Clark <nathaniel.l.clark@intel.com> This issue relates to the following test suite run: The sub-test test_2 failed with the following error:
Info required for matching: sanity-scrub 2 |
| Comments |
| Comment by nasf (Inactive) [ 29/Apr/14 ] |
|
Some log on the client side: 23:56:52:LustreError: 23554:0:(ldlm_lock.c:671:ldlm_lock2desc()) ASSERTION( lock->l_policy_data.l_inodebits.bits == (MDS_INODELOCK_LOOKUP | MDS_INODELOCK_UPDATE | MDS_INODELOCK_LAYOUT) ) failed: Inappropriate inode lock bits during conversion 3 23:56:52:LustreError: 23554:0:(ldlm_lock.c:671:ldlm_lock2desc()) LBUG 23:56:52:Pid: 23554, comm: diff 23:56:52: 23:56:52:Call Trace: 23:56:52: [<ffffffffa03d0895>] libcfs_debug_dumpstack+0x55/0x80 [libcfs] 23:56:52: [<ffffffffa03d0e97>] lbug_with_loc+0x47/0xb0 [libcfs] 23:56:52: [<ffffffffa08dc239>] ldlm_lock2desc+0x179/0x180 [ptlrpc] 23:56:52: [<ffffffffa08eed90>] ldlm_cli_enqueue+0x1f0/0x790 [ptlrpc] 23:56:52: [<ffffffffa0914c0a>] ? ptlrpc_request_set_replen+0x3a/0x60 [ptlrpc] 23:56:52: [<ffffffffa08f3bd0>] ? ldlm_completion_ast+0x0/0x930 [ptlrpc] 23:56:52: [<ffffffffa13434d0>] ? ll_md_blocking_ast+0x0/0x7f0 [lustre] 23:56:52: [<ffffffffa0a3d88e>] mdc_enqueue+0x2be/0x1b00 [mdc] 23:56:52: [<ffffffffa08e25e5>] ? ldlm_lock_match+0x215/0x8b0 [ptlrpc] 23:56:52: [<ffffffffa0a3f2ce>] mdc_intent_lock+0x1fe/0x63f [mdc] 23:56:52: [<ffffffffa13434d0>] ? ll_md_blocking_ast+0x0/0x7f0 [lustre] 23:56:52: [<ffffffffa08f3bd0>] ? ldlm_completion_ast+0x0/0x930 [ptlrpc] 23:56:52: [<ffffffffa053ba28>] ? lprocfs_counter_add+0x1a8/0x1c0 [obdclass] 23:56:52: [<ffffffffa082eae1>] lmv_intent_remote+0x481/0xa90 [lmv] 23:56:52: [<ffffffffa13434d0>] ? ll_md_blocking_ast+0x0/0x7f0 [lustre] 23:56:52: [<ffffffffa082fd49>] lmv_intent_lookup+0xc59/0xd00 [lmv] 23:56:52: [<ffffffffa13434d0>] ? ll_md_blocking_ast+0x0/0x7f0 [lustre] 23:56:52: [<ffffffffa03d027b>] ? cfs_set_ptldebug_header+0x2b/0xc0 [libcfs] 23:56:52: [<ffffffffa0830ada>] lmv_intent_lock+0x32a/0x380 [lmv] 23:56:52: [<ffffffffa13434d0>] ? ll_md_blocking_ast+0x0/0x7f0 [lustre] 23:56:52: [<ffffffffa1325ed8>] ? ll_prep_md_op_data+0x1a8/0x490 [lustre] 23:56:52: [<ffffffffa1344d79>] ll_lookup_it+0x299/0xb50 [lustre] 23:56:52: [<ffffffffa13434d0>] ? ll_md_blocking_ast+0x0/0x7f0 [lustre] 23:56:52: [<ffffffffa134589f>] ll_lookup_nd+0x26f/0x3e0 [lustre] 23:56:52: [<ffffffff811a394e>] ? d_alloc+0x13e/0x1b0 23:56:52: [<ffffffff81198a25>] do_lookup+0x1a5/0x230 23:56:52: [<ffffffff81198db0>] __link_path_walk+0x200/0xff0 23:56:52: [<ffffffff8114a647>] ? handle_pte_fault+0xf7/0xb00 23:56:52: [<ffffffff81199e5a>] path_walk+0x6a/0xe0 23:56:52: [<ffffffff8119a06b>] filename_lookup+0x6b/0xc0 23:56:52: [<ffffffff81196996>] ? final_putname+0x26/0x50 23:56:52: [<ffffffff8119b197>] user_path_at+0x57/0xa0 23:56:52: [<ffffffff8104a98c>] ? __do_page_fault+0x1ec/0x480 23:56:52: [<ffffffff812826f5>] ? _atomic_dec_and_lock+0x55/0x80 23:56:52: [<ffffffff811aaa10>] ? mntput_no_expire+0x30/0x110 23:56:52: [<ffffffff8118e7a4>] ? cp_new_stat+0xe4/0x100 23:56:52: [<ffffffff8118e9e0>] vfs_fstatat+0x50/0xa0 23:56:52: [<ffffffff8118eb5b>] vfs_stat+0x1b/0x20 23:56:52: [<ffffffff8118eb84>] sys_newstat+0x24/0x50 23:56:52: [<ffffffff810e2057>] ? audit_syscall_entry+0x1d7/0x200 23:56:52: [<ffffffff810e1e4e>] ? __audit_syscall_exit+0x25e/0x290 23:56:52: [<ffffffff8100b072>] system_call_fastpath+0x16/0x1b |
| Comment by Oleg Drokin [ 29/Apr/14 ] |
|
So it appears tha client side exp_connect_flags is somehow corrupted because we had no servers not supporting inode bits for many years now. Not since 1.4.x I suspect. |
| Comment by Andreas Dilger [ 30/Apr/14 ] |
|
It is strange that the LASSERT() is on the client, but LFSCK is running on the server (only some parts of the test scripts are setting up the environment on the client). Oleg notes that this problem indicates that the client is trying to use an IBITS lock (as it should) but it thinks the server does not support this (due to missing connect feature flags). That might indicate some problem or corruption with the connection state. |
| Comment by nasf (Inactive) [ 30/Apr/14 ] |
|
Yes, I also suspected that it should not related with OI scrub, instead, might because of some connection race cases. I remembered that we have met similar connection flags issues about client_is_remote() before at LLNL site. But I cannot remember the detail... |
| Comment by Hongchao Zhang [ 27/Jun/14 ] |
|
https://testing.hpdd.intel.com/test_sets/099fd572-fd95-11e3-b84d-5254006e85c2 |
| Comment by James Nunez (Inactive) [ 30/Jun/14 ] |
|
sanity-scrub test 4 is displaying this error and crashing the client node. I'm not sure why so few logs were produced, but here are the logs from the client node: https://testing.hpdd.intel.com/test_sessions/77df2c0e-0092-11e4-a42b-5254006e85c2 |
| Comment by James Nunez (Inactive) [ 02/Jul/14 ] |
|
I've run sanity-scrub four times on the OpenSFS cluster and sanity-scrub fails with this error each time in test 2, 4 or 5. I've run with one, two and four clients with two MDSs each with two MDTs and four OSSs with two OSTs each with Lustre 2.5.60 build # 2535. |
| Comment by nasf (Inactive) [ 03/Jul/14 ] |
|
A debug patch to check the export status when assertion: |
| Comment by James Nunez (Inactive) [ 03/Jul/14 ] |
|
I've run sanity-scrub twice with the debug patch and only hit this error one of the runs. Here is the output from the debug message for a single client run that failed in sanity-scrub test 2: ... restore data restore EA /usr/lib64/lustre/tests remove recovery logs removed `/lustre/brpt/CATALOGS' starting MDTs without disabling OI scrub Starting client: c17: -o user_xattr,flock mds01@o2ib:/scratch /lustre/scratch Message from syslogd@c17 at Jul 3 11:36:24 ... kernel:LustreError: 22841:0:(ldlm_lock.c:672:ldlm_lock2desc()) ASSERTION( lock->l_policy_data.l_inodebits.bits == (MDS_INODELOCK_LOOKUP | MDS_INODELOCK_UPDATE | MDS_INODELOCK_LAYOUT) ) failed: Inappropriate inode lock bits during conversion, bits 0x3, exp ffff880827d6dc00, flags 0x0 Message from syslogd@c17 at Jul 3 11:36:24 ... kernel:LustreError: 22841:0:(ldlm_lock.c:672:ldlm_lock2desc()) LBUG |
| Comment by Andreas Dilger [ 04/Jul/14 ] |
|
Hmm, it has bits 0x3 set, so it is missing the LAYOUT flag. That would be possible if the file is being migrated, or maybe if it is a remote MDT object? I suspect that the LASSERT is wrong here, but I'm not positive yet as I haven't looked closely at the surrounding code. |
| Comment by nasf (Inactive) [ 04/Jul/14 ] |
|
The bits may be right, because the former code cannot find OBD_CONNECT_IBITS in the connection_flags, so the layout ibits will be dropped automatically. Here we get the connection_flags as "0x0", that is impossible for any connection. So there should be data corruption. |
| Comment by Andreas Dilger [ 04/Jul/14 ] |
|
Since this only happens during sanity-scrub, I expect there is some kind of memory corruption or use-after-free problem in the code? It might be useful to run sanity-scrub on a kernel with CONFIG_DEBUG_SLAB and other memory debugging features (CONFIG_DEBUG_SPINLOCK, CONFIG_DEBUG_OBJECTS, CONFIG_DEBUG_KMEMLEAK, CONFIG_DEBUG_MUTEXES, CONFIG_DEBUG_KOBJECT) enabled to see if this triggers a problem. |
| Comment by nasf (Inactive) [ 05/Jul/14 ] |
|
I have run related tests locally for hundreds of times, but cannot reproduce the failure by myself. So I have to go though all related code. Finally, I found that: it may be not related with data corruption, but more seems race condition between the asynchronous connection and synchronous lock_enqueue operation. Consider the following scenario: 1) There are two MDTs in the system, when the client mount, it will trigger MDS_CONNECT request to each of the MDT. But the connect request will be handled asynchronously. The mount processing will go ahead when the connection between the client and MDT0 is established, but will not wait for the connection between the client and the MDT1 to be established. 2) After the client mount returned, the application tries to "stat $MOUNT/a/b", the directory "a" resides on the MDT0, the "b" resides on the MDT1. So the client will send lock_enqueue RPC to the MDT0 firstly. 3) The RPC handler on the MDT0 finds that the target object "b" resides on the remote MDT1, then it replies the client to try to talk with the MDT1. 4) The client gets the MDT0's reply, then calls lmv_intent_remote() to forward the request to the MDT1 via MDC1, but at that time, the connection between the client and the MDT1 has not been established yet. 5) For most of other RPCs, above case is not a problem, because the client-side RPC sender (in ptlrpc) will handle kinds of connection issues. But for the ldlm_enqueue, it is some different, because before giving the ldlm_enqueue RPC to the sender (or to the ptlrpc), the sponsor will check the export connect flags that will not be set until the connection between the client and the MDT has been established. So at such time point, the sponsor of the ldlm_enqueue request will get "0" flags from the export connection flags, then trigger the ASSERTION as described in this ticket. So it is not a special bug for OI scrub, but more general issue for the whole Lustre framework. Because the sanity-scrub will access some cross-MDTs objects immediately after the client mount (without any wait), then it is more easy to reproduce the failure. About the fixing, I do not want to change the general connection/ldlm framework, because I am afraid of introducing some potential bugs. Instead, the patch will make the ldlm_request request to wait the connection to be established for such race case. |
| Comment by nasf (Inactive) [ 05/Jul/14 ] |
|
James, would you please to verify the new patch? Thanks! |
| Comment by Di Wang [ 05/Jul/14 ] |
|
Hmm, I think this is the problem for single MDT FS too, i.e. ldlm_cli_enqueue should not use exp connection flag before it make sure the connection is established? commit 113303973ec9f8484eb2355a1a6ef3c4c7fd6a56
Author: nathan <nathan>
Date: Sat Feb 10 06:33:41 2007 +0000
land b1_5 onto HEAD
Which I thought it use exp connection flag in a wrong way. So we probably just remove these code to save some trouble here. Since I do not think we plan to let 2.6 lustre client to connect some non-inodebit support server (1.4 ?) |
| Comment by nasf (Inactive) [ 05/Jul/14 ] |
|
No, for single MDT case, there is no such race condition. Because the mount process will call md_getstatus() on the ROOT object firstly, it is NOT ldlm_enqueue RPC. Then when the mount return to user-space, it can guarantee that the connection between the client and the MDT0 has been established successfully. |
| Comment by Di Wang [ 07/Jul/14 ] |
|
Ah, I see. Hmm, I still think remove those incompatible inodebit handling is a better choice, or can we do sth like this root@testnode lustre-release]# git diff lustre/ldlm/
diff --git a/lustre/ldlm/ldlm_request.c b/lustre/ldlm/ldlm_request.c
index 76f329d..2c0d05d 100644
--- a/lustre/ldlm/ldlm_request.c
+++ b/lustre/ldlm/ldlm_request.c
@@ -909,7 +909,7 @@ int ldlm_cli_enqueue(struct obd_export *exp, struct ptlrpc_request **reqp,
OBD_CONNECT_IBITS))
lock->l_policy_data.l_inodebits.bits =
MDS_INODELOCK_LOOKUP |
- MDS_INODELOCK_UPDATE;
+ MDS_INODELOCK_UPDATE | MDS_INODELOCK_LAYOUT;
else
lock->l_policy_data = *policy;
}
|
| Comment by nasf (Inactive) [ 07/Jul/14 ] |
|
Wangdi, I have verified the race conditions locally. But I am not sure whether removing the incompatible bits is a suitable solution or not. Because the key issue is that we are checking un-iniitialized export flags. |
| Comment by Di Wang [ 07/Jul/14 ] |
Yes, but that is for checking if the server support inodebit lock. But since 2.6 client will be only connected to inodebit enable server, so why do we need this check? TBH, I really do not like add "connection wait logic" above ptlrpc, which might bring us troubles or even worse. |
| Comment by nasf (Inactive) [ 07/Jul/14 ] |
|
Even though we drop the incompatible ibits, we still needs to prevent the user to use the non-initialised export, otherwise, there may be other potential bugs. |
| Comment by James Nunez (Inactive) [ 07/Jul/14 ] |
|
I've tested the patch at http://review.whamcloud.com/#/c/10958/ (version 2) and, after running sanity-scrub three times, I have not seen this assertion. The patch fixes the crash/error for the configuration I've been testing with. |
| Comment by nasf (Inactive) [ 07/Jul/14 ] |
|
Thanks James! |
| Comment by Di Wang [ 07/Jul/14 ] |
|
"Even though we drop the incompatible ibits, we still needs to prevent the user to use the non-initialised export, otherwise, there may be other potential bugs." That is something should never happen, IMHO. |
| Comment by nasf (Inactive) [ 08/Jul/14 ] |
|
Here is the patch to drop the redundant ibits lock interoperability check: |
| Comment by nasf (Inactive) [ 10/Jul/14 ] |
|
The patch has been landed to master. |
| Comment by Jian Yu [ 04/Dec/14 ] |
|
While verifying patch http://review.whamcloud.com/12606 on Lustre b2_5 branch, the same failure occurred: LustreError: 19407:0:(ldlm_lock.c:669:ldlm_lock2desc()) ASSERTION( lock->l_policy_data.l_inodebits.bits == (MDS_INODELOCK_LOOKUP | MDS_INODELOCK_UPDATE | MDS_INODELOCK_LAYOUT) ) failed: Inappropriate inode lock bits during conversion 3 LustreError: 19407:0:(ldlm_lock.c:669:ldlm_lock2desc()) LBUG Pid: 19407, comm: stat Call Trace: [<ffffffffa03d0895>] libcfs_debug_dumpstack+0x55/0x80 [libcfs] [<ffffffffa03d0e97>] lbug_with_loc+0x47/0xb0 [libcfs] [<ffffffffa067a239>] ldlm_lock2desc+0x179/0x180 [ptlrpc] [<ffffffffa068cb90>] ldlm_cli_enqueue+0x1f0/0x790 [ptlrpc] [<ffffffffa06b29ea>] ? ptlrpc_request_set_replen+0x3a/0x60 [ptlrpc] [<ffffffffa06919d0>] ? ldlm_completion_ast+0x0/0x920 [ptlrpc] [<ffffffffa0a825c0>] ? ll_md_blocking_ast+0x0/0x7d0 [lustre] [<ffffffffa08ded8e>] mdc_enqueue+0x2be/0x1a10 [mdc] [<ffffffffa01bc294>] ? fld_client_rpc+0x864/0xed0 [fld] [<ffffffffa08e06dd>] mdc_intent_lock+0x1fd/0x64a [mdc] [<ffffffffa0a825c0>] ? ll_md_blocking_ast+0x0/0x7d0 [lustre] [<ffffffffa06919d0>] ? ldlm_completion_ast+0x0/0x920 [ptlrpc] [<ffffffffa00e58b8>] ? lprocfs_counter_add+0x1a8/0x1d6 [lvfs] [<ffffffffa08a846e>] lmv_intent_remote+0x47e/0xa80 [lmv] [<ffffffffa0a825c0>] ? ll_md_blocking_ast+0x0/0x7d0 [lustre] [<ffffffffa08a9137>] lmv_intent_lookup+0x6c7/0x700 [lmv] [<ffffffffa0a825c0>] ? ll_md_blocking_ast+0x0/0x7d0 [lustre] [<ffffffffa08a9d6a>] lmv_intent_lock+0x32a/0x380 [lmv] [<ffffffffa0a825c0>] ? ll_md_blocking_ast+0x0/0x7d0 [lustre] [<ffffffffa0a81d0e>] ? ll_i2gids+0x2e/0xd0 [lustre] [<ffffffffa0a68c0d>] ? ll_prep_md_op_data+0x10d/0x3b0 [lustre] [<ffffffffa0a8491f>] ll_lookup_it+0x33f/0xb00 [lustre] [<ffffffffa0a825c0>] ? ll_md_blocking_ast+0x0/0x7d0 [lustre] [<ffffffffa0a50c51>] ? __ll_inode_revalidate_it+0x1e1/0xc30 [lustre] [<ffffffffa0a8535f>] ll_lookup_nd+0x27f/0x3f0 [lustre] [<ffffffff811a42fe>] ? d_alloc+0x13e/0x1b0 [<ffffffff81198a35>] do_lookup+0x1a5/0x230 [<ffffffff81199100>] __link_path_walk+0x200/0x1000 [<ffffffff8114a3d7>] ? handle_pte_fault+0xf7/0xb00 [<ffffffff8119a1ba>] path_walk+0x6a/0xe0 [<ffffffff8119a3cb>] filename_lookup+0x6b/0xc0 [<ffffffff8119b4f7>] user_path_at+0x57/0xa0 [<ffffffff8104a98c>] ? __do_page_fault+0x1ec/0x480 [<ffffffff8118e990>] vfs_fstatat+0x50/0xa0 [<ffffffff811515a5>] ? do_mmap_pgoff+0x335/0x380 [<ffffffff8118ea4e>] vfs_lstat+0x1e/0x20 [<ffffffff8118ea74>] sys_newlstat+0x24/0x50 [<ffffffff810e1e07>] ? audit_syscall_entry+0x1d7/0x200 [<ffffffff8100b072>] system_call_fastpath+0x16/0x1b Kernel panic - not syncing: LBUG Maloo report: https://testing.hpdd.intel.com/test_sets/f4332d04-7af0-11e4-956d-5254006e85c2 Hi Nasf, could you please take a look whether this is a regression introduced by the patch http://review.whamcloud.com/12606 or not? If not, then this is an issue on Lustre b2_5 branch and I'll back-port your patch to fix it. Thank you! |
| Comment by nasf (Inactive) [ 04/Dec/14 ] |
|
Yujian, I do not think it is http://review.whamcloud.com/#/c/12606/ caused the trouble. You need to back-port patch(es) to fix it. |
| Comment by Jian Yu [ 05/Dec/14 ] |
|
Thank you, Nasf. Will do. |
| Comment by Jian Yu [ 05/Dec/14 ] |
|
One more instance on Lustre b2_5 branch: |
| Comment by Gerrit Updater [ 05/Dec/14 ] |
|
Jian Yu (jian.yu@intel.com) uploaded a new patch: http://review.whamcloud.com/12960 |