[LU-4971] sanity-scrub test_2: ldlm_lock2desc()) ASSERTION( lock->l_policy_data.l_inodebits.bits == (MDS_INODELOCK_LOOKUP | MDS_INODELOCK_UPDATE | MDS_INODELOCK_LAYOUT)failed Created: 28/Apr/14  Updated: 08/Feb/17  Resolved: 10/Jul/14

Status: Resolved
Project: Lustre
Component/s: None
Affects Version/s: Lustre 2.6.0, Lustre 2.5.4
Fix Version/s: Lustre 2.6.0

Type: Bug Priority: Critical
Reporter: Maloo Assignee: nasf (Inactive)
Resolution: Fixed Votes: 0
Labels: None

Issue Links:
Duplicate
duplicates LU-5273 Test failure on test suite sanity-scr... Closed
is duplicated by LU-5272 Test failure on test suite sanity-scr... Resolved
Related
is related to LU-7975 "(lod_object.c:700:lod_ah_init()) ASS... Resolved
Severity: 3
Rank (Obsolete): 13763

 Description   

This issue was created by maloo for Nathaniel Clark <nathaniel.l.clark@intel.com>

This issue relates to the following test suite run:
http://maloo.whamcloud.com/test_sets/598caeaa-cd2c-11e3-b548-52540035b04c
https://maloo.whamcloud.com/test_sets/fc94ece0-a552-11e3-9fee-52540035b04c

The sub-test test_2 failed with the following error:

test failed to respond and timed out

Info required for matching: sanity-scrub 2



 Comments   
Comment by nasf (Inactive) [ 29/Apr/14 ]

Some log on the client side:

23:56:52:LustreError: 23554:0:(ldlm_lock.c:671:ldlm_lock2desc()) ASSERTION( lock->l_policy_data.l_inodebits.bits == (MDS_INODELOCK_LOOKUP | MDS_INODELOCK_UPDATE | MDS_INODELOCK_LAYOUT) ) failed: Inappropriate inode lock bits during conversion 3
23:56:52:LustreError: 23554:0:(ldlm_lock.c:671:ldlm_lock2desc()) LBUG
23:56:52:Pid: 23554, comm: diff
23:56:52:
23:56:52:Call Trace:
23:56:52: [<ffffffffa03d0895>] libcfs_debug_dumpstack+0x55/0x80 [libcfs]
23:56:52: [<ffffffffa03d0e97>] lbug_with_loc+0x47/0xb0 [libcfs]
23:56:52: [<ffffffffa08dc239>] ldlm_lock2desc+0x179/0x180 [ptlrpc]
23:56:52: [<ffffffffa08eed90>] ldlm_cli_enqueue+0x1f0/0x790 [ptlrpc]
23:56:52: [<ffffffffa0914c0a>] ? ptlrpc_request_set_replen+0x3a/0x60 [ptlrpc]
23:56:52: [<ffffffffa08f3bd0>] ? ldlm_completion_ast+0x0/0x930 [ptlrpc]
23:56:52: [<ffffffffa13434d0>] ? ll_md_blocking_ast+0x0/0x7f0 [lustre]
23:56:52: [<ffffffffa0a3d88e>] mdc_enqueue+0x2be/0x1b00 [mdc]
23:56:52: [<ffffffffa08e25e5>] ? ldlm_lock_match+0x215/0x8b0 [ptlrpc]
23:56:52: [<ffffffffa0a3f2ce>] mdc_intent_lock+0x1fe/0x63f [mdc]
23:56:52: [<ffffffffa13434d0>] ? ll_md_blocking_ast+0x0/0x7f0 [lustre]
23:56:52: [<ffffffffa08f3bd0>] ? ldlm_completion_ast+0x0/0x930 [ptlrpc]
23:56:52: [<ffffffffa053ba28>] ? lprocfs_counter_add+0x1a8/0x1c0 [obdclass]
23:56:52: [<ffffffffa082eae1>] lmv_intent_remote+0x481/0xa90 [lmv]
23:56:52: [<ffffffffa13434d0>] ? ll_md_blocking_ast+0x0/0x7f0 [lustre]
23:56:52: [<ffffffffa082fd49>] lmv_intent_lookup+0xc59/0xd00 [lmv]
23:56:52: [<ffffffffa13434d0>] ? ll_md_blocking_ast+0x0/0x7f0 [lustre]
23:56:52: [<ffffffffa03d027b>] ? cfs_set_ptldebug_header+0x2b/0xc0 [libcfs]
23:56:52: [<ffffffffa0830ada>] lmv_intent_lock+0x32a/0x380 [lmv]
23:56:52: [<ffffffffa13434d0>] ? ll_md_blocking_ast+0x0/0x7f0 [lustre]
23:56:52: [<ffffffffa1325ed8>] ? ll_prep_md_op_data+0x1a8/0x490 [lustre]
23:56:52: [<ffffffffa1344d79>] ll_lookup_it+0x299/0xb50 [lustre]
23:56:52: [<ffffffffa13434d0>] ? ll_md_blocking_ast+0x0/0x7f0 [lustre]
23:56:52: [<ffffffffa134589f>] ll_lookup_nd+0x26f/0x3e0 [lustre]
23:56:52: [<ffffffff811a394e>] ? d_alloc+0x13e/0x1b0
23:56:52: [<ffffffff81198a25>] do_lookup+0x1a5/0x230
23:56:52: [<ffffffff81198db0>] __link_path_walk+0x200/0xff0
23:56:52: [<ffffffff8114a647>] ? handle_pte_fault+0xf7/0xb00
23:56:52: [<ffffffff81199e5a>] path_walk+0x6a/0xe0
23:56:52: [<ffffffff8119a06b>] filename_lookup+0x6b/0xc0
23:56:52: [<ffffffff81196996>] ? final_putname+0x26/0x50
23:56:52: [<ffffffff8119b197>] user_path_at+0x57/0xa0
23:56:52: [<ffffffff8104a98c>] ? __do_page_fault+0x1ec/0x480
23:56:52: [<ffffffff812826f5>] ? _atomic_dec_and_lock+0x55/0x80
23:56:52: [<ffffffff811aaa10>] ? mntput_no_expire+0x30/0x110
23:56:52: [<ffffffff8118e7a4>] ? cp_new_stat+0xe4/0x100
23:56:52: [<ffffffff8118e9e0>] vfs_fstatat+0x50/0xa0
23:56:52: [<ffffffff8118eb5b>] vfs_stat+0x1b/0x20
23:56:52: [<ffffffff8118eb84>] sys_newstat+0x24/0x50
23:56:52: [<ffffffff810e2057>] ? audit_syscall_entry+0x1d7/0x200
23:56:52: [<ffffffff810e1e4e>] ? __audit_syscall_exit+0x25e/0x290
23:56:52: [<ffffffff8100b072>] system_call_fastpath+0x16/0x1b
Comment by Oleg Drokin [ 29/Apr/14 ]

So it appears tha client side exp_connect_flags is somehow corrupted because we had no servers not supporting inode bits for many years now. Not since 1.4.x I suspect.

Comment by Andreas Dilger [ 30/Apr/14 ]

It is strange that the LASSERT() is on the client, but LFSCK is running on the server (only some parts of the test scripts are setting up the environment on the client).

Oleg notes that this problem indicates that the client is trying to use an IBITS lock (as it should) but it thinks the server does not support this (due to missing connect feature flags). That might indicate some problem or corruption with the connection state.

Comment by nasf (Inactive) [ 30/Apr/14 ]

Yes, I also suspected that it should not related with OI scrub, instead, might because of some connection race cases. I remembered that we have met similar connection flags issues about client_is_remote() before at LLNL site. But I cannot remember the detail...

Comment by Hongchao Zhang [ 27/Jun/14 ]

https://testing.hpdd.intel.com/test_sets/099fd572-fd95-11e3-b84d-5254006e85c2

Comment by James Nunez (Inactive) [ 30/Jun/14 ]

sanity-scrub test 4 is displaying this error and crashing the client node.

I'm not sure why so few logs were produced, but here are the logs from the client node: https://testing.hpdd.intel.com/test_sessions/77df2c0e-0092-11e4-a42b-5254006e85c2

Comment by James Nunez (Inactive) [ 02/Jul/14 ]

I've run sanity-scrub four times on the OpenSFS cluster and sanity-scrub fails with this error each time in test 2, 4 or 5.

I've run with one, two and four clients with two MDSs each with two MDTs and four OSSs with two OSTs each with Lustre 2.5.60 build # 2535.

Comment by nasf (Inactive) [ 03/Jul/14 ]

A debug patch to check the export status when assertion:

http://review.whamcloud.com/#/c/10958/

Comment by James Nunez (Inactive) [ 03/Jul/14 ]

I've run sanity-scrub twice with the debug patch and only hit this error one of the runs. Here is the output from the debug message for a single client run that failed in sanity-scrub test 2:

...
restore data
restore EA
/usr/lib64/lustre/tests
remove recovery logs
removed `/lustre/brpt/CATALOGS'
starting MDTs without disabling OI scrub
Starting client: c17: -o user_xattr,flock mds01@o2ib:/scratch /lustre/scratch

Message from syslogd@c17 at Jul  3 11:36:24 ...
 kernel:LustreError: 22841:0:(ldlm_lock.c:672:ldlm_lock2desc()) ASSERTION( lock->l_policy_data.l_inodebits.bits == (MDS_INODELOCK_LOOKUP | MDS_INODELOCK_UPDATE | MDS_INODELOCK_LAYOUT) ) failed: Inappropriate inode lock bits during conversion, bits 0x3, exp ffff880827d6dc00, flags 0x0

Message from syslogd@c17 at Jul  3 11:36:24 ...
 kernel:LustreError: 22841:0:(ldlm_lock.c:672:ldlm_lock2desc()) LBUG
Comment by Andreas Dilger [ 04/Jul/14 ]

Hmm, it has bits 0x3 set, so it is missing the LAYOUT flag. That would be possible if the file is being migrated, or maybe if it is a remote MDT object? I suspect that the LASSERT is wrong here, but I'm not positive yet as I haven't looked closely at the surrounding code.

Comment by nasf (Inactive) [ 04/Jul/14 ]

The bits may be right, because the former code cannot find OBD_CONNECT_IBITS in the connection_flags, so the layout ibits will be dropped automatically.

Here we get the connection_flags as "0x0", that is impossible for any connection. So there should be data corruption.

Comment by Andreas Dilger [ 04/Jul/14 ]

Since this only happens during sanity-scrub, I expect there is some kind of memory corruption or use-after-free problem in the code? It might be useful to run sanity-scrub on a kernel with CONFIG_DEBUG_SLAB and other memory debugging features (CONFIG_DEBUG_SPINLOCK, CONFIG_DEBUG_OBJECTS, CONFIG_DEBUG_KMEMLEAK, CONFIG_DEBUG_MUTEXES, CONFIG_DEBUG_KOBJECT) enabled to see if this triggers a problem.

Comment by nasf (Inactive) [ 05/Jul/14 ]

I have run related tests locally for hundreds of times, but cannot reproduce the failure by myself. So I have to go though all related code. Finally, I found that: it may be not related with data corruption, but more seems race condition between the asynchronous connection and synchronous lock_enqueue operation. Consider the following scenario:

1) There are two MDTs in the system, when the client mount, it will trigger MDS_CONNECT request to each of the MDT. But the connect request will be handled asynchronously. The mount processing will go ahead when the connection between the client and MDT0 is established, but will not wait for the connection between the client and the MDT1 to be established.

2) After the client mount returned, the application tries to "stat $MOUNT/a/b", the directory "a" resides on the MDT0, the "b" resides on the MDT1. So the client will send lock_enqueue RPC to the MDT0 firstly.

3) The RPC handler on the MDT0 finds that the target object "b" resides on the remote MDT1, then it replies the client to try to talk with the MDT1.

4) The client gets the MDT0's reply, then calls lmv_intent_remote() to forward the request to the MDT1 via MDC1, but at that time, the connection between the client and the MDT1 has not been established yet.

5) For most of other RPCs, above case is not a problem, because the client-side RPC sender (in ptlrpc) will handle kinds of connection issues. But for the ldlm_enqueue, it is some different, because before giving the ldlm_enqueue RPC to the sender (or to the ptlrpc), the sponsor will check the export connect flags that will not be set until the connection between the client and the MDT has been established. So at such time point, the sponsor of the ldlm_enqueue request will get "0" flags from the export connection flags, then trigger the ASSERTION as described in this ticket.

So it is not a special bug for OI scrub, but more general issue for the whole Lustre framework. Because the sanity-scrub will access some cross-MDTs objects immediately after the client mount (without any wait), then it is more easy to reproduce the failure.

About the fixing, I do not want to change the general connection/ldlm framework, because I am afraid of introducing some potential bugs. Instead, the patch will make the ldlm_request request to wait the connection to be established for such race case.

Comment by nasf (Inactive) [ 05/Jul/14 ]

James, would you please to verify the new patch? Thanks!
http://review.whamcloud.com/#/c/10958/

Comment by Di Wang [ 05/Jul/14 ]

Hmm, I think this is the problem for single MDT FS too, i.e. ldlm_cli_enqueue should not use exp connection flag before it make sure the connection is established?
This inodebit connect flag is being added even before 1.6 I guess? At least this inodebit interop code is added in 2007

commit 113303973ec9f8484eb2355a1a6ef3c4c7fd6a56
Author: nathan <nathan>
Date:   Sat Feb 10 06:33:41 2007 +0000

    land b1_5 onto HEAD

Which I thought it use exp connection flag in a wrong way. So we probably just remove these code to save some trouble here. Since I do not think we plan to let 2.6 lustre client to connect some non-inodebit support server (1.4 ?)

Comment by nasf (Inactive) [ 05/Jul/14 ]

No, for single MDT case, there is no such race condition. Because the mount process will call md_getstatus() on the ROOT object firstly, it is NOT ldlm_enqueue RPC. Then when the mount return to user-space, it can guarantee that the connection between the client and the MDT0 has been established successfully.

Comment by Di Wang [ 07/Jul/14 ]

Ah, I see. Hmm, I still think remove those incompatible inodebit handling is a better choice, or can we do sth like this

root@testnode lustre-release]# git diff lustre/ldlm/
diff --git a/lustre/ldlm/ldlm_request.c b/lustre/ldlm/ldlm_request.c
index 76f329d..2c0d05d 100644
--- a/lustre/ldlm/ldlm_request.c
+++ b/lustre/ldlm/ldlm_request.c
@@ -909,7 +909,7 @@ int ldlm_cli_enqueue(struct obd_export *exp, struct ptlrpc_request **reqp,
                              OBD_CONNECT_IBITS))
                                 lock->l_policy_data.l_inodebits.bits =
                                         MDS_INODELOCK_LOOKUP |
-                                        MDS_INODELOCK_UPDATE;
+                                        MDS_INODELOCK_UPDATE | MDS_INODELOCK_LAYOUT;
                         else
                                 lock->l_policy_data = *policy;
                 }
Comment by nasf (Inactive) [ 07/Jul/14 ]

Wangdi, I have verified the race conditions locally. But I am not sure whether removing the incompatible bits is a suitable solution or not. Because the key issue is that we are checking un-iniitialized export flags.

Comment by Di Wang [ 07/Jul/14 ]

Because the key issue is that we are checking un-iniitialized export flags.

Yes, but that is for checking if the server support inodebit lock. But since 2.6 client will be only connected to inodebit enable server, so why do we need this check? TBH, I really do not like add "connection wait logic" above ptlrpc, which might bring us troubles or even worse.

Comment by nasf (Inactive) [ 07/Jul/14 ]

Even though we drop the incompatible ibits, we still needs to prevent the user to use the non-initialised export, otherwise, there may be other potential bugs.

Comment by James Nunez (Inactive) [ 07/Jul/14 ]

I've tested the patch at http://review.whamcloud.com/#/c/10958/ (version 2) and, after running sanity-scrub three times, I have not seen this assertion.

The patch fixes the crash/error for the configuration I've been testing with.

Comment by nasf (Inactive) [ 07/Jul/14 ]

Thanks James!

Comment by Di Wang [ 07/Jul/14 ]

"Even though we drop the incompatible ibits, we still needs to prevent the user to use the non-initialised export, otherwise, there may be other potential bugs."

That is something should never happen, IMHO.

Comment by nasf (Inactive) [ 08/Jul/14 ]

Here is the patch to drop the redundant ibits lock interoperability check:

http://review.whamcloud.com/11004

Comment by nasf (Inactive) [ 10/Jul/14 ]

The patch has been landed to master.

Comment by Jian Yu [ 04/Dec/14 ]

While verifying patch http://review.whamcloud.com/12606 on Lustre b2_5 branch, the same failure occurred:

LustreError: 19407:0:(ldlm_lock.c:669:ldlm_lock2desc()) ASSERTION( lock->l_policy_data.l_inodebits.bits == (MDS_INODELOCK_LOOKUP | MDS_INODELOCK_UPDATE | MDS_INODELOCK_LAYOUT) ) failed: Inappropriate inode lock bits during conversion 3
LustreError: 19407:0:(ldlm_lock.c:669:ldlm_lock2desc()) LBUG
Pid: 19407, comm: stat

Call Trace:
 [<ffffffffa03d0895>] libcfs_debug_dumpstack+0x55/0x80 [libcfs] 
 [<ffffffffa03d0e97>] lbug_with_loc+0x47/0xb0 [libcfs] 
 [<ffffffffa067a239>] ldlm_lock2desc+0x179/0x180 [ptlrpc] 
 [<ffffffffa068cb90>] ldlm_cli_enqueue+0x1f0/0x790 [ptlrpc] 
 [<ffffffffa06b29ea>] ? ptlrpc_request_set_replen+0x3a/0x60 [ptlrpc] 
 [<ffffffffa06919d0>] ? ldlm_completion_ast+0x0/0x920 [ptlrpc] 
 [<ffffffffa0a825c0>] ? ll_md_blocking_ast+0x0/0x7d0 [lustre] 
 [<ffffffffa08ded8e>] mdc_enqueue+0x2be/0x1a10 [mdc]
 [<ffffffffa01bc294>] ? fld_client_rpc+0x864/0xed0 [fld]
 [<ffffffffa08e06dd>] mdc_intent_lock+0x1fd/0x64a [mdc]
 [<ffffffffa0a825c0>] ? ll_md_blocking_ast+0x0/0x7d0 [lustre] 
 [<ffffffffa06919d0>] ? ldlm_completion_ast+0x0/0x920 [ptlrpc] 
 [<ffffffffa00e58b8>] ? lprocfs_counter_add+0x1a8/0x1d6 [lvfs]
 [<ffffffffa08a846e>] lmv_intent_remote+0x47e/0xa80 [lmv]
 [<ffffffffa0a825c0>] ? ll_md_blocking_ast+0x0/0x7d0 [lustre] 
 [<ffffffffa08a9137>] lmv_intent_lookup+0x6c7/0x700 [lmv]
 [<ffffffffa0a825c0>] ? ll_md_blocking_ast+0x0/0x7d0 [lustre] 
 [<ffffffffa08a9d6a>] lmv_intent_lock+0x32a/0x380 [lmv]
 [<ffffffffa0a825c0>] ? ll_md_blocking_ast+0x0/0x7d0 [lustre] 
 [<ffffffffa0a81d0e>] ? ll_i2gids+0x2e/0xd0 [lustre] 
 [<ffffffffa0a68c0d>] ? ll_prep_md_op_data+0x10d/0x3b0 [lustre] 
 [<ffffffffa0a8491f>] ll_lookup_it+0x33f/0xb00 [lustre] 
 [<ffffffffa0a825c0>] ? ll_md_blocking_ast+0x0/0x7d0 [lustre] 
 [<ffffffffa0a50c51>] ? __ll_inode_revalidate_it+0x1e1/0xc30 [lustre] 
 [<ffffffffa0a8535f>] ll_lookup_nd+0x27f/0x3f0 [lustre] 
 [<ffffffff811a42fe>] ? d_alloc+0x13e/0x1b0
 [<ffffffff81198a35>] do_lookup+0x1a5/0x230
 [<ffffffff81199100>] __link_path_walk+0x200/0x1000
 [<ffffffff8114a3d7>] ? handle_pte_fault+0xf7/0xb00
 [<ffffffff8119a1ba>] path_walk+0x6a/0xe0
 [<ffffffff8119a3cb>] filename_lookup+0x6b/0xc0
 [<ffffffff8119b4f7>] user_path_at+0x57/0xa0
 [<ffffffff8104a98c>] ? __do_page_fault+0x1ec/0x480
 [<ffffffff8118e990>] vfs_fstatat+0x50/0xa0
 [<ffffffff811515a5>] ? do_mmap_pgoff+0x335/0x380
 [<ffffffff8118ea4e>] vfs_lstat+0x1e/0x20
 [<ffffffff8118ea74>] sys_newlstat+0x24/0x50
 [<ffffffff810e1e07>] ? audit_syscall_entry+0x1d7/0x200
 [<ffffffff8100b072>] system_call_fastpath+0x16/0x1b

Kernel panic - not syncing: LBUG

Maloo report: https://testing.hpdd.intel.com/test_sets/f4332d04-7af0-11e4-956d-5254006e85c2

Hi Nasf, could you please take a look whether this is a regression introduced by the patch http://review.whamcloud.com/12606 or not? If not, then this is an issue on Lustre b2_5 branch and I'll back-port your patch to fix it. Thank you!

Comment by nasf (Inactive) [ 04/Dec/14 ]

Yujian,

I do not think it is http://review.whamcloud.com/#/c/12606/ caused the trouble. You need to back-port patch(es) to fix it.

Comment by Jian Yu [ 05/Dec/14 ]

Thank you, Nasf. Will do.

Comment by Jian Yu [ 05/Dec/14 ]

One more instance on Lustre b2_5 branch:
https://testing.hpdd.intel.com/test_sets/c800bb44-5e7f-11e4-9843-5254006e85c2

Comment by Gerrit Updater [ 05/Dec/14 ]

Jian Yu (jian.yu@intel.com) uploaded a new patch: http://review.whamcloud.com/12960
Subject: LU-4971 ldlm: drop redundant ibits lock interoperability check
Project: fs/lustre-release
Branch: b2_5
Current Patch Set: 1
Commit: 2b44f94c2bfbefd9b0f85a52218979b851d7af58

Generated at Sat Feb 10 01:47:26 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.