[LU-12343] osc_lock.c:687:osc_ldlm_weigh_ast()) LBUG Created: 27/May/19 Updated: 04/Sep/19 Resolved: 21/Aug/19 |
|
| Status: | Resolved |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | Lustre 2.13.0 |
| Fix Version/s: | Lustre 2.13.0, Lustre 2.12.3 |
| Type: | Bug | Priority: | Minor |
| Reporter: | Shuichi Ihara | Assignee: | Patrick Farrell (Inactive) |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | ORNL | ||
| Issue Links: |
|
||||
| Severity: | 3 | ||||
| Rank (Obsolete): | 9223372036854775807 | ||||
| Description |
|
during mdtest from 10 clients to a directory which configured with DNE2/DoM, few clients crashed at same time below. [403683.467928] LustreError: 11-0: cache1-MDT0000-mdc-ffff89de6e235000: operation mds_connect to node 10.0.10.175@o2ib10 failed: rc = -16 [403709.580276] Lustre: Mounted cache1-client [403807.677757] LustreError: 192778:0:(osc_lock.c:687:osc_ldlm_weigh_ast()) ASSERTION( dlmlock->l_resource->lr_type == LDLM_EXTENT || ldlm_has_dom(dlmlock) ) failed: [403807.681557] LustreError: 192778:0:(osc_lock.c:687:osc_ldlm_weigh_ast()) LBUG [403807.683405] Pid: 192778, comm: mdtest 3.10.0-957.10.1.el7.x86_64 #1 SMP Mon Mar 18 15:06:45 UTC 2019 [403807.683407] Call Trace: [403807.683419] [<ffffffffc08867cc>] libcfs_call_trace+0x8c/0xc0 [libcfs] [403807.683446] [<ffffffffc088687c>] lbug_with_loc+0x4c/0xa0 [libcfs] [403807.683452] [<ffffffffc0d4b947>] osc_ldlm_weigh_ast+0x377/0x3a0 [osc] [403807.683474] [<ffffffffc0d9aa91>] mdc_cancel_weight+0xe1/0x130 [mdc] [403807.683484] [<ffffffffc0bb10d1>] ldlm_cancel_no_wait_policy+0x51/0x80 [ptlrpc] [403807.683527] [<ffffffffc0bb1118>] ldlm_cancel_aged_no_wait_policy+0x18/0x70 [ptlrpc] [403807.683546] [<ffffffffc0bb25ba>] ldlm_prepare_lru_list+0x1fa/0x4c0 [ptlrpc] [403807.683565] [<ffffffffc0bb5f1a>] ldlm_cancel_lru_local+0x1a/0x30 [ptlrpc] [403807.683584] [<ffffffffc0bb614e>] ldlm_prep_elc_req+0x21e/0x490 [ptlrpc] [403807.683605] [<ffffffffc0bb63e8>] ldlm_prep_enqueue_req+0x28/0x30 [ptlrpc] [403807.683624] [<ffffffffc0da9f55>] mdc_intent_getattr_pack.isra.15+0xa5/0x3a0 [mdc] [403807.683632] [<ffffffffc0dace12>] mdc_enqueue_base+0x532/0x1500 [mdc] [403807.683640] [<ffffffffc0dae545>] mdc_intent_lock+0x135/0x560 [mdc] [403807.683649] [<ffffffffc0deb962>] lmv_intent_lock+0x472/0xaf0 [lmv] [403807.683659] [<ffffffffc0e4760a>] ll_lookup_it+0x3aa/0x1910 [lustre] [403807.683685] [<ffffffffc0e49fbb>] ll_lookup_nd+0xbb/0x190 [lustre] [403807.683696] [<ffffffffa9a4c5b3>] lookup_real+0x23/0x60 [403807.683702] [<ffffffffa9a4cfd2>] __lookup_hash+0x42/0x60 [403807.683705] [<ffffffffa9a538bc>] do_unlinkat+0x14c/0x2d0 [403807.683710] [<ffffffffa9a549d6>] SyS_unlink+0x16/0x20 [403807.683715] [<ffffffffa9f75ddb>] system_call_fastpath+0x22/0x27 [403807.683722] [<ffffffffffffffff>] 0xffffffffffffffff [403807.683756] Kernel panic - not syncing: LBUG [403807.685577] CPU: 7 PID: 192778 Comm: mdtest Kdump: loaded Tainted: G OEL ------------ 3.10.0-957.10.1.el7.x86_64 #1 [403807.689195] Hardware name: Intel Corporation S2600KPR/S2600KPR, BIOS SE5C610.86B.01.01.0027.071020182329 07/10/2018 [403807.691049] Call Trace: [403807.692858] [<ffffffffa9f62e41>] dump_stack+0x19/0x1b [403807.694647] [<ffffffffa9f5c550>] panic+0xe8/0x21f [403807.696419] [<ffffffffc08868cb>] lbug_with_loc+0x9b/0xa0 [libcfs] [403807.698183] [<ffffffffc0d4b947>] osc_ldlm_weigh_ast+0x377/0x3a0 [osc] [403807.699954] [<ffffffffc0d9aa91>] mdc_cancel_weight+0xe1/0x130 [mdc] [403807.701740] [<ffffffffc0bb10d1>] ldlm_cancel_no_wait_policy+0x51/0x80 [ptlrpc] [403807.703481] [<ffffffffc0bb1118>] ldlm_cancel_aged_no_wait_policy+0x18/0x70 [ptlrpc] [403807.706888] [<ffffffffc0bb25ba>] ldlm_prepare_lru_list+0x1fa/0x4c0 [ptlrpc] [403807.710249] [<ffffffffc0bb5f1a>] ldlm_cancel_lru_local+0x1a/0x30 [ptlrpc] [403807.711919] [<ffffffffc0bb614e>] ldlm_prep_elc_req+0x21e/0x490 [ptlrpc] [403807.713536] [<ffffffffc0bb63e8>] ldlm_prep_enqueue_req+0x28/0x30 [ptlrpc] [403807.715110] [<ffffffffc0da9f55>] mdc_intent_getattr_pack.isra.15+0xa5/0x3a0 [mdc] [403807.716683] [<ffffffffc0dace12>] mdc_enqueue_base+0x532/0x1500 [mdc] [403807.721334] [<ffffffffc0dae545>] mdc_intent_lock+0x135/0x560 [mdc] [403807.727177] [<ffffffffc0deb962>] lmv_intent_lock+0x472/0xaf0 [lmv] [403807.735265] [<ffffffffc0e4760a>] ll_lookup_it+0x3aa/0x1910 [lustre] [403807.744991] [<ffffffffc0e49fbb>] ll_lookup_nd+0xbb/0x190 [lustre] [403807.746080] [<ffffffffa9a4c5b3>] lookup_real+0x23/0x60 [403807.747136] [<ffffffffa9a4cfd2>] __lookup_hash+0x42/0x60 [403807.748161] [<ffffffffa9a538bc>] do_unlinkat+0x14c/0x2d0 [403807.750112] [<ffffffffa9a549d6>] SyS_unlink+0x16/0x20 [403807.751057] [<ffffffffa9f75ddb>] system_call_fastpath+0x22/0x27 |
| Comments |
| Comment by Gerrit Updater [ 27/May/19 ] |
|
Patrick Farrell (pfarrell@whamcloud.com) uploaded a new patch: https://review.whamcloud.com/34966 |
| Comment by Patrick Farrell (Inactive) [ 27/May/19 ] |
|
I decided to take a quick stab at fixing this one... We'll see what Mike thinks. |
| Comment by James A Simmons [ 17/Jul/19 ] |
|
We just hit this on our 2.12 production system. |
| Comment by Gerrit Updater [ 21/Aug/19 ] |
|
Oleg Drokin (green@whamcloud.com) merged in patch https://review.whamcloud.com/34966/ |
| Comment by Peter Jones [ 21/Aug/19 ] |
|
Landed for 2.13 |
| Comment by Gerrit Updater [ 22/Aug/19 ] |
|
James Simmons (jsimmons@infradead.org) uploaded a new patch: https://review.whamcloud.com/35858 |
| Comment by Gerrit Updater [ 04/Sep/19 ] |
|
Oleg Drokin (green@whamcloud.com) merged in patch https://review.whamcloud.com/35858/ |