[LU-3327] kernel:LustreError: 4543:0:(mdc_locks.c:143:mdc_set_lock_data()) LBUG [reproducible in my environment] Created: 13/May/13  Updated: 14/May/13  Resolved: 13/May/13

Status: Resolved
Project: Lustre
Component/s: None
Affects Version/s: Lustre 2.4.0
Fix Version/s: None

Type: Bug Priority: Minor
Reporter: Scott Sampson Assignee: WC Triage
Resolution: Duplicate Votes: 0
Labels: None
Environment:

both client and server rhel6.4, Lustre: Build Version: v2_3_65-1dkms-1-PRISTINE-2.6.32-358.el6.x86_64
zfs backend


Issue Links:
Related
is related to LU-3318 mdc_set_lock_data() ASSERTION( old_in... Resolved
Severity: 3
Rank (Obsolete): 8249

 Description   

May 12 22:51:30 l2 kernel: LustreError: 4542:0:(lov_object.c:184:lov_init_sub()) header@ffff88083c406ac0[0x0, 6, [0x100010000:0x6775:0x0] hash]

{ May 12 22:51:30 l2 kernel: LustreError: 4542:0:(lov_object.c:184:lov_init_sub()) ....lovsub@ffff88083c406b58[0] May 12 22:51:30 l2 kernel: LustreError: 4542:0:(lov_object.c:184:lov_init_sub()) ....osc@ffff88083d3fb728id: 0x0:26485 idx: 1 gen: 0 kms_valid: 1 kms 10752 rc: 0 force_sync: 0 min_xid: 0 size: 10752 mtime: 1368424271 atime: 0 ctime: 1368424271 blocks: 1 May 12 22:51:30 l2 kernel: LustreError: 4542:0:(lov_object.c:184:lov_init_sub()) }

header@ffff88083c406ac0
May 12 22:51:30 l2 kernel: LustreError: 4542:0:(lov_object.c:184:lov_init_sub()) stripe 0 is already owned.
May 12 22:51:30 l2 kernel: LustreError: 4542:0:(lov_object.c:185:lov_init_sub()) header@ffff88083c40ebc8[0x0, 5, [0x200000404:0x154:0x0] hash]{
May 12 22:51:30 l2 kernel: LustreError: 4542:0:(lov_object.c:185:lov_init_sub()) ....vvp@ffff88083c40ec60(+ 0 0) inode: ffff88083c420bf8 144115205322834260/33554436 100640 1 1 ffff88083c40ec60 [0x200000404:0x154:0x0]
May 12 22:51:30 l2 kernel: LustreError: 4542:0:(lov_object.c:185:lov_init_sub()) ....lov@ffff88083c405ac0stripes: 1, valid, lsm

{ffff88084163e3c0 0x0BD10BD0 1 1 0}

:
May 12 22:51:30 l2 kernel: LustreError: 4542:0:(lov_object.c:185:lov_init_sub()) header@ffff88083c406ac0[0x0, 6, [0x100010000:0x6775:0x0] hash]

{ May 12 22:51:30 l2 kernel: LustreError: 4542:0:(lov_object.c:185:lov_init_sub()) ....lovsub@ffff88083c406b58[0] May 12 22:51:30 l2 kernel: LustreError: 4542:0:(lov_object.c:185:lov_init_sub()) ....osc@ffff88083d3fb728id: 0x0:26485 idx: 1 gen: 0 kms_valid: 1 kms 10752 rc: 0 force_sync: 0 min_xid: 0 size: 10752 mtime: 1368424271 atime: 0 ctime: 1368424271 blocks: 1 May 12 22:51:30 l2 kernel: LustreError: 4542:0:(lov_object.c:185:lov_init_sub()) }

header@ffff88083c406ac0
May 12 22:51:30 l2 kernel: LustreError: 4542:0:(lov_object.c:185:lov_init_sub())
May 12 22:51:30 l2 kernel: LustreError: 4542:0:(lov_object.c:185:lov_init_sub()) } header@ffff88083c40ebc8
May 12 22:51:30 l2 kernel: LustreError: 4542:0:(lov_object.c:185:lov_init_sub()) owned.
May 12 22:51:30 l2 kernel: LustreError: 4542:0:(lov_object.c:186:lov_init_sub()) header@ffff880841756dd8[0x0, 1, [0x200000404:0x154:0x0]]
May 12 22:51:30 l2 kernel: LustreError: 4542:0:(lov_object.c:186:lov_init_sub()) try to own.
May 12 22:51:39 l2 kernel: LustreError: 4543:0:(mdc_locks.c:143:mdc_set_lock_data()) ASSERTION( old_inode->i_state & I_FREEING ) failed: Found existing inode ffff88084365e178/144115205255786850/33554436 state 0 in lock: setting data to ffff8808689361b8/144115205255786850/33554436
May 12 22:51:39 l2 kernel: LustreError: 4543:0:(mdc_locks.c:143:mdc_set_lock_data()) LBUG

Message from syslogd@l2 at May 12 22:51:39 ...
kernel:LustreError: 4543:0:(mdc_locks.c:143:mdc_set_lock_data()) ASSERTION( old_inode->i_state & I_FREEING ) failed: Found existing inode ffff88084365e178/144115205255786850/33554436 state 0 in lock: setting data to ffff8808689361b8/144115205255786850/33554436

Message from syslogd@l2 at May 12 22:51:39 ...
kernel:LustreError: 4543:0:(mdc_locks.c:143:mdc_set_lock_data()) LBUG
May 12 22:51:39 l2 kernel: Pid: 4543, comm: rm
May 12 22:51:39 l2 kernel:
May 12 22:51:39 l2 kernel: Call Trace:
May 12 22:51:39 l2 kernel: [<ffffffffa0516895>] libcfs_debug_dumpstack+0x55/0x80 [libcfs]
May 12 22:51:39 l2 kernel: [<ffffffffa0516e97>] lbug_with_loc+0x47/0xb0 [libcfs]
May 12 22:51:39 l2 kernel: [<ffffffffa0a2329d>] mdc_set_lock_data+0x1fd/0x230 [mdc]
May 12 22:51:39 l2 kernel: [<ffffffffa0c88163>] lmv_set_lock_data+0x103/0x480 [lmv]
May 12 22:51:39 l2 kernel: [<ffffffffa0bb359e>] ll_lookup_it_finish+0x84e/0xe80 [lustre]
May 12 22:51:39 l2 kernel: [<ffffffffa0ca1eda>] ? lmv_intent_lock+0x31a/0x370 [lmv]
May 12 22:51:39 l2 kernel: [<ffffffffa0baf920>] ? ll_md_blocking_ast+0x0/0x750 [lustre]
May 12 22:51:39 l2 kernel: [<ffffffffa0bae9ee>] ? ll_i2gids+0x2e/0xd0 [lustre]
May 12 22:51:39 l2 kernel: [<ffffffffa0bb3fc7>] ll_lookup_it+0x3f7/0xbf0 [lustre]
May 12 22:51:39 l2 kernel: [<ffffffffa0baf920>] ? ll_md_blocking_ast+0x0/0x750 [lustre]

The log shows thousands of lines with 22:51:30 timestamp but only the last few lines are shown here. This timestamp was 9 seconds before the LBUG event and corresponded to my deleting and moving a few hundred small files. I've reproduced this about a dozen times using slightly different steps each time. I'm a new lustre user just doing testing at this point.

This issue seems similar to LU-3318.



 Comments   
Comment by Jodi Levi (Inactive) [ 13/May/13 ]

Duplicate of LU-3318

Please provide how this can be reproduced in LU-3318.
Thank you!

Comment by Peter Jones [ 14/May/13 ]

Scott

It would be great if you could chime in on LU-3318 if you have something different to add

Thanks

Peter

Generated at Sat Feb 10 01:32:58 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.