[LU-4445] lu_object.c:1199:lu_device_fini()) ASSERTION( t->ldt_device_nr > 0 ) Created: 07/Jan/14  Updated: 19/Dec/14  Resolved: 28/Feb/14

Status: Resolved
Project: Lustre
Component/s: None
Affects Version/s: None
Fix Version/s: Lustre 2.6.0

Type: Bug Priority: Critical
Reporter: nasf (Inactive) Assignee: nasf (Inactive)
Resolution: Fixed Votes: 0
Labels: None
Environment:

I hit the following ASSERT() several times on client-side during LFSCK performance testing:

LustreError: 16887:0:(lu_object.c:1199:lu_device_fini()) ASSERTION( t->ldt_device_nr > 0 ) failed: 
LustreError: 16887:0:(lu_object.c:1199:lu_device_fini()) LBUG
Pid: 16887, comm: umount

Call Trace:
 [<ffffffffa03fc895>] libcfs_debug_dumpstack+0x55/0x80 [libcfs]
 [<ffffffffa03fce97>] lbug_with_loc+0x47/0xb0 [libcfs]
 [<ffffffffa0589434>] lu_device_fini+0x84/0xc0 [obdclass]
 [<ffffffffa0b8e778>] lov_device_free+0x38/0x250 [lov]
 [<ffffffffa058d1fe>] lu_stack_fini+0x7e/0xc0 [obdclass]
 [<ffffffffa0593b8e>] cl_stack_fini+0xe/0x10 [obdclass]
 [<ffffffffa117ce8d>] cl_sb_fini+0x6d/0x190 [lustre]
 [<ffffffffa1140364>] client_common_put_super+0x54/0x9f0 [lustre]
 [<ffffffffa1140e16>] ll_put_super+0x116/0x500 [lustre]
 [<ffffffffa116e6e3>] ? ll_destroy_inode+0xc3/0x100 [lustre]
 [<ffffffff8119d03f>] ? destroy_inode+0x2f/0x60
 [<ffffffff8119d50c>] ? dispose_list+0xfc/0x120
 [<ffffffff8119d906>] ? invalidate_inodes+0xf6/0x190
 [<ffffffff8118366b>] generic_shutdown_super+0x5b/0xe0
 [<ffffffff81183756>] kill_anon_super+0x16/0x60
 [<ffffffffa057f2fa>] lustre_kill_super+0x4a/0x60 [obdclass]
 [<ffffffff81183ef7>] deactivate_super+0x57/0x80
 [<ffffffff811a21ef>] mntput_no_expire+0xbf/0x110
 [<ffffffff811a2c5b>] sys_umount+0x7b/0x3a0
 [<ffffffff8100b072>] system_call_fastpath+0x16/0x1b

Kernel panic - not syncing: LBUG

During the test, I mount multiple logic clients on the same node, and create in parallel, and then umount all of them after the create. LFSCK patches do not touch client-side code. So I do not know it is LFSCK patches caused the issue, but it is blocking the LFSCK performance test.

Jay, would you please give some look when you have time? Thanks!


Issue Links:
Related
is related to LU-6053 Review #12531 (LU-4604) failed to com... Resolved
Severity: 3
Rank (Obsolete): 12191

 Comments   
Comment by nasf (Inactive) [ 07/Jan/14 ]

I think that we lack the protection on the lu_device_type::ldt_device_nr. I will work on that.

Comment by nasf (Inactive) [ 07/Jan/14 ]

Here is the patch:

http://review.whamcloud.com/#/c/8750/

Comment by nasf (Inactive) [ 28/Feb/14 ]

http://review.whamcloud.com/#/c/8750/ has been merged into 8694, and has been landed to master.

Generated at Sat Feb 10 01:42:48 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.