[LU-3649] lu_device_fini()) ASSERTION( cfs_atomic_read(&d->ld_ref) == 0 ) failed: Refcount is 1 in lfsck_deregister Created: 26/Jul/13 Updated: 29/Aug/13 Resolved: 26/Aug/13 |
|
| Status: | Resolved |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | Lustre 2.5.0 |
| Fix Version/s: | Lustre 2.4.1, Lustre 2.5.0 |
| Type: | Bug | Priority: | Blocker |
| Reporter: | Oleg Drokin | Assignee: | nasf (Inactive) |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | None | ||
| Severity: | 3 |
| Rank (Obsolete): | 9390 |
| Description |
|
After recent lfsck landing now I have sanity test 71 crashing like this: <0>[82524.023928] LustreError: 8247:0:(lu_object.c:1198:lu_device_fini()) ASSERTION( cfs_atomic_read(&d->ld_ref) == 0 ) failed: Refcount is 1 <0>[82524.024918] LustreError: 8247:0:(lu_object.c:1198:lu_device_fini()) LBUG <4>[82524.025406] Pid: 8247, comm: umount <4>[82524.025859] <4>[82524.025860] Call Trace: <4>[82524.026587] [<ffffffffa04b68a5>] libcfs_debug_dumpstack+0x55/0x80 [libcfs] <4>[82524.027099] [<ffffffffa04b6ea7>] lbug_with_loc+0x47/0xb0 [libcfs] <4>[82524.027621] [<ffffffffa05d66b8>] lu_device_fini+0xb8/0xc0 [obdclass] <4>[82524.028162] [<ffffffffa05bb797>] ls_device_put+0x87/0x1d0 [obdclass] <4>[82524.028627] [<ffffffffa05bba03>] local_oid_storage_fini+0x123/0x1d0 [obdclass] <4>[82524.029370] [<ffffffffa0af3267>] lfsck_instance_cleanup+0x137/0x360 [lfsck] <4>[82524.029917] [<ffffffffa0af5622>] lfsck_degister+0xa2/0xd0 [lfsck] <4>[82524.030408] [<ffffffffa0d6f8ff>] ofd_device_fini+0x4f/0x240 [ofd] <4>[82524.030925] [<ffffffffa05c8507>] class_cleanup+0x577/0xda0 [obdclass] <4>[82524.031434] [<ffffffffa059fabc>] ? class_name2dev+0x7c/0xe0 [obdclass] <4>[82524.031966] [<ffffffffa05c9dec>] class_process_config+0x10bc/0x1c80 [obdclass] <4>[82524.032937] [<ffffffffa05c383c>] ? lustre_cfg_new+0x16c/0x6e0 [obdclass] <4>[82524.033452] [<ffffffffa05c39a3>] ? lustre_cfg_new+0x2d3/0x6e0 [obdclass] <4>[82524.034018] [<ffffffffa05cab29>] class_manual_cleanup+0x179/0x6e0 [obdclass] <4>[82524.034549] [<ffffffffa059fabc>] ? class_name2dev+0x7c/0xe0 [obdclass] <4>[82524.035084] [<ffffffffa0604b84>] server_put_super+0x5c4/0xed0 [obdclass] <4>[82524.035580] [<ffffffff81183a4b>] generic_shutdown_super+0x5b/0xe0 <4>[82524.036065] [<ffffffff81183b36>] kill_anon_super+0x16/0x60 <4>[82524.036552] [<ffffffffa05cc976>] lustre_kill_super+0x36/0x60 [obdclass] <4>[82524.037052] [<ffffffff811842d7>] deactivate_super+0x57/0x80 <4>[82524.037536] [<ffffffff811a237f>] mntput_no_expire+0xbf/0x110 <4>[82524.038026] [<ffffffff811a2dfb>] sys_umount+0x7b/0x3a0 <4>[82524.038481] [<ffffffff8100b0b2>] system_call_fastpath+0x16/0x1b <4>[82524.039756] <0>[82524.041349] Kernel panic - not syncing: LBUG Crashdump and modules are in /exports/crashdumps/192.168.10.219-2013-07-26-15\:21\:02/ tag in my source branch: master-20130726 |
| Comments |
| Comment by Oleg Drokin [ 28/Jul/13 ] |
|
I seem to be hitting this very frequently now on umount in all sort of various testruns. |
| Comment by nasf (Inactive) [ 28/Jul/13 ] |
|
This is the patch: |
| Comment by Oleg Drokin [ 29/Jul/13 ] |
|
patch landed |
| Comment by nasf (Inactive) [ 31/Jul/13 ] |
|
The patch is not enough, we need another fix for that: |
| Comment by nasf (Inactive) [ 21/Aug/13 ] |
|
|
| Comment by nasf (Inactive) [ 23/Aug/13 ] |
|
The patch for b2_4: |
| Comment by Peter Jones [ 26/Aug/13 ] |
|
Landed for 2.5 |