[LU-2119] lu_device_fini(): ASSERTION( cfs_atomic_read(&d->ld_ref) == 0 ) failed: Refcount is 1 Created: 09/Oct/12 Updated: 11/Oct/12 Resolved: 11/Oct/12 |
|
| Status: | Resolved |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | Lustre 2.4.0 |
| Fix Version/s: | Lustre 2.4.0 |
| Type: | Bug | Priority: | Major |
| Reporter: | Li Wei (Inactive) | Assignee: | Niu Yawei (Inactive) |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | zfs | ||
| Severity: | 3 |
| Rank (Obsolete): | 5120 |
| Description |
|
See also https://maloo.whamcloud.com/test_sets/d45314ce-11ff-11e2-a663-52540035b04c From the OSS console log: 01:55:36:LustreError: 15560:0:(obd_mount.c:2107:server_put_super()) lustre-OST0000: Fail to disconnect osp-on-ost! 01:55:36:LustreError: 15560:0:(obd_mount.c:2137:server_put_super()) no obd lustre-OST0000 01:55:36:LustreError: 15560:0:(obd_mount.c:1415:lustre_stop_osp()) Can not find osp-on-ost lustre-MDT0000-osp-OST0000 01:55:36:LustreError: 15560:0:(obd_mount.c:2152:server_put_super()) lustre-OST0000: Fail to stop osp-on-ost! 01:55:37:LustreError: 15159:0:(lu_object.c:1114:lu_device_fini()) ASSERTION( cfs_atomic_read(&d->ld_ref) == 0 ) failed: Refcount is 1 01:55:37:LustreError: 15159:0:(lu_object.c:1114:lu_device_fini()) LBUG 01:55:37:Pid: 15159, comm: obd_zombid 01:55:37: 01:55:37:Call Trace: 01:55:37: [<ffffffffa0c9a905>] libcfs_debug_dumpstack+0x55/0x80 [libcfs] 01:55:37: [<ffffffffa0c9af17>] lbug_with_loc+0x47/0xb0 [libcfs] 01:55:37: [<ffffffffa06d96fc>] lu_device_fini+0xcc/0xd0 [obdclass] 01:55:37: [<ffffffffa06dfefe>] dt_device_fini+0xe/0x10 [obdclass] 01:55:37: [<ffffffffa0f44a43>] osd_device_free+0x173/0x2d0 [osd_zfs] 01:55:37: [<ffffffffa06b769d>] class_decref+0x46d/0x590 [obdclass] 01:55:37: [<ffffffffa0693ce9>] obd_zombie_impexp_cull+0x309/0x610 [obdclass] 01:55:37: [<ffffffffa06940b5>] obd_zombie_impexp_thread+0xc5/0x1c0 [obdclass] 01:55:37: [<ffffffff81060250>] ? default_wake_function+0x0/0x20 01:55:37: [<ffffffffa0693ff0>] ? obd_zombie_impexp_thread+0x0/0x1c0 [obdclass] 01:55:37: [<ffffffff8100c14a>] child_rip+0xa/0x20 01:55:37: [<ffffffffa0693ff0>] ? obd_zombie_impexp_thread+0x0/0x1c0 [obdclass] 01:55:37: [<ffffffffa0693ff0>] ? obd_zombie_impexp_thread+0x0/0x1c0 [obdclass] 01:55:37: [<ffffffff8100c140>] ? child_rip+0x0/0x20 01:55:37: 01:55:37:Kernel panic - not syncing: LBUG 01:55:37:Pid: 15159, comm: obd_zombid Tainted: P --------------- 2.6.32-279.5.1.el6_lustre.ga194b25.x86_64 #1 01:55:37:Call Trace: 01:55:37: [<ffffffff814fd58a>] ? panic+0xa0/0x168 01:55:37: [<ffffffffa0c9af6b>] ? lbug_with_loc+0x9b/0xb0 [libcfs] 01:55:37: [<ffffffffa06d96fc>] ? lu_device_fini+0xcc/0xd0 [obdclass] 01:55:37: [<ffffffffa06dfefe>] ? dt_device_fini+0xe/0x10 [obdclass] 01:55:37: [<ffffffffa0f44a43>] ? osd_device_free+0x173/0x2d0 [osd_zfs] 01:55:37: [<ffffffffa06b769d>] ? class_decref+0x46d/0x590 [obdclass] 01:55:37: [<ffffffffa0693ce9>] ? obd_zombie_impexp_cull+0x309/0x610 [obdclass] 01:55:37: [<ffffffffa06940b5>] ? obd_zombie_impexp_thread+0xc5/0x1c0 [obdclass] 01:55:37: [<ffffffff81060250>] ? default_wake_function+0x0/0x20 01:55:37: [<ffffffffa0693ff0>] ? obd_zombie_impexp_thread+0x0/0x1c0 [obdclass] 01:55:37: [<ffffffff8100c14a>] ? child_rip+0xa/0x20 01:55:37: [<ffffffffa0693ff0>] ? obd_zombie_impexp_thread+0x0/0x1c0 [obdclass] 01:55:37: [<ffffffffa0693ff0>] ? obd_zombie_impexp_thread+0x0/0x1c0 [obdclass] 01:55:37: [<ffffffff8100c140>] ? child_rip+0x0/0x20 |
| Comments |
| Comment by Li Wei (Inactive) [ 09/Oct/12 ] |
|
https://maloo.whamcloud.com/test_sets/2b0033cc-1202-11e2-a663-52540035b04c |
| Comment by Li Wei (Inactive) [ 09/Oct/12 ] |
|
https://maloo.whamcloud.com/test_sets/22c55996-1205-11e2-a663-52540035b04c |
| Comment by Johann Lombardi (Inactive) [ 09/Oct/12 ] |
|
Niu, could you please have a look? |
| Comment by Niu Yawei (Inactive) [ 09/Oct/12 ] |
|
Now we allocate qsd (which holds an refcount of osd) in osd_device_alloc(), but in the osd_device_fini() (for zfs only), we didn't call osd_shutdown() to cleanup the qsd, that'll cause trouble in the abnormal shutdown path. (for instance, OFD/MDT wasn't started, so osd_process_config()->osd_shutdown() will not be called). I'll post a patch soon. |
| Comment by Niu Yawei (Inactive) [ 09/Oct/12 ] |
| Comment by Niu Yawei (Inactive) [ 11/Oct/12 ] |
|
patch landed. |