[LU-2119] lu_device_fini(): ASSERTION( cfs_atomic_read(&d->ld_ref) == 0 ) failed: Refcount is 1 Created: 09/Oct/12  Updated: 11/Oct/12  Resolved: 11/Oct/12

Status: Resolved
Project: Lustre
Component/s: None
Affects Version/s: Lustre 2.4.0
Fix Version/s: Lustre 2.4.0

Type: Bug Priority: Major
Reporter: Li Wei (Inactive) Assignee: Niu Yawei (Inactive)
Resolution: Fixed Votes: 0
Labels: zfs

Severity: 3
Rank (Obsolete): 5120

 Description   

See also LU-2070.

https://maloo.whamcloud.com/test_sets/d45314ce-11ff-11e2-a663-52540035b04c

From the OSS console log:

01:55:36:LustreError: 15560:0:(obd_mount.c:2107:server_put_super()) lustre-OST0000: Fail to disconnect osp-on-ost!
01:55:36:LustreError: 15560:0:(obd_mount.c:2137:server_put_super()) no obd lustre-OST0000
01:55:36:LustreError: 15560:0:(obd_mount.c:1415:lustre_stop_osp()) Can not find osp-on-ost lustre-MDT0000-osp-OST0000
01:55:36:LustreError: 15560:0:(obd_mount.c:2152:server_put_super()) lustre-OST0000: Fail to stop osp-on-ost!
01:55:37:LustreError: 15159:0:(lu_object.c:1114:lu_device_fini()) ASSERTION( cfs_atomic_read(&d->ld_ref) == 0 ) failed: Refcount is 1
01:55:37:LustreError: 15159:0:(lu_object.c:1114:lu_device_fini()) LBUG
01:55:37:Pid: 15159, comm: obd_zombid
01:55:37:
01:55:37:Call Trace:
01:55:37: [<ffffffffa0c9a905>] libcfs_debug_dumpstack+0x55/0x80 [libcfs]
01:55:37: [<ffffffffa0c9af17>] lbug_with_loc+0x47/0xb0 [libcfs]
01:55:37: [<ffffffffa06d96fc>] lu_device_fini+0xcc/0xd0 [obdclass]
01:55:37: [<ffffffffa06dfefe>] dt_device_fini+0xe/0x10 [obdclass]
01:55:37: [<ffffffffa0f44a43>] osd_device_free+0x173/0x2d0 [osd_zfs]
01:55:37: [<ffffffffa06b769d>] class_decref+0x46d/0x590 [obdclass]
01:55:37: [<ffffffffa0693ce9>] obd_zombie_impexp_cull+0x309/0x610 [obdclass]
01:55:37: [<ffffffffa06940b5>] obd_zombie_impexp_thread+0xc5/0x1c0 [obdclass]
01:55:37: [<ffffffff81060250>] ? default_wake_function+0x0/0x20
01:55:37: [<ffffffffa0693ff0>] ? obd_zombie_impexp_thread+0x0/0x1c0 [obdclass]
01:55:37: [<ffffffff8100c14a>] child_rip+0xa/0x20
01:55:37: [<ffffffffa0693ff0>] ? obd_zombie_impexp_thread+0x0/0x1c0 [obdclass]
01:55:37: [<ffffffffa0693ff0>] ? obd_zombie_impexp_thread+0x0/0x1c0 [obdclass]
01:55:37: [<ffffffff8100c140>] ? child_rip+0x0/0x20
01:55:37:
01:55:37:Kernel panic - not syncing: LBUG
01:55:37:Pid: 15159, comm: obd_zombid Tainted: P           ---------------    2.6.32-279.5.1.el6_lustre.ga194b25.x86_64 #1
01:55:37:Call Trace:
01:55:37: [<ffffffff814fd58a>] ? panic+0xa0/0x168
01:55:37: [<ffffffffa0c9af6b>] ? lbug_with_loc+0x9b/0xb0 [libcfs]
01:55:37: [<ffffffffa06d96fc>] ? lu_device_fini+0xcc/0xd0 [obdclass]
01:55:37: [<ffffffffa06dfefe>] ? dt_device_fini+0xe/0x10 [obdclass]
01:55:37: [<ffffffffa0f44a43>] ? osd_device_free+0x173/0x2d0 [osd_zfs]
01:55:37: [<ffffffffa06b769d>] ? class_decref+0x46d/0x590 [obdclass]
01:55:37: [<ffffffffa0693ce9>] ? obd_zombie_impexp_cull+0x309/0x610 [obdclass]
01:55:37: [<ffffffffa06940b5>] ? obd_zombie_impexp_thread+0xc5/0x1c0 [obdclass]
01:55:37: [<ffffffff81060250>] ? default_wake_function+0x0/0x20
01:55:37: [<ffffffffa0693ff0>] ? obd_zombie_impexp_thread+0x0/0x1c0 [obdclass]
01:55:37: [<ffffffff8100c14a>] ? child_rip+0xa/0x20
01:55:37: [<ffffffffa0693ff0>] ? obd_zombie_impexp_thread+0x0/0x1c0 [obdclass]
01:55:37: [<ffffffffa0693ff0>] ? obd_zombie_impexp_thread+0x0/0x1c0 [obdclass]
01:55:37: [<ffffffff8100c140>] ? child_rip+0x0/0x20


 Comments   
Comment by Li Wei (Inactive) [ 09/Oct/12 ]

https://maloo.whamcloud.com/test_sets/2b0033cc-1202-11e2-a663-52540035b04c

Comment by Li Wei (Inactive) [ 09/Oct/12 ]

https://maloo.whamcloud.com/test_sets/22c55996-1205-11e2-a663-52540035b04c

Comment by Johann Lombardi (Inactive) [ 09/Oct/12 ]

Niu, could you please have a look?

Comment by Niu Yawei (Inactive) [ 09/Oct/12 ]

Now we allocate qsd (which holds an refcount of osd) in osd_device_alloc(), but in the osd_device_fini() (for zfs only), we didn't call osd_shutdown() to cleanup the qsd, that'll cause trouble in the abnormal shutdown path. (for instance, OFD/MDT wasn't started, so osd_process_config()->osd_shutdown() will not be called).

I'll post a patch soon.

Comment by Niu Yawei (Inactive) [ 09/Oct/12 ]

http://review.whamcloud.com/4239

Comment by Niu Yawei (Inactive) [ 11/Oct/12 ]

patch landed.

Generated at Sat Feb 10 01:22:31 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.