[LU-2134] mmp.sh test_1: osd_start()) ASSERTION( obd->obd_lu_dev ) failed Created: 09/Oct/12  Updated: 29/May/17  Resolved: 29/May/17

Status: Resolved
Project: Lustre
Component/s: None
Affects Version/s: Lustre 2.4.0
Fix Version/s: None

Type: Bug Priority: Minor
Reporter: Andreas Dilger Assignee: WC Triage
Resolution: Duplicate Votes: 0
Labels: None
Environment:

single node test system (client, MDS, OSS on dual-core x86_64 node), 2.3.53-10-g94b6f09 + patch from http://review.whamcloud.com/3715


Issue Links:
Related
is related to LU-1558 mmp.sh should be able to run so long ... Resolved
Severity: 3
Rank (Obsolete): 5136

 Description   

I hit a crash when starting up mmp.sh by itself. I had run a previous replay-dual.sh test, then unmounted it, removed all the Lustre modules, and then ran "sh mmp.sh", which appears to have tried to use the existing filesystems without reformatting.

Test output looks like:

Failover is not used on MDS, enable MMP manually
tune2fs 1.42.5.wc3 (15-Sep-2012)
Multiple mount protection has been enabled with update interval 5s.
dumpe2fs 1.42.5.wc3 (15-Sep-2012)
Filesystem features:      has_journal ext_attr resize_inode dir_index filetype mmp flex_bg dirdata sparse_super large_file huge_file uninit_bg dir_nlink quota
Failover is not used on OSS, enable MMP manually
tune2fs 1.42.5.wc3 (15-Sep-2012)
Multiple mount protection has been enabled with update interval 5s.
dumpe2fs 1.42.5.wc3 (15-Sep-2012)
Filesystem features:      has_journal ext_attr resize_inode dir_index filetype extent mmp flex_bg sparse_super large_file huge_file uninit_bg dir_nlink quota

== mmp test 1: two mounts at the same time =========================================================== 17:29:44 (1349825384)
Mounting /dev/vg_sookie/lvmdt1 on mds1...
Mounting /dev/vg_sookie/lvmdt1 on mds1...
Starting mds1:   /dev/vg_sookie/lvmdt1 /mnt/mds1
Starting mds1:   /dev/vg_sookie/lvmdt1 /mnt/mds1

Console looks like:

Lustre: DEBUG MARKER: Failover is not used on MDS, enable MMP manuallyLustre: DEBUG MARKER: Failover is not used on OSS, enable MMP manually
Lustre: DEBUG MARKER: == mmp test 1: two mounts at the same time ===============
============================================ 17:29:44 (1349825384)
Lustre: DEBUG MARKER: Mounting /dev/vg_sookie/lvmdt1 on mds1...
Lustre: DEBUG MARKER: Mounting /dev/vg_sookie/lvmdt1 on mds1...
LustreError: 28896:0:(obd_class.h:995:obd_connect()) Device 0 not setup
LustreError: 28896:0:(obd_config.c:619:class_cleanup()) Device 0 not setup
LustreError: 28896:0:(obd_mount.c:2332:osd_start()) ASSERTION( obd->obd_lu_dev )
 failed: 
LustreError: 28896:0:(obd_mount.c:2332:osd_start()) LBUG
Pid: 28896, comm: mount.lustre

Call Trace:
 [<ffffffffa11ed905>] libcfs_debug_dumpstack+0x55/0x80 [libcfs]
 [<ffffffffa11edf17>] lbug_with_loc+0x47/0xb0 [libcfs]
 [<ffffffffa135ac00>] lustre_fill_super+0x1950/0x1af0 [obdclass]
 [<ffffffff8116a27c>] ? pcpu_alloc+0x3ac/0xa50
 [<ffffffff8127a01a>] ? strlcpy+0x4a/0x60
 [<ffffffff8117cd00>] ? set_anon_super+0x0/0x100
 [<ffffffffa13592b0>] ? lustre_fill_super+0x0/0x1af0 [obdclass]
 [<ffffffff8117e16f>] get_sb_nodev+0x5f/0xa0
 [<ffffffffa1344945>] lustre_get_sb+0x25/0x30 [obdclass]
 [<ffffffff8117ddcb>] vfs_kern_mount+0x7b/0x1b0
 [<ffffffff8117df72>] do_kern_mount+0x52/0x130
 [<ffffffff8119c652>] do_mount+0x2d2/0x8d0
 [<ffffffff8119cce0>] sys_mount+0x90/0xe0
 [<ffffffff8100b0f2>] system_call_fastpath+0x16/0x1b

LDISKFS-fs (dm-9): mounted filesystem with ordered data mode. quota=on. Opts: 
LustreError: dumping log to /tmp/lustre-log.1349825385.28896
LustreError: 28929:0:(obd_mount.c:2320:osd_start()) ASSERTION( obd ) failed: 
LustreError: 28929:0:(obd_mount.c:2320:osd_start()) LBUG
Pid: 28929, comm: mount.lustre

Call Trace:
 [<ffffffffa11ed905>] libcfs_debug_dumpstack+0x55/0x80 [libcfs]
 [<ffffffffa11edf17>] lbug_with_loc+0x47/0xb0 [libcfs]
 [<ffffffffa135a754>] lustre_fill_super+0x14a4/0x1af0 [obdclass]
 [<ffffffff8116a27c>] ? pcpu_alloc+0x3ac/0xa50
 [<ffffffff8127a01a>] ? strlcpy+0x4a/0x60
 [<ffffffff8117cd00>] ? set_anon_super+0x0/0x100
 [<ffffffffa13592b0>] ? lustre_fill_super+0x0/0x1af0 [obdclass]
 [<ffffffff8117e16f>] get_sb_nodev+0x5f/0xa0
 [<ffffffffa1344945>] lustre_get_sb+0x25/0x30 [obdclass]
 [<ffffffff8117ddcb>] vfs_kern_mount+0x7b/0x1b0
 [<ffffffff8117df72>] do_kern_mount+0x52/0x130
 [<ffffffff8119c652>] do_mount+0x2d2/0x8d0
 [<ffffffff8119cce0>] sys_mount+0x90/0xe0
 [<ffffffff8100b0f2>] system_call_fastpath+0x16/0x1b


 Comments   
Comment by Andreas Dilger [ 12/Oct/12 ]

I hit this bug again with the same process - running "mmp.sh" on an existing filesystem (from clean start, no modules loaded) while trying to test the http://review.whamcloud.com/3715 patch. Running "mmp.sh" without the patch (after a reboot) didn't hit any problems. I'm not sure what the difference is between the two tests yet.

Comment by Bruno Faccini (Inactive) [ 24/Jul/14 ]

Andreas, do you agree with me that we can set this ticket as a dup LU-5299 ??

Generated at Sat Feb 10 01:22:39 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.