Details
-
Bug
-
Resolution: Duplicate
-
Major
-
None
-
None
-
None
-
3
-
12020
Description
My local test runs shows this bug almost every time in test 61d replay-single.sh
Dec 14 13:09:17 nodez kernel: Lustre: DEBUG MARKER: == replay-single test 61d: error in llog_setup should cleanup the llog context correctly == 13:09:16 (1387012156) Dec 14 13:09:17 nodez kernel: Lustre: Failing over lustre-MDT0000 Dec 14 13:09:17 nodez kernel: Lustre: server umount lustre-MDT0000 complete Dec 14 13:09:17 nodez kernel: LDISKFS-fs (loop0): mounted filesystem with ordered data mode. quota=on. Opts: Dec 14 13:09:17 nodez kernel: Lustre: *** cfs_fail_loc=605, val=0*** Dec 14 13:09:17 nodez kernel: LustreError: 8279:0:(llog_obd.c:207:llog_setup()) MGS: ctxt 0 lop_setup=ffffffffa0e26d90 failed: rc = -95 Dec 14 13:09:17 nodez kernel: LustreError: 8279:0:(obd_config.c:572:class_setup()) setup MGS failed (-95) Dec 14 13:09:17 nodez kernel: LustreError: 8279:0:(obd_mount.c:199:lustre_start_simple()) MGS setup error -95 Dec 14 13:09:17 nodez kernel: LustreError: 8279:0:(obd_mount_server.c:134:server_deregister_mount()) MGS not registered Dec 14 13:09:17 nodez kernel: LustreError: 15e-a: Failed to start MGS 'MGS' (-95). Is the 'mgs' module loaded? Dec 14 13:09:17 nodez kernel: LustreError: 8279:0:(obd_mount_server.c:844:lustre_disconnect_lwp()) lustre-MDT0000-lwp-MDT0000: Can't end config log lustre-client. Dec 14 13:09:17 nodez kernel: LustreError: 8279:0:(obd_mount_server.c:1419:server_put_super()) lustre-MDT0000: failed to disconnect lwp. (rc=-2) Dec 14 13:09:17 nodez kernel: LustreError: 8279:0:(obd_mount_server.c:1449:server_put_super()) no obd lustre-MDT0000 Dec 14 13:09:17 nodez kernel: LustreError: 8279:0:(obd_mount_server.c:134:server_deregister_mount()) lustre-MDT0000 not registered Dec 14 13:09:18 nodez kernel: general protection fault: 0000 [#1] SMP Dec 14 13:09:18 nodez kernel: last sysfs file: /sys/devices/system/cpu/possible Dec 14 13:09:18 nodez kernel: CPU 1 Dec 14 13:09:18 nodez kernel: Modules linked in: lustre ofd osp lod ost mdt mdd mgs osd_ldiskfs ldiskfs lquota lfsck obdecho mgc lov osc mdc lmv fid fld ptlrpc obdclass ksocklnd lnet libcfs zfs(P) zcommon(P) znvpair(P) zavl(P) zunicode(P) spl vboxsf vboxguest [last unloaded: libcfs] Dec 14 13:09:18 nodez kernel: Dec 14 13:09:18 nodez kernel: Pid: 8279, comm: mount.lustre Tainted: P --------------- T 2.6.32 #0 innotek GmbH VirtualBox/VirtualBox Dec 14 13:09:18 nodez kernel: RIP: 0010:[<ffffffffa0e46f03>] [<ffffffffa0e46f03>] lprocfs_remove_nolock+0x33/0x100 [obdclass] Dec 14 13:09:18 nodez kernel: RSP: 0018:ffff88003d34d928 EFLAGS: 00010202 Dec 14 13:09:18 nodez kernel: RAX: ffffffffa0ec08e0 RBX: 6b6b6b6b6b6b6b6b RCX: 0000000000000000 Dec 14 13:09:18 nodez kernel: RDX: 0000000000000000 RSI: 0000000000000030 RDI: ffff8800327b73c0 Dec 14 13:09:18 nodez kernel: RBP: 6b6b6b6b6b6b6b6b R08: 0000000000000158 R09: 0000000000000000 Dec 14 13:09:18 nodez kernel: R10: ffff880033c82a98 R11: ffff880033c829c0 R12: ffff8800327b74c8 Dec 14 13:09:18 nodez kernel: R13: 6b6b6b6b6b6b6b6b R14: 0000000000000002 R15: ffff88003c5e7aa0 Dec 14 13:09:18 nodez kernel: FS: 00007fadb4325700(0000) GS:ffff880001e80000(0000) knlGS:0000000000000000 Dec 14 13:09:18 nodez kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b Dec 14 13:09:18 nodez kernel: CR2: 00007f7e81b12ea0 CR3: 000000002b85f000 CR4: 00000000000006e0 Dec 14 13:09:18 nodez kernel: DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 Dec 14 13:09:18 nodez kernel: DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 Dec 14 13:09:18 nodez kernel: Process mount.lustre (pid: 8279, threadinfo ffff88003d34c000, task ffff88003e7547f0) Dec 14 13:09:18 nodez kernel: Stack: Dec 14 13:09:18 nodez kernel: ffff88003d6a2ed8 ffff880036490b78 ffff88003d6a2f80 ffff8800327b73c0 Dec 14 13:09:18 nodez kernel: <d> ffff88003d34d9d8 ffff8800327b74c8 0000000000000008 ffffffffa0e474a8 Dec 14 13:09:18 nodez kernel: <d> ffff8800327b7330 ffffffffa0660952 ffff88003d620000 ffff88003d34d9d8 Dec 14 13:09:18 nodez kernel: Call Trace: Dec 14 13:09:18 nodez kernel: [<ffffffffa0e474a8>] ? lprocfs_remove+0x18/0x30 [obdclass] Dec 14 13:09:18 nodez kernel: [<ffffffffa0660952>] ? qsd_fini+0x72/0x440 [lquota] Dec 14 13:09:18 nodez kernel: [<ffffffffa0742152>] ? osd_shutdown+0x32/0xe0 [osd_ldiskfs] Dec 14 13:09:18 nodez kernel: [<ffffffffa0742549>] ? osd_device_fini+0x119/0x180 [osd_ldiskfs] Dec 14 13:09:18 nodez kernel: [<ffffffffa0e56784>] ? class_cleanup+0x804/0xd90 [obdclass] Dec 14 13:09:18 nodez kernel: [<ffffffffa0e35ae0>] ? class_name2dev+0x70/0xd0 [obdclass] Dec 14 13:09:18 nodez kernel: [<ffffffffa0e5b645>] ? class_process_config+0x1d45/0x2e50 [obdclass] Dec 14 13:09:18 nodez kernel: [<ffffffffa0e5ca0a>] ? class_manual_cleanup+0x2ba/0xd60 [obdclass] Dec 14 13:09:18 nodez kernel: [<ffffffff810e6f44>] ? cache_alloc_debugcheck_after+0x123/0x192 Dec 14 13:09:18 nodez kernel: [<ffffffff810e88bc>] ? __kmalloc+0x123/0x18e Dec 14 13:09:18 nodez kernel: [<ffffffffa0e5cc8d>] ? class_manual_cleanup+0x53d/0xd60 [obdclass] Dec 14 13:09:18 nodez kernel: [<ffffffffa074a6c4>] ? osd_obd_disconnect+0x164/0x1d0 [osd_ldiskfs] Dec 14 13:09:18 nodez kernel: [<ffffffffa0e6243d>] ? lustre_put_lsi+0x19d/0xe90 [obdclass] Dec 14 13:09:18 nodez kernel: [<ffffffffa0e641d8>] ? lustre_common_put_super+0x5b8/0xbe0 [obdclass] Dec 14 13:09:18 nodez kernel: [<ffffffffa0e95802>] ? server_put_super+0x172/0x2190 [obdclass] Dec 14 13:09:18 nodez kernel: [<ffffffffa0e97f8d>] ? server_fill_super+0x76d/0x15c0 [obdclass] Dec 14 13:09:18 nodez kernel: [<ffffffffa0e673c0>] ? lustre_fill_super+0x0/0x520 [obdclass] Dec 14 13:09:18 nodez kernel: [<ffffffffa0e67598>] ? lustre_fill_super+0x1d8/0x520 [obdclass] Dec 14 13:09:18 nodez kernel: [<ffffffffa0e673c0>] ? lustre_fill_super+0x0/0x520 [obdclass] Dec 14 13:09:18 nodez kernel: [<ffffffffa0e673c0>] ? lustre_fill_super+0x0/0x520 [obdclass] Dec 14 13:09:18 nodez kernel: [<ffffffff810f863f>] ? get_sb_nodev+0x4e/0x84 Dec 14 13:09:18 nodez kernel: [<ffffffffa0e5f52c>] ? lustre_get_sb+0x1c/0x30 [obdclass] Dec 14 13:09:18 nodez kernel: [<ffffffff810f838d>] ? vfs_kern_mount+0x96/0x15b Dec 14 13:09:18 nodez kernel: [<ffffffff810f84b3>] ? do_kern_mount+0x49/0xe7 Dec 14 13:09:18 nodez kernel: [<ffffffff8110dcd5>] ? do_mount+0x7a1/0x824 Dec 14 13:09:18 nodez kernel: [<ffffffff8110dde0>] ? sys_mount+0x88/0xc4 Dec 14 13:09:18 nodez kernel: [<ffffffff81008a42>] ? system_call_fastpath+0x16/0x1b Dec 14 13:09:18 nodez kernel: Code: ec 18 48 8b 1f 48 c7 07 00 00 00 00 48 85 db 74 4c 48 81 fb 00 f0 ff ff 77 43 4c 8b 6b 48 4d 85 ed 75 08 e9 90 00 00 00 48 89 eb <48> 8b 6b 50 48 85 ed 75 f4 4c 8b 63 08 48 8b 6b 48 4c 89 e7 e8 Dec 14 13:09:18 nodez kernel: RIP [<ffffffffa0e46f03>] lprocfs_remove_nolock+0x33/0x100 [obdclass] Dec 14 13:09:18 nodez kernel: RSP <ffff88003d34d928> Dec 14 13:09:18 nodez kernel: ---[ end trace 5f7830ce85deef31 ]--- Dec 14 13:09:18 nodez kernel: Kernel panic - not syncing: Fatal exception
I've found that osd_device_fini() cleanup things in wrong order, it should cleanup procfs after osd_shutdown() but not before because quota uses osd procfs as well.
Attachments
Issue Links
- duplicates
-
LU-3857 panic in lprocfs_remove_nolock+0x3b/0x100
-
- Resolved
-
Patch http://review.whamcloud.com/8506 from
LU-3857was landed to master.