Loading...

XML

Word

Printable

Type: Bug
Resolution: Fixed
Priority: Blocker
Fix Version/s: Lustre 2.4.0
Affects Version/s: Lustre 2.4.0
Labels:
None
Environment:
Single-node test configuration (dual-core x86_64, 1 MDT, 3 OST)

Severity:
3
Rank (Obsolete):
5271

I recently hit this problem in running sanity-scrub.sh:

LustreError: 140-5: Server testfs-MDT0000 requested index 0, but that index is already in use. Use --writeconf to force
mgs_write_log_target()) Can't get index (-98)
mgs_handle_target_reg()) Failed to write testfs-MDT0000 log (-98)
erver_register_target()) Cannot talk to the MGS: -98, not fatal
LustreError: 32638:0:(osd_scrub.c:1122:osd_scrub_cleanup()) ASSERTION( dev->od_otable_it == ((void *)0) ) failed

Pid: 32638, comm: umount
Call Trace:
[<ffffffffa08fb905>] libcfs_debug_dumpstack+0x55/0x80 [libcfs]
[<ffffffffa08fbf17>] lbug_with_loc+0x47/0xb0 [libcfs]
[<ffffffffa0fc096f>] osd_scrub_cleanup+0xdf/0xe0 [osd_ldiskfs]
[<ffffffffa0f9d323>] osd_shutdown+0x33/0x110 [osd_ldiskfs]
[<ffffffffa0fa9ff5>] osd_process_config+0x165/0x1b0 [osd_ldiskfs]
[<ffffffffa0d97611>] lod_process_config+0x451/0xa70 [lod]
[<ffffffffa0ed9ac0>] mdd_process_config+0x210/0x7e0 [mdd]
[<ffffffffa1027272>] mdt_stack_fini+0x172/0xbf0 [mdt]
[<ffffffffa1027fb7>] mdt_device_fini+0x2c7/0x510 [mdt]
[<ffffffffa0a8d4c7>] class_cleanup+0x577/0xdc0 [obdclass]
[<ffffffffa0a8edb5>] class_process_config+0x10a5/0x1ca0 [obdclass]
[<ffffffffa0a8fb29>] class_manual_cleanup+0x179/0x6f0 [obdclass]
[<ffffffffa0a9d0ac>] server_put_super+0x61c/0x1300 [obdclass]
[<ffffffff8117d34b>] generic_shutdown_super+0x5b/0xe0
[<ffffffff8117d436>] kill_anon_super+0x16/0x60
[<ffffffffa0a919a6>] lustre_kill_super+0x36/0x60 [obdclass]
[<ffffffff8117e4b0>] deactivate_super+0x70/0x90
[<ffffffff8119a4ff>] mntput_no_expire+0xbf/0x110
[<ffffffff8119af9b>] sys_umount+0x7b/0x3a0

Alex's patch in http://review.whamcloud.com/4217 to be landed was created for ~~LU-2033~~, but since that bug was closed and actually related to a separate issue, I'd rather file a new bug instead of re-opening that one. That patch works around the duplicate index==0 issue by resetting the filesystem label after formatting (to clear the "VIRGIN" flag), though my preference would be if the MDT itself detected that it had been restored from backup and reset the label internally. At least the proposed solution will also work for older versions of Lustre as well, so a single restore procedure can be documented, so I'm not dead-set against this part of the patch.

The osd_scrub_cleanup() assertion is also addressed by Alex's patch, but Fan Yong rightfully objected to that fix because it still implies that the scrub thread is running when the MDT is being stopped, so there is some other cleanup/serialization needed.

Assignee:: nasf (Inactive)

Reporter:: Andreas Dilger

Votes:: 0 Vote for this issue

Watchers:: 3 Start watching this issue

Created:: 20/Oct/12 6:03 PM

Updated:: 19/Apr/13 8:38 PM

Resolved:: 29/Oct/12 7:15 AM

Details

Description

Attachments

Activity

People

Dates