Details
-
Bug
-
Resolution: Fixed
-
Minor
-
None
-
None
-
None
-
3
-
Orion
-
2952
Description
Bug hit on the orion_quota branch which has just been rebased on orion. There is really nothing on the orion_quota branch which could cause this:
14:42:59:Lustre: DEBUG MARKER: == replay-single test 22b: check orphan code race in test 22 == 14:42:59 (1332452579) 14:43:01:Turning device dm-0 (0xfd00000) read-only 14:43:01:Lustre: DEBUG MARKER: mds1 REPLAY BARRIER on lustre-MDT0000 14:43:02:Removing read-only on unknown block (0xfd00000) 14:43:19:LDISKFS-fs (dm-0): warning: maximal mount count reached, running e2fsck is recommended 14:43:20:LDISKFS-fs (dm-0): recovery complete 14:43:20:LDISKFS-fs (dm-0): mounted filesystem with ordered data mode. quota=off. Opts: 14:44:20:Lustre: 7516:0:(ldlm_lib.c:1644:target_recovery_overseer()) recovery is timed out, evict stale exports 14:44:21:LustreError: 7516:0:(genops.c:1302:class_disconnect_stale_exports()) lustre-MDT0000: disconnect stale client 0a4ff024-85d5-c11f-ade5-2ac2049b2a25@<unknown> 14:44:21:Lustre: lustre-MDT0000: Recovery over after 1:00, of 3 clients 2 recovered and 1 was evicted. 14:44:21:Lustre: Skipped 9 previous similar messages 14:44:21:LustreError: 7516:0:(libcfs_fail.h:141:cfs_race()) cfs_race id 148 sleeping 14:44:21:LustreError: 7479:0:(libcfs_fail.h:146:cfs_race()) cfs_fail_race id 148 waking 14:44:21:LustreError: 7516:0:(libcfs_fail.h:144:cfs_race()) cfs_fail_race id 148 awake, rc=0 14:44:21:Lustre: 7516:0:(mdd_orphans.c:283:orph_key_test_and_del()) Found orphan [0x200002341:0x7:0x0]! Delete it 14:44:21:LustreError: 7516:0:(mdd_orphans.c:227:orph_index_delete()) ASSERTION(obj->mod_flags & ORPHAN_OBJ) failed 14:44:21:LustreError: 7516:0:(mdd_orphans.c:227:orph_index_delete()) LBUG 14:44:21:Pid: 7516, comm: tgt_recov 14:44:21: 14:44:21:Call Trace: 14:44:21: [<ffffffffa043a835>] libcfs_debug_dumpstack+0x55/0x80 [libcfs] 14:44:21: [<ffffffffa043ad67>] lbug_with_loc+0x47/0xb0 [libcfs] 14:44:22: [<ffffffffa044441d>] libcfs_assertion_failed+0x2d/0x30 [libcfs] 14:44:22: [<ffffffffa0984f48>] orph_index_delete+0x718/0x990 [mdd] 14:44:22: [<ffffffffa09858a7>] __mdd_orphan_cleanup+0x6e7/0xa50 [mdd] 14:44:22: [<ffffffff81090a90>] ? autoremove_wake_function+0x0/0x40 14:44:22: [<ffffffffa0993043>] mdd_recovery_complete+0x73/0xf0 [mdd] 14:44:22: [<ffffffffa0a32a7e>] mdt_postrecov+0x3e/0xb0 [mdt] 14:44:22: [<ffffffffa055d0be>] ? lu_env_init+0x1e/0x30 [obdclass] 14:44:22: [<ffffffffa0a34480>] mdt_obd_postrecov+0x80/0xa0 [mdt] 14:44:22: [<ffffffffa0669950>] ? ldlm_reprocess_res+0x0/0x20 [ptlrpc] 14:44:22: [<ffffffffa0672c4b>] target_recovery_thread+0x8fb/0xcf0 [ptlrpc] 14:44:22: [<ffffffff8106cc0f>] ? release_task+0x36f/0x4e0 14:44:22: [<ffffffff81096294>] ? switch_task_namespaces+0x24/0x60 14:44:22: [<ffffffff8106eac7>] ? do_exit+0x5a7/0x860 14:44:22: [<ffffffffa0672350>] ? target_recovery_thread+0x0/0xcf0 [ptlrpc] 14:44:22: [<ffffffff8100c14a>] child_rip+0xa/0x20 14:44:22: [<ffffffffa0672350>] ? target_recovery_thread+0x0/0xcf0 [ptlrpc] 14:44:22: [<ffffffffa0672350>] ? target_recovery_thread+0x0/0xcf0 [ptlrpc] 14:44:22: [<ffffffff8100c140>] ? child_rip+0x0/0x20
https://maloo.whamcloud.com/test_sets/f3a3d4dc-74b4-11e1-bfc6-5254004bbbd3
Attachments
Issue Links
- is related to
-
LU-4011 problems with upstream lustre client code
- Closed