Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-2355

orph_index_delete()) ASSERTION(obj->mod_flags & ORPHAN_OBJ)

    XMLWordPrintable

Details

    • Bug
    • Resolution: Fixed
    • Minor
    • None
    • None
    • None
    • 3
    • Orion
    • 2952

    Description

      Bug hit on the orion_quota branch which has just been rebased on orion. There is really nothing on the orion_quota branch which could cause this:

      14:42:59:Lustre: DEBUG MARKER: == replay-single test 22b: check orphan code race in test 22 == 14:42:59 (1332452579)
      14:43:01:Turning device dm-0 (0xfd00000) read-only
      14:43:01:Lustre: DEBUG MARKER: mds1 REPLAY BARRIER on lustre-MDT0000
      14:43:02:Removing read-only on unknown block (0xfd00000)
      14:43:19:LDISKFS-fs (dm-0): warning: maximal mount count reached, running e2fsck is recommended
      14:43:20:LDISKFS-fs (dm-0): recovery complete
      14:43:20:LDISKFS-fs (dm-0): mounted filesystem with ordered data mode. quota=off. Opts: 
      14:44:20:Lustre: 7516:0:(ldlm_lib.c:1644:target_recovery_overseer()) recovery is timed out, evict stale exports
      14:44:21:LustreError: 7516:0:(genops.c:1302:class_disconnect_stale_exports()) lustre-MDT0000: disconnect stale client 0a4ff024-85d5-c11f-ade5-2ac2049b2a25@<unknown>
      14:44:21:Lustre: lustre-MDT0000: Recovery over after 1:00, of 3 clients 2 recovered and 1 was evicted.
      14:44:21:Lustre: Skipped 9 previous similar messages
      14:44:21:LustreError: 7516:0:(libcfs_fail.h:141:cfs_race()) cfs_race id 148 sleeping
      14:44:21:LustreError: 7479:0:(libcfs_fail.h:146:cfs_race()) cfs_fail_race id 148 waking
      14:44:21:LustreError: 7516:0:(libcfs_fail.h:144:cfs_race()) cfs_fail_race id 148 awake, rc=0
      14:44:21:Lustre: 7516:0:(mdd_orphans.c:283:orph_key_test_and_del()) Found orphan [0x200002341:0x7:0x0]! Delete it
      14:44:21:LustreError: 7516:0:(mdd_orphans.c:227:orph_index_delete()) ASSERTION(obj->mod_flags & ORPHAN_OBJ) failed
      14:44:21:LustreError: 7516:0:(mdd_orphans.c:227:orph_index_delete()) LBUG
      14:44:21:Pid: 7516, comm: tgt_recov
      14:44:21:
      14:44:21:Call Trace:
      14:44:21: [<ffffffffa043a835>] libcfs_debug_dumpstack+0x55/0x80 [libcfs]
      14:44:21: [<ffffffffa043ad67>] lbug_with_loc+0x47/0xb0 [libcfs]
      14:44:22: [<ffffffffa044441d>] libcfs_assertion_failed+0x2d/0x30 [libcfs]
      14:44:22: [<ffffffffa0984f48>] orph_index_delete+0x718/0x990 [mdd]
      14:44:22: [<ffffffffa09858a7>] __mdd_orphan_cleanup+0x6e7/0xa50 [mdd]
      14:44:22: [<ffffffff81090a90>] ? autoremove_wake_function+0x0/0x40
      14:44:22: [<ffffffffa0993043>] mdd_recovery_complete+0x73/0xf0 [mdd]
      14:44:22: [<ffffffffa0a32a7e>] mdt_postrecov+0x3e/0xb0 [mdt]
      14:44:22: [<ffffffffa055d0be>] ? lu_env_init+0x1e/0x30 [obdclass]
      14:44:22: [<ffffffffa0a34480>] mdt_obd_postrecov+0x80/0xa0 [mdt]
      14:44:22: [<ffffffffa0669950>] ? ldlm_reprocess_res+0x0/0x20 [ptlrpc]
      14:44:22: [<ffffffffa0672c4b>] target_recovery_thread+0x8fb/0xcf0 [ptlrpc]
      14:44:22: [<ffffffff8106cc0f>] ? release_task+0x36f/0x4e0
      14:44:22: [<ffffffff81096294>] ? switch_task_namespaces+0x24/0x60
      14:44:22: [<ffffffff8106eac7>] ? do_exit+0x5a7/0x860
      14:44:22: [<ffffffffa0672350>] ? target_recovery_thread+0x0/0xcf0 [ptlrpc]
      14:44:22: [<ffffffff8100c14a>] child_rip+0xa/0x20
      14:44:22: [<ffffffffa0672350>] ? target_recovery_thread+0x0/0xcf0 [ptlrpc]
      14:44:22: [<ffffffffa0672350>] ? target_recovery_thread+0x0/0xcf0 [ptlrpc]
      14:44:22: [<ffffffff8100c140>] ? child_rip+0x0/0x20
      

      https://maloo.whamcloud.com/test_sets/f3a3d4dc-74b4-11e1-bfc6-5254004bbbd3

      Attachments

        Issue Links

          Activity

            People

              liwei Li Wei (Inactive)
              johann Johann Lombardi (Inactive)
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: