Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-5572

replay-single test_73b: import is not in FULL state

Details

    • Bug
    • Resolution: Duplicate
    • Minor
    • None
    • None
    • None
    • 3
    • 15542

    Description

      This issue was created by maloo for Amir Shehata <amir.shehata@intel.com>

      This issue relates to the following test suite run: https://testing.hpdd.intel.com/test_sets/b5ba7f1c-2fcf-11e4-9f89-5254006e85c2.

      shadow-13vm6: mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 0 sec
      shadow-13vm5:  rpc : @@@@@@ FAIL: can't put import for mdc.lustre-MDT0000-mdc-*.mds_server_uuid into FULL state after 662 sec, have CONNECTING 
       replay-single test_73b: @@@@@@ FAIL: import is not in FULL state 
      

      The following test runs had the exact same problem as well:
      https://testing.hpdd.intel.com/test_sets/56c66138-2af7-11e4-ba37-5254006e85c2
      https://testing.hpdd.intel.com/test_sets/7949beac-24ef-11e4-8458-5254006e85c2

      Attachments

        Issue Links

          Activity

            [LU-5572] replay-single test_73b: import is not in FULL state
            green Oleg Drokin added a comment -

            as such I think this is a dup of LU-5287

            green Oleg Drokin added a comment - as such I think this is a dup of LU-5287
            green Oleg Drokin added a comment -

            MDS1 crashed with this assertion (visible in console log):

            07:18:13:Lustre: lustre-MDT0000: Client lustre-MDT0001-mdtlov_UUID (at 10.1.4.149@tcp) reconnecting, waiting for 5 clients in recovery for 0:22
            07:18:13:LustreError: 13004:0:(ldlm_lib.c:2253:target_queue_recovery_request()) ASSERTION( req->rq_export->exp_lock_replay_needed ) failed: 
            07:18:13:LustreError: 13004:0:(ldlm_lib.c:2253:target_queue_recovery_request()) LBUG
            07:18:13:Pid: 13004, comm: mdt00_001
            07:18:13:
            07:18:13:Call Trace:
            07:18:13: [<ffffffffa0483895>] libcfs_debug_dumpstack+0x55/0x80 [libcfs]
            07:18:13: [<ffffffffa0483e97>] lbug_with_loc+0x47/0xb0 [libcfs]
            07:18:13: [<ffffffffa07edcd5>] target_queue_recovery_request+0xb35/0xc40 [ptlrpc]
            07:18:13: [<ffffffffa0886d7f>] tgt_handle_recovery+0x38f/0x520 [ptlrpc]
            07:18:14: [<ffffffffa088cd05>] tgt_request_handle+0x1a5/0xb10 [ptlrpc]
            07:18:14: [<ffffffffa083c294>] ptlrpc_main+0xe64/0x1990 [ptlrpc]
            07:18:14: [<ffffffffa083b430>] ? ptlrpc_main+0x0/0x1990 [ptlrpc]
            07:18:14: [<ffffffff8109abf6>] kthread+0x96/0xa0
            07:18:14: [<ffffffff8100c20a>] child_rip+0xa/0x20
            07:18:14: [<ffffffff8109ab60>] ? kthread+0x0/0xa0
            07:18:14: [<ffffffff8100c200>] ? child_rip+0x0/0x20
            07:18:14:
            07:18:14:Kernel panic - not syncing: LBUG
            
            green Oleg Drokin added a comment - MDS1 crashed with this assertion (visible in console log): 07:18:13:Lustre: lustre-MDT0000: Client lustre-MDT0001-mdtlov_UUID (at 10.1.4.149@tcp) reconnecting, waiting for 5 clients in recovery for 0:22 07:18:13:LustreError: 13004:0:(ldlm_lib.c:2253:target_queue_recovery_request()) ASSERTION( req->rq_export->exp_lock_replay_needed ) failed: 07:18:13:LustreError: 13004:0:(ldlm_lib.c:2253:target_queue_recovery_request()) LBUG 07:18:13:Pid: 13004, comm: mdt00_001 07:18:13: 07:18:13:Call Trace: 07:18:13: [<ffffffffa0483895>] libcfs_debug_dumpstack+0x55/0x80 [libcfs] 07:18:13: [<ffffffffa0483e97>] lbug_with_loc+0x47/0xb0 [libcfs] 07:18:13: [<ffffffffa07edcd5>] target_queue_recovery_request+0xb35/0xc40 [ptlrpc] 07:18:13: [<ffffffffa0886d7f>] tgt_handle_recovery+0x38f/0x520 [ptlrpc] 07:18:14: [<ffffffffa088cd05>] tgt_request_handle+0x1a5/0xb10 [ptlrpc] 07:18:14: [<ffffffffa083c294>] ptlrpc_main+0xe64/0x1990 [ptlrpc] 07:18:14: [<ffffffffa083b430>] ? ptlrpc_main+0x0/0x1990 [ptlrpc] 07:18:14: [<ffffffff8109abf6>] kthread+0x96/0xa0 07:18:14: [<ffffffff8100c20a>] child_rip+0xa/0x20 07:18:14: [<ffffffff8109ab60>] ? kthread+0x0/0xa0 07:18:14: [<ffffffff8100c200>] ? child_rip+0x0/0x20 07:18:14: 07:18:14:Kernel panic - not syncing: LBUG

            People

              wc-triage WC Triage
              maloo Maloo
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: