Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-5449

Test failure on test suite sanity-scrub, subtest test_8

Details

    • Bug
    • Resolution: Cannot Reproduce
    • Minor
    • None
    • None
    • None
    • 3
    • 15168

    Description

      This issue was created by maloo for Bob Glossman <bob.glossman@intel.com>

      This issue relates to the following test suite run: https://testing.hpdd.intel.com/test_sets/5705a464-1bfd-11e4-8763-5254006e85c2.

      The sub-test test_8 failed with the following error:

      test failed to respond and timed out

      Info required for matching: sanity-scrub 8

      Attachments

        Issue Links

          Activity

            [LU-5449] Test failure on test suite sanity-scrub, subtest test_8

            Haven't seen this issue in years.

            adilger Andreas Dilger added a comment - Haven't seen this issue in years.

            This also happens on sanity-scrub/test_7 (review-dne-part-2 on master):
            https://testing.hpdd.intel.com/test_sets/ee4b9602-3b93-11e4-a52e-5254006e85c2

            utopiabound Nathaniel Clark added a comment - This also happens on sanity-scrub/test_7 (review-dne-part-2 on master): https://testing.hpdd.intel.com/test_sets/ee4b9602-3b93-11e4-a52e-5254006e85c2
            green Oleg Drokin added a comment -

            Not there's a crash in MDT1 logs

            05:26:14:Lustre: 28508:0:(service.c:1509:ptlrpc_at_check_timed()) Skipped 1 previous similar message
            05:26:14:Lustre: lustre-MDT0000: Client lustre-MDT0000-lwp-MDT0001_UUID (at 10.2.5.88@tcp) reconnecting, waiting for 4 clients in recovery for 0:42
            05:26:14:Lustre: Skipped 2 previous similar messages
            05:26:14:LustreError: 28536:0:(ldlm_lib.c:1689:check_for_clients()) ASSERTION( clnts <= obd->obd_max_recoverable_clients ) failed: 
            05:26:14:LustreError: 28536:0:(ldlm_lib.c:1689:check_for_clients()) LBUG
            05:26:14:Pid: 28536, comm: tgt_recov
            05:26:14:
            05:26:14:Call Trace:
            05:26:14: [<ffffffffa07f2920>] ? check_for_clients+0x0/0x70 [ptlrpc]
            05:26:14: [<ffffffffa048e895>] libcfs_debug_dumpstack+0x55/0x80 [libcfs]
            05:26:14: [<ffffffffa048ee97>] lbug_with_loc+0x47/0xb0 [libcfs]
            05:26:14: [<ffffffffa07f298c>] check_for_clients+0x6c/0x70 [ptlrpc]
            05:26:14: [<ffffffffa07f3ee3>] target_recovery_overseer+0xb3/0x230 [ptlrpc]
            05:26:14: [<ffffffffa07f2550>] ? exp_connect_healthy+0x0/0x20 [ptlrpc]
            05:26:14: [<ffffffff8109afa0>] ? autoremove_wake_function+0x0/0x40
            05:26:14: [<ffffffffa07fa810>] ? target_recovery_thread+0x0/0x19c0 [ptlrpc]
            05:26:14: [<ffffffffa07fadf4>] target_recovery_thread+0x5e4/0x19c0 [ptlrpc]
            05:26:14: [<ffffffff81061d12>] ? default_wake_function+0x12/0x20
            05:26:14: [<ffffffffa07fa810>] ? target_recovery_thread+0x0/0x19c0 [ptlrpc]
            05:26:14: [<ffffffff8109abf6>] kthread+0x96/0xa0
            05:26:14: [<ffffffff8100c20a>] child_rip+0xa/0x20
            05:26:14: [<ffffffff8109ab60>] ? kthread+0x0/0xa0
            05:26:14: [<ffffffff8100c200>] ? child_rip+0x0/0x20
            05:26:14:
            05:26:14:Kernel panic - not syncing: LBUG
            
            green Oleg Drokin added a comment - Not there's a crash in MDT1 logs 05:26:14:Lustre: 28508:0:(service.c:1509:ptlrpc_at_check_timed()) Skipped 1 previous similar message 05:26:14:Lustre: lustre-MDT0000: Client lustre-MDT0000-lwp-MDT0001_UUID (at 10.2.5.88@tcp) reconnecting, waiting for 4 clients in recovery for 0:42 05:26:14:Lustre: Skipped 2 previous similar messages 05:26:14:LustreError: 28536:0:(ldlm_lib.c:1689:check_for_clients()) ASSERTION( clnts <= obd->obd_max_recoverable_clients ) failed: 05:26:14:LustreError: 28536:0:(ldlm_lib.c:1689:check_for_clients()) LBUG 05:26:14:Pid: 28536, comm: tgt_recov 05:26:14: 05:26:14:Call Trace: 05:26:14: [<ffffffffa07f2920>] ? check_for_clients+0x0/0x70 [ptlrpc] 05:26:14: [<ffffffffa048e895>] libcfs_debug_dumpstack+0x55/0x80 [libcfs] 05:26:14: [<ffffffffa048ee97>] lbug_with_loc+0x47/0xb0 [libcfs] 05:26:14: [<ffffffffa07f298c>] check_for_clients+0x6c/0x70 [ptlrpc] 05:26:14: [<ffffffffa07f3ee3>] target_recovery_overseer+0xb3/0x230 [ptlrpc] 05:26:14: [<ffffffffa07f2550>] ? exp_connect_healthy+0x0/0x20 [ptlrpc] 05:26:14: [<ffffffff8109afa0>] ? autoremove_wake_function+0x0/0x40 05:26:14: [<ffffffffa07fa810>] ? target_recovery_thread+0x0/0x19c0 [ptlrpc] 05:26:14: [<ffffffffa07fadf4>] target_recovery_thread+0x5e4/0x19c0 [ptlrpc] 05:26:14: [<ffffffff81061d12>] ? default_wake_function+0x12/0x20 05:26:14: [<ffffffffa07fa810>] ? target_recovery_thread+0x0/0x19c0 [ptlrpc] 05:26:14: [<ffffffff8109abf6>] kthread+0x96/0xa0 05:26:14: [<ffffffff8100c20a>] child_rip+0xa/0x20 05:26:14: [<ffffffff8109ab60>] ? kthread+0x0/0xa0 05:26:14: [<ffffffff8100c200>] ? child_rip+0x0/0x20 05:26:14: 05:26:14:Kernel panic - not syncing: LBUG

            People

              wc-triage WC Triage
              maloo Maloo
              Votes:
              0 Vote for this issue
              Watchers:
              6 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: