Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-4458

Interop 2.5.0<->2.6 failure on test suite recovery-small test_9

    XMLWordPrintable

Details

    • Bug
    • Resolution: Fixed
    • Critical
    • Lustre 2.7.0, Lustre 2.5.4
    • Lustre 2.6.0, Lustre 2.7.0
    • server: lustre-master build # 1823 RHEL6 ldiskfs
      client: 2.5.0
    • 3
    • 12221

    Description

      This issue was created by maloo for sarah <sarah@whamcloud.com>

      This issue relates to the following test suite run: http://maloo.whamcloud.com/test_sets/3ba4558e-77f3-11e3-a6a3-52540035b04c.

      The sub-test test_9 failed with the following error:

      test failed to respond and timed out

      Found D process on OST:

      22:51:46:Lustre: DEBUG MARKER: == recovery-small test 9: pause bulk on OST (bug 1420) == 22:48:54 (1389077334)
      22:51:47:Lustre: DEBUG MARKER: lctl set_param fail_loc=0x214
      22:51:47:LustreError: 2046:0:(fail.c:133:__cfs_fail_timeout_set()) cfs_fail_timeout id 214 sleeping for 20000000ms
      22:51:47:INFO: task ll_ost_io00_002:2046 blocked for more than 120 seconds.
      22:51:47:"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
      22:51:47:ll_ost_io00_0 D 0000000000000001     0  2046      2 0x00000080
      22:51:47: ffff8802f8759a60 0000000000000046 ffff8802f8759ac0 0000000016734040
      22:57:44: ffffffffa0566ab0 ffff880316255389 0000004e359ea090 ffffffffa053c044
      22:57:44: ffff8803167345f8 ffff8802f8759fd8 000000000000fb88 ffff8803167345f8
      22:57:44:Call Trace:
      22:57:44: [<ffffffff8150f3f2>] schedule_timeout+0x192/0x2e0
      22:57:44: [<ffffffff810811e0>] ? process_timeout+0x0/0x10
      22:57:45: [<ffffffffa0520d0f>] __cfs_fail_timeout_set+0xcf/0x150 [libcfs]
      22:57:45: [<ffffffffa0eaaec9>] cfs_fail_timeout_set.clone.2+0x29/0x30 [ptlrpc]
      22:57:45: [<ffffffffa0eae94b>] tgt_brw_write+0x34b/0x1550 [ptlrpc]
      22:57:45: [<ffffffffa0525921>] ? libcfs_debug_msg+0x41/0x50 [libcfs]
      22:57:45: [<ffffffffa0eb0fea>] tgt_handle_request0+0x2ea/0x1490 [ptlrpc]
      22:57:45: [<ffffffffa0525921>] ? libcfs_debug_msg+0x41/0x50 [libcfs]
      22:57:45: [<ffffffffa0eb25ca>] tgt_request_handle+0x43a/0x980 [ptlrpc]
      22:57:45: [<ffffffffa0e65725>] ptlrpc_main+0xd25/0x1970 [ptlrpc]
      22:57:45: [<ffffffffa0e64a00>] ? ptlrpc_main+0x0/0x1970 [ptlrpc]
      22:57:46: [<ffffffff81096a36>] kthread+0x96/0xa0
      22:57:46: [<ffffffff8100c0ca>] child_rip+0xa/0x20
      22:57:46: [<ffffffff810969a0>] ? kthread+0x0/0xa0
      22:57:46: [<ffffffff8100c0c0>] ? child_rip+0x0/0x20
      22:57:46:INFO: task ll_ost_io00_002:2046 blocked for more than 120 seconds.
      22:57:46:"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
      22:57:46:ll_ost_io00_0 D 0000000000000001     0  2046      2 0x00000080
      22:57:46: ffff8802f8759a60 0000000000000046 ffff8802f8759ac0 0000000016734040
      22:57:46: ffffffffa0566ab0 ffff880316255389 0000004e359ea090 ffffffffa053c044
      22:57:46: ffff8803167345f8 ffff8802f8759fd8 000000000000fb88 ffff8803167345f8
      22:57:46:Call Trace:
      

      Attachments

        Issue Links

          Activity

            People

              tappro Mikhail Pershin
              maloo Maloo
              Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: