Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-10670

sanity-flr test 43 timeout

    XMLWordPrintable

Details

    • Bug
    • Resolution: Fixed
    • Blocker
    • Lustre 2.11.0
    • Lustre 2.11.0
    • None
    • 3
    • 9223372036854775807

    Description

      https://testing.hpdd.intel.com/test_sets/713fb70e-119d-11e8-a6ad-52540065bddc

      It fails very often:

      Error: 'Timeout occurred after 227 mins, last suite running was sanity-flr, restarting cluster to continue tests' 
      Failure Rate: 41.18% of most recent 17 runs, 22 skipped (all branches)
      

      On a client:

      [10077.749514] Lustre: DEBUG MARKER: == sanity-flr test 43: mirror pick on write ========================================================== 12:14:55 (1518610495)
      [10320.098013] INFO: task dd:23892 blocked for more than 120 seconds.
      [10320.114074] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
      [10320.116709] dd              D ffff88007b96dee0     0 23892  23675 0x00000080
      [10320.119330] Call Trace:
      [10320.125475]  [<ffffffff810c6632>] ? default_wake_function+0x12/0x20
      [10320.150782]  [<ffffffff810bc2d8>] ? __wake_up_common+0x58/0x90
      [10320.154162]  [<ffffffff816ab8a9>] schedule+0x29/0x70
      [10320.170306]  [<ffffffff816a92b9>] schedule_timeout+0x239/0x2c0
      [10320.176336]  [<ffffffffc09f5e88>] ? ptlrpc_set_add_new_req+0xd8/0x150 [ptlrpc]
      [10320.178829]  [<ffffffffc0bd50c0>] ? osc_io_ladvise_end+0x50/0x50 [osc]
      [10320.181237]  [<ffffffffc0a25ffb>] ? ptlrpcd_add_req+0x22b/0x300 [ptlrpc]
      [10320.183701]  [<ffffffffc09fbe99>] ? ptlrpc_request_bufs_pack+0x1d9/0x480 [ptlrpc]
      [10320.186106]  [<ffffffff816abc5d>] wait_for_completion+0xfd/0x140
      [10320.188437]  [<ffffffff810c6620>] ? wake_up_state+0x20/0x20
      [10320.190651]  [<ffffffffc0bd5284>] osc_io_setattr_end+0xc4/0x180 [osc]
      [10320.192955]  [<ffffffffc0bd63d0>] ? osc_io_setattr_start+0x260/0x700 [osc]
      [10320.195231]  [<ffffffffc0c28490>] ? lov_io_iter_fini_wrapper+0x50/0x50 [lov]
      [10320.197659]  [<ffffffffc0832e8d>] cl_io_end+0x5d/0x150 [obdclass]
      [10320.199802]  [<ffffffffc0c2856b>] lov_io_end_wrapper+0xdb/0xe0 [lov]
      [10320.202033]  [<ffffffffc0c28bc5>] lov_io_call.isra.5+0x85/0x140 [lov]
      [10320.204170]  [<ffffffffc0c28cb6>] lov_io_end+0x36/0xb0 [lov]
      [10320.206291]  [<ffffffffc0832e8d>] cl_io_end+0x5d/0x150 [obdclass]
      [10320.208353]  [<ffffffffc083551f>] cl_io_loop+0x13f/0xc70 [obdclass]
      [10320.210509]  [<ffffffffc0cd1460>] cl_setattr_ost+0x250/0x3c0 [lustre]
      [10320.212550]  [<ffffffffc0cab495>] ll_setattr_raw+0x1165/0x1270 [lustre]
      [10320.214631]  [<ffffffffc0cab60c>] ll_setattr+0x6c/0xd0 [lustre]
      [10320.217542]  [<ffffffff81220fc1>] notify_change+0x2c1/0x420
      [10320.228621]  [<ffffffff812b45b6>] ? security_inode_need_killpriv+0x16/0x20
      [10320.230605]  [<ffffffff81200ad5>] do_truncate+0x75/0xc0
      [10320.232485]  [<ffffffff81211d97>] do_last+0x627/0x12c0
      [10320.234244]  [<ffffffff81212af2>] path_openat+0xc2/0x490
      [10320.236065]  [<ffffffff811af746>] ? do_read_fault.isra.44+0xe6/0x130
      [10320.237871]  [<ffffffff8121508b>] do_filp_open+0x4b/0xb0
      [10320.239642]  [<ffffffff8122233a>] ? __alloc_fd+0x8a/0x130
      [10320.241313]  [<ffffffff81201bc3>] do_sys_open+0xf3/0x1f0
      [10320.243068]  [<ffffffff816b8945>] ? system_call_after_swapgs+0x172/0x214
      [10320.244820]  [<ffffffff81201cde>] SyS_open+0x1e/0x20
      [10320.246469]  [<ffffffff816b89fd>] system_call_fastpath+0x16/0x1b
      [10320.248096]  [<ffffffff816b889d>] ? system_call_after_swapgs+0xca/0x214
      

      Attachments

        Issue Links

          Activity

            People

              bobijam Zhenyu Xu
              tappro Mikhail Pershin
              Votes:
              0 Vote for this issue
              Watchers:
              9 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: