Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-7387

Failover: recovery-random-scale test_fail_client_mds: test failed to respond and timed out

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Minor
    • Resolution: Cannot Reproduce
    • Affects Version/s: Lustre 2.8.0
    • Fix Version/s: None
    • Labels:
      None
    • Environment:
      Server/Client: RHEL 7 , master, Build# 3228
    • Severity:
      3
    • Rank (Obsolete):
      9223372036854775807

      Description

      This issue was created by maloo for Saurabh Tandan <saurabh.tandan@intel.com>

      This issue relates to the following test suite run: https://testing.hpdd.intel.com/test_sets/a6ceebd6-7f0c-11e5-b444-5254006e85c2.

      The sub-test test_fail_client_mds failed with the following error:

      test failed to respond and timed out
      

      Client dmesg:

      [ 1560.467561] dd              D ffff88007fd13680     0  3532   2809 0x00000080
      [ 1560.468969]  ffff8800788038c0 0000000000000086 ffff88007a992220 ffff880078803fd8
      [ 1560.470621]  ffff880078803fd8 ffff880078803fd8 ffff88007a992220 ffff88007bdffb68
      [ 1560.472358]  ffff88007bdffb70 7fffffffffffffff ffff88007a992220 ffff880078803c90
      [ 1560.473646] Call Trace:
      [ 1560.473824]  [<ffffffff81609839>] schedule+0x29/0x70
      [ 1560.474200]  [<ffffffff81607789>] schedule_timeout+0x209/0x2d0
      [ 1560.474727]  [<ffffffff812da0f0>] ? radix_tree_gang_lookup+0x90/0xc0
      [ 1560.475187]  [<ffffffff81609d36>] wait_for_completion+0x116/0x170
      [ 1560.475694]  [<ffffffff810a9510>] ? wake_up_state+0x20/0x20
      [ 1560.476113]  [<ffffffffa0bbf1d4>] osc_io_setattr_end+0xc4/0x180 [osc]
      [ 1560.476678]  [<ffffffffa0a05f60>] ? lov_io_iter_fini_wrapper+0x50/0x50 [lov]
      [ 1560.477215]  [<ffffffffa06a694d>] cl_io_end+0x5d/0x150 [obdclass]
      [ 1560.477691]  [<ffffffffa0a0603b>] lov_io_end_wrapper+0xdb/0xe0 [lov]
      [ 1560.478151]  [<ffffffffa0a0656a>] lov_io_call.isra.11+0x8a/0x140 [lov]
      [ 1560.478693]  [<ffffffffa0a06656>] lov_io_end+0x36/0xb0 [lov]
      [ 1560.479115]  [<ffffffffa06a694d>] cl_io_end+0x5d/0x150 [obdclass]
      [ 1560.479595]  [<ffffffffa06a9143>] cl_io_loop+0xb3/0x190 [obdclass]
      [ 1560.480072]  [<ffffffffa0aec7c0>] cl_setattr_ost+0x240/0x3a0 [lustre]
      [ 1560.480578]  [<ffffffffa0ac3b0c>] ll_setattr_raw+0x12ac/0x1330 [lustre]
      [ 1560.481067]  [<ffffffffa0ac3bf3>] ll_setattr+0x63/0xc0 [lustre]
      [ 1560.481580]  [<ffffffff811e3689>] notify_change+0x279/0x3d0
      [ 1560.481959]  [<ffffffff8128a1ae>] ? process_measurement+0x8e/0x250
      [ 1560.482413]  [<ffffffff811c4a83>] do_truncate+0x73/0xc0
      [ 1560.482846]  [<ffffffff811d5a72>] do_last+0x5f2/0x1270
      [ 1560.483224]  [<ffffffff811d67b2>] path_openat+0xc2/0x490
      [ 1560.483659]  [<ffffffff8118934c>] ? mmap_region+0x1bc/0x610
      [ 1560.484065]  [<ffffffff811d7f7b>] do_filp_open+0x4b/0xb0
      [ 1560.484481]  [<ffffffff811e49d7>] ? __alloc_fd+0xa7/0x130
      [ 1560.484847]  [<ffffffff811c5c73>] do_sys_open+0xf3/0x1f0
      [ 1560.485237]  [<ffffffff811c5d8e>] SyS_open+0x1e/0x20
      [ 1560.485644]  [<ffffffff81614389>] system_call_fastpath+0x16/0x1b
      

      Client run_dd_dbug:

      ++ date '+%F %H:%M:%S'
      + echoerr '2015-10-29 05:16:15: dd run starting'
      + echo '2015-10-29 05:16:15: dd run starting'
      2015-10-29 05:16:15: dd run starting
      + mkdir -p /mnt/lustre/d0.dd-shadow-51vm5.shadow.whamcloud.com
      + /usr/bin/lfs setstripe -c -1 /mnt/lustre/d0.dd-shadow-51vm5.shadow.whamcloud.com
      + cd /mnt/lustre/d0.dd-shadow-51vm5.shadow.whamcloud.com
      + sync
      ++ /usr/bin/lfs df /mnt/lustre/d0.dd-shadow-51vm5.shadow.whamcloud.com
      ++ awk '/filesystem summary:/ {print $5}'
      + FREE_SPACE=11300292
      + BLKS=1271282
      + echoerr 'Total free disk space is 11300292, 4k blocks to dd is 1271282'
      + echo 'Total free disk space is 11300292, 4k blocks to dd is 1271282'
      Total free disk space is 11300292, 4k blocks to dd is 1271282
      + load_pid=3519
      + wait 3519
      + dd bs=4k count=1271282 status=noxfer if=/dev/zero of=/mnt/lustre/d0.dd-shadow-51vm5.shadow.whamcloud.com/dd-file
      dd: error writing ‘/mnt/lustre/d0.dd-shadow-51vm5.shadow.whamcloud.com/dd-file’: Input/output error
      428034+0 records in
      428033+0 records out
      + '[' 1 -eq 0 ']'
      ++ date '+%F %H:%M:%S'
      + echoerr '2015-10-29 05:19:43: dd failed'
      + echo '2015-10-29 05:19:43: dd failed'
      2015-10-29 05:19:43: dd failed
      

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                wc-triage WC Triage
                Reporter:
                maloo Maloo
              • Votes:
                0 Vote for this issue
                Watchers:
                1 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: