[LU-7387] Failover: recovery-random-scale test_fail_client_mds: test failed to respond and timed out Created: 04/Nov/15  Updated: 05/Aug/20  Resolved: 05/Aug/20

Status: Resolved
Project: Lustre
Component/s: None
Affects Version/s: Lustre 2.8.0
Fix Version/s: None

Type: Bug Priority: Minor
Reporter: Maloo Assignee: WC Triage
Resolution: Cannot Reproduce Votes: 0
Labels: None
Environment:

Server/Client: RHEL 7 , master, Build# 3228


Issue Links:
Related
is related to LU-5526 recovery-mds-scale test failover_mds:... Resolved
Severity: 3
Rank (Obsolete): 9223372036854775807

 Description   

This issue was created by maloo for Saurabh Tandan <saurabh.tandan@intel.com>

This issue relates to the following test suite run: https://testing.hpdd.intel.com/test_sets/a6ceebd6-7f0c-11e5-b444-5254006e85c2.

The sub-test test_fail_client_mds failed with the following error:

test failed to respond and timed out

Client dmesg:

[ 1560.467561] dd              D ffff88007fd13680     0  3532   2809 0x00000080
[ 1560.468969]  ffff8800788038c0 0000000000000086 ffff88007a992220 ffff880078803fd8
[ 1560.470621]  ffff880078803fd8 ffff880078803fd8 ffff88007a992220 ffff88007bdffb68
[ 1560.472358]  ffff88007bdffb70 7fffffffffffffff ffff88007a992220 ffff880078803c90
[ 1560.473646] Call Trace:
[ 1560.473824]  [<ffffffff81609839>] schedule+0x29/0x70
[ 1560.474200]  [<ffffffff81607789>] schedule_timeout+0x209/0x2d0
[ 1560.474727]  [<ffffffff812da0f0>] ? radix_tree_gang_lookup+0x90/0xc0
[ 1560.475187]  [<ffffffff81609d36>] wait_for_completion+0x116/0x170
[ 1560.475694]  [<ffffffff810a9510>] ? wake_up_state+0x20/0x20
[ 1560.476113]  [<ffffffffa0bbf1d4>] osc_io_setattr_end+0xc4/0x180 [osc]
[ 1560.476678]  [<ffffffffa0a05f60>] ? lov_io_iter_fini_wrapper+0x50/0x50 [lov]
[ 1560.477215]  [<ffffffffa06a694d>] cl_io_end+0x5d/0x150 [obdclass]
[ 1560.477691]  [<ffffffffa0a0603b>] lov_io_end_wrapper+0xdb/0xe0 [lov]
[ 1560.478151]  [<ffffffffa0a0656a>] lov_io_call.isra.11+0x8a/0x140 [lov]
[ 1560.478693]  [<ffffffffa0a06656>] lov_io_end+0x36/0xb0 [lov]
[ 1560.479115]  [<ffffffffa06a694d>] cl_io_end+0x5d/0x150 [obdclass]
[ 1560.479595]  [<ffffffffa06a9143>] cl_io_loop+0xb3/0x190 [obdclass]
[ 1560.480072]  [<ffffffffa0aec7c0>] cl_setattr_ost+0x240/0x3a0 [lustre]
[ 1560.480578]  [<ffffffffa0ac3b0c>] ll_setattr_raw+0x12ac/0x1330 [lustre]
[ 1560.481067]  [<ffffffffa0ac3bf3>] ll_setattr+0x63/0xc0 [lustre]
[ 1560.481580]  [<ffffffff811e3689>] notify_change+0x279/0x3d0
[ 1560.481959]  [<ffffffff8128a1ae>] ? process_measurement+0x8e/0x250
[ 1560.482413]  [<ffffffff811c4a83>] do_truncate+0x73/0xc0
[ 1560.482846]  [<ffffffff811d5a72>] do_last+0x5f2/0x1270
[ 1560.483224]  [<ffffffff811d67b2>] path_openat+0xc2/0x490
[ 1560.483659]  [<ffffffff8118934c>] ? mmap_region+0x1bc/0x610
[ 1560.484065]  [<ffffffff811d7f7b>] do_filp_open+0x4b/0xb0
[ 1560.484481]  [<ffffffff811e49d7>] ? __alloc_fd+0xa7/0x130
[ 1560.484847]  [<ffffffff811c5c73>] do_sys_open+0xf3/0x1f0
[ 1560.485237]  [<ffffffff811c5d8e>] SyS_open+0x1e/0x20
[ 1560.485644]  [<ffffffff81614389>] system_call_fastpath+0x16/0x1b

Client run_dd_dbug:

++ date '+%F %H:%M:%S'
+ echoerr '2015-10-29 05:16:15: dd run starting'
+ echo '2015-10-29 05:16:15: dd run starting'
2015-10-29 05:16:15: dd run starting
+ mkdir -p /mnt/lustre/d0.dd-shadow-51vm5.shadow.whamcloud.com
+ /usr/bin/lfs setstripe -c -1 /mnt/lustre/d0.dd-shadow-51vm5.shadow.whamcloud.com
+ cd /mnt/lustre/d0.dd-shadow-51vm5.shadow.whamcloud.com
+ sync
++ /usr/bin/lfs df /mnt/lustre/d0.dd-shadow-51vm5.shadow.whamcloud.com
++ awk '/filesystem summary:/ {print $5}'
+ FREE_SPACE=11300292
+ BLKS=1271282
+ echoerr 'Total free disk space is 11300292, 4k blocks to dd is 1271282'
+ echo 'Total free disk space is 11300292, 4k blocks to dd is 1271282'
Total free disk space is 11300292, 4k blocks to dd is 1271282
+ load_pid=3519
+ wait 3519
+ dd bs=4k count=1271282 status=noxfer if=/dev/zero of=/mnt/lustre/d0.dd-shadow-51vm5.shadow.whamcloud.com/dd-file
dd: error writing ‘/mnt/lustre/d0.dd-shadow-51vm5.shadow.whamcloud.com/dd-file’: Input/output error
428034+0 records in
428033+0 records out
+ '[' 1 -eq 0 ']'
++ date '+%F %H:%M:%S'
+ echoerr '2015-10-29 05:19:43: dd failed'
+ echo '2015-10-29 05:19:43: dd failed'
2015-10-29 05:19:43: dd failed

Generated at Sat Feb 10 02:08:28 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.