Details
-
Bug
-
Resolution: Cannot Reproduce
-
Critical
-
None
-
Lustre 2.7.0, Lustre 2.5.3
-
Lustre Build: https://build.hpdd.intel.com/job/lustre-b2_5/80/
Distro/Arch: RHEL6.5/x86_64
Test Group: failover
FSTYPE=zfs
-
3
-
15384
Description
While running recovery-mds-scale test failover_mds (MDS failed over 1 time), client load on one of the clients failed as follows:
2014-08-16 20:44:59: dd run starting + mkdir -p /mnt/lustre/d0.dd-shadow-41vm5.shadow.whamcloud.com + /usr/bin/lfs setstripe -c -1 /mnt/lustre/d0.dd-shadow-41vm5.shadow.whamcloud.com + cd /mnt/lustre/d0.dd-shadow-41vm5.shadow.whamcloud.com + sync ++ /usr/bin/lfs df /mnt/lustre/d0.dd-shadow-41vm5.shadow.whamcloud.com ++ awk '/filesystem summary:/ {print $5}' + FREE_SPACE=14195328 + BLKS=1596974 + echoerr 'Total free disk space is 14195328, 4k blocks to dd is 1596974' + echo 'Total free disk space is 14195328, 4k blocks to dd is 1596974' Total free disk space is 14195328, 4k blocks to dd is 1596974 + load_pid=3715 + wait 3715 + dd bs=4k count=1596974 status=noxfer if=/dev/zero of=/mnt/lustre/d0.dd-shadow-41vm5.shadow.whamcloud.com/dd-file dd: writing `/mnt/lustre/d0.dd-shadow-41vm5.shadow.whamcloud.com/dd-file': No space left on device 1213957+0 records in 1213956+0 records out + '[' 1 -eq 0 ']' ++ date '+%F %H:%M:%S' + echoerr '2014-08-16 20:52:06: dd failed' + echo '2014-08-16 20:52:06: dd failed' 2014-08-16 20:52:06: dd failed
Console log on the client:
LustreError: 3715:0:(vvp_io.c:1081:vvp_io_commit_write()) Write page 1213956 of inode ffff88007ccedb78 failed -28
Maloo report: https://testing.hpdd.intel.com/test_sets/acafc288-26a6-11e4-84f2-5254006e85c2
The failure was reported in LU-3326 before. However, Lustre b2_5 build #80 already contained patch http://review.whamcloud.com/11425. So, more fixup is needed to resolve the failure.
Attachments
Issue Links
- is related to
-
LU-6493 sanity test 42b failure on sync
- Open
-
LU-4846 Failover test failure on test suite replay-single test_26: No space left
- Resolved
-
LU-6200 Failover recovery-mds-scale test_failover_ost: test_failover_ost returned 1
- Resolved
-
LU-7309 replay-single test_70b: no space left on device
- Resolved
-
LU-7387 Failover: recovery-random-scale test_fail_client_mds: test failed to respond and timed out
- Resolved
- is related to
-
LU-3326 recovery-mds-scale test_failover_ost: tar: Cannot open: No space left on device
- Resolved
- mentioned in
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...