[LU-4348] Failure on test suite sanity test_82: no space left Created: 05/Dec/13  Updated: 19/Feb/15  Resolved: 19/Feb/15

Status: Resolved
Project: Lustre
Component/s: None
Affects Version/s: Lustre 2.6.0
Fix Version/s: None

Type: Bug Priority: Major
Reporter: Maloo Assignee: WC Triage
Resolution: Fixed Votes: 0
Labels: zfs
Environment:

lustre-master build #1791 RHEL6 zfs


Issue Links:
Related
Severity: 3
Rank (Obsolete): 11913

 Description   

This issue was created by maloo for sarah <sarah@whamcloud.com>

This issue relates to the following test suite run: http://maloo.whamcloud.com/test_sets/20748e4e-5c66-11e3-9d08-52540035b04c.

The sub-test test_82 failed with the following error:

test_82 failed with 61

test log

== sanity test 82: Basic grouplock test ================================= 19:16:14 (1386040574)
dd: writing `/mnt/lustre/f.sanity.82': No space left on device


 Comments   
Comment by Sarah Liu [ 05/Dec/13 ]

In the latest tag-2.5.52 zfs testing, hit a lot similar errors as "no space left on device". I checked the lustre-initialization_1 log and found, both MDS size and OST size are smaller than the previous tag-2.5.51

in 2.5.52, lustre-initialization_1 autotest log shows:
https://maloo.whamcloud.com/test_logs/7442b9b4-5c6c-11e3-9d08-52540035b04c/show_text

02:17:39:export MDSSIZE=1939865
02:17:40:export MGSSIZE=1939865
02:17:41:export MDSFSTYPE=zfs
02:17:42:export MGSFSTYPE=zfs
02:17:42:export MGSNID=`h2tcp client-26vm7`
02:17:43:export ost_HOST=client-26vm8
02:17:44:export ost1_HOST=client-26vm8
02:17:44:export OSTDEV1=/dev/lvm-Role_OSS/P1
02:17:44:export ost2_HOST=client-26vm8
02:17:45:export OSTDEV2=/dev/lvm-Role_OSS/P2
02:17:45:export ost3_HOST=client-26vm8
02:17:45:export OSTDEV3=/dev/lvm-Role_OSS/P3
02:17:45:export ost4_HOST=client-26vm8
02:17:46:export OSTDEV4=/dev/lvm-Role_OSS/P4
02:17:46:export ost5_HOST=client-26vm8
02:17:46:export OSTDEV5=/dev/lvm-Role_OSS/P5
02:17:46:export ost6_HOST=client-26vm8
02:17:46:export OSTDEV6=/dev/lvm-Role_OSS/P6
02:17:47:export ost7_HOST=client-26vm8
02:17:48:export OSTDEV7=/dev/lvm-Role_OSS/P7
02:17:48:# some setup for conf-sanity test 24a, 24b, 33a
02:17:48:export fs2mds_DEV=/dev/lvm-Role_MDS/S1
02:17:48:export fs2ost_DEV=/dev/lvm-Role_OSS/S1
02:17:48:export fs3ost_DEV=/dev/lvm-Role_OSS/S2
02:17:49:export RCLIENTS="client-26vm1"
02:17:50:export OSTCOUNT=7
02:17:50:export NETTYPE=tcp
02:17:50:export OSTSIZE=223196

in tag 2.5.51 autotest log shows:
https://maloo.whamcloud.com/test_logs/707fd738-4bc5-11e3-a7be-52540035b04c/show_text

19:47:47:export MDSSIZE=2097152
19:47:48:export MGSSIZE=2097152
19:47:48:export MDSFSTYPE=zfs
19:47:48:export MGSFSTYPE=zfs
19:47:48:export MGSNID=`h2tcp client-13vm3`
19:47:48:export ost_HOST=client-13vm4
19:47:48:export ost1_HOST=client-13vm4
19:47:48:export OSTDEV1=/dev/lvm-Role_OSS/P1
19:47:49:export ost2_HOST=client-13vm4
19:47:49:export OSTDEV2=/dev/lvm-Role_OSS/P2
19:47:49:export ost3_HOST=client-13vm4
19:47:49:export OSTDEV3=/dev/lvm-Role_OSS/P3
19:47:49:export ost4_HOST=client-13vm4
19:47:49:export OSTDEV4=/dev/lvm-Role_OSS/P4
19:47:49:export ost5_HOST=client-13vm4
19:47:49:export OSTDEV5=/dev/lvm-Role_OSS/P5
19:47:49:export ost6_HOST=client-13vm4
19:47:49:export OSTDEV6=/dev/lvm-Role_OSS/P6
19:47:49:export ost7_HOST=client-13vm4
19:47:49:export OSTDEV7=/dev/lvm-Role_OSS/P7
19:47:49:# some setup for conf-sanity test 24a, 24b, 33a
19:47:49:export fs2mds_DEV=/dev/lvm-Role_MDS/S1
19:47:49:export fs2ost_DEV=/dev/lvm-Role_OSS/S1
19:47:49:export fs3ost_DEV=/dev/lvm-Role_OSS/S2
19:47:49:export RCLIENTS="client-13vm2"
19:47:49:export OSTCOUNT=7
19:47:49:export NETTYPE=tcp
19:47:49:export OSTSIZE=2097152
19:47:49:export OSTFSTYPE=zfs
Comment by Sarah Liu [ 09/Dec/13 ]

I also checked the MDT and OST size for review-zfs, and it is bigger than the config used in tag-2.5.52

https://maloo.whamcloud.com/test_logs/9173c588-610c-11e3-974e-52540035b04c/show_text

Comment by Andreas Dilger [ 10/Dec/13 ]

I don't think the 70MB should make such a difference in the ZFS test results. Are these tests being run in normal review tests, or are they only run with SLOW=no?

Comment by Sarah Liu [ 11/Dec/13 ]

Most of the tests also run in normal review test and pass, only 4 tests are skipped in review tests.

Comment by Andreas Dilger [ 17/Dec/13 ]

Link to TEI tickets that change OSTCOUNT and OSTSIZE for ZFS tests. Hopefully TEI-1032 will resolve this problem. For debugging, Sarah will make a patch to dump space usage in the error handler ( lfs df, lfs df -i, lfs find $MOUNT -size +4M, and grant usage of the client OSCs and OSTs)

Comment by Nathaniel Clark [ 19/Feb/15 ]

This seems to be fixed. It hasn't recurred since TEI-1032 was closed (over a year ago)

Generated at Sat Feb 10 01:41:55 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.