Details
-
Bug
-
Resolution: Fixed
-
Critical
-
Lustre 2.6.0, Lustre 2.4.2, Lustre 2.5.1, Lustre 2.7.0, Lustre 2.8.0
-
client and server: lustre-master build #1783 RHEL6.4 ldiskfs
-
3
-
11869
Description
This issue was created by maloo for sarah <sarah@whamcloud.com>
This issue relates to the following test suite run: http://maloo.whamcloud.com/test_sets/77c48f3e-59f3-11e3-98fc-52540035b04c.
The sub-test test_69 failed with the following error:
create file after reformat
test log shows:
CMD: client-16vm3 lctl get_param -n osc.lustre-OST0000-osc-MDT0000.prealloc_last_id - created 10000 (time 1385749407.25 total 47.32 last 47.32) - created 20000 (time 1385749454.94 total 95.01 last 47.69) - created 30000 (time 1385749503.05 total 143.12 last 48.11) - created 40000 (time 1385749551.55 total 191.62 last 48.50) open(/mnt/lustre/d0.conf-sanity/d69/f.conf-sanity.69-49787) error: File too large total: 49787 creates in 240.39 seconds: 207.11 creates/second stop ost1 service on client-16vm4 CMD: client-16vm4 grep -c /mnt/ost1' ' /proc/mounts Stopping /mnt/ost1 (opts:-f) on client-16vm4 CMD: client-16vm4 umount -d -f /mnt/ost1 CMD: client-16vm4 lsmod | grep lnet > /dev/null && lctl dl | grep ' ST ' CMD: client-16vm4 grep -c /mnt/ost1' ' /proc/mounts CMD: client-16vm4 lsmod | grep lnet > /dev/null && lctl dl | grep ' ST ' CMD: client-16vm4 mkfs.lustre --mgsnode=client-16vm3@tcp --fsname=lustre --ost --index=0 --param=sys.timeout=20 --backfstype=ldiskfs --device-size=200000 --mkfsoptions=\"-E lazy_itable_init\" --reformat --replace /dev/lvm-Role_OSS/P1 Permanent disk data: Target: lustre-OST0000 Index: 0 Lustre FS: lustre Mount type: ldiskfs Flags: 0x42 (OST update ) Persistent mount opts: errors=remount-ro Parameters: mgsnode=10.10.4.122@tcp sys.timeout=20 device size = 2048MB formatting backing filesystem ldiskfs on /dev/lvm-Role_OSS/P1 target name lustre-OST0000 4k blocks 50000 options -I 256 -q -O extents,uninit_bg,dir_nlink,quota,huge_file,flex_bg -G 256 -E lazy_itable_init,resize=4290772992,lazy_journal_init -F mkfs_cmd = mke2fs -j -b 4096 -L lustre-OST0000 -I 256 -q -O extents,uninit_bg,dir_nlink,quota,huge_file,flex_bg -G 256 -E lazy_itable_init,resize=4290772992,lazy_journal_init -F /dev/lvm-Role_OSS/P1 50000 Writing CONFIGS/mountdata start ost1 service on client-16vm4 CMD: client-16vm4 mkdir -p /mnt/ost1 CMD: client-16vm4 test -b /dev/lvm-Role_OSS/P1 Starting ost1: /dev/lvm-Role_OSS/P1 /mnt/ost1 CMD: client-16vm4 mkdir -p /mnt/ost1; mount -t lustre /dev/lvm-Role_OSS/P1 /mnt/ost1 CMD: client-16vm4 PATH=/usr/lib64/lustre/tests:/usr/lib/lustre/tests:/usr/lib64/lustre/tests:/opt/iozone/bin:/usr/lib64/lustre/tests//usr/lib64/lustre/tests:/usr/lib64/lustre/tests:/usr/lib64/lustre/tests/../utils:/opt/iozone/bin:/usr/lib64/lustre/tests/mpi:/usr/lib64/lustre/tests/racer:/usr/lib64/lustre/../lustre-iokit/sgpdd-survey:/usr/lib64/lustre/tests:/usr/lib64/lustre/utils/gss:/usr/lib64/lustre/utils:/usr/lib64/openmpi/bin:/usr/bin:/bin:/sbin:/usr/sbin::/sbin:/bin:/usr/sbin: NAME=autotest_config sh rpc.sh set_default_debug \"-1\" \"all -lnet -lnd -pinger\" 4 CMD: client-16vm4 e2label /dev/lvm-Role_OSS/P1 2>/dev/null Started lustre-OST0000 CMD: client-16vm3 /usr/sbin/lctl get_param -n version CMD: client-16vm3 /usr/sbin/lctl get_param -n version CMD: client-16vm3 lctl list_param osc.lustre-OST*-osc > /dev/null 2>&1 CMD: client-16vm3 lctl get_param -n at_min can't get osc.lustre-OST0000-osc-MDT0000.ost_server_uuid by list_param in 40 secs Go with osc.lustre-OST0000-osc-MDT0000.ost_server_uuid directly CMD: client-16vm3 PATH=/usr/lib64/lustre/tests:/usr/lib/lustre/tests:/usr/lib64/lustre/tests:/opt/iozone/bin:/usr/lib64/lustre/tests//usr/lib64/lustre/tests:/usr/lib64/lustre/tests:/usr/lib64/lustre/tests/../utils:/opt/iozone/bin:/usr/lib64/lustre/tests/mpi:/usr/lib64/lustre/tests/racer:/usr/lib64/lustre/../lustre-iokit/sgpdd-survey:/usr/lib64/lustre/tests:/usr/lib64/lustre/utils/gss:/usr/lib64/lustre/utils:/usr/lib64/openmpi/bin:/usr/bin:/bin:/sbin:/usr/sbin::/sbin:/bin:/usr/sbin: NAME=autotest_config sh rpc.sh wait_import_state FULL osc.lustre-OST0000-osc-MDT0000.ost_server_uuid 40 client-16vm3: osc.lustre-OST0000-osc-MDT0000.ost_server_uuid in FULL state after 0 sec touch: cannot touch `/mnt/lustre/d0.conf-sanity/d69/f.conf-sanity.69-last': File too large conf-sanity test_69: @@@@@@ FAIL: create file after reformat
Attachments
Issue Links
- is duplicated by
-
LU-4490 Test failure conf-sanity test_69: createmany gets "File too large"
-
- Resolved
-
-
LU-4338 Failure on test suite conf-sanity test_69: create file after reformat
-
- Closed
-
-
LU-4339 Failure on test suite conf-sanity test_69: create file after reformat
-
- Closed
-
- is related to
-
LU-8158 conf-sanity test_69: create file after reformat
-
- Open
-
-
LU-6123 conf-sanity test_72: FAIL: mount client failed
-
- Resolved
-
- is related to
-
LU-4204 typo in new conf-sanity subtest
-
- Resolved
-
-
LU-5246 Failure on test suite sanity test_220: error: File too large
-
- Resolved
-
- mentioned in
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
I also figured out where the -EFBIG (-27) vs -ENOSPC difference was coming from. That is returned by lod_alloc_specific() returning an error when a specific layout is requested but there are no OST objects available. Normally that makes sense because a specific layout specifies the stripe count, and if that cannot be satisfied then the file may grow too large for the available number of stripes. In the case of this test, there is only one stripe but the directory specifies it must be on OST0000, so it triggers this condition:
Previously it always returned -EFBIG, but I changed it in http://review.whamcloud.com/12937 "
LU-5246tests: create OST objects on correct MDT" to return -ENOSPC in case no objects could be created at all. That patch was only landed on Aug 9, so I suspect the cases of -EFBIG being returned will decline and -ENOSPC will be returned instead (as it should be).