Details
-
Bug
-
Resolution: Fixed
-
Critical
-
Lustre 2.11.0, Lustre 2.12.0, Lustre 2.10.3, Lustre 2.10.4, Lustre 2.10.5, Lustre 2.10.6, Lustre 2.12.1, Lustre 2.12.6
-
None
-
3
-
9223372036854775807
Description
ost-pools tests 1n, 11, 15, 16, 19 and 22 all fail trying to create/open or write files with the following error message:
File too large
For example, from the test_log of test_1n
== ost-pools test 1n: Pool with a 15 char pool name works well ======================================= 10:03:28 (1512554608) CMD: trevis-8vm4 lctl pool_new lustre.testpool1234567 trevis-8vm4: Pool lustre.testpool1234567 created CMD: trevis-8vm4 lctl get_param -n lod.lustre-MDT0000-mdtlov.pools.testpool1234567 2>/dev/null || echo foo CMD: trevis-8vm4 lctl get_param -n lod.lustre-MDT0000-mdtlov.pools.testpool1234567 2>/dev/null || echo foo CMD: trevis-8vm1.trevis.hpdd.intel.com lctl get_param -n lov.lustre-*.pools.testpool1234567 2>/dev/null || echo foo CMD: trevis-8vm1.trevis.hpdd.intel.com lctl get_param -n lov.lustre-*.pools.testpool1234567 2>/dev/null || echo foo CMD: trevis-8vm4 lctl pool_add lustre.testpool1234567 OST0000 trevis-8vm4: OST lustre-OST0000_UUID added to pool lustre.testpool1234567 CMD: trevis-8vm4 lctl get_param -n lod.lustre-MDT0000-mdtlov.pools.testpool1234567 | sort -u | tr '\n' ' ' CMD: trevis-8vm4 lctl get_param -n lod.lustre-MDT0000-mdtlov.pools.testpool1234567 | sort -u | tr '\n' ' ' CMD: trevis-8vm1.trevis.hpdd.intel.com lctl get_param -n lov.lustre-*.pools.testpool1234567 | sort -u | tr '\n' ' ' CMD: trevis-8vm1.trevis.hpdd.intel.com lctl get_param -n lov.lustre-*.pools.testpool1234567 | sort -u | tr '\n' ' ' dd: failed to open '/mnt/lustre/d1n.ost-pools/file': File too large ost-pools test_1n: @@@@@@ FAIL: failed to write to /mnt/lustre/d1n.ost-pools/file: 1
In the dmesg log for the MDS (vm4), we can see a failure
[18753.542095] Lustre: DEBUG MARKER: == ost-pools test 1n: Pool with a 15 char pool name works well ======================================= 13:37:10 (1512567430) [18753.714379] Lustre: DEBUG MARKER: lctl pool_new lustre.testpool1234567 [18758.015205] Lustre: DEBUG MARKER: lctl get_param -n lod.lustre-MDT0000-mdtlov.pools.testpool1234567 2>/dev/null || echo foo [18758.331296] Lustre: DEBUG MARKER: lctl get_param -n lod.lustre-MDT0000-mdtlov.pools.testpool1234567 2>/dev/null || echo foo [18760.686719] Lustre: DEBUG MARKER: lctl pool_add lustre.testpool1234567 OST0000 [18766.993199] Lustre: DEBUG MARKER: lctl get_param -n lod.lustre-MDT0000-mdtlov.pools.testpool1234567 | sort -u | tr '\n' ' ' [18767.303867] Lustre: DEBUG MARKER: lctl get_param -n lod.lustre-MDT0000-mdtlov.pools.testpool1234567 | sort -u | tr '\n' ' ' [18768.515291] LustreError: 3750:0:(lod_qos.c:1350:lod_alloc_specific()) can't lstripe objid [0x200029443:0xdaad:0x0]: have 1 want 7 [18768.704524] Lustre: DEBUG MARKER: /usr/sbin/lctl mark ost-pools test_1n: @@@@@@ FAIL: failed to write to \/mnt\/lustre\/d1n.ost-pools\/file: 1 [18768.896290] Lustre: DEBUG MARKER: ost-pools test_1n: @@@@@@ FAIL: failed to write to /mnt/lustre/d1n.ost-pools/file: 1 [18769.103049] Lustre: DEBUG MARKER: /usr/sbin/lctl dk > /home/autotest/autotest/logs/test_logs/2017-12-05/lustre-master-el7-x86_64--full--1_1_1__3676___6c155f47-820d-447d-893f-15b24418827f/ost-pools.test_1n.debug_log.$(hostname -s).1512567446.log; dmesg > /home/autotest/autotest/lo
and similar failures for the other tests. Note: there are 7 OSTs and 1 MDS for the following test suite:
https://testing.hpdd.intel.com/test_sets/fdd54642-dae4-11e7-8027-52540065bddc
These ost-pools tests started failing with the ‘File too large’ error on September 27, 2017 with 2.10.52.113.
Note: So far we are only seeing these failures during 'full' test sessions and not in review-* test sessions.
Logs for some of the other instances of this failure are at:
https://testing.hpdd.intel.com/test_sets/da2df238-db44-11e7-9c63-52540065bddc
https://testing.hpdd.intel.com/test_sets/4fc12420-daa0-11e7-9c63-52540065bddc
https://testing.hpdd.intel.com/test_sets/307880b4-da7c-11e7-9c63-52540065bddc
https://testing.hpdd.intel.com/test_sets/0e1cd21c-da73-11e7-8027-52540065bddc
https://testing.hpdd.intel.com/test_sets/c1f5d0c8-dadb-11e7-9c63-52540065bddc
Attachments
Issue Links
- is duplicated by
-
LU-9277 ost-pools test_19: createmany /mnt/lustre/d19.ost-pools/dir1/f19.ost-pools failed!
- Resolved
- is related to
-
LU-10353 parallel-scale* tests fail with ‘No space left on device’
- Open
-
LU-10689 parallel-scale-nfsv3 test_connectathon: can't sync bigfile21829: File too large
- Open
-
LU-8264 lfs setstripe without -p pool_name doesn't inherit pool from parent/ROOT directory
- Resolved
-
LU-2113 ENOSPC sometimes incorrectly reported as file too bigin lfs setstripe
- Resolved
- is related to
-
LU-10396 ost-pools test_23b: dd did not fail with ENOSPC
- Open
- mentioned in
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...