[LU-2253] Test failure on sanity-quota test_2: error: No space left on device Created: 31/Oct/12  Updated: 19/Dec/13  Resolved: 03/Sep/13

Status: Resolved
Project: Lustre
Component/s: None
Affects Version/s: Lustre 2.4.0, Lustre 1.8.9, Lustre 2.4.2
Fix Version/s: Lustre 2.5.0

Type: Bug Priority: Blocker
Reporter: Maloo Assignee: Niu Yawei (Inactive)
Resolution: Fixed Votes: 0
Labels: mn8, yuc2
Environment:

Test was run with SLOW=yes


Severity: 3
Rank (Obsolete): 5382

 Description   

This issue was created by maloo for sarah <sarah@whamcloud.com>

This issue relates to the following test suite run: https://maloo.whamcloud.com/test_sets/c15b1d7e-22eb-11e2-afb4-52540035b04c.

The sub-test test_2 failed with the following error:

test failed to respond and timed out

test log shows no space left on device error and user create failure

mknod(/mnt/lustre/d0.sanity-quota/d2/f.sanity-quota.2-01048326) error: No space left on device
total: 1048326 creates in 1873.45 seconds: 559.57 creates/second
Disk quotas for user quota_usr (uid 60000):
     Filesystem  kbytes   quota   limit   grace   files   quota   limit   grace
    /mnt/lustre       0       0       0       - 1048326       0 1048576       -
lustre-MDT0000_UUID
                      0       -       0       - 1048326       - 1048576       -
lustre-OST0000_UUID
                      0       -       0       -       -       -       -       -
lustre-OST0001_UUID
                      0       -       0       -       -       -       -       -
lustre-OST0002_UUID
                      0       -       0       -       -       -       -       -
lustre-OST0003_UUID
                      0       -       0       -       -       -       -       -
lustre-OST0004_UUID
                      0       -       0       -       -       -       -       -
lustre-OST0005_UUID
                      0       -       0       -       -       -       -       -
lustre-OST0006_UUID
                      0       -       0       -       -       -       -       -
Files for user (quota_usr):
  File: `/mnt/lustre/d0.sanity-quota/d2/f.sanity-quota.2-0977084'
  Size: 0         	Blocks: 0          IO Block: 2097152 regular empty file
Device: 2c54f966h/743766374d	Inode: 144115205373225167  Links: 1
Access: (0444/-r--r--r--)  Uid: (60000/quota_usr)   Gid: (60000/quota_usr)
Access: 2012-10-30 04:15:05.000000000 -0700
Modify: 2012-10-30 04:15:05.000000000 -0700
Change: 2012-10-30 04:15:05.000000000 -0700
  File: `/mnt/lustre/d0.sanity-quota/d2/f.sanity-quota.2-0164550'
  Size: 0         	Blocks: 0          IO Block: 2097152 regular empty file
Device: 2c54f966h/743766374d	Inode: 144115205272535769  Links: 1
Access: (0444/-r--r--r--)  Uid: (60000/quota_usr)   Gid: (60000/quota_usr)
Access: 2012-10-30 03:49:44.000000000 -0700
Modify: 2012-10-30 03:49:44.000000000 -0700
Change: 2012-10-30 03:49:44.000000000 -0700
  File: `/mnt/lustre/d0.sanity-quota/d2/f.sanity-quota.2-0461849'
  Size: 0         	Blocks: 0          IO Block: 2097152 regular empty file
Device: 2c54f966h/743766374d	Inode: 144115205306125356  Links: 1
Access: (0444/-r--r--r--)  Uid: (60000/quota_usr)   Gid: (60000/quota_usr)
Access: 2012-10-30 03:56:08.000000000 -0700
Modify: 2012-10-30 03:56:08.000000000 -0700
Change: 2012-10-30 03:56:08.000000000 -0700
  File: `/mnt/lustre/d0.sanity-quota/d2/f.sanity-quota.2-0602708'
  Size: 0         	Blocks: 0          IO Block: 2097152 regular empty file
Device: 2c54f966h/743766374d	Inode: 144115205322912359  Links: 1
Access: (0444/-r--r--r--)  Uid: (60000/quota_usr)   Gid: (60000/quota_usr)
Access: 2012-10-30 04:00:27.000000000 -0700
Modify: 2012-10-30 04:00:27.000000000 -0700
Change: 2012-10-30 04:00:27.000000000 -0700
 sanity-quota test_2: @@@@@@ FAIL: user create failure, but expect success 
  Trace dump:


 Comments   
Comment by Sarah Liu [ 03/Nov/12 ]

Hit same issue on SLEL11 SP2 client
https://maloo.whamcloud.com/test_sets/30b98780-2541-11e2-9e7c-52540035b04c

mknod(/mnt/lustre/d0.sanity-quota/d2/f.sanity-quota.2-01047886) error: No space left on device
total: 1047886 creates in 1872.26 seconds: 559.69 creates/second
Comment by Jodi Levi (Inactive) [ 06/Nov/12 ]

Niu,
Can you have a look at this one? Do you already have a patch for this?

Comment by Niu Yawei (Inactive) [ 06/Nov/12 ]

Yes, I think it has been fixed by 984f4ce51fd38caaf0bd2b706a130f7f17c51638 (which has been landed). Now we check the free inodes in test_2, and skip the test if free inodes isn't enough.

Comment by Niu Yawei (Inactive) [ 06/Nov/12 ]

patch landed. http://review.whamcloud.com/4275 (see LU-2153)

Comment by Sarah Liu [ 26/Nov/12 ]

This failure is not seen in tag-2.3.56 testing

Comment by Jian Yu [ 15/Aug/13 ]

Lustre client build: http://build.whamcloud.com/job/lustre-b1_8/258/ (1.8.9-wc1)
Lustre server build: http://build.whamcloud.com/job/lustre-b2_4/31/

sanity-quota test is blocked by the issue in this ticket:

mknod(/mnt/lustre/d0.sanity-quota/d2/f2-01048289) error: No space left on device
total: 1048289 creates in 1784.48 seconds: 587.45 creates/second

https://maloo.whamcloud.com/test_sets/96c0392a-0489-11e3-90ba-52540035b04c

More instances:
https://maloo.whamcloud.com/test_sets/a59bbd40-ec68-11e2-8011-52540035b04c
https://maloo.whamcloud.com/test_sets/ae15b2c6-0236-11e3-ba9e-52540035b04c
https://maloo.whamcloud.com/test_sets/6bb6a738-c941-11e2-8cff-52540035b04c

Comment by Niu Yawei (Inactive) [ 15/Aug/13 ]

The free inode checking in s-q of b1_8 is wrong:

        local FREE_INODES=$(lfs_df -i | grep "summary" | awk '{print $5}')

It should be 'print $4' but not 'print $5', we must forgot to change it from 5 to 4 when replace 'lfs df' with 'lfs_df'.

Comment by Niu Yawei (Inactive) [ 15/Aug/13 ]

patch for b1_8 to fix the typo in test_2 of s-q: http://review.whamcloud.com/7341

Comment by Jian Yu [ 16/Aug/13 ]

Lustre Build: http://build.whamcloud.com/job/lustre-b2_4/28/
MDSCOUNT=4

The same failure also occurred under DNE configuration:
https://maloo.whamcloud.com/test_sets/b33e632c-027e-11e3-b384-52540035b04c

Comment by Niu Yawei (Inactive) [ 19/Aug/13 ]

In DNE, we ususally want to get the free inodes on MDT0 but not the whole filesystem: http://review.whamcloud.com/7375

Comment by Niu Yawei (Inactive) [ 03/Sep/13 ]

patches landed on master & b1_8.

Comment by Jian Yu [ 19/Dec/13 ]

Lustre Build: http://build.whamcloud.com/job/lustre-b2_4/69/ (2.4.2 RC1)
MDSCOUNT=4

The same failure occurred:
https://maloo.whamcloud.com/test_sets/a1a2c1d0-6877-11e3-a9a3-52540035b04c

Generated at Sat Feb 10 01:23:39 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.