Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-16467

lod_trans_space_check() fails with -28 during file unlink

Details

    • Bug
    • Resolution: Unresolved
    • Major
    • None
    • None
    • None
    • 3
    • 9223372036854775807

    Description

      We have situation with striped directories and full MDTs. File unlink fails with -ENOSPC error.

      # lfs df
      UUID                   1K-blocks        Used   Available Use% Mounted on
      ai200x-MDT0000_UUID    139539628   133404116     3719624  98% /lustre/ai200x/client[MDT:0]
      ai200x-MDT0001_UUID    139539628   123157904    13966032  90% /lustre/ai200x/client[MDT:1]
      ai200x-MDT0002_UUID    139539628   137217096           0 100% /lustre/ai200x/client[MDT:2]
      ai200x-MDT0003_UUID    139539628   124383568    12740368  91% /lustre/ai200x/client[MDT:3]
      # rm -rf 542162
      rm: cannot remove '542162': No space left on device
      [root@ai200x-001 blogbench]# lfs getdirstripe 542162
      lmv_stripe_count: 3 lmv_stripe_offset: 1 lmv_hash_type: fnv_1a_64
      mdtidx		 FID[seq:oid:ver]
           1		 [0x240002b16:0x11cec:0x0]
           2		 [0x2c0001bba:0x11cec:0x0]
           3		 [0x280000c3e:0x86b8:0x0]
      
      [43349.098414] LustreError: 470595:0:(file.c:249:ll_close_inode_openhandle()) ai200x-clilmv-ff47cef76496c800: inode [0x240001b78:0x8a9c:0x0] mdc close failed: rc = -28
      

      File itself is placed on MDT0001 which has space but unlink operation calls OSP and failed on osp_statfs():

      osp_statfs()) ai200x-MDT0002-osp-MDT0001: 34884907 blocks, 598078 free, 0 avail
      ...
      lod_trans_space_check()) ai200x-MDT0002-osp-MDT0001: fail - target state 220: rc = -28
      
      

      So as result file is not removed because MDT0002 is full. This is the result of LU-14179 patch it seems and solution could be skipping lod_trans_space_check() for unlink operation.

      Attachments

        Issue Links

          Activity

            [LU-16467] lod_trans_space_check() fails with -28 during file unlink

            I think the straight forward solution here is for lod_trans_space_check() to skip the space check for unlink and rmdir operations, or at least to check "free" instead of "avail" space for those operation types.

            adilger Andreas Dilger added a comment - I think the straight forward solution here is for lod_trans_space_check() to skip the space check for unlink and rmdir operations, or at least to check "free" instead of "avail" space for those operation types.

            I've just added lustre debug log collected around rm operation:

            # rm /lustre/ai200x/client/blogbench/542327/blog-8/article-7.xml
            
            and 'blog-8' info just in case:
            # lfs getdirstripe /lustre/ai200x/client/blogbench/542327/blog-8
            lmv_stripe_count: 4 lmv_stripe_offset: 0 lmv_hash_type: fnv_1a_64
            mdtidx		 FID[seq:oid:ver]
                 0		 [0x200004a50:0x12072:0x0]		
                 1		 [0x240002b21:0x12072:0x0]		
                 2		 [0x2c0001bc2:0x839d:0x0]		
                 3		 [0x280000c45:0x11af:0x0]
            tappro Mikhail Pershin added a comment - I've just added lustre debug log collected around rm operation: # rm /lustre/ai200x/client/blogbench/542327/blog-8/article-7.xml and 'blog-8' info just in case: # lfs getdirstripe /lustre/ai200x/client/blogbench/542327/blog-8 lmv_stripe_count: 4 lmv_stripe_offset: 0 lmv_hash_type: fnv_1a_64 mdtidx FID[seq:oid:ver]      0 [0x200004a50:0x12072:0x0]      1 [0x240002b21:0x12072:0x0]      2 [0x2c0001bc2:0x839d:0x0]      3 [0x280000c45:0x11af:0x0]

            People

              laisiyao Lai Siyao
              tappro Mikhail Pershin
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated: