Details

    • Bug
    • Resolution: Unresolved
    • Major
    • None
    • Lustre 2.10.3
    • None
    • zfs 0.7.5, OPA, skylake, centos7
    • 3
    • 9223372036854775807

    Description

      Hiya,

      we have 2 MDS's with 1 MDT on one of them and 2 MDTs on the other. so 3 MDT's in total. each MDT consists of 2 hardware raid1 mirrors with zmirror putting those together into one zfs MDT in one zpool. so 4-way replication.

      latest centos7.4 kernels 3.10.0-693.17.1.el7.x86_64 everywhere. nopti set on lustre servers. 8 OSS's if that matters. multipath on all lustre servers. purely software raidz3 on OSS's.

      we are testing DNE2 with 3-way dir striping, and also with default inheritance to all sub-dirs.
      the below test fails and seems repeatable.

      # lfs setdirstripe -c 3 mdt0-2
      # lfs setdirstripe -D -c 3 mdt0-2
      # chown rhumble mdt0-2
      [rhumble@farnarkle2 ~]$ for f in /dagg/old_stuff/rjh/mdtest/mdt*; do echo === $f === ; time ( cd $f ; for g in {0000..9999}; do mkdir $g; for h in {00..99}; do mkdir $g/$h; done; done ) ; time rm -rf $f/*; done
      ...
      === /dagg/old_stuff/rjh/mdtest/mdt0-2 ===
      
      real    57m21.053s
      user    8m36.378s
      sys     18m25.963s
      rm: cannot remove ‘/dagg/old_stuff/rjh/mdtest/mdt0-2/2556’: Directory not empty
      
      real    72m52.257s
      user    0m4.197s
      sys     7m59.024s
      ...
      
      [rhumble@farnarkle2 ~]$ ls -al /dagg/old_stuff/rjh/mdtest/mdt0-2/2556
      total 894
      drwxrwxr-x 3 rhumble hpcadmin  76800 Feb 16 03:33 .
      drwxr-xr-x 3 rhumble hpcadmin 838656 Feb 16 15:46 ..
      [rhumble@farnarkle2 ~]$ rmdir /dagg/old_stuff/rjh/mdtest/mdt0-2/2556
      rmdir: failed to remove ‘/dagg/old_stuff/rjh/mdtest/mdt0-2/2556’: Directory not empty
      [rhumble@farnarkle2 ~]$ rm -rf /dagg/old_stuff/rjh/mdtest/mdt0-2/2556
      rm: cannot remove ‘/dagg/old_stuff/rjh/mdtest/mdt0-2/2556’: Directory not empty
      

      there aren't any problems seen with the other 4 dirs tested.
      the other 4 dirs are mdt0, mdt1, mdt2 whcih have dir striping set to only that mdt and no default (-D) set, and to a directory with 3-way dir striping and no default (-D) set. ie.

      [root@farnarkle1 ~]# lfs getdirstripe /dagg/old_stuff/rjh/mdtest/mdt0
      lmv_stripe_count: 0 lmv_stripe_offset: 0 lmv_hash_type: none
      
      [root@farnarkle1 ~]# lfs getdirstripe /dagg/old_stuff/rjh/mdtest/mdt1
      lmv_stripe_count: 0 lmv_stripe_offset: 1 lmv_hash_type: none
      
      [root@farnarkle1 ~]# lfs getdirstripe /dagg/old_stuff/rjh/mdtest/mdt2
      lmv_stripe_count: 0 lmv_stripe_offset: 2 lmv_hash_type: none
      
      [root@farnarkle1 ~]# lfs getdirstripe /dagg/old_stuff/rjh/mdtest/mdt0-2
      lmv_stripe_count: 3 lmv_stripe_offset: 0 lmv_hash_type: fnv_1a_64
      mdtidx           FID[seq:oid:ver]
           0           [0x20000b7bd:0x4a1b:0x0]
           1           [0x28001639c:0x4a58:0x0]
           2           [0x680016b6b:0x4a58:0x0]
      
      [root@farnarkle1 ~]# lfs getdirstripe /dagg/old_stuff/rjh/mdtest/mdt0-2-no-inherit
      lmv_stripe_count: 3 lmv_stripe_offset: 0 lmv_hash_type: fnv_1a_64
      mdtidx           FID[seq:oid:ver]
           0           [0x20000bfa7:0xa63a:0x0]
           1           [0x2800182f7:0xa69f:0x0]
           2           [0x680018abd:0xa697:0x0]
      
      [root@farnarkle1 ~]# lfs getdirstripe -D /dagg/old_stuff/rjh/mdtest/mdt0
      lmv_stripe_count: 0 lmv_stripe_offset: -1 lmv_hash_type: none
      
      [root@farnarkle1 ~]# lfs getdirstripe -D /dagg/old_stuff/rjh/mdtest/mdt1
      lmv_stripe_count: 0 lmv_stripe_offset: -1 lmv_hash_type: none
      
      [root@farnarkle1 ~]# lfs getdirstripe -D /dagg/old_stuff/rjh/mdtest/mdt2
      lmv_stripe_count: 0 lmv_stripe_offset: -1 lmv_hash_type: none
      
      [root@farnarkle1 ~]# lfs getdirstripe -D /dagg/old_stuff/rjh/mdtest/mdt0-2
      lmv_stripe_count: 3 lmv_stripe_offset: -1 lmv_hash_type: fnv_1a_64
      
      [root@farnarkle1 ~]# lfs getdirstripe -D /dagg/old_stuff/rjh/mdtest/mdt0-2-no-inherit/
      lmv_stripe_count: 0 lmv_stripe_offset: -1 lmv_hash_type: none
      

      the un-removable directories have only appeared on the 3-way -D directory, so I suspect the bug is to do with DNE2 and the -D inheritance stuff in particular.

      I also re-ran the test with all 3 MDT's on one MDS, and the same thing happened - one directory was un-removable by any means.

      there's nothing in dmesg or syslog.

      cheers,
      robin

      Attachments

        1. 10677-rm58-warble1.log
          39.61 MB
        2. 10677-rm58-warble2.log
          17.12 MB
        3. 10677-warble1.log
          21.52 MB
        4. 10677-warble2.log
          2.83 MB

        Issue Links

          Activity

            [LU-10677] can't delete directory
            laisiyao Lai Siyao added a comment -

            I'm working on patch to improve lfsck on this, will provide a patch later.

            laisiyao Lai Siyao added a comment - I'm working on patch to improve lfsck on this, will provide a patch later.
            pjones Peter Jones added a comment -

            Robin

            There is a Whamcloud ftp site that you could upload large files to. I can give you details directly (i.e via email) if you wish to do this

            Peter

            pjones Peter Jones added a comment - Robin There is a Whamcloud ftp site that you could upload large files to. I can give you details directly (i.e via email) if you wish to do this Peter
            scadmin SC Admin added a comment -

            Hi,

            sorry. yes, I should have got back to this, but it's been a pretty low priority for us. and high risk as it turns out 'cos the MDS crashed whilst running the lfsck (see LU-11111)

            we did get as far as this though ->

            # lctl get_param -n mdd.dagg-MDT000*.lfsck_namespace | grep striped_striped_shard
            striped_shards_skipped: 3
            striped_shards_skipped: 4
            striped_shards_skipped: 4
            

            which perhaps is enough to help.

            would you like me to upload the debug_file.txt.gz to somewhere, or grep for something? it's about 300M gzip'd.

            cheers,
            robin

            scadmin SC Admin added a comment - Hi, sorry. yes, I should have got back to this, but it's been a pretty low priority for us. and high risk as it turns out 'cos the MDS crashed whilst running the lfsck (see LU-11111 ) we did get as far as this though -> # lctl get_param -n mdd.dagg-MDT000*.lfsck_namespace | grep striped_striped_shard striped_shards_skipped: 3 striped_shards_skipped: 4 striped_shards_skipped: 4 which perhaps is enough to help. would you like me to upload the debug_file.txt.gz to somewhere, or grep for something? it's about 300M gzip'd. cheers, robin
            pjones Peter Jones added a comment -

            scadmin when do you expect to be able to run this test?

            pjones Peter Jones added a comment - scadmin when do you expect to be able to run this test?

            just so you know, these directories that can't be deleted 'cos they have 'hidden' dirs in them are still there and behave the same after the namespace lfsck in LU-10988. so lfsck missed them somehow.
            there are at least 9 directories like this at the moment.

            I am afraid that these 9 directories are the ones skipped by the namespace LFSCK. I would suggest to enable LFSCK debug (lctl set_param debug+=lfsck) on the MDTs, then re-run namespace LFSCK. Please collect the debug logs on the MDT that will tell us which directories are skipped and why. Please use large debug buffer and mask unnecessary debug components to avoid debug buffer overflow.

            yong.fan nasf (Inactive) added a comment - just so you know, these directories that can't be deleted 'cos they have 'hidden' dirs in them are still there and behave the same after the namespace lfsck in LU-10988 . so lfsck missed them somehow. there are at least 9 directories like this at the moment. I am afraid that these 9 directories are the ones skipped by the namespace LFSCK. I would suggest to enable LFSCK debug (lctl set_param debug+=lfsck) on the MDTs, then re-run namespace LFSCK. Please collect the debug logs on the MDT that will tell us which directories are skipped and why. Please use large debug buffer and mask unnecessary debug components to avoid debug buffer overflow.
            scadmin SC Admin added a comment -

            just so you know, these directories that can't be deleted 'cos they have 'hidden' dirs in them are still there and behave the same after the namespace lfsck in LU-10988. so lfsck missed them somehow.

            there are at least 9 directories like this at the moment.

            cheers,
            robin

            scadmin SC Admin added a comment - just so you know, these directories that can't be deleted 'cos they have 'hidden' dirs in them are still there and behave the same after the namespace lfsck in LU-10988 . so lfsck missed them somehow. there are at least 9 directories like this at the moment. cheers, robin
            yong.fan nasf (Inactive) added a comment - - edited

            Hi Robin,

            As you can see, we already have three patches for LU-10887, one is for LFSCK trouble (https://review.whamcloud.com/31915), another is for object leak (https://review.whamcloud.com/31929), the 3rd one (https://review.whamcloud.com/29228) for showing mount options. Please keep update on such ticket when you need more helps.

            yong.fan nasf (Inactive) added a comment - - edited Hi Robin, As you can see, we already have three patches for LU-10887 , one is for LFSCK trouble ( https://review.whamcloud.com/31915 ), another is for object leak ( https://review.whamcloud.com/31929 ), the 3rd one ( https://review.whamcloud.com/29228 ) for showing mount options. Please keep update on such ticket when you need more helps.
            scadmin SC Admin added a comment -

            Hi Fan Yong,

            just to let you know I've put this on the backburner and won't be doing any more on this bug until we have resolved most of LU-10887 and have repaired the current fs errors. basically I'm reluctant to make things worse by adding more errors to the fs at the moment.

            it'll be interesting to see if a successful lfsck pass can repair the current 93410/5 59234/5 46886/5 24538/9 'hidden' directories.

            cheers,
            robin

            scadmin SC Admin added a comment - Hi Fan Yong, just to let you know I've put this on the backburner and won't be doing any more on this bug until we have resolved most of LU-10887 and have repaired the current fs errors. basically I'm reluctant to make things worse by adding more errors to the fs at the moment. it'll be interesting to see if a successful lfsck pass can repair the current 93410/5 59234/5 46886/5 24538/9 'hidden' directories. cheers, robin
            scadmin SC Admin added a comment -

            ok. I understand now. thanks.
            I'll try to find a shorter reproducer and run the 'lctl debug_daemon' with '-1' whilst I running the 'rm -rf' parts.

            cheers,
            robin

            scadmin SC Admin added a comment - ok. I understand now. thanks. I'll try to find a shorter reproducer and run the 'lctl debug_daemon' with '-1' whilst I running the 'rm -rf' parts. cheers, robin

            What I need is the Lustre kernel debug logs when failed to rmdir the striped directory (such as the "58"). It is difficult to know in advance which striped directory will be the one.

            yong.fan nasf (Inactive) added a comment - What I need is the Lustre kernel debug logs when failed to rmdir the striped directory (such as the "58"). It is difficult to know in advance which striped directory will be the one.

            People

              laisiyao Lai Siyao
              scadmin SC Admin
              Votes:
              0 Vote for this issue
              Watchers:
              10 Start watching this issue

              Dates

                Created:
                Updated: