Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-16056

rarely directories got created with the wrong permission

Details

    • 3
    • 9223372036854775807

    Description

      we've had two instances of this, with the latest being:

      [root@ec01 ~]# ls -ld /exafs/home/sihara/mdtest.out/test-dir.0-0/mdtest_tree.488.0/file.mdtest.488.30670
      ---------- 1 sihara users 0 Jul 16 10:10 /exafs/home/sihara/mdtest.out/test-dir.0-0/mdtest_tree.488.0/file.mdtest.488.30670
      
      [root@ec01 ~]# lfs getdirstripe /exafs/home/sihara/mdtest.out/test-dir.0-0/mdtest_tree.488.0
      lmv_stripe_count: 0 lmv_stripe_offset: 1 lmv_hash_type: none
      [root@ec01 ~]# lfs getdirstripe -D /exafs/home/sihara/mdtest.out/test-dir.0-0/mdtest_tree.488.0
      lmv_stripe_count: 1 lmv_stripe_offset: -1 lmv_hash_type: none lmv_max_inherit: -1 lmv_max_inherit_rr: 5
      [root@ec01 ~]# lfs getstripe -m /exafs/home/sihara/mdtest.out/test-dir.0-0/mdtest_tree.488.0/file.mdtest.488.30670 
      1 

      The username/group ownership is fine, the permissions are supposed to be 755.

      Attachments

        Activity

          [LU-16056] rarely directories got created with the wrong permission

          "Jian Yu <yujian@whamcloud.com>" uploaded a new patch: https://review.whamcloud.com/48414
          Subject: LU-16056 libcfs: restore umask handling in kernel threads
          Project: fs/lustre-release
          Branch: b2_15
          Current Patch Set: 1
          Commit: ac44a6c05afd67d9006c93132c354aef994c4383

          gerrit Gerrit Updater added a comment - "Jian Yu <yujian@whamcloud.com>" uploaded a new patch: https://review.whamcloud.com/48414 Subject: LU-16056 libcfs: restore umask handling in kernel threads Project: fs/lustre-release Branch: b2_15 Current Patch Set: 1 Commit: ac44a6c05afd67d9006c93132c354aef994c4383
          pjones Peter Jones added a comment -

          Landed for 2.16

          pjones Peter Jones added a comment - Landed for 2.16

          "Oleg Drokin <green@whamcloud.com>" merged in patch https://review.whamcloud.com/48233/
          Subject: LU-16056 libcfs: restore umask handling in kernel threads
          Project: fs/lustre-release
          Branch: master
          Current Patch Set:
          Commit: c92bdd97d99cc755a187987f3d5963adeb3ea475

          gerrit Gerrit Updater added a comment - "Oleg Drokin <green@whamcloud.com>" merged in patch https://review.whamcloud.com/48233/ Subject: LU-16056 libcfs: restore umask handling in kernel threads Project: fs/lustre-release Branch: master Current Patch Set: Commit: c92bdd97d99cc755a187987f3d5963adeb3ea475

          "Andreas Dilger <adilger@whamcloud.com>" uploaded a new patch: https://review.whamcloud.com/48233
          Subject: LU-16056 libcfs: restore umask handling in kernel threads
          Project: fs/lustre-release
          Branch: master
          Current Patch Set: 1
          Commit: 06fe5984b7ebfe0153155546995576ea35f7ef21

          gerrit Gerrit Updater added a comment - "Andreas Dilger <adilger@whamcloud.com>" uploaded a new patch: https://review.whamcloud.com/48233 Subject: LU-16056 libcfs: restore umask handling in kernel threads Project: fs/lustre-release Branch: master Current Patch Set: 1 Commit: 06fe5984b7ebfe0153155546995576ea35f7ef21

          The latest patch worked and dumped when no permission file was created.

          [root@ai400x2-1-vm1 ~]# lctl debug_file  /scratch/log/lustre-log.1659427201.8549 | grep '##'
          00000004:00020000:19.0:1659427201.471371:0:8549:0:(mdt_handler.c:877:mdt_pack_attr2body()) ##### [0x200000483:0xfd91:0x0] with mode 32768 uid 8888 gid 100 valid 1100000080002f8f
          
          [root@ec01 ~]# lfs fid2path /exafs "[0x200000483:0xfd91:0x0]"
          /exafs/home/sihara/mdtest.out/test-dir.0-0/mdtest_tree.162.0/file.mdtest.162.13744
          
          [root@ec01 ~]# ls -l /exafs/home/sihara/mdtest.out/test-dir.0-0/mdtest_tree.162.0/file.mdtest.162.13744
          ---------- 1 sihara users 0 Aug  2 17:00 /exafs/home/sihara/mdtest.out/test-dir.0-0/mdtest_tree.162.0/file.mdtest.162.13744
          
          sihara Shuichi Ihara added a comment - The latest patch worked and dumped when no permission file was created. [root@ai400x2-1-vm1 ~]# lctl debug_file /scratch/log/lustre-log.1659427201.8549 | grep '##' 00000004:00020000:19.0:1659427201.471371:0:8549:0:(mdt_handler.c:877:mdt_pack_attr2body()) ##### [0x200000483:0xfd91:0x0] with mode 32768 uid 8888 gid 100 valid 1100000080002f8f [root@ec01 ~]# lfs fid2path /exafs "[0x200000483:0xfd91:0x0]" /exafs/home/sihara/mdtest.out/test-dir.0-0/mdtest_tree.162.0/file.mdtest.162.13744 [root@ec01 ~]# ls -l /exafs/home/sihara/mdtest.out/test-dir.0-0/mdtest_tree.162.0/file.mdtest.162.13744 ---------- 1 sihara users 0 Aug 2 17:00 /exafs/home/sihara/mdtest.out/test-dir.0-0/mdtest_tree.162.0/file.mdtest.162.13744

          If there isn't enough logging enabled for this test you could push a patch on master that modifies the test, in case it is hit again in the future? Maybe including your extra CDEBUG changes?

          adilger Andreas Dilger added a comment - If there isn't enough logging enabled for this test you could push a patch on master that modifies the test, in case it is hit again in the future? Maybe including your extra CDEBUG changes?
          laisiyao Lai Siyao added a comment -

          I checked the logs, but strangely didn't find any messages about these two files. And this test never failed in maloo test.

          laisiyao Lai Siyao added a comment - I checked the logs, but strangely didn't find any messages about these two files. And this test never failed in maloo test.

          Lai, I noticed a sanity test_103b failure on the Gerrit Janitor testing of an unrelated patch that might be the same as this issue:
          https://testing.whamcloud.com/gerrit-janitor/24464/testresults/sanity2-ldiskfs-DNE-centos7_x86_64-centos7_x86_64-retry2/

          == sanity test 103b: umask lfs setstripe ================= 18:43:45 (1659134625)
           sanity test_103b: @@@@@@ FAIL: lfs setstripe /mnt/lustre/f103b.sanity.s0074 '0000' != '0074' 
           sanity test_103b: @@@@@@ FAIL: lfs setstripe -N2 /mnt/lustre/f103b.sanity.m0061 '0000' != '0061' 
          

          it looks like some kind of race condition/bug with umask handling, and has full debug logs on both the client and server, so this might be easier to investigate than the huge mdtest reproducer.

          adilger Andreas Dilger added a comment - Lai, I noticed a sanity test_103b failure on the Gerrit Janitor testing of an unrelated patch that might be the same as this issue: https://testing.whamcloud.com/gerrit-janitor/24464/testresults/sanity2-ldiskfs-DNE-centos7_x86_64-centos7_x86_64-retry2/ == sanity test 103b: umask lfs setstripe ================= 18:43:45 (1659134625) sanity test_103b: @@@@@@ FAIL: lfs setstripe /mnt/lustre/f103b.sanity.s0074 '0000' != '0074' sanity test_103b: @@@@@@ FAIL: lfs setstripe -N2 /mnt/lustre/f103b.sanity.m0061 '0000' != '0061' it looks like some kind of race condition/bug with umask handling, and has full debug logs on both the client and server, so this might be easier to investigate than the huge mdtest reproducer.

          "Lai Siyao <lai.siyao@whamcloud.com>" uploaded a new patch: https://review.whamcloud.com/48071
          Subject: LU-16056 mdt: debug patch for wrong dir permission
          Project: fs/lustre-release
          Branch: master
          Current Patch Set: 1
          Commit: 7291ac8308437f11fffb4ef7b2a7dded2c59c0fd

          gerrit Gerrit Updater added a comment - "Lai Siyao <lai.siyao@whamcloud.com>" uploaded a new patch: https://review.whamcloud.com/48071 Subject: LU-16056 mdt: debug patch for wrong dir permission Project: fs/lustre-release Branch: master Current Patch Set: 1 Commit: 7291ac8308437f11fffb4ef7b2a7dded2c59c0fd

          People

            laisiyao Lai Siyao
            laisiyao Lai Siyao
            Votes:
            0 Vote for this issue
            Watchers:
            9 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: