Details

    • Bug
    • Resolution: Unresolved
    • Major
    • None
    • None
    • Alma 8 Lustre 2.15.4 client
      Centos 7 Lustre 2.12.9 server
    • 3
    • 9223372036854775807

    Description

      Environment:

      • No DNE
      • No DOM
      • Opencache disable

      Reproducer:

      On the first client:

      [root@client1 test_bug_2.12-2.15]# mkdir foo
      [root@client1 test_bug_2.12-2.15]# for i in 1 2 3 4 ; do dd if=/dev/zero bs=1k count=1k of=foo/$i  ; done
      [root@client1 test_bug_2.12-2.15]# cd foo
      [root@client1 foo]# tail -f 2
      ... 
      

      On the second client:

      [root@client2 test_bug_2.12-2.15]# mv foo/ foo.new ; mkdir foo ; mv foo.new foo/foo
      [root@client2 test_bug_2.12-2.15]# find
      .
      ./foo
      ./foo/foo
      ./foo/foo/4
      ./foo/foo/1
      ./foo/foo/2
      ./foo/foo/3 
      

      Back on the first one:

      ^C
      [root@client1 foo]# ls
      1  2  3  4
      [root@client1 foo]# cd ..
      [root@client1 test_bug_2.12-2.15]# ls foo/
      1  2  3  4 
      

      Here, we should have:

      [root@client1 test_bug_2.12-2.15]# ls foo/
      foo
      

      Then, if the directory is touched then the directory entry is sync:

      [root@client1 test_bug_2.12-2.15]# touch foo/2
      [root@client1 test_bug_2.12-2.15]# ls foo/
      foo
      [root@client1 test_bug_2.12-2.15]# ls foo/foo/
      1  2  3  4
      

      Attachments

        Issue Links

          Activity

            [LU-17583] Unclean direntry after rename

            "Oleg Drokin <green@whamcloud.com>" merged in patch https://review.whamcloud.com/c/fs/lustre-release/+/54354/
            Subject: LU-17583 llite: getattr/open should not revalidate dentry
            Project: fs/lustre-release
            Branch: master
            Current Patch Set:
            Commit: 32582842cae452984f74e76a4eb69379cc48ce5f

            gerrit Gerrit Updater added a comment - "Oleg Drokin <green@whamcloud.com>" merged in patch https://review.whamcloud.com/c/fs/lustre-release/+/54354/ Subject: LU-17583 llite: getattr/open should not revalidate dentry Project: fs/lustre-release Branch: master Current Patch Set: Commit: 32582842cae452984f74e76a4eb69379cc48ce5f

            patch https://review.whamcloud.com/32157 needs to be confirmed if stat performance can be boost on the single client.

            50 clients (VMs), 20 cores per node
            TCP network
            mpirun --hostfile hfile -np 1000 ./src/mdtest -d /mnt/cache/ -v -F -n 200 -w 1k -C -T -r -i3

            This is not determined whether if 1.2M stat performance because of 50 clients or not.

            Here is what I did test results with/without https://review.whamcloud.com/#/c/fs/lustre-release/+/54354/ on the single client.

            Server: Rockylinux8.8
            Client: Rockylinux8.9 (4.18.0-513.18.1.el8_9.x86_64)
            

            master (commit: 424b9ccb00)

            $ mpirun -np 24 --allow-run-as-root /work/tools/bin/mdtest -n 50000 -d /lustre/mdtest.out -F -C -T -v -w 32k
            
            SUMMARY rate: (of 1 iterations)
               Operation                      Max            Min           Mean        Std Dev
               ---------                      ---            ---           ----        -------
               File creation             :       5696.717       5696.717       5696.717          0.000
               File stat                 :     321496.308     321496.163     321496.240          0.030
               File read                 :          0.000          0.000          0.000          0.000
               File removal              :          0.000          0.000          0.000          0.000
               Tree creation             :        139.307        139.307        139.307          0.000
               Tree removal              :          0.000          0.000          0.000          0.000
            V-1: Entering PrintTimestamp...
            

            master (commit: 424b9ccb00) + patch https://review.whamcloud.com/#/c/fs/lustre-release/+/54354/

            $ mpirun -np 24 --allow-run-as-root /work/tools/bin/mdtest -n 50000 -d /lustre/mdtest.out -F -C -T -v -w 32k
            
            SUMMARY rate: (of 1 iterations)
               Operation                      Max            Min           Mean        Std Dev
               ---------                      ---            ---           ----        -------
               File creation             :       5713.001       5713.001       5713.001          0.000
               File stat                 :     327804.444     327804.270     327804.367          0.050
               File read                 :          0.000          0.000          0.000          0.000
               File removal              :          0.000          0.000          0.000          0.000
               Tree creation             :         41.311         41.311         41.311          0.000
               Tree removal              :          0.000          0.000          0.000          0.000
            V-1: Entering PrintTimestamp...
            

            I didn't see obvious regression with patch in this configuration.

            sihara Shuichi Ihara added a comment - patch https://review.whamcloud.com/32157 needs to be confirmed if stat performance can be boost on the single client. 50 clients (VMs), 20 cores per node TCP network mpirun --hostfile hfile -np 1000 ./src/mdtest -d /mnt/cache/ -v -F -n 200 -w 1k -C -T -r -i3 This is not determined whether if 1.2M stat performance because of 50 clients or not. Here is what I did test results with/without https://review.whamcloud.com/#/c/fs/lustre-release/+/54354/ on the single client. Server: Rockylinux8.8 Client: Rockylinux8.9 (4.18.0-513.18.1.el8_9.x86_64) master (commit: 424b9ccb00) $ mpirun -np 24 --allow-run-as-root /work/tools/bin/mdtest -n 50000 -d /lustre/mdtest.out -F -C -T -v -w 32k SUMMARY rate: (of 1 iterations) Operation Max Min Mean Std Dev --------- --- --- ---- ------- File creation : 5696.717 5696.717 5696.717 0.000 File stat : 321496.308 321496.163 321496.240 0.030 File read : 0.000 0.000 0.000 0.000 File removal : 0.000 0.000 0.000 0.000 Tree creation : 139.307 139.307 139.307 0.000 Tree removal : 0.000 0.000 0.000 0.000 V-1: Entering PrintTimestamp... master (commit: 424b9ccb00) + patch https://review.whamcloud.com/#/c/fs/lustre-release/+/54354/ $ mpirun -np 24 --allow-run-as-root /work/tools/bin/mdtest -n 50000 -d /lustre/mdtest.out -F -C -T -v -w 32k SUMMARY rate: (of 1 iterations) Operation Max Min Mean Std Dev --------- --- --- ---- ------- File creation : 5713.001 5713.001 5713.001 0.000 File stat : 327804.444 327804.270 327804.367 0.050 File read : 0.000 0.000 0.000 0.000 File removal : 0.000 0.000 0.000 0.000 Tree creation : 41.311 41.311 41.311 0.000 Tree removal : 0.000 0.000 0.000 0.000 V-1: Entering PrintTimestamp... I didn't see obvious regression with patch in this configuration.

            "Etienne AUJAMES <eaujames@ddn.com>" uploaded a new patch: https://review.whamcloud.com/c/fs/lustre-release/+/55295
            Subject: LU-17583 llite: getattr/open should not revalidate dentry
            Project: fs/lustre-release
            Branch: b2_15
            Current Patch Set: 1
            Commit: db81162b82315dc15926a02f4cb9f165c75709e3

            gerrit Gerrit Updater added a comment - "Etienne AUJAMES <eaujames@ddn.com>" uploaded a new patch: https://review.whamcloud.com/c/fs/lustre-release/+/55295 Subject: LU-17583 llite: getattr/open should not revalidate dentry Project: fs/lustre-release Branch: b2_15 Current Patch Set: 1 Commit: db81162b82315dc15926a02f4cb9f165c75709e3

            "Etienne AUJAMES <eaujames@ddn.com>" uploaded a new patch: https://review.whamcloud.com/c/fs/lustre-release/+/54854
            Subject: LU-17583 mdt: don't fetch LOOKUP lock for getattr/open by fid
            Project: fs/lustre-release
            Branch: master
            Current Patch Set: 1
            Commit: ac6daa3a6fd8ddc87af2111c6258812ca85ab91b

            gerrit Gerrit Updater added a comment - "Etienne AUJAMES <eaujames@ddn.com>" uploaded a new patch: https://review.whamcloud.com/c/fs/lustre-release/+/54854 Subject: LU-17583 mdt: don't fetch LOOKUP lock for getattr/open by fid Project: fs/lustre-release Branch: master Current Patch Set: 1 Commit: ac6daa3a6fd8ddc87af2111c6258812ca85ab91b
            eaujames Etienne Aujames added a comment - - edited

            I benched the 2 patches and the master branch:

            Evironement
            The test is similar than the "LU-10948 llite: Revalidate dentries in ll_intent_file_open" (https://review.whamcloud.com/32157):

            • 50 clients (VMs), 20 cores per node
            • TCP network
            • mpirun --hostfile hfile -np 1000 ./src/mdtest -d /mnt/cache/ -v -F -n 200 -w 1k -C -T -r -i3

            Tests results

            baseline (master branch: 2.15.61_225_gbb6a2d2)
            -------
            SUMMARY rate (in ops/sec): (of 3 iterations)
               Operation                     Max            Min           Mean        Std Dev
               ---------                     ---            ---           ----        -------
               File creation                 973.425        970.273        971.558          1.655
               File stat                 1289674.130    1275339.298    1280954.250       7655.213
               File read                       0.000          0.000          0.000          0.000
               File removal                 1597.736       1586.937       1593.411          5.711
               Tree creation                 360.366         25.710        243.243        188.575
               Tree removal                   54.075         45.861         51.261          4.678
            
            no revalidate patch
            -------------------
            SUMMARY rate (in ops/sec): (of 3 iterations)
               Operation                     Max            Min           Mean        Std Dev
               ---------                     ---            ---           ----        -------
               File creation                 964.990        964.175        964.627          0.415
               File stat                 1290872.830    1277123.666    1282763.829       7199.393
               File read                       0.000          0.000          0.000          0.000
               File removal                 1586.650       1579.566       1583.856          3.772
               Tree creation                 357.906        127.050        276.751        129.798
               Tree removal                   54.996         52.189         54.031          1.596
            
            revalidate by name patch
            -----------------------
            SUMMARY rate (in ops/sec): (of 3 iterations)
               Operation                     Max            Min           Mean        Std Dev
               ---------                     ---            ---           ----        -------
               File creation                 965.137        963.721        964.344          0.723
               File stat                 1288246.143    1275527.402    1280273.795       6945.983
               File read                       0.000          0.000          0.000          0.000
               File removal                 1586.897       1579.279       1582.628          3.892
               Tree creation                 321.945         29.644        206.441        155.491
               Tree removal                   56.468         41.925         48.071          7.529
            

            So, no major performance regression detected with or without the open/getattr entry revalidation.
            If no objection, I will abandon the "lite: getattr/open should revalidate dentry by name" (https://review.whamcloud.com/54607). The patch does not improve the performance significantly.

            eaujames Etienne Aujames added a comment - - edited I benched the 2 patches and the master branch: Evironement The test is similar than the " LU-10948 llite: Revalidate dentries in ll_intent_file_open" ( https://review.whamcloud.com/32157): 50 clients (VMs), 20 cores per node TCP network mpirun --hostfile hfile -np 1000 ./src/mdtest -d /mnt/cache/ -v -F -n 200 -w 1k -C -T -r -i3 Tests results baseline (master branch: 2.15.61_225_gbb6a2d2) ------- SUMMARY rate (in ops/sec): (of 3 iterations) Operation Max Min Mean Std Dev --------- --- --- ---- ------- File creation 973.425 970.273 971.558 1.655 File stat 1289674.130 1275339.298 1280954.250 7655.213 File read 0.000 0.000 0.000 0.000 File removal 1597.736 1586.937 1593.411 5.711 Tree creation 360.366 25.710 243.243 188.575 Tree removal 54.075 45.861 51.261 4.678 no revalidate patch ------------------- SUMMARY rate (in ops/sec): (of 3 iterations) Operation Max Min Mean Std Dev --------- --- --- ---- ------- File creation 964.990 964.175 964.627 0.415 File stat 1290872.830 1277123.666 1282763.829 7199.393 File read 0.000 0.000 0.000 0.000 File removal 1586.650 1579.566 1583.856 3.772 Tree creation 357.906 127.050 276.751 129.798 Tree removal 54.996 52.189 54.031 1.596 revalidate by name patch ----------------------- SUMMARY rate (in ops/sec): (of 3 iterations) Operation Max Min Mean Std Dev --------- --- --- ---- ------- File creation 965.137 963.721 964.344 0.723 File stat 1288246.143 1275527.402 1280273.795 6945.983 File read 0.000 0.000 0.000 0.000 File removal 1586.897 1579.279 1582.628 3.892 Tree creation 321.945 29.644 206.441 155.491 Tree removal 56.468 41.925 48.071 7.529 So, no major performance regression detected with or without the open/getattr entry revalidation. If no objection, I will abandon the "lite: getattr/open should revalidate dentry by name" ( https://review.whamcloud.com/54607 ). The patch does not improve the performance significantly.

            "Etienne AUJAMES <eaujames@ddn.com>" uploaded a new patch: https://review.whamcloud.com/c/fs/lustre-release/+/54607
            Subject: LU-17583 llite: getattr/open should revalidate dentry by name
            Project: fs/lustre-release
            Branch: master
            Current Patch Set: 1
            Commit: 9e69ad31f0116a6eb4089a631a164ee45cd8c0e7

            gerrit Gerrit Updater added a comment - "Etienne AUJAMES <eaujames@ddn.com>" uploaded a new patch: https://review.whamcloud.com/c/fs/lustre-release/+/54607 Subject: LU-17583 llite: getattr/open should revalidate dentry by name Project: fs/lustre-release Branch: master Current Patch Set: 1 Commit: 9e69ad31f0116a6eb4089a631a164ee45cd8c0e7

            "Etienne AUJAMES <eaujames@ddn.com>" uploaded a new patch: https://review.whamcloud.com/c/fs/lustre-release/+/54354
            Subject: LU-17583 llite: ll_getattr should not revalidate dentry
            Project: fs/lustre-release
            Branch: master
            Current Patch Set: 1
            Commit: 37a19ec08985248bb2a8727d7e122b068d7e2ca6

            gerrit Gerrit Updater added a comment - "Etienne AUJAMES <eaujames@ddn.com>" uploaded a new patch: https://review.whamcloud.com/c/fs/lustre-release/+/54354 Subject: LU-17583 llite: ll_getattr should not revalidate dentry Project: fs/lustre-release Branch: master Current Patch Set: 1 Commit: 37a19ec08985248bb2a8727d7e122b068d7e2ca6

            "Etienne AUJAMES <eaujames@ddn.com>" uploaded a new patch: https://review.whamcloud.com/c/fs/lustre-release/+/54352
            Subject: LU-17583 tests: bug reproducer
            Project: fs/lustre-release
            Branch: master
            Current Patch Set: 1
            Commit: e6891440f7989fc4b2f666dd236647c865577f60

            gerrit Gerrit Updater added a comment - "Etienne AUJAMES <eaujames@ddn.com>" uploaded a new patch: https://review.whamcloud.com/c/fs/lustre-release/+/54352 Subject: LU-17583 tests: bug reproducer Project: fs/lustre-release Branch: master Current Patch Set: 1 Commit: e6891440f7989fc4b2f666dd236647c865577f60
            eaujames Etienne Aujames added a comment - - edited

            I have reproduced the issue with master branch one client (RHEL 8.9) and a 2.15.4 server.

            Reverting the 92fadf9 "LU-15200 llite: revalidate dentry if LOOKUP lock fetched" (https://review.whamcloud.com/45599) seems to fix the issue.

            int ll_revalidate_it_finish()
            ...
                    if (bits & MDS_INODELOCK_LOOKUP) {
                            if (!ll_d_setup(de, true))
                                    RETURN(-ENOMEM);
                            d_lustre_revalidate(de);                    <-------------
            

            ll_getattr checks the entry by fid, it does not resolve it by name, so it cannot revalidate the direntry.

            eaujames Etienne Aujames added a comment - - edited I have reproduced the issue with master branch one client (RHEL 8.9) and a 2.15.4 server. Reverting the 92fadf9 " LU-15200 llite: revalidate dentry if LOOKUP lock fetched" ( https://review.whamcloud.com/45599 ) seems to fix the issue. int ll_revalidate_it_finish() ... if (bits & MDS_INODELOCK_LOOKUP) { if (!ll_d_setup(de, true)) RETURN(-ENOMEM); d_lustre_revalidate(de); <------------- ll_getattr checks the entry by fid, it does not resolve it by name, so it cannot revalidate the direntry.

            People

              eaujames Etienne Aujames
              eaujames Etienne Aujames
              Votes:
              0 Vote for this issue
              Watchers:
              7 Start watching this issue

              Dates

                Created:
                Updated: