Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-15913

rename stress test leads to REMOTE_PARENT_DIR corruption

Details

    • Bug
    • Resolution: Fixed
    • Blocker
    • Lustre 2.16.0
    • None
    • None
    • 3
    • 9223372036854775807

    Description

      Streess test with active renaming of files and directories inside a striped directory finished with an error:

      Performing actions for 'rename(2,676pp674,aaaaaaaaaa,bbbbbbbbbb,cccccccccc,dddddddddd,eeeeeeeeee,ffffffffff,gggggggggg,hhhhhhhhhh,iiiiiiiiii,jjjjjjjjjj,kkkkkkkkkk,llllllllll,mmmmmmmmmm,nnnnnnnnnn,oooooo -> 3,676pp674,aaaaaaaaaa,bbbbbbbbbb,cccccccccc,dddddddddd,eeeeeeeeee,ffffffffff,gggggggggg,hhhhhhhhhh,i) error Input/output error (5)'
      numerrs=1
      Fri Jun  3 01:06:30 CDT 2022 

      leads to corrupted MDT partition. E2fsck reports next symptoms:

      1) incorrect filetype

      2) a link to directory

      3) .. points to the REMOTE_PARENT_DIR but should point to the special directory with sequence name

      ./data.20220423/server/e2fsck.pre_read_only.kjcf04n03.6.1-010.43.20220504165818.out.kjcf04n03:Entry '14,37275pp37273,aaaaaaaaaa,bbbbbbbbbb,cccccccccc,dddddddddd,eeeeeeeeee,ffffffffff,gggggggggg,hhh' in /REMOTE_PARENT_DIR/0x24006497a:0x129bc:0x0 (4014289161) has an incorrect filetype (was 17, should be 2).
      
      ./data.20220423/server/e2fsck.pre_read_only.kjcf04n03.6.1-010.43.20220504165818.out.kjcf04n03:Entry '0x24006b05d:0xa4c8:0x0' in /REMOTE_PARENT_DIR (4030089985) is a link to directory /REMOTE_PARENT_DIR/0x24006497a:0x129bc:0x0/14,37275pp37273,aaaaaaaaaa,bbbbbbbbbb,cccccccccc,dddddddddd,eeeeeeeeee,ffffffffff,gggggggggg,hhh (4014289162 fid=[0x20006a4b1:0x42c6:0x0]).
      
      ./data.20220423/server/e2fsck.pre_read_only.kjcf04n03.6.1-010.43.20220504165818.out.kjcf04n03:'..' in /REMOTE_PARENT_DIR/0x24006497a:0x129bc:0x0/14,37275pp37273,aaaaaaaaaa,bbbbbbbbbb,cccccccccc,dddddddddd,eeeeeeeeee,ffffffffff,gggggggggg,hhh (4014289162) is /REMOTE_PARENT_DIR (4030089985), should be /REMOTE_PARENT_DIR/0x24006497a:0x129bc:0x0 (4014289161).  

       

      Attachments

        Issue Links

          Activity

            [LU-15913] rename stress test leads to REMOTE_PARENT_DIR corruption

            Patch was landed to master in 2.15.51.

            adilger Andreas Dilger added a comment - Patch was landed to master in 2.15.51.

            "Etienne AUJAMES <eaujames@ddn.com>" uploaded a new patch: https://review.whamcloud.com/c/fs/lustre-release/+/57815
            Subject: LU-15913 mdt: disable parallel rename for striped dirs
            Project: fs/lustre-release
            Branch: b2_15
            Current Patch Set: 1
            Commit: 67c9076e4aed3b20f1bae576de4c9f3f9619a130

            gerrit Gerrit Updater added a comment - "Etienne AUJAMES <eaujames@ddn.com>" uploaded a new patch: https://review.whamcloud.com/c/fs/lustre-release/+/57815 Subject: LU-15913 mdt: disable parallel rename for striped dirs Project: fs/lustre-release Branch: b2_15 Current Patch Set: 1 Commit: 67c9076e4aed3b20f1bae576de4c9f3f9619a130

            "Oleg Drokin <green@whamcloud.com>" merged in patch https://review.whamcloud.com/c/fs/lustre-release/+/55143/
            Subject: LU-15913 tests: fix set_params_xxx
            Project: fs/lustre-release
            Branch: master
            Current Patch Set:
            Commit: 214432e2595715458102365870e4573eb672439c

            gerrit Gerrit Updater added a comment - "Oleg Drokin <green@whamcloud.com>" merged in patch https://review.whamcloud.com/c/fs/lustre-release/+/55143/ Subject: LU-15913 tests: fix set_params_xxx Project: fs/lustre-release Branch: master Current Patch Set: Commit: 214432e2595715458102365870e4573eb672439c

            "Sergey Cheremencev <scherementsev@ddn.com>" uploaded a new patch: https://review.whamcloud.com/c/fs/lustre-release/+/55143
            Subject: LU-15913 tests: fix set_params_xxx
            Project: fs/lustre-release
            Branch: master
            Current Patch Set: 1
            Commit: bcf4095b63d7e46ab017afaaae0bfb7a5e51112c

            gerrit Gerrit Updater added a comment - "Sergey Cheremencev <scherementsev@ddn.com>" uploaded a new patch: https://review.whamcloud.com/c/fs/lustre-release/+/55143 Subject: LU-15913 tests: fix set_params_xxx Project: fs/lustre-release Branch: master Current Patch Set: 1 Commit: bcf4095b63d7e46ab017afaaae0bfb7a5e51112c

            "Oleg Drokin <green@whamcloud.com>" merged in patch https://review.whamcloud.com/c/fs/lustre-release/+/47643/
            Subject: LU-15913 tests: add rename stress test via racer
            Project: fs/lustre-release
            Branch: master
            Current Patch Set:
            Commit: 9a1d68f9b8d9dc7edcbcbc9543450e046c8303b4

            gerrit Gerrit Updater added a comment - "Oleg Drokin <green@whamcloud.com>" merged in patch https://review.whamcloud.com/c/fs/lustre-release/+/47643/ Subject: LU-15913 tests: add rename stress test via racer Project: fs/lustre-release Branch: master Current Patch Set: Commit: 9a1d68f9b8d9dc7edcbcbc9543450e046c8303b4

            "Sergey Cheremencev <scherementsev@ddn.com>" uploaded a new patch: https://review.whamcloud.com/c/fs/lustre-release/+/53981
            Subject: LU-15913 tests: clean between racer 1 and 2
            Project: fs/lustre-release
            Branch: master
            Current Patch Set: 1
            Commit: 81cb6f72de12dc6a3aca486a8585405dca755fc3

            gerrit Gerrit Updater added a comment - "Sergey Cheremencev <scherementsev@ddn.com>" uploaded a new patch: https://review.whamcloud.com/c/fs/lustre-release/+/53981 Subject: LU-15913 tests: clean between racer 1 and 2 Project: fs/lustre-release Branch: master Current Patch Set: 1 Commit: 81cb6f72de12dc6a3aca486a8585405dca755fc3

            "Andreas Dilger <adilger@whamcloud.com>" uploaded a new patch: https://review.whamcloud.com/c/fs/lustre-release/+/53768
            Subject: LU-15913 tests: add rename stress test via racer (testing)
            Project: fs/lustre-release
            Branch: master
            Current Patch Set: 1
            Commit: b49b73d379b33705688a82c535b824c660c64641

            gerrit Gerrit Updater added a comment - "Andreas Dilger <adilger@whamcloud.com>" uploaded a new patch: https://review.whamcloud.com/c/fs/lustre-release/+/53768 Subject: LU-15913 tests: add rename stress test via racer (testing) Project: fs/lustre-release Branch: master Current Patch Set: 1 Commit: b49b73d379b33705688a82c535b824c660c64641

            "Oleg Drokin <green@whamcloud.com>" merged in patch https://review.whamcloud.com/47593/
            Subject: LU-15913 mdt: disable parallel rename for striped dirs
            Project: fs/lustre-release
            Branch: master
            Current Patch Set:
            Commit: f238540c879dc668e18cf99cba62f117ccae64d6

            gerrit Gerrit Updater added a comment - "Oleg Drokin <green@whamcloud.com>" merged in patch https://review.whamcloud.com/47593/ Subject: LU-15913 mdt: disable parallel rename for striped dirs Project: fs/lustre-release Branch: master Current Patch Set: Commit: f238540c879dc668e18cf99cba62f117ccae64d6
            spitzcor Cory Spitz added a comment -

            > Was testing not being done with 2.15.0-RC5?
            Correct. The report was filed for the previous RC. I regret that that wasn't made clear. But, at least it has uncovered this broken rollback, which could still happen in theory for other reasons (say, an ENOMEM experienced along the way).

            spitzcor Cory Spitz added a comment - > Was testing not being done with 2.15.0-RC5? Correct. The report was filed for the previous RC. I regret that that wasn't made clear. But, at least it has uncovered this broken rollback, which could still happen in theory for other reasons (say, an ENOMEM experienced along the way).

            one option is to make the failing update (e.g. insert) the very first update in the batch so there would be nothing to rollback.

            bzzz Alex Zhuravlev added a comment - one option is to make the failing update (e.g. insert) the very first update in the batch so there would be nothing to rollback.

            People

              adilger Andreas Dilger
              artem_blagodarenko Artem Blagodarenko (Inactive)
              Votes:
              0 Vote for this issue
              Watchers:
              15 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: