Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-11643

create disk images for Lustre 2.10 and 2.12 for ldiskfs

Details

    • Question/Request
    • Resolution: Fixed
    • Minor
    • Lustre 2.15.0
    • Lustre 2.12.0
    • 9223372036854775807

    Description

      We need to create disk images for Lustre 2.10.0 and 2.12.0 for conf-sanity test_32 upgrade regression testing, similar to the existing lustre/tests/disk*.tar.bz2 files.

      These new test filesystems should include files using the new features for those releases, and tests that verify they are working properly:

      • PFL file layouts
      • project quotas
      • FLR mirrored files
      • DoM files on the MDTs

      Also, some of the files need to be striped over multiple OSTs (e.g. "lfs setstripe -c 2"), but store data on only a single object (i.e. size = 4KB). This relates to an LFSCK issue that I saw with filter_fid and would be good to test.

      Attachments

        Issue Links

          Activity

            [LU-11643] create disk images for Lustre 2.10 and 2.12 for ldiskfs
            sarah Sarah Liu added a comment -

            Hello,
            The test_32newtarball I used for 2.12 should be in the master code.

            sarah Sarah Liu added a comment - Hello, The test_32newtarball I used for 2.12 should be in the master code.

            sarah ,
            which code was used to create the disk2_12 image ? I am trying to add one more upgrade test (LU-16082)  but cannot recreate the image easily.

            zam Alexander Zarochentsev added a comment - sarah , which code was used to create the disk2_12 image ? I am trying to add one more upgrade test ( LU-16082 )  but cannot recreate the image easily.
            pjones Peter Jones added a comment -

            Fixed in 2.15

            pjones Peter Jones added a comment - Fixed in 2.15
            adilger Andreas Dilger added a comment - - edited

            It looks like the test should be checking both remote_dir and striped_dir_old. It looks like a few parts of the test check for dne_upgrade, but they should always run that part of the test if mds2_is_available is true and the directory exists. It may be that we don't need to regenerate the images at all.

            Please make any changes starting with my patch https://review.whamcloud.com/46404 so that we keep the debug messages.

            adilger Andreas Dilger added a comment - - edited It looks like the test should be checking both remote_dir and striped_dir_old . It looks like a few parts of the test check for dne_upgrade , but they should always run that part of the test if mds2_is_available is true and the directory exists. It may be that we don't need to regenerate the images at all. Please make any changes starting with my patch https://review.whamcloud.com/46404 so that we keep the debug messages.
            sarah Sarah Liu added a comment - - edited

            For these 2 images, when doing the sha1sums check, it needs to go into the "remote_dir" dir instead of ROOT or striped_dir. I think we need to set the "pfl_upgrade=yes" (or other symbols) in test_32x which failed this part.

            if $r test -f $tmp/sha1sums; then
                                    # LU-2393 - do both sorts on same node to ensure locale
                                    # is identical
                                    $r cat $tmp/sha1sums | sort -k 2 >$tmp/sha1sums.orig
                                    if [ "$dne_upgrade" != "no" ]; then
                                            pushd $tmp/mnt/lustre/striped_dir
                                    elif [ "$pfl_upgrade" != "no" ] ||
                                            [ "$flr_upgrade" != "no" ] ||
                                            [ "$dom_new_upgrade" != "no" ] ||
                                            [ "$project_quota_upgrade" != "no" ]; then
                                            pushd $tmp/mnt/lustre/remote_dir
                                    else
                                            pushd $tmp/mnt/lustre
                                    fi
            

            I will try locally to verity first

            sarah Sarah Liu added a comment - - edited For these 2 images, when doing the sha1sums check, it needs to go into the "remote_dir" dir instead of ROOT or striped_dir. I think we need to set the "pfl_upgrade=yes" (or other symbols) in test_32x which failed this part. if $r test -f $tmp/sha1sums; then # LU-2393 - do both sorts on same node to ensure locale # is identical $r cat $tmp/sha1sums | sort -k 2 >$tmp/sha1sums.orig if [ "$dne_upgrade" != "no" ]; then pushd $tmp/mnt/lustre/striped_dir elif [ "$pfl_upgrade" != "no" ] || [ "$flr_upgrade" != "no" ] || [ "$dom_new_upgrade" != "no" ] || [ "$project_quota_upgrade" != "no" ]; then pushd $tmp/mnt/lustre/remote_dir else pushd $tmp/mnt/lustre fi I will try locally to verity first

            Looking at the Janitor testing, it also failed test_32c in the same way, which is checking the striped_dir, but the files are missing, so it may not be an image problem. What is needed at this point is to check with master whether the mounted filesystem shows the files to be present in the root directory and striped_dir, and then check whether this matches 2.12, or if there really is an upgrade problem.

            adilger Andreas Dilger added a comment - Looking at the Janitor testing, it also failed test_32c in the same way, which is checking the striped_dir , but the files are missing, so it may not be an image problem. What is needed at this point is to check with master whether the mounted filesystem shows the files to be present in the root directory and striped_dir , and then check whether this matches 2.12, or if there really is an upgrade problem.

            Yes, the patch https://review.whamcloud.com/46354 "LU-13514 tests: replace nid in conf-sanity test_32" fixed the conf-sanity test_33a failure and allowed the mdt2 image to be mounted.

            The current problem is test_33b looks like the filesystem image is missing files in the ROOT directory that the test expects to see when there are multiple MDTs (LU-15506). It looks like the test is failing before it checks the striped_dir:

            == checking sha1sums ==
            CMD: onyx-71vm5 cat /tmp/t32/sha1sums
            /tmp/t32/mnt/lustre
            --- /tmp/t32/sha1sums.orig	2022-02-01 23:27:02.120240417 +0000
            +++ /tmp/t32/sha1sums	2022-02-01 23:27:02.123240424 +0000
            @@ -1,10 +0,0 @@
            -59ced6686342e5fdff70a29277632622ad271168  ./init.d/functions
            -ff4f8d1bcd9ab4a9edcf77496e23963e5c6f6a2c  ./init.d/lsvcgss
            -f8f634b92b75af4112634a6f14464e562cd82454  ./init.d/lustre
            -dff7d87de75271f0714c3b82921d40c96598f67a  ./init.d/netconsole
            -21414c2b3c89f95d3eab00dafc954d3f6cf3ba9f  ./init.d/network
            -f87a11aceaf7dc0e1614ea074fda14d6896ac66f  ./init.d/README
            -92624163580750ca250a2c1cc8bd531d0609702a  ./init.d/rhnsd
            -a17ecaeb91c0218092c8b01308a132698da9b81f  ./pfl_dir/pfl_file
            -da39a3ee5e6b4b0d3255bfef95601890afd80709  ./project_quota_dir/pj_quota_file_old
            -2c72448b440f16c9fae18e287ca827c25d29a7cb  ./rc.local
            ==** find returned files **==
            

            My patch https://review.whamcloud.com/46404 "LU-15506 tests: improve conf-sanity test_32 messages" is adding some debugging to make it more clear which phases the test is running, and where the files are missing. I haven't had time to actually mount the disk2_10 filesystem locally and check whether the files are there, just doing everything via autotest/Maloo.

            The best solution would be fixing the test32newtarball() function to properly fill the image for what the tests expect, if that is really the problem, rather than fixing the images by hand.

            It also looks like the disk2_5-ldiskfs.tar.bz2 image has a similar problem, which is causing the Janitor test failures, but it has never been tested because it wasn't included in the lustre-tests RPM due to not being added to lustre/tests/Makefile.am. I'm not sure whether it is worthwhile to fix that image at this point, or maybe it should be removed from Git entirely (though we would "lose" some test coverage in this case).

            adilger Andreas Dilger added a comment - Yes, the patch https://review.whamcloud.com/46354 " LU-13514 tests: replace nid in conf-sanity test_32 " fixed the conf-sanity test_33a failure and allowed the mdt2 image to be mounted. The current problem is test_33b looks like the filesystem image is missing files in the ROOT directory that the test expects to see when there are multiple MDTs ( LU-15506 ). It looks like the test is failing before it checks the striped_dir : == checking sha1sums == CMD: onyx-71vm5 cat /tmp/t32/sha1sums /tmp/t32/mnt/lustre --- /tmp/t32/sha1sums.orig 2022-02-01 23:27:02.120240417 +0000 +++ /tmp/t32/sha1sums 2022-02-01 23:27:02.123240424 +0000 @@ -1,10 +0,0 @@ -59ced6686342e5fdff70a29277632622ad271168 ./init.d/functions -ff4f8d1bcd9ab4a9edcf77496e23963e5c6f6a2c ./init.d/lsvcgss -f8f634b92b75af4112634a6f14464e562cd82454 ./init.d/lustre -dff7d87de75271f0714c3b82921d40c96598f67a ./init.d/netconsole -21414c2b3c89f95d3eab00dafc954d3f6cf3ba9f ./init.d/network -f87a11aceaf7dc0e1614ea074fda14d6896ac66f ./init.d/README -92624163580750ca250a2c1cc8bd531d0609702a ./init.d/rhnsd -a17ecaeb91c0218092c8b01308a132698da9b81f ./pfl_dir/pfl_file -da39a3ee5e6b4b0d3255bfef95601890afd80709 ./project_quota_dir/pj_quota_file_old -2c72448b440f16c9fae18e287ca827c25d29a7cb ./rc.local ==** find returned files **== My patch https://review.whamcloud.com/46404 " LU-15506 tests: improve conf-sanity test_32 messages " is adding some debugging to make it more clear which phases the test is running, and where the files are missing. I haven't had time to actually mount the disk2_10 filesystem locally and check whether the files are there, just doing everything via autotest/Maloo. The best solution would be fixing the test32newtarball() function to properly fill the image for what the tests expect, if that is really the problem, rather than fixing the images by hand. It also looks like the disk2_5-ldiskfs.tar.bz2 image has a similar problem, which is causing the Janitor test failures, but it has never been tested because it wasn't included in the lustre-tests RPM due to not being added to lustre/tests/Makefile.am . I'm not sure whether it is worthwhile to fix that image at this point, or maybe it should be removed from Git entirely (though we would "lose" some test coverage in this case).
            sarah Sarah Liu added a comment - - edited

            test_32a hung with new disk images are known issue, I commented on https://jira.whamcloud.com/browse/LU-11643?focusedCommentId=290110&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-290110, That "mount client" code was added by LU-12846. And if I do writeconf, the mount can pass.

            sarah Sarah Liu added a comment - - edited test_32a hung with new disk images are known issue, I commented on https://jira.whamcloud.com/browse/LU-11643?focusedCommentId=290110&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-290110 , That "mount client" code was added by LU-12846 . And if I do writeconf, the mount can pass.
            pjones Peter Jones added a comment -

            Is this fixed by the landing of https://review.whamcloud.com/46354/?

            pjones Peter Jones added a comment - Is this fixed by the landing of https://review.whamcloud.com/46354/?

            "Andreas Dilger <adilger@whamcloud.com>" uploaded a new patch: https://review.whamcloud.com/46353
            Subject: LU-11643 tests: skip some conf-sanity test_32 tests
            Project: fs/lustre-release
            Branch: master
            Current Patch Set: 1
            Commit: fcac417a66e49f99522e4d124783e43bb36f793b

            gerrit Gerrit Updater added a comment - "Andreas Dilger <adilger@whamcloud.com>" uploaded a new patch: https://review.whamcloud.com/46353 Subject: LU-11643 tests: skip some conf-sanity test_32 tests Project: fs/lustre-release Branch: master Current Patch Set: 1 Commit: fcac417a66e49f99522e4d124783e43bb36f793b

            People

              sarah Sarah Liu
              adilger Andreas Dilger
              Votes:
              0 Vote for this issue
              Watchers:
              6 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: