Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-1410

Test failure on test suite sanity, subtest test_200c

Details

    • Bug
    • Resolution: Fixed
    • Minor
    • Lustre 2.4.0
    • Lustre 2.3.0, Lustre 1.8.9
    • None
    • 3
    • 4481

    Description

      This issue was created by maloo for sarah <sarah@whamcloud.com>

      This issue relates to the following test suite run: https://maloo.whamcloud.com/test_sets/b43d3072-9ecb-11e1-b567-52540035b04c.

      The sub-test test_200c failed with the following error:

      Cannot set pool cea1 to /mnt/lustre/d200.pools/dir_tst

      I got this error when doing rolling upgrade from 1.8.7 to 2.2.52. MDS is upgraded to 2.2.52-RHEL6 while OST and clients are 1.8.7
      The original configuration is
      MDS, OST:1.8.7-RHEL5; client 1:1.8.7-RHEL5; client 2:1.8.7-RHEL6

      Attachments

        Activity

          [LU-1410] Test failure on test suite sanity, subtest test_200c

          B1_8 patch now landed.

          keith Keith Mannthey (Inactive) added a comment - B1_8 patch now landed.
          pjones Peter Jones added a comment -

          ok then dropping priority. The patches can still land to improve the flexibility of the test in the long term but this is really only a problem that will crop up in testing situations and not production situations.

          pjones Peter Jones added a comment - ok then dropping priority. The patches can still land to improve the flexibility of the test in the long term but this is really only a problem that will crop up in testing situations and not production situations.
          sarah Sarah Liu added a comment -

          Hi Peter, yes, I know the workaround way to run this test

          sarah Sarah Liu added a comment - Hi Peter, yes, I know the workaround way to run this test
          pjones Peter Jones added a comment -

          Sarah

          Given that upgrade/downgrade testing is done manually, does knowing what triggers this issue allow you to setup in a way to workaround it and complete the rest of the testing?

          Peter

          pjones Peter Jones added a comment - Sarah Given that upgrade/downgrade testing is done manually, does knowing what triggers this issue allow you to setup in a way to workaround it and complete the rest of the testing? Peter

          I have given up my nodes as the initial issue has patches pending. An official retest should be the next step.

          A patched 1.8 and Master will allow the automated the run the Sanity 200 tests to run on a one OST setup.

          keith Keith Mannthey (Inactive) added a comment - I have given up my nodes as the initial issue has patches pending. An official retest should be the next step. A patched 1.8 and Master will allow the automated the run the Sanity 200 tests to run on a one OST setup.

          Master also has this test issue. I have submitted patches for both 1_8 and Master for further review and test.

          b1_8
          http://review.whamcloud.com/3731

          master:
          http://review.whamcloud.com/3730

          keith Keith Mannthey (Inactive) added a comment - Master also has this test issue. I have submitted patches for both 1_8 and Master for further review and test. b1_8 http://review.whamcloud.com/3731 master: http://review.whamcloud.com/3730

          Well this issue is in the sanity test.

          There are 2 versions of this test. The 1.8.8 version and the master version. The main client is 1.8.8 so it uses the older test code. When it upgrades (clients upgrade last) it will use the master test code. This is very likely a 1.8.8 branch issue with Sanity 200 an a 1 OST only configuration. A full rolling upgrade test (it will take a while) will tell us if a 2.3 change is needed. Very likely running with 2 or more OSTs and 1.8 is fine with this test.

          At this point there is no indication there is a problem with master but more testing is needed.

          keith Keith Mannthey (Inactive) added a comment - Well this issue is in the sanity test. There are 2 versions of this test. The 1.8.8 version and the master version. The main client is 1.8.8 so it uses the older test code. When it upgrades (clients upgrade last) it will use the master test code. This is very likely a 1.8.8 branch issue with Sanity 200 an a 1 OST only configuration. A full rolling upgrade test (it will take a while) will tell us if a 2.3 change is needed. Very likely running with 2 or more OSTs and 1.8 is fine with this test. At this point there is no indication there is a problem with master but more testing is needed.

          Ok so the root issues this line:

           pdsh -l root -t 100 -S -w client-12vm3 '(PATH=$PATH:/usr/lib64/lustre/utils:/usr/lib64/lustre/tests:/sbin:/usr/sbin; cd /usr/lib64/lustre/tests; sh -c "/usr/sbin/lctl' pool_add lustre.cea1 'lustre-OST[1-0/2]")'
           
          

          lustre-OST[1-0/2] is the ostname index list arg and it should be lustre-OST[0-1/2] for the one OST case. Sarah and I are working to fine the correct part of the testing macro stack to fix.

          I have manually test the lustre-OST[0-1/2] change and it works.

          I will update when the issue is fixed.

          keith Keith Mannthey (Inactive) added a comment - Ok so the root issues this line: pdsh -l root -t 100 -S -w client-12vm3 '(PATH=$PATH:/usr/lib64/lustre/utils:/usr/lib64/lustre/tests:/sbin:/usr/sbin; cd /usr/lib64/lustre/tests; sh -c "/usr/sbin/lctl' pool_add lustre.cea1 'lustre-OST[1-0/2]")' lustre-OST [1-0/2] is the ostname index list arg and it should be lustre-OST [0-1/2] for the one OST case. Sarah and I are working to fine the correct part of the testing macro stack to fix. I have manually test the lustre-OST [0-1/2] change and it works. I will update when the issue is fixed.

          With Chris's help I have a virtual node with persistent storage. YEA! I have just kicked off the first test run, I will update when I know more.

          keith Keith Mannthey (Inactive) added a comment - With Chris's help I have a virtual node with persistent storage. YEA! I have just kicked off the first test run, I will update when I know more.

          It appears we can add persistent storage to the VM nodes. I am working to enable this so virtual (easy to get) nodes are able to complete this test.

          keith Keith Mannthey (Inactive) added a comment - It appears we can add persistent storage to the VM nodes. I am working to enable this so virtual (easy to get) nodes are able to complete this test.

          People

            keith Keith Mannthey (Inactive)
            maloo Maloo
            Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: