Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-12415

conf-sanity test 69 fails with 'OST replacement created too many inodes; X'

Details

    • Bug
    • Resolution: Duplicate
    • Major
    • None
    • Lustre 2.13.0
    • ZFS
    • 3
    • 9223372036854775807

    Description

      conf-sanity test_69 fails with 'OST replacement created too many inodes; 96444' for ZFS testing only. Looking at the client test_log, we see

      trevis-14vm12: osc.lustre-OST0000-osc-MDT0000.ost_server_uuid in FULL state after 11 sec
      mount lustre on /mnt/lustre.....
      Starting client: trevis-14vm9.trevis.whamcloud.com:  -o user_xattr,flock trevis-14vm12@tcp:/lustre /mnt/lustre
      CMD: trevis-14vm9.trevis.whamcloud.com mkdir -p /mnt/lustre
      CMD: trevis-14vm9.trevis.whamcloud.com mount -t lustre -o user_xattr,flock trevis-14vm12@tcp:/lustre /mnt/lustre
      On OST0, 96444 used inodes
       conf-sanity test_69: @@@@@@ FAIL: OST replacement created too many inodes; 96444 
      

      There’s no error messages in the console logs that indicate a problem happened.

      conf-sanity test 69 started failing with this error message on 2019-05-27 with Lustre version 2.12.53.62. Here are some links to failed conf-sanity test 69 logs:
      https://testing.whamcloud.com/test_sets/c388c3a4-8175-11e9-a028-52540065bddc
      https://testing.whamcloud.com/test_sets/d4fb934c-8412-11e9-af1f-52540065bddc
      https://testing.whamcloud.com/test_sets/7ed3d6f0-86dd-11e9-b8e0-52540065bddc
      https://testing.whamcloud.com/test_sets/93bc20f4-8b6a-11e9-9bb5-52540065bddc

      Attachments

        Issue Links

          Activity

            [LU-12415] conf-sanity test 69 fails with 'OST replacement created too many inodes; X'

            This is the same root cause as LU-12404, namely the patch from LU-11760 allowing too many objects to be created, which is what test_69 is exactly trying to detect, but it was skipped because it is in the SLOW group (about 10 minutes per test).

            adilger Andreas Dilger added a comment - This is the same root cause as LU-12404 , namely the patch from LU-11760 allowing too many objects to be created, which is what test_69 is exactly trying to detect, but it was skipped because it is in the SLOW group (about 10 minutes per test).

            While the test doesn't give enough output for me to be sure, I suspect the culprit is LU-12396

            (For a lot of ZFS failures)

            pfarrell Patrick Farrell (Inactive) added a comment - - edited While the test doesn't give enough output for me to be sure, I suspect the culprit is LU-12396 (For a lot of ZFS failures)

            People

              wc-triage WC Triage
              jamesanunez James Nunez (Inactive)
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: