Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-12757

sanity-lfsck test 36a fails with '(N) Fail to resync /mnt/lustre/d36a.sanity-lfsck/f2'

    XMLWordPrintable

Details

    • Bug
    • Resolution: Fixed
    • Minor
    • Lustre 2.14.0
    • Lustre 2.13.0, Lustre 2.12.3, Lustre 2.12.4, Lustre 2.12.5
    • 3
    • 9223372036854775807

    Description

      We see sanity-lfsck test_36a fail in resync for the last two of the following three calls to ‘lfs mirror resync’ from test 36a:

      5271         $LFS mirror resync $DIR/$tdir/f0 ||
      5272                 error "(6) Fail to resync $DIR/$tdir/f0"
      5273         $LFS mirror resync $DIR/$tdir/f1 ||
      5274                 error "(7) Fail to resync $DIR/$tdir/f1"
      5275         $LFS mirror resync $DIR/$tdir/f2 ||
      5276                 error "(8) Fail to resync $DIR/$tdir/f2"
      5277 
      

      It looks like this test started failing with these two errors on 07-September-2019 with Lustre master version 2.12.57.54.

      Looking at the suite_log for https://testing.whamcloud.com/test_sets/a5f2b938-d438-11e9-a2b6-52540065bddc, we see

      lfs mirror mirror: component 131075 not synced
      : No space left on device (28)
      lfs mirror mirror: component 131076 not synced
      : No space left on device (28)
      lfs mirror mirror: component 196613 not synced
      : No space left on device (28)
      lfs mirror: '/mnt/lustre/d36a.sanity-lfsck/f1' llapi_mirror_resync_many: No space left on device.
       sanity-lfsck test_36a: @@@@@@ FAIL: (7) Fail to resync /mnt/lustre/d36a.sanity-lfsck/f1 
      

      Similarly, looking at the suite_log for https://testing.whamcloud.com/test_sets/42fbb9fe-d575-11e9-9fc9-52540065bddc, we see

      lfs mirror mirror: component 131075 not synced
      : No space left on device (28)
      lfs mirror mirror: component 131076 not synced
      : No space left on device (28)
      lfs mirror mirror: component 196613 not synced
      : No space left on device (28)
      lfs mirror: '/mnt/lustre/d36a.sanity-lfsck/f2' llapi_mirror_resync_many: No space left on device.
       sanity-lfsck test_36a: @@@@@@ FAIL: (8) Fail to resync /mnt/lustre/d36a.sanity-lfsck/f2 
       

      It is possible that we are running out of disk space on an OST, but it seems strange that this just started earlier this month.

      Logs for other failures are at
      https://testing.whamcloud.com/test_sessions/279dd05c-e122-4f8f-bafe-b8299e8e0e61
      https://testing.whamcloud.com/test_sessions/fe936f3a-df7d-4d23-9d28-721da7ab8f76

      Attachments

        Issue Links

          Activity

            People

              wc-triage WC Triage
              jamesanunez James Nunez (Inactive)
              Votes:
              0 Vote for this issue
              Watchers:
              8 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: