Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-15553

replay-vbr test 12a fails with 'test_12a failed with 4'

    XMLWordPrintable

Details

    • Bug
    • Resolution: Unresolved
    • Minor
    • None
    • Lustre 2.16.0, Lustre 2.15.0, Lustre 2.15.4, Lustre 2.15.5, Lustre 2.15.6
    • 3
    • 9223372036854775807

    Description

      replay-vbr test_12a started failing with 'test_12a failed with 4' on August 4, 2021 for Lustre 2.14.53.7 with logs at https://testing.whamcloud.com/test_sets/17efe0ba-7e4a-4e7f-b7f5-02383e1314c5. We’ve seen this test fail for ZFS and ldiskfs, but, so far, always DNE.

      Looking at a recent failure at https://testing.whamcloud.com/test_sets/014ce4c3-c654-47f9-9333-1c58ebf545c3, the suite_log shows

      CMD: onyx-24vm7 e2label /dev/mapper/mds1_flakey 2>/dev/null
      Started lustre-MDT0000
      CMD: onyx-55vm7.onyx.whamcloud.com unlinkmany /mnt/lustre/f12a.replay-vbr- 25
       - unlinked 0 (time 1643080125 ; total 0 ; last 0)
      total: 25 unlinks in 0 seconds: inf unlinks/second
      CMD: onyx-55vm7.onyx.whamcloud.com unlinkmany /mnt/lustre/f12a.replay-vbr-3- 25
       - unlinked 0 (time 1643080125 ; total 0 ; last 0)
      total: 25 unlinks in 0 seconds: inf unlinks/second
      CMD: onyx-55vm7.onyx.whamcloud.com checkstat -v /mnt/lustre/d12a.replay-vbr/f12a.replay-vbr
       replay-vbr test_12a: @@@@@@ FAIL: test_12a failed with 4 
        Trace dump:
        = /usr/lib64/lustre/tests/test-framework.sh:6391:error()
        = /usr/lib64/lustre/tests/test-framework.sh:6695:run_one()
      

      Looking at the code for this test,

      1152     # All 50 files should have been replayed
      1153     do_node $CLIENT1 unlinkmany $DIR/$tfile- 25 || return 2
      1154     do_node $CLIENT1 unlinkmany $DIR/$tfile-3- 25 || return 3
      1155     do_node $CLIENT1 $CHECKSTAT $DIR/$tdir/$tfile && return 4
      1156 
      1157     return 0
      1158 }
      1159 run_test 12a "lost data due to missed REMOTE client during replay"
      

      The call to checkstat is what produces this error.

      Attachments

        Issue Links

          Activity

            People

              laisiyao Lai Siyao
              jamesanunez James Nunez (Inactive)
              Votes:
              0 Vote for this issue
              Watchers:
              6 Start watching this issue

              Dates

                Created:
                Updated: