Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-15338

sanity test_205a: No jobstats for id.205a.dd.320 found on ost1::*.lustre-OST0000.job_stats

    XMLWordPrintable

Details

    • Bug
    • Resolution: Fixed
    • Minor
    • Lustre 2.15.0
    • None
    • None
    • 3
    • 9223372036854775807

    Description

      This issue was created by maloo for Andreas Dilger <adilger@whamcloud.com>

      This issue relates to the following test suite run: https://testing.whamcloud.com/test_sets/32ad6edf-a1b8-46ac-91ff-92eecbe0efe3

      test_205a failed with the following error:

       sanity test_205a: @@@@@@ FAIL: No jobstats for id.205a.dd.320 found on ost1::*.lustre-OST0000.job_stats 
      

      However, the dumped jobstats show that there is a result for id.205a.dd.320, but also a result from the previous "dd" write operation that has an ID that has a matching (but longer) string id.205a.dd.32075, so the "grep -c" in the test matches two jobids and fails the test:

      - job_id:          id.205a.dd.32075
        snapshot_time:   1615208459
        write_bytes:     { samples:           1, unit: bytes, min: 1048576, max: 1048576, sum:         1048576, sumsq:      1099511627776 }
        write:           { samples:           1, unit: usecs, min:     141, max:     141, sum:             141, sumsq:              19881 }
        punch:           { samples:           1, unit: usecs, min:      31, max:      31, sum:              31, sumsq:                961 }
        sync:            { samples:           1, unit: usecs, min:   42111, max:   42111, sum:           42111, sumsq:         1773336321 }
      - job_id:          id.205a.dd.320
        snapshot_time:   1615208461
        read_bytes:      { samples:           1, unit: bytes, min: 1048576, max: 1048576, sum:         1048576, sumsq:      1099511627776 }
        read:            { samples:           1, unit: usecs, min:      53, max:      53, sum:              53, sumsq:               2809 }
      

      I don't know the exact statistics of this happening, maybe around (4/32768 = 1/8192) since any 5-digit random number for the first "dd" has a 4-, 3-, 2-, and 1-digit substring that would also match. In any case, full-string matching should fix this problem.

      VVVVVVV DO NOT REMOVE LINES BELOW, Added by Maloo for auto-association VVVVVVV
      sanity test_205a - No jobstats for id.205a.dd.4 found on ost1::*.lustre-OST0000.job_stats

      Attachments

        Activity

          People

            adilger Andreas Dilger
            maloo Maloo
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: