Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-3438

replay-ost-single test_5 failed with error int check_write_rcs() "Unexpected # bytes transferred"

    XMLWordPrintable

Details

    • Bug
    • Resolution: Duplicate
    • Major
    • None
    • Lustre 2.4.0
    • None
    • Lustre master branch
    • 3
    • 8566

    Description

      Our testing system shows, that there is failed test eplay-ost-single.test_5

      Lustre: DEBUG MARKER: == replay-ost-single test 5: Fail OST during iozone == 21:21:13 (1369851673)
      Lustre: Failing over lustre-OST0000
      LustreError: 11-0: an error occurred while communicating with 0@lo. The ost_write operation failed with -19
      LustreError: Skipped 1 previous similar message
      Lustre: lustre-OST0000-osc-ffff8800514d3400: Connection to lustre-OST0000 (at 0@lo) was lost; in progress operations using this service will wait for recovery to complete
      Lustre: Skipped 1 previous similar message
      Lustre: lustre-OST0000: shutting down for failover; client state will be preserved.
      Lustre: OST lustre-OST0000 has stopped.
      Lustre: server umount lustre-OST0000 complete
      LustreError: 137-5: UUID 'lustre-OST0000_UUID' is not available for connect (no target)
      LustreError: Skipped 1 previous similar message
      LDISKFS-fs (loop1): mounted filesystem with ordered data mode. Opts: 
      LDISKFS-fs (loop1): mounted filesystem with ordered data mode. Opts: 
      Lustre: 16962:0:(ldlm_lib.c:2195:target_recovery_init()) RECOVERY: service lustre-OST0000, 2 recoverable clients, last_transno 1322
      Lustre: lustre-OST0000: Now serving lustre-OST0000 on /dev/loop1 with recovery enabled
      Lustre: 2398:0:(ldlm_lib.c:1021:target_handle_connect()) lustre-OST0000: connection from lustre-MDT0000-mdtlov_UUID@0@lo recovering/t0 exp ffff88005ca19c00 cur 1369851700 last 1369851697
      Lustre: 2398:0:(ldlm_lib.c:1021:target_handle_connect()) Skipped 3 previous similar messages
      Lustre: lustre-OST0000: Will be in recovery for at least 1:00, or until 2 clients reconnect
      Lustre: lustre-OST0000: Recovery over after 0:01, of 2 clients 2 recovered and 0 were evicted.
      Lustre: lustre-OST0000-osc-MDT0000: Connection restored to lustre-OST0000 (at 0@lo)
      Lustre: Skipped 1 previous similar message
      LustreError: 1716:0:(osc_request.c:1232:check_write_rcs()) Unexpected # bytes transferred: 65536 (requested 32768)
      LustreError: 1716:0:(osc_request.c:1232:check_write_rcs()) Unexpected # bytes transferred: 2097152 (requested 1048576)
      Lustre: lustre-OST0000: received MDS connection from 0@lo
      Lustre: MDS mdd_obd-lustre-MDT0000: lustre-OST0000_UUID now active, resetting orphans
      Lustre: DEBUG MARKER: iozone rc=1
      Lustre: DEBUG MARKER: replay-ost-single test_5: @@@@@@ FAIL: iozone failed
      

      This messages looks related to 4mb IO patch

      LustreError: 1716:0:(osc_request.c:1232:check_write_rcs()) Unexpected # bytes transferred: 65536 (requested 32768)
      LustreError: 1716:0:(osc_request.c:1232:check_write_rcs()) Unexpected # bytes transferred: 2097152 (requested 1048576)
      

      I believe, that this test is failed in master branch, but they skip it as SLOW during testing
      https://maloo.whamcloud.com/test_sets/dd033a98-7264-11e2-aad1-52540035b04c

      test_5	SKIP	0	0	skipping SLOW test 5
      

      Attachments

        Issue Links

          Activity

            People

              keith Keith Mannthey (Inactive)
              artem_blagodarenko Artem Blagodarenko (Inactive)
              Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: