Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-18292

recovery-mds-scale test_failover_mds: dd: error writing: Cannot send after transport endpoint shutdown

    XMLWordPrintable

Details

    • Bug
    • Resolution: Unresolved
    • Minor
    • None
    • Lustre 2.16.0
    • 3
    • 9223372036854775807

    Description

      This issue was created by maloo for jianyu <yujian@whamcloud.com>

      This issue relates to the following test suite run: https://testing.whamcloud.com/test_sets/105ee112-176a-495b-bb3b-4d9ddefab646

      test_failover_mds failed with the following error:

      Found the END_RUN_FILE file: /autotest/autotest-2/2024-09-30/lustre-master_failover-part-3_4581_170_79ec5dab-e1af-4efc-8017-9c4f9dc656d5//end_run_file
      onyx-106vm10.onyx.whamcloud.com
      Client load  failed on node onyx-106vm10.onyx.whamcloud.com:
      /autotest/autotest-2/2024-09-30/lustre-master_failover-part-3_4581_170_79ec5dab-e1af-4efc-8017-9c4f9dc656d5//recovery-mds-scale.test_failover_mds.run__stdout.onyx-106vm10.onyx.whamcloud.com.log
      /autotest/autotest-2/2024-09-30/lustre-master_failover-part-3_4581_170_79ec5dab-e1af-4efc-8017-9c4f9dc656d5//recovery-mds-scale.test_failover_mds.run__debug.onyx-106vm10.onyx.whamcloud.com.log
      2024-10-01 18:28:38 Terminating clients loads ...
      Duration:               86400 seconds
      Server failover period: 1200 seconds
      Exited after:           72191 seconds
      Number of failovers before exit: mds1: 61 times
      ost1: 0 times
      ost2: 0 times
      ost3: 0 times
      ost4: 0 times
      ost5: 0 times
      ost6: 0 times
      ost7: 0 times
      Status: FAIL
      

      Test session details:
      clients: https://build.whamcloud.com/job/lustre-master/4581 - 5.14.0-427.31.1.el9_4.x86_64
      servers: https://build.whamcloud.com/job/lustre-master/4581 - 5.14.0-427.31.1_lustre.el9.x86_64

      run_dd_debug log on client onyx-106vm10:

      + dd bs=4k count=5378229 status=noxfer if=/dev/zero of=/mnt/lustre/d0.dd-onyx-106vm10.onyx.whamcloud.com/dd-file
      dd: error writing '/mnt/lustre/d0.dd-onyx-106vm10.onyx.whamcloud.com/dd-file': Cannot send after transport endpoint shutdown
      2725905+0 records in
      2725904+0 records out
      + '[' 1 -eq 0 ']'
      ++ date '+%F %H:%M:%S'
      + echoerr '2024-10-01 18:12:01: dd failed'
      + echo '2024-10-01 18:12:01: dd failed'
      2024-10-01 18:12:01: dd failed
      

      VVVVVVV DO NOT REMOVE LINES BELOW, Added by Maloo for auto-association VVVVVVV
      recovery-mds-scale test_failover_mds - test_failover_mds returned 1

      Attachments

        Issue Links

          Activity

            People

              wc-triage WC Triage
              maloo Maloo
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated: