Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-18948

recovery-small 136 never work on single node

Details

    • Bug
    • Resolution: Unresolved
    • Blocker
    • None
    • None
    • None
    • 3
    • 9223372036854775807

    Description

      83ffa859bc62 (Sebastien Buisson  2019-04-16 22:32:43 +0900 2900)        # fail_loc will make MDS sleep in the middle of changelog_deregister
      83ffa859bc62 (Sebastien Buisson  2019-04-16 22:32:43 +0900 2901)        # take this opportunity to abruptly kill MDS
      83ffa859bc62 (Sebastien Buisson  2019-04-16 22:32:43 +0900 2902)        FAILURE_MODE_save=$FAILURE_MODE
      83ffa859bc62 (Sebastien Buisson  2019-04-16 22:32:43 +0900 2903)        FAILURE_MODE=HARD
      83ffa859bc62 (Sebastien Buisson  2019-04-16 22:32:43 +0900 2904)        fail mds1
      83ffa859bc62 (Sebastien Buisson  2019-04-16 22:32:43 +0900 2905)        FAILURE_MODE=$FAILURE_MODE_save
      

      it caused

      == recovery-small test 136: changelog_deregister leaving pending records ========================================================== 15:10:22 (1745496622)
      mdd.lustre-MDT0000.changelog_mask=ALL
      mdd.lustre-MDT0001.changelog_mask=ALL
      mdd.lustre-MDT0002.changelog_mask=ALL
      mdd.lustre-MDT0003.changelog_mask=ALL
      striped dir -i0 -c0 -H fnv_1a_64 /mnt/lustre/d136.recovery-small
       - create 7082 (time 1745496634.42 total 10.00 last 708.13)
      total: 10000 create in 14.06 seconds: 710.99 ops/second
      Changelog size 1991952
      fail_loc=0x1318
      fail_val=30
      Failing mds1,mds2,mds3,mds4,ost1,ost2,ost3,ost4,ost5,ost6,ost7,ost8 on devel1
      + powerman --off devel1
      ./../tests/test-framework.sh: line 3104: powerman: command not found
      waiting ! ping -w 3 -c 1 devel1, 5 secs left ...
      waiting ! ping -w 3 -c 1 devel1, 4 secs left ...
      waiting ! ping -w 3 -c 1 devel1, 3 secs left ...
      waiting ! ping -w 3 -c 1 devel1, 2 secs left ...
      waiting ! ping -w 3 -c 1 devel1, 1 secs left ...
      waiting for devel1 to fail attempts=3
      + powerman --off devel1
      ./../tests/test-framework.sh: line 3104: powerman: command not found
      waiting ! ping -w 3 -c 1 devel1, 5 secs left ...
      waiting ! ping -w 3 -c 1 devel1, 4 secs left ...
      waiting ! ping -w 3 -c 1 devel1, 3 secs left ...
      waiting ! ping -w 3 -c 1 devel1, 2 secs left ...
      waiting ! ping -w 3 -c 1 devel1, 1 secs left ...
      waiting for devel1 to fail attempts=3
      + powerman --off devel1
      ./../tests/test-framework.sh: line 3104: powerman: command not found
      waiting ! ping -w 3 -c 1 devel1, 5 secs left ...
      waiting ! ping -w 3 -c 1 devel1, 4 secs left ...
      waiting ! ping -w 3 -c 1 devel1, 3 secs left ...
      waiting ! ping -w 3 -c 1 devel1, 2 secs left ...
      waiting ! ping -w 3 -c 1 devel1, 1 secs left ...
      waiting for devel1 to fail attempts=3
      devel1 still pingable after power down! attempts=3
      15:11:17 (1745496677) shut down
      facet: mds1 facet_host: devel1 facet_failover_host: devel1
      + powerman --on devel1
      ./../tests/test-framework.sh: line 3196: powerman: command not found
      15:11:17 (1745496677) devel1 rebooted; waithostlist: devel1
      

      Attachments

        Activity

          People

            wc-triage WC Triage
            shadow Alexey Lyashkov
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

              Created:
              Updated: