Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-4104

Failure on test suite replay-dual test_21b: Not all renames are replayed. COS=1

Details

    • Bug
    • Resolution: Duplicate
    • Minor
    • None
    • Lustre 2.5.0, Lustre 2.7.0
    • None
    • client and server: lustre-b2_5 RHEL6 build #2
    • 3
    • 11020

    Description

      This issue was created by maloo for sarah <sarah@whamcloud.com>

      This issue relates to the following test suite run: http://maloo.whamcloud.com/test_sets/bd37a616-3458-11e3-9356-52540035b04c.

      The sub-test test_21b failed with the following error:

      Not all renames are replayed. COS=1

      Info required for matching: replay-dual 21b

      Attachments

        Issue Links

          Activity

            [LU-4104] Failure on test suite replay-dual test_21b: Not all renames are replayed. COS=1

            Hit this failure on lustre-master tag 2.6.93 with results at https://testing.hpdd.intel.com/test_sets/9c692176-adec-11e4-a0b6-5254006e85c2 .

            The client log shows the unlink(s) was not successful:

            == replay-dual test 21b: commit on sharing, two clients == 08:06:34 (1423065994)
            Stopping clients: c11,c12,c13 /lustre/scratch2 (opts:)
            Stopping client c11 /lustre/scratch2 opts:
            Stopping client c12 /lustre/scratch2 opts:
            Starting client c11,c12,c13:  -o user_xattr,flock mds01@o2ib:/scratch /lustre/scratch
            Started clients c11,c12,c13: 
            mds01@o2ib:/scratch on /lustre/scratch type lustre (rw,user_xattr,flock)
            mds01@o2ib:/scratch on /lustre/scratch type lustre (rw,user_xattr,flock)
            mds01@o2ib:/scratch on /lustre/scratch type lustre (rw,user_xattr,flock)
            mdt.scratch-MDT0000.commit_on_sharing=1
            Replay barrier on scratch-MDT0000
            Stopping clients: c11 /lustre/scratch (opts:-f)
            Stopping client c11 /lustre/scratch opts:-f
            Failing mds1 on mds01
            Stopping /lustre/scratch/mdt0 (opts:) on mds01
            pdsh@c13: mds01: ssh exited with exit code 1
            reboot facets: mds1
            Failover mds1 to mds01
            08:06:55 (1423066015) waiting for mds01 network 900 secs ...
            08:06:55 (1423066015) network interface is UP
            mount facets: mds1
            Starting mds1:   /dev/lvm-sdc/MDT0 /lustre/scratch/mdt0
            Started scratch-MDT0000
            UNLINK /lustre/scratch/f21b.replay-dual-3
            unlink: cannot unlink `/lustre/scratch/f21b.replay-dual-3': Input/output error
            unlink f21b.replay-dual-3 fail!
            Starting client c11:  -o user_xattr,flock mds01@o2ib:/scratch /lustre/scratch
            Started clients c11: 
            mds01@o2ib:/scratch on /lustre/scratch type lustre (rw,user_xattr,flock)
             replay-dual test_21b: @@@@@@ FAIL: Not all renames are replayed. COS=1 
              
            jamesanunez James Nunez (Inactive) added a comment - Hit this failure on lustre-master tag 2.6.93 with results at https://testing.hpdd.intel.com/test_sets/9c692176-adec-11e4-a0b6-5254006e85c2 . The client log shows the unlink(s) was not successful: == replay-dual test 21b: commit on sharing, two clients == 08:06:34 (1423065994) Stopping clients: c11,c12,c13 /lustre/scratch2 (opts:) Stopping client c11 /lustre/scratch2 opts: Stopping client c12 /lustre/scratch2 opts: Starting client c11,c12,c13: -o user_xattr,flock mds01@o2ib:/scratch /lustre/scratch Started clients c11,c12,c13: mds01@o2ib:/scratch on /lustre/scratch type lustre (rw,user_xattr,flock) mds01@o2ib:/scratch on /lustre/scratch type lustre (rw,user_xattr,flock) mds01@o2ib:/scratch on /lustre/scratch type lustre (rw,user_xattr,flock) mdt.scratch-MDT0000.commit_on_sharing=1 Replay barrier on scratch-MDT0000 Stopping clients: c11 /lustre/scratch (opts:-f) Stopping client c11 /lustre/scratch opts:-f Failing mds1 on mds01 Stopping /lustre/scratch/mdt0 (opts:) on mds01 pdsh@c13: mds01: ssh exited with exit code 1 reboot facets: mds1 Failover mds1 to mds01 08:06:55 (1423066015) waiting for mds01 network 900 secs ... 08:06:55 (1423066015) network interface is UP mount facets: mds1 Starting mds1: /dev/lvm-sdc/MDT0 /lustre/scratch/mdt0 Started scratch-MDT0000 UNLINK /lustre/scratch/f21b.replay-dual-3 unlink: cannot unlink `/lustre/scratch/f21b.replay-dual-3': Input/output error unlink f21b.replay-dual-3 fail! Starting client c11: -o user_xattr,flock mds01@o2ib:/scratch /lustre/scratch Started clients c11: mds01@o2ib:/scratch on /lustre/scratch type lustre (rw,user_xattr,flock) replay-dual test_21b: @@@@@@ FAIL: Not all renames are replayed. COS=1
            sarah Sarah Liu added a comment -

            Hit this failure again in hard failover test on RHEL6 on build lustre-master #2835

            https://testing.hpdd.intel.com/test_sets/2cf8b852-a819-11e4-93dd-5254006e85c2

            sarah Sarah Liu added a comment - Hit this failure again in hard failover test on RHEL6 on build lustre-master #2835 https://testing.hpdd.intel.com/test_sets/2cf8b852-a819-11e4-93dd-5254006e85c2
            sarah Sarah Liu added a comment - Also hit this error on SLES11 SP3 client: https://maloo.whamcloud.com/test_sets/c6bc907a-381d-11e3-844f-52540035b04c

            People

              wc-triage WC Triage
              maloo Maloo
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: