Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-9157

replay-single test_80c: rmdir failed

Details

    • Bug
    • Resolution: Duplicate
    • Major
    • None
    • Lustre 2.10.0, Lustre 2.10.1, Lustre 2.11.0, Lustre 2.10.2, Lustre 2.12.0, Lustre 2.10.3, Lustre 2.10.4, Lustre 2.10.5, Lustre 2.10.7
    • None
    • trevis-46vm1/2/3/7/8, Full Group test w/DNE,
      RHEL7.3/zfs, branch master, v2.9.53.19.gdad9ce4, b3534
    • 3
    • 9223372036854775807

    Description

      https://testing.hpdd.intel.com/test_sessions/13a7c94c-f92a-11e6-aa39-5254006e85c2

      From test_log:

      rm: cannot remove '/mnt/lustre/d80c.replay-single/remote_dir': Input/output error
       replay-single test_80c: @@@@@@ FAIL: rmdir failed 
        Trace dump:
        = /usr/lib64/lustre/tests/test-framework.sh:4841:error()
        = /usr/lib64/lustre/tests/replay-single.sh:2637:test_80c()
        = /usr/lib64/lustre/tests/test-framework.sh:5117:run_one()
        = /usr/lib64/lustre/tests/test-framework.sh:5156:run_one_logged()
        = /usr/lib64/lustre/tests/test-framework.sh:5003:run_test()
        = /usr/lib64/lustre/tests/replay-single.sh:2641:main()
      

      Attachments

        Issue Links

          Activity

            [LU-9157] replay-single test_80c: rmdir failed
            pjones Peter Jones added a comment -

            duplicate of LU-10740

            pjones Peter Jones added a comment - duplicate of LU-10740
            laisiyao Lai Siyao added a comment -

            The failure should be the same as LU-10740, the logs shows the same FID was allocated twice, which cause llog_osd_create() failed with -EEXIST, and mdt_reint_unlink() failed with -EIO in the end:

            40000000:00000040:0.0:1535086843.183909:0:24271:0:(fid_request.c:400:seq_client_alloc_fid()) cli-cli-lustre-MDT0000-osp-MDT0001: Allocated FID [0x20000bb98:0x1:0x0]
            00000004:00000040:0.0:1535086843.183917:0:24271:0:(mdt_handler.c:5324:mdt_object_init()) object init, fid = [0x20000bb98:0x1:0x0]
            00000004:00000040:1.0:1535086843.184867:0:23089:0:(osp_trans.c:265:object_update_request_dump()) i = 0 fid = [0x20000bb98:0x1:0x0] op = create params = 1 batchid = 0 size = 256 repsize 0
            00000040:00001000:0.0:1535086843.213847:0:24271:0:(llog_osd.c:399:llog_osd_write_rec()) new record 106a0000 to [0x20000bb98:0x1:0x0]
            00000004:00000040:0.0:1535086843.213858:0:24271:0:(osp_trans.c:265:object_update_request_dump()) i = 0 fid = [0x20000bb98:0x1:0x0] op = attr_get params = 0 batchid = 0 size = 40 repsize 208
            00000004:00000040:0.0:1535086843.214723:0:24271:0:(osp_md_object.c:1163:osp_md_write()) write [0x20000bb98:0x1:0x0] offset = 0 length = 32768
            00000004:00000040:0.0:1535086843.214752:0:24271:0:(osp_md_object.c:1163:osp_md_write()) write [0x20000bb98:0x1:0x0] offset = 32768 length = 504
            00000040:00080000:0.0:1535086843.214755:0:24271:0:(llog_osd.c:698:llog_osd_write_rec()) added record [0x20000bb98:0x1:0x0].1, 504 off33272
            00000004:00000040:1.0:1535086843.214788:0:23089:0:(osp_trans.c:265:object_update_request_dump()) i = 0 fid = [0x20000bb98:0x1:0x0] op = write params = 2 batchid = 0 size = 32832 repsize 0
            00000004:00000040:1.0:1535086843.214789:0:23089:0:(osp_trans.c:265:object_update_request_dump()) i = 1 fid = [0x20000bb98:0x1:0x0] op = write params = 2 batchid = 0 size = 568 repsize 0
            00000040:00001000:1.0:1535086846.270681:0:24271:0:(llog_osd.c:399:llog_osd_write_rec()) new record 106a0000 to [0x20000bb98:0x1:0x0]
            00000004:00000040:1.0:1535086846.270684:0:24271:0:(osp_md_object.c:1163:osp_md_write()) write [0x20000bb98:0x1:0x0] offset = 0 length = 88
            00000004:00000040:1.0:1535086846.270692:0:24271:0:(osp_md_object.c:1163:osp_md_write()) write [0x20000bb98:0x1:0x0] offset = 88 length = 4
            00000004:00000040:1.0:1535086846.270695:0:24271:0:(osp_md_object.c:1163:osp_md_write()) write [0x20000bb98:0x1:0x0] offset = 32760 length = 8
            00000004:00000040:1.0:1535086846.270699:0:24271:0:(osp_md_object.c:1163:osp_md_write()) write [0x20000bb98:0x1:0x0] offset = 33272 length = 1160
            00000040:00080000:1.0:1535086846.270702:0:24271:0:(llog_osd.c:698:llog_osd_write_rec()) added record [0x20000bb98:0x1:0x0].2, 1160 off34432
            00000004:00000040:1.0:1535086872.182169:0:23089:0:(osp_trans.c:265:object_update_request_dump()) i = 3 fid = [0x20000bb98:0x1:0x0] op = write params = 2 batchid = 0 size = 152 repsize 0
            00000004:00000040:1.0:1535086872.182172:0:23089:0:(osp_trans.c:265:object_update_request_dump()) i = 4 fid = [0x20000bb98:0x1:0x0] op = write params = 2 batchid = 0 size = 72 repsize 0
            00000004:00000040:1.0:1535086872.182174:0:23089:0:(osp_trans.c:265:object_update_request_dump()) i = 5 fid = [0x20000bb98:0x1:0x0] op = write params = 2 batchid = 0 size = 72 repsize 0
            00000004:00000040:1.0:1535086872.182176:0:23089:0:(osp_trans.c:265:object_update_request_dump()) i = 6 fid = [0x20000bb98:0x1:0x0] op = write params = 2 batchid = 0 size = 1224 repsize 0
            00000004:00000040:0.0:1535086877.998347:0:26196:0:(mdt_handler.c:5344:mdt_object_free()) object free, fid = [0x20000bb98:0x1:0x0]
            00000004:00000040:1.0:1535086890.715650:0:27095:0:(mdt_handler.c:5324:mdt_object_init()) object init, fid = [0x20000bb98:0x1:0x0]
            00000004:00000040:1.0:1535086890.715674:0:27095:0:(osp_trans.c:265:object_update_request_dump()) i = 0 fid = [0x20000bb98:0x1:0x0] op = attr_get params = 0 batchid = 0 size = 40 repsize 208
            00000004:00000040:1.0:1535086890.716560:0:27095:0:(osp_md_object.c:1238:osp_md_read()) lustre-MDT0000-osp-MDT0001 [0x20000bb98:0x1:0x0] read offset 0 size 32768
            00000004:00000040:1.0:1535086890.716562:0:27095:0:(osp_trans.c:265:object_update_request_dump()) i = 0 fid = [0x20000bb98:0x1:0x0] op = read params = 2 batchid = 0 size = 72 repsize 32768
            00000004:00000040:1.0:1535086890.721227:0:27095:0:(osp_md_object.c:1238:osp_md_read()) lustre-MDT0000-osp-MDT0001 [0x20000bb98:0x1:0x0] read offset 32768 size 32768
            00000004:00000040:1.0:1535086890.721229:0:27095:0:(osp_trans.c:265:object_update_request_dump()) i = 0 fid = [0x20000bb98:0x1:0x0] op = read params = 2 batchid = 0 size = 72 repsize 32768
            40000000:00000040:0.0:1535086899.333519:0:26923:0:(fid_request.c:400:seq_client_alloc_fid()) cli-cli-lustre-MDT0000-osp-MDT0001: Allocated FID [0x20000bb98:0x1:0x0]
            00000004:00000040:0.0:1535086899.333537:0:26923:0:(mdt_handler.c:5324:mdt_object_init()) object init, fid = [0x20000bb98:0x1:0x0]
            00000004:00000040:1.0:1535086899.334459:0:26954:0:(osp_trans.c:265:object_update_request_dump()) i = 0 fid = [0x20000bb98:0x1:0x0] op = create params = 1 batchid = 0 size = 256 repsize 0
            00000040:00001000:0.0:1535086899.371627:0:26923:0:(llog_osd.c:399:llog_osd_write_rec()) new record 106a0000 to [0x20000bb98:0x1:0x0]
            00000004:00000040:0.0:1535086899.371630:0:26923:0:(osp_md_object.c:1163:osp_md_write()) write [0x20000bb98:0x1:0x0] offset = 0 length = 32768
            00000004:00000040:0.0:1535086899.371679:0:26923:0:(osp_md_object.c:1163:osp_md_write()) write [0x20000bb98:0x1:0x0] offset = 32768 length = 504
            00000040:00080000:0.0:1535086899.371684:0:26923:0:(llog_osd.c:698:llog_osd_write_rec()) added record [0x20000bb98:0x1:0x0].1, 504 off33272
            00000004:00080000:1.0:1535086899.394315:0:26954:0:(osp_object.c:1304:osp_invalidate()) Invalidate osp_object [0x20000bb98:0x1:0x0]
            
            laisiyao Lai Siyao added a comment - The failure should be the same as LU-10740 , the logs shows the same FID was allocated twice, which cause llog_osd_create() failed with -EEXIST, and mdt_reint_unlink() failed with -EIO in the end: 40000000:00000040:0.0:1535086843.183909:0:24271:0:(fid_request.c:400:seq_client_alloc_fid()) cli-cli-lustre-MDT0000-osp-MDT0001: Allocated FID [0x20000bb98:0x1:0x0] 00000004:00000040:0.0:1535086843.183917:0:24271:0:(mdt_handler.c:5324:mdt_object_init()) object init, fid = [0x20000bb98:0x1:0x0] 00000004:00000040:1.0:1535086843.184867:0:23089:0:(osp_trans.c:265:object_update_request_dump()) i = 0 fid = [0x20000bb98:0x1:0x0] op = create params = 1 batchid = 0 size = 256 repsize 0 00000040:00001000:0.0:1535086843.213847:0:24271:0:(llog_osd.c:399:llog_osd_write_rec()) new record 106a0000 to [0x20000bb98:0x1:0x0] 00000004:00000040:0.0:1535086843.213858:0:24271:0:(osp_trans.c:265:object_update_request_dump()) i = 0 fid = [0x20000bb98:0x1:0x0] op = attr_get params = 0 batchid = 0 size = 40 repsize 208 00000004:00000040:0.0:1535086843.214723:0:24271:0:(osp_md_object.c:1163:osp_md_write()) write [0x20000bb98:0x1:0x0] offset = 0 length = 32768 00000004:00000040:0.0:1535086843.214752:0:24271:0:(osp_md_object.c:1163:osp_md_write()) write [0x20000bb98:0x1:0x0] offset = 32768 length = 504 00000040:00080000:0.0:1535086843.214755:0:24271:0:(llog_osd.c:698:llog_osd_write_rec()) added record [0x20000bb98:0x1:0x0].1, 504 off33272 00000004:00000040:1.0:1535086843.214788:0:23089:0:(osp_trans.c:265:object_update_request_dump()) i = 0 fid = [0x20000bb98:0x1:0x0] op = write params = 2 batchid = 0 size = 32832 repsize 0 00000004:00000040:1.0:1535086843.214789:0:23089:0:(osp_trans.c:265:object_update_request_dump()) i = 1 fid = [0x20000bb98:0x1:0x0] op = write params = 2 batchid = 0 size = 568 repsize 0 00000040:00001000:1.0:1535086846.270681:0:24271:0:(llog_osd.c:399:llog_osd_write_rec()) new record 106a0000 to [0x20000bb98:0x1:0x0] 00000004:00000040:1.0:1535086846.270684:0:24271:0:(osp_md_object.c:1163:osp_md_write()) write [0x20000bb98:0x1:0x0] offset = 0 length = 88 00000004:00000040:1.0:1535086846.270692:0:24271:0:(osp_md_object.c:1163:osp_md_write()) write [0x20000bb98:0x1:0x0] offset = 88 length = 4 00000004:00000040:1.0:1535086846.270695:0:24271:0:(osp_md_object.c:1163:osp_md_write()) write [0x20000bb98:0x1:0x0] offset = 32760 length = 8 00000004:00000040:1.0:1535086846.270699:0:24271:0:(osp_md_object.c:1163:osp_md_write()) write [0x20000bb98:0x1:0x0] offset = 33272 length = 1160 00000040:00080000:1.0:1535086846.270702:0:24271:0:(llog_osd.c:698:llog_osd_write_rec()) added record [0x20000bb98:0x1:0x0].2, 1160 off34432 00000004:00000040:1.0:1535086872.182169:0:23089:0:(osp_trans.c:265:object_update_request_dump()) i = 3 fid = [0x20000bb98:0x1:0x0] op = write params = 2 batchid = 0 size = 152 repsize 0 00000004:00000040:1.0:1535086872.182172:0:23089:0:(osp_trans.c:265:object_update_request_dump()) i = 4 fid = [0x20000bb98:0x1:0x0] op = write params = 2 batchid = 0 size = 72 repsize 0 00000004:00000040:1.0:1535086872.182174:0:23089:0:(osp_trans.c:265:object_update_request_dump()) i = 5 fid = [0x20000bb98:0x1:0x0] op = write params = 2 batchid = 0 size = 72 repsize 0 00000004:00000040:1.0:1535086872.182176:0:23089:0:(osp_trans.c:265:object_update_request_dump()) i = 6 fid = [0x20000bb98:0x1:0x0] op = write params = 2 batchid = 0 size = 1224 repsize 0 00000004:00000040:0.0:1535086877.998347:0:26196:0:(mdt_handler.c:5344:mdt_object_free()) object free, fid = [0x20000bb98:0x1:0x0] 00000004:00000040:1.0:1535086890.715650:0:27095:0:(mdt_handler.c:5324:mdt_object_init()) object init, fid = [0x20000bb98:0x1:0x0] 00000004:00000040:1.0:1535086890.715674:0:27095:0:(osp_trans.c:265:object_update_request_dump()) i = 0 fid = [0x20000bb98:0x1:0x0] op = attr_get params = 0 batchid = 0 size = 40 repsize 208 00000004:00000040:1.0:1535086890.716560:0:27095:0:(osp_md_object.c:1238:osp_md_read()) lustre-MDT0000-osp-MDT0001 [0x20000bb98:0x1:0x0] read offset 0 size 32768 00000004:00000040:1.0:1535086890.716562:0:27095:0:(osp_trans.c:265:object_update_request_dump()) i = 0 fid = [0x20000bb98:0x1:0x0] op = read params = 2 batchid = 0 size = 72 repsize 32768 00000004:00000040:1.0:1535086890.721227:0:27095:0:(osp_md_object.c:1238:osp_md_read()) lustre-MDT0000-osp-MDT0001 [0x20000bb98:0x1:0x0] read offset 32768 size 32768 00000004:00000040:1.0:1535086890.721229:0:27095:0:(osp_trans.c:265:object_update_request_dump()) i = 0 fid = [0x20000bb98:0x1:0x0] op = read params = 2 batchid = 0 size = 72 repsize 32768 40000000:00000040:0.0:1535086899.333519:0:26923:0:(fid_request.c:400:seq_client_alloc_fid()) cli-cli-lustre-MDT0000-osp-MDT0001: Allocated FID [0x20000bb98:0x1:0x0] 00000004:00000040:0.0:1535086899.333537:0:26923:0:(mdt_handler.c:5324:mdt_object_init()) object init, fid = [0x20000bb98:0x1:0x0] 00000004:00000040:1.0:1535086899.334459:0:26954:0:(osp_trans.c:265:object_update_request_dump()) i = 0 fid = [0x20000bb98:0x1:0x0] op = create params = 1 batchid = 0 size = 256 repsize 0 00000040:00001000:0.0:1535086899.371627:0:26923:0:(llog_osd.c:399:llog_osd_write_rec()) new record 106a0000 to [0x20000bb98:0x1:0x0] 00000004:00000040:0.0:1535086899.371630:0:26923:0:(osp_md_object.c:1163:osp_md_write()) write [0x20000bb98:0x1:0x0] offset = 0 length = 32768 00000004:00000040:0.0:1535086899.371679:0:26923:0:(osp_md_object.c:1163:osp_md_write()) write [0x20000bb98:0x1:0x0] offset = 32768 length = 504 00000040:00080000:0.0:1535086899.371684:0:26923:0:(llog_osd.c:698:llog_osd_write_rec()) added record [0x20000bb98:0x1:0x0].1, 504 off33272 00000004:00080000:1.0:1535086899.394315:0:26954:0:(osp_object.c:1304:osp_invalidate()) Invalidate osp_object [0x20000bb98:0x1:0x0]
            pjones Peter Jones added a comment -

            Lai

            Could you please advise?

            Thanks

            Peter

            pjones Peter Jones added a comment - Lai Could you please advise? Thanks Peter
            jamesanunez James Nunez (Inactive) added a comment - - edited Some recent failures: https://testing.whamcloud.com/test_sets/fb3a0a6c-99e8-11e8-b0aa-52540065bddc https://testing.whamcloud.com/test_sets/fc2b404a-9a78-11e8-a9f7-52540065bddc It fails 100% of the time on master with review-dne-zfs-part-4.
            sarah Sarah Liu added a comment -

            MDS console

            https://testing.hpdd.intel.com/test_sets/93814ab4-53f0-11e7-bacf-5254006e85c2

            10:16:00:[59442.222216] Lustre: DEBUG MARKER: mkdir -p /mnt/lustre-mds2; mount -t lustre   		                   lustre-mdt2/mdt2 /mnt/lustre-mds2
            10:16:00:[59442.618812] Lustre: lustre-MDT0001: Imperative Recovery not enabled, recovery window 60-180
            10:16:00:[59442.622180] LustreError: 3386:0:(osd_oi.c:503:osd_oid()) lustre-MDT0001-osd: unsupported quota oid: 0x16
            10:16:00:[59442.829897] Lustre: DEBUG MARKER: /usr/sbin/lctl get_param -n health_check
            10:16:00:[59443.177304] Lustre: DEBUG MARKER: PATH=/usr/lib64/lustre/tests:/usr/lib/lustre/tests:/usr/lib64/lustre/tests:/opt/iozone/bin:/opt/iozone/bin:/usr/lib64/lustre/tests/mpi:/usr/lib64/lustre/tests/racer:/usr/lib64/lustre/../lustre-iokit/sgpdd-survey:/usr/lib64/lustre/tests:/usr/lib64/lustre/u
            10:16:00:[59443.788151] Lustre: DEBUG MARKER: /usr/sbin/lctl mark trevis-55vm3.trevis.hpdd.intel.com: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all 4
            10:16:00:[59443.792192] Lustre: DEBUG MARKER: /usr/sbin/lctl mark trevis-55vm3.trevis.hpdd.intel.com: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all 4
            10:16:00:[59444.131692] Lustre: DEBUG MARKER: trevis-55vm3.trevis.hpdd.intel.com: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all 4
            10:19:58:[59444.136692] Lustre: DEBUG MARKER: trevis-55vm3.trevis.hpdd.intel.com: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all 4
            10:19:58:[59444.397031] Lustre: DEBUG MARKER: lctl set_param -n mdt.lustre*.enable_remote_dir=1
            10:19:58:[59444.739127] Lustre: DEBUG MARKER: zfs get -H -o value 				lustre:svname lustre-mdt2/mdt2 2>/dev/null | 				grep -E ':[a-zA-Z]{3}[0-9]{4}'
            10:19:58:[59445.082988] Lustre: DEBUG MARKER: zfs get -H -o value 				lustre:svname lustre-mdt2/mdt2 2>/dev/null | 				grep -E ':[a-zA-Z]{3}[0-9]{4}'
            10:19:58:[59445.432320] Lustre: DEBUG MARKER: zfs get -H -o value lustre:svname 		                           lustre-mdt2/mdt2 2>/dev/null
            10:19:58:[59446.305312] Lustre: DEBUG MARKER: /usr/sbin/lctl mark trevis-55vm2.trevis.hpdd.intel.com: executing wait_import_state_mount FULL mdc.lustre-MDT0001-mdc-*.mds_server_uuid
            10:19:58:[59446.305685] Lustre: DEBUG MARKER: /usr/sbin/lctl mark trevis-55vm1.trevis.hpdd.intel.com: executing wait_import_state_mount FULL mdc.lustre-MDT0001-mdc-*.mds_server_uuid
            10:19:58:[59446.626587] Lustre: DEBUG MARKER: trevis-55vm1.trevis.hpdd.intel.com: executing wait_import_state_mount FULL mdc.lustre-MDT0001-mdc-*.mds_server_uuid
            10:19:58:[59446.628706] Lustre: DEBUG MARKER: trevis-55vm2.trevis.hpdd.intel.com: executing wait_import_state_mount FULL mdc.lustre-MDT0001-mdc-*.mds_server_uuid
            10:19:58:[59447.057549] Lustre: lustre-MDT0001: Will be in recovery for at least 1:00, or until 5 clients reconnect
            10:19:58:[59447.570240] Lustre: lustre-MDT0001: Recovery over after 0:01, of 5 clients 5 recovered and 0 were evicted.
            10:19:58:[59448.124116] Lustre: DEBUG MARKER: /usr/sbin/lctl mark mdc.lustre-MDT0001-mdc-*.mds_server_uuid in FULL state after 1 sec
            10:19:58:[59448.124291] Lustre: DEBUG MARKER: /usr/sbin/lctl mark mdc.lustre-MDT0001-mdc-*.mds_server_uuid in FULL state after 1 sec
            10:19:58:[59448.512706] Lustre: DEBUG MARKER: mdc.lustre-MDT0001-mdc-*.mds_server_uuid in FULL state after 1 sec
            10:19:58:[59448.527843] Lustre: DEBUG MARKER: mdc.lustre-MDT0001-mdc-*.mds_server_uuid in FULL state after 1 sec
            10:19:58:[59450.364950] Lustre: DEBUG MARKER: /usr/sbin/lctl mark  replay-single test_80c: @@@@@@ FAIL: rmdir failed 
            10:19:58:[59450.685468] Lustre: DEBUG MARKER: replay-single test_80c: @@@@@@ FAIL: rmdir failed
            10:19:58:[59451.049620] Lustre: DEBUG MARKER: /usr/sbin/lctl dk > /test_logs/2017-06-16/lustre-master-el7-x86_64--full--1_19_1__3603__-70308503758260-171737/replay-single.test_80c.debug_log.$(hostname -s).1497694559.log;
            10:19:58:[59451.049620]          dmesg > /test_logs/2017-06-16/lustre-master-el7-x86_64--full--1_19_1__3
            10:19:58:[59453.691066] Lustre: DEBUG MARKER: lctl set_param -n fail_loc=0 	    fail_val=0 2>/dev/null
            10:19:58:[59455.834157] Lustre: DEBUG MARKER: for i in $(find /proc/spl -type f); do
            10:19:58:[59455.834157] 				echo ====${i}=======================;
            10:19:58:[59455.834157] 				cat ${i};
            10:19:58:[59455.834157] 				done
            10:19:58:[59456.934168] Lustre: DEBUG MARKER: echo ====zpool events -v=======================;
            10:19:58:[59456.934168] 				zpool events -v
            10:19:58:[59457.623449] LustreError: 3434:0:(llog_cat.c:763:llog_cat_cancel_records()) lustre-MDT0000-osp-MDT0001: fail to cancel 0 of 1 llog-records: rc = -116
            10:19:58:[59457.752332] Lustre: DEBUG MARKER: /usr/sbin/lctl mark == replay-single test 80d: DNE: create remote dir, drop update rep from MDT1, fail 2 MDTs ============ 10:16:06 \(1497694566\)
            
            sarah Sarah Liu added a comment - MDS console https://testing.hpdd.intel.com/test_sets/93814ab4-53f0-11e7-bacf-5254006e85c2 10:16:00:[59442.222216] Lustre: DEBUG MARKER: mkdir -p /mnt/lustre-mds2; mount -t lustre lustre-mdt2/mdt2 /mnt/lustre-mds2 10:16:00:[59442.618812] Lustre: lustre-MDT0001: Imperative Recovery not enabled, recovery window 60-180 10:16:00:[59442.622180] LustreError: 3386:0:(osd_oi.c:503:osd_oid()) lustre-MDT0001-osd: unsupported quota oid: 0x16 10:16:00:[59442.829897] Lustre: DEBUG MARKER: /usr/sbin/lctl get_param -n health_check 10:16:00:[59443.177304] Lustre: DEBUG MARKER: PATH=/usr/lib64/lustre/tests:/usr/lib/lustre/tests:/usr/lib64/lustre/tests:/opt/iozone/bin:/opt/iozone/bin:/usr/lib64/lustre/tests/mpi:/usr/lib64/lustre/tests/racer:/usr/lib64/lustre/../lustre-iokit/sgpdd-survey:/usr/lib64/lustre/tests:/usr/lib64/lustre/u 10:16:00:[59443.788151] Lustre: DEBUG MARKER: /usr/sbin/lctl mark trevis-55vm3.trevis.hpdd.intel.com: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all 4 10:16:00:[59443.792192] Lustre: DEBUG MARKER: /usr/sbin/lctl mark trevis-55vm3.trevis.hpdd.intel.com: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all 4 10:16:00:[59444.131692] Lustre: DEBUG MARKER: trevis-55vm3.trevis.hpdd.intel.com: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all 4 10:19:58:[59444.136692] Lustre: DEBUG MARKER: trevis-55vm3.trevis.hpdd.intel.com: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all 4 10:19:58:[59444.397031] Lustre: DEBUG MARKER: lctl set_param -n mdt.lustre*.enable_remote_dir=1 10:19:58:[59444.739127] Lustre: DEBUG MARKER: zfs get -H -o value lustre:svname lustre-mdt2/mdt2 2>/dev/null | grep -E ':[a-zA-Z]{3}[0-9]{4}' 10:19:58:[59445.082988] Lustre: DEBUG MARKER: zfs get -H -o value lustre:svname lustre-mdt2/mdt2 2>/dev/null | grep -E ':[a-zA-Z]{3}[0-9]{4}' 10:19:58:[59445.432320] Lustre: DEBUG MARKER: zfs get -H -o value lustre:svname lustre-mdt2/mdt2 2>/dev/null 10:19:58:[59446.305312] Lustre: DEBUG MARKER: /usr/sbin/lctl mark trevis-55vm2.trevis.hpdd.intel.com: executing wait_import_state_mount FULL mdc.lustre-MDT0001-mdc-*.mds_server_uuid 10:19:58:[59446.305685] Lustre: DEBUG MARKER: /usr/sbin/lctl mark trevis-55vm1.trevis.hpdd.intel.com: executing wait_import_state_mount FULL mdc.lustre-MDT0001-mdc-*.mds_server_uuid 10:19:58:[59446.626587] Lustre: DEBUG MARKER: trevis-55vm1.trevis.hpdd.intel.com: executing wait_import_state_mount FULL mdc.lustre-MDT0001-mdc-*.mds_server_uuid 10:19:58:[59446.628706] Lustre: DEBUG MARKER: trevis-55vm2.trevis.hpdd.intel.com: executing wait_import_state_mount FULL mdc.lustre-MDT0001-mdc-*.mds_server_uuid 10:19:58:[59447.057549] Lustre: lustre-MDT0001: Will be in recovery for at least 1:00, or until 5 clients reconnect 10:19:58:[59447.570240] Lustre: lustre-MDT0001: Recovery over after 0:01, of 5 clients 5 recovered and 0 were evicted. 10:19:58:[59448.124116] Lustre: DEBUG MARKER: /usr/sbin/lctl mark mdc.lustre-MDT0001-mdc-*.mds_server_uuid in FULL state after 1 sec 10:19:58:[59448.124291] Lustre: DEBUG MARKER: /usr/sbin/lctl mark mdc.lustre-MDT0001-mdc-*.mds_server_uuid in FULL state after 1 sec 10:19:58:[59448.512706] Lustre: DEBUG MARKER: mdc.lustre-MDT0001-mdc-*.mds_server_uuid in FULL state after 1 sec 10:19:58:[59448.527843] Lustre: DEBUG MARKER: mdc.lustre-MDT0001-mdc-*.mds_server_uuid in FULL state after 1 sec 10:19:58:[59450.364950] Lustre: DEBUG MARKER: /usr/sbin/lctl mark replay-single test_80c: @@@@@@ FAIL: rmdir failed 10:19:58:[59450.685468] Lustre: DEBUG MARKER: replay-single test_80c: @@@@@@ FAIL: rmdir failed 10:19:58:[59451.049620] Lustre: DEBUG MARKER: /usr/sbin/lctl dk > /test_logs/2017-06-16/lustre-master-el7-x86_64--full--1_19_1__3603__-70308503758260-171737/replay-single.test_80c.debug_log.$(hostname -s).1497694559.log; 10:19:58:[59451.049620] dmesg > /test_logs/2017-06-16/lustre-master-el7-x86_64--full--1_19_1__3 10:19:58:[59453.691066] Lustre: DEBUG MARKER: lctl set_param -n fail_loc=0 fail_val=0 2>/dev/null 10:19:58:[59455.834157] Lustre: DEBUG MARKER: for i in $(find /proc/spl -type f); do 10:19:58:[59455.834157] echo ====${i}=======================; 10:19:58:[59455.834157] cat ${i}; 10:19:58:[59455.834157] done 10:19:58:[59456.934168] Lustre: DEBUG MARKER: echo ====zpool events -v=======================; 10:19:58:[59456.934168] zpool events -v 10:19:58:[59457.623449] LustreError: 3434:0:(llog_cat.c:763:llog_cat_cancel_records()) lustre-MDT0000-osp-MDT0001: fail to cancel 0 of 1 llog-records: rc = -116 10:19:58:[59457.752332] Lustre: DEBUG MARKER: /usr/sbin/lctl mark == replay-single test 80d: DNE: create remote dir, drop update rep from MDT1, fail 2 MDTs ============ 10:16:06 \(1497694566\)

            People

              laisiyao Lai Siyao
              jcasper James Casper (Inactive)
              Votes:
              0 Vote for this issue
              Watchers:
              6 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: