Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-13974

Wrong behavior for out operations create+write especially for update log

Details

    • Bug
    • Resolution: Fixed
    • Major
    • Lustre 2.12.9, Lustre 2.15.0
    • Lustre 2.14.0
    • None
    • 3
    • 9223372036854775807

    Description

      I've catch a next problem with update log.

      When MDT writes to a new update log, after failover, a writer creates two update request. A first update with llog CREATE and  catalog modification. A second update with WRITE header and record with index 1. Both are sending through a out queue. 

      After first update request osp_create_interpreter set zero to obj->opo_attr.la_valid. Any new adding llog record would  call dt_attr_get(cahe is invalid) for llog before dt_write. A size is 0, and llog record with index 2 would be written right after a llog header.

      So any new llog add  between end of osp_create_interpreter(1 update req) and osp_write_interpreter(2 update req) would lead to update log corruption.

      Here is a log of the problem

      40000000:00000040:1.0:1600183643.158176:0:15651:0:(fid_request.c:400:seq_client_alloc_fid()) cli-cli-lustre-MDT0000-osp-MDT0001: Allocated FID [0x2000061c1:0x1:0x0]
      00000004:00000040:1.0:1600183643.158181:0:15651:0:(mdt_handler.c:5793:mdt_object_init()) object init, fid = [0x2000061c1:0x1:0x0]
      00000040:00001000:1.0:1600183643.158557:0:15651:0:(llog_osd.c:400:llog_osd_write_rec()) new record 106a0000 to [0x2000061c1:0x1:0x0]
      00000004:00000040:1.0:1600183643.158559:0:15651:0:(osp_md_object.c:1177:osp_md_write()) write [0x2000061c1:0x1:0x0] offset = 0 length = 32768
      00000004:00000040:1.0:1600183643.158597:0:15651:0:(osp_md_object.c:1177:osp_md_write()) write [0x2000061c1:0x1:0x0] offset = 32768 length = 464
      00000040:00080000:1.0:1600183643.158600:0:15651:0:(llog_osd.c:708:llog_osd_write_rec()) added record [0x2000061c1:0x1:0x0].1, 464 off33232
      00000004:00000040:1.0:1600183643.158614:0:29981:0:(osp_trans.c:266:object_update_request_dump()) i = 0 fid = [0x2000061c1:0x1:0x0] op = create params = 1 batchid = 0 size = 256 repsize 0
      00000004:00000040:0.0:1600183643.158743:0:15657:0:(mdt_handler.c:5793:mdt_object_init()) object init, fid = [0x2000061c1:0x1:0x0]
      00000020:00000040:0.0:1600183643.158763:0:15657:0:(out_handler.c:574:out_index_insert()) lustre-MDT0000: [0x20000000a:0x0:0x0] index insert [0x2000061c1:0x1:0x0]: rc = 0
      00000020:00001000:0.0:1600183643.158777:0:15657:0:(out_lib.c:548:out_tx_create_exec()) lustre-MDT0000-osd: create [0x2000061c1:0x1:0x0]: dof 0, mode 100000
      00000020:00000040:0.0:1600183643.158819:0:15657:0:(out_lib.c:1092:out_obj_index_insert()) lustre-MDT0000-osd: index insert [0x20000000a:0x0:0x0] name: [0x2000061c1:0x1:0x0] fid [0x2000061c1:0x1:0x0], type 32768
      00000040:00001000:1.0:1600183643.163358:0:16581:0:(llog_osd.c:400:llog_osd_write_rec()) new record 106a0000 to [0x2000061c1:0x1:0x0]
      00000004:00000040:1.0:1600183643.163374:0:16581:0:(osp_trans.c:266:object_update_request_dump()) i = 0 fid = [0x2000061c1:0x1:0x0] op = attr_get params = 0 batchid = 0 size = 40 repsize 208
      00000004:00000040:1.0:1600183643.163641:0:16581:0:(osp_md_object.c:1177:osp_md_write()) write [0x2000061c1:0x1:0x0] offset = 0 length = 32768
      00000004:00000040:1.0:1600183643.163686:0:16581:0:(osp_md_object.c:1177:osp_md_write()) write [0x2000061c1:0x1:0x0] offset = 32768 length = 424
      00000040:00080000:1.0:1600183643.163690:0:16581:0:(llog_osd.c:708:llog_osd_write_rec()) added record [0x2000061c1:0x1:0x0].2, 424 off33192
      00000004:00000040:1.0:1600183643.163790:0:29981:0:(osp_trans.c:266:object_update_request_dump()) i = 0 fid = [0x2000061c1:0x1:0x0] op = write params = 2 batchid = 0 size = 32832 repsize 0
      00000004:00000040:1.0:1600183643.163793:0:29981:0:(osp_trans.c:266:object_update_request_dump()) i = 1 fid = [0x2000061c1:0x1:0x0] op = write params = 2 batchid = 0 size = 528 repsize 0
      00000020:00000040:0.0:1600183643.164058:0:15657:0:(out_lib.c:695:out_tx_write_exec()) write [0x2000061c1:0x1:0x0] pos 0 buf ffff9bdfea590038, len 32768
      00000020:00000040:0.0:1600183643.164174:0:15657:0:(out_lib.c:695:out_tx_write_exec()) write [0x2000061c1:0x1:0x0] pos 32768 buf ffff9bdfea598078, len 464
      00000004:00000040:1.0:1600183643.164381:0:29981:0:(osp_trans.c:266:object_update_request_dump()) i = 0 fid = [0x2000061c1:0x1:0x0] op = write params = 2 batchid = 0 size = 32832 repsize 0
      00000004:00000040:1.0:1600183643.164383:0:29981:0:(osp_trans.c:266:object_update_request_dump()) i = 1 fid = [0x2000061c1:0x1:0x0] op = write params = 2 batchid = 0 size = 488 repsize 0
      00000020:00000040:0.0:1600183643.164533:0:15657:0:(out_lib.c:695:out_tx_write_exec()) write [0x2000061c1:0x1:0x0] pos 0 buf ffff9bdfea590038, len 32768
      00000020:00000040:0.0:1600183643.164545:0:15657:0:(out_lib.c:695:out_tx_write_exec()) write [0x2000061c1:0x1:0x0] pos 32768 buf ffff9bdfea598078, len 424
      00000020:00080000:0.0:1600183643.164547:0:15657:0:(out_lib.c:718:out_tx_write_exec()) llog wrong idx [0x2000061c1:0x1:0x0] pos 32768, len 424, size 32768 idx 2->275382272
      

      At log record 1 and 2 was added at the same offset.

      llog_osd_write_rec()) added record [0x2000061c1:0x1:0x0].1, 464 off33232
      llog_osd_write_rec()) added record [0x2000061c1:0x1:0x0].2, 424 off33192

      So our with index 2 would rewrite the 1.
      Any thoughts on how better to fix it?

      Attachments

        Issue Links

          Activity

            [LU-13974] Wrong behavior for out operations create+write especially for update log

            "Mike Pershin <mpershin@whamcloud.com>" uploaded a new patch: https://review.whamcloud.com/46804
            Subject: LU-13974 tests: update log corruption
            Project: fs/lustre-release
            Branch: b2_14
            Current Patch Set: 1
            Commit: 35a62ff76299ab6437acd5c5c0db8f130025973f

            gerrit Gerrit Updater added a comment - "Mike Pershin <mpershin@whamcloud.com>" uploaded a new patch: https://review.whamcloud.com/46804 Subject: LU-13974 tests: update log corruption Project: fs/lustre-release Branch: b2_14 Current Patch Set: 1 Commit: 35a62ff76299ab6437acd5c5c0db8f130025973f

            "Mike Pershin <mpershin@whamcloud.com>" uploaded a new patch: https://review.whamcloud.com/46803
            Subject: LU-13974 llog: check stale osp object
            Project: fs/lustre-release
            Branch: b2_14
            Current Patch Set: 1
            Commit: 147f74c142de3db8bb4025af825dbcdd04c1fbbb

            gerrit Gerrit Updater added a comment - "Mike Pershin <mpershin@whamcloud.com>" uploaded a new patch: https://review.whamcloud.com/46803 Subject: LU-13974 llog: check stale osp object Project: fs/lustre-release Branch: b2_14 Current Patch Set: 1 Commit: 147f74c142de3db8bb4025af825dbcdd04c1fbbb
            pjones Peter Jones added a comment -

            Landed for 2.15

            pjones Peter Jones added a comment - Landed for 2.15

            Oleg Drokin (green@whamcloud.com) merged in patch https://review.whamcloud.com/40743/
            Subject: LU-13974 tests: update log corruption
            Project: fs/lustre-release
            Branch: master
            Current Patch Set:
            Commit: 562837124ec7bffeba7edb4b4b899bc271833374

            gerrit Gerrit Updater added a comment - Oleg Drokin (green@whamcloud.com) merged in patch https://review.whamcloud.com/40743/ Subject: LU-13974 tests: update log corruption Project: fs/lustre-release Branch: master Current Patch Set: Commit: 562837124ec7bffeba7edb4b4b899bc271833374

            Oleg Drokin (green@whamcloud.com) merged in patch https://review.whamcloud.com/40742/
            Subject: LU-13974 llog: check stale osp object
            Project: fs/lustre-release
            Branch: master
            Current Patch Set:
            Commit: 82c6e42d6137f39a1f2394b7bc6e8d600eb36181

            gerrit Gerrit Updater added a comment - Oleg Drokin (green@whamcloud.com) merged in patch https://review.whamcloud.com/40742/ Subject: LU-13974 llog: check stale osp object Project: fs/lustre-release Branch: master Current Patch Set: Commit: 82c6e42d6137f39a1f2394b7bc6e8d600eb36181

            Alexander Boyko (alexander.boyko@hpe.com) uploaded a new patch: https://review.whamcloud.com/40743
            Subject: LU-13974 tests: update log corruption
            Project: fs/lustre-release
            Branch: master
            Current Patch Set: 1
            Commit: 9b1a4b48de586c391a944e53c0dc916ec0ce84bb

            gerrit Gerrit Updater added a comment - Alexander Boyko (alexander.boyko@hpe.com) uploaded a new patch: https://review.whamcloud.com/40743 Subject: LU-13974 tests: update log corruption Project: fs/lustre-release Branch: master Current Patch Set: 1 Commit: 9b1a4b48de586c391a944e53c0dc916ec0ce84bb

            Alexander Boyko (alexander.boyko@hpe.com) uploaded a new patch: https://review.whamcloud.com/40742
            Subject: LU-13974 llog: check stale osp object
            Project: fs/lustre-release
            Branch: master
            Current Patch Set: 1
            Commit: 12804cc17552bd01968e447ca71cae5deff6d4b1

            gerrit Gerrit Updater added a comment - Alexander Boyko (alexander.boyko@hpe.com) uploaded a new patch: https://review.whamcloud.com/40742 Subject: LU-13974 llog: check stale osp object Project: fs/lustre-release Branch: master Current Patch Set: 1 Commit: 12804cc17552bd01968e447ca71cae5deff6d4b1

            @Peter Jones, I don't have a good solution for these bugs. And unfortunately, I did not get any response, comments or ideas from WC side(from Mike or Alex). I'm not a right person for assignee here.

            aboyko Alexander Boyko added a comment - @Peter Jones, I don't have a good solution for these bugs. And unfortunately, I did not get any response, comments or ideas from WC side(from Mike or Alex). I'm not a right person for assignee here.

            One more update log corruption for a case

            fs1-MDT0001-osp-MDT0003: [0x2:0x400024d0:0x2] Invalid record: index 112926 but expected 112925

            • Some out update request was failed, it had llog updates.
            • A new out request was declared (number 2)
            • llog osp object was invalidated(base on failure), here we have an invalid llog_handle->llog_log_hdr (count, last_index, bitmap)
            • out request(N2) called llog_add, dt_attr_get made an osp request to have a valid size and marked object valid
            • llog record was added with wrong index, at file recX, recX+2
            • got -29 from a debug patch, it detected wrong previous index. Without it error would happened during llog_process_thread as Invalid record etc if failover take a place.

            The main condition is invalidation of llog osp object between declare a new out request and llog_add for it.
            If declaration follow by invalidation, it does not create an error, cause llog header would be re read. 

            00000020:00080000:3.0:1599041250.939177:0:51269:0:(update_trans.c:94:top_multiple_thandle_dump())  cookie [0x2:0x80006997:0x5].145572
            00000040:00080000:1.0:1599041250.940534:0:51191:0:(llog_osd.c:709:llog_osd_write_rec()) added record [0x580006997:0x2:0x0].145575, 600 off74673832
            00000020:00080000:2.0:1599041250.940900:0:30740:0:(update_trans.c:94:top_multiple_thandle_dump())  cookie [0x2:0x80006997:0x5].145575
            00000020:00080000:2.0:1599041250.940911:0:30740:0:(update_trans.c:94:top_multiple_thandle_dump())  cookie [0x2:0x80006997:0x5].145573
            00000020:00080000:2.0:1599041250.940925:0:30740:0:(update_trans.c:94:top_multiple_thandle_dump())  cookie [0x2:0x80006997:0x5].145574
            00000004:00080000:2.0:1599041250.940935:0:30740:0:(osp_md_object.c:1138:osp_write_interpreter()) error [0x580006997:0x2:0x0]: rc = -2
            00000004:00080000:2.0:1599041250.940936:0:30740:0:(osp_md_object.c:1138:osp_write_interpreter()) error [0x580006997:0x2:0x0]: rc = -2
            00000004:00080000:2.0:1599041250.940937:0:30740:0:(osp_md_object.c:1138:osp_write_interpreter()) error [0x580006997:0x2:0x0]: rc = -2
            00000004:00080000:2.0:1599041250.940938:0:30740:0:(osp_md_object.c:1138:osp_write_interpreter()) error [0x580006997:0x2:0x0]: rc = -2
            00000004:00080000:2.0:1599041250.940939:0:30740:0:(osp_object.c:1347:osp_invalidate()) Invalidate osp_object [0x580006997:0x2:0x0]
            00000040:00080000:1.0:1599041250.964601:0:51211:0:(llog_osd.c:709:llog_osd_write_rec()) added record [0x580006997:0x2:0x0].145576, 552 off74673784  <--- 145576 record from the same offset as 145575, llog bitmap still have 145575 record
            00000020:00080000:0.0:1599041251.006396:0:30740:0:(update_trans.c:94:top_multiple_thandle_dump())  cookie [0x2:0x80006997:0x5].145576
            00000004:00080000:0.0:1599041251.006402:0:30740:0:(osp_md_object.c:1138:osp_write_interpreter()) error [0x580006997:0x2:0x0]: rc = -29
            

             

             

            aboyko Alexander Boyko added a comment - One more update log corruption for a case fs1-MDT0001-osp-MDT0003: [0x2:0x400024d0:0x2] Invalid record: index 112926 but expected 112925 Some out update request was failed, it had llog updates. A new out request was declared (number 2) llog osp object was invalidated(base on failure), here we have an invalid llog_handle->llog_log_hdr (count, last_index, bitmap) out request(N2) called llog_add, dt_attr_get made an osp request to have a valid size and marked object valid llog record was added with wrong index, at file recX, recX+2 got -29 from a debug patch, it detected wrong previous index. Without it error would happened during llog_process_thread as Invalid record etc if failover take a place. The main condition is invalidation of llog osp object between declare a new out request and llog_add for it. If declaration follow by invalidation, it does not create an error, cause llog header would be re read.  00000020:00080000:3.0:1599041250.939177:0:51269:0:(update_trans.c:94:top_multiple_thandle_dump()) cookie [0x2:0x80006997:0x5].145572 00000040:00080000:1.0:1599041250.940534:0:51191:0:(llog_osd.c:709:llog_osd_write_rec()) added record [0x580006997:0x2:0x0].145575, 600 off74673832 00000020:00080000:2.0:1599041250.940900:0:30740:0:(update_trans.c:94:top_multiple_thandle_dump()) cookie [0x2:0x80006997:0x5].145575 00000020:00080000:2.0:1599041250.940911:0:30740:0:(update_trans.c:94:top_multiple_thandle_dump()) cookie [0x2:0x80006997:0x5].145573 00000020:00080000:2.0:1599041250.940925:0:30740:0:(update_trans.c:94:top_multiple_thandle_dump()) cookie [0x2:0x80006997:0x5].145574 00000004:00080000:2.0:1599041250.940935:0:30740:0:(osp_md_object.c:1138:osp_write_interpreter()) error [0x580006997:0x2:0x0]: rc = -2 00000004:00080000:2.0:1599041250.940936:0:30740:0:(osp_md_object.c:1138:osp_write_interpreter()) error [0x580006997:0x2:0x0]: rc = -2 00000004:00080000:2.0:1599041250.940937:0:30740:0:(osp_md_object.c:1138:osp_write_interpreter()) error [0x580006997:0x2:0x0]: rc = -2 00000004:00080000:2.0:1599041250.940938:0:30740:0:(osp_md_object.c:1138:osp_write_interpreter()) error [0x580006997:0x2:0x0]: rc = -2 00000004:00080000:2.0:1599041250.940939:0:30740:0:(osp_object.c:1347:osp_invalidate()) Invalidate osp_object [0x580006997:0x2:0x0] 00000040:00080000:1.0:1599041250.964601:0:51211:0:(llog_osd.c:709:llog_osd_write_rec()) added record [0x580006997:0x2:0x0].145576, 552 off74673784 <--- 145576 record from the same offset as 145575, llog bitmap still have 145575 record 00000020:00080000:0.0:1599041251.006396:0:30740:0:(update_trans.c:94:top_multiple_thandle_dump()) cookie [0x2:0x80006997:0x5].145576 00000004:00080000:0.0:1599041251.006402:0:30740:0:(osp_md_object.c:1138:osp_write_interpreter()) error [0x580006997:0x2:0x0]: rc = -29    

            I've pushed a debug patch to catch these type of errors
            https://review.whamcloud.com/#/c/39995/

            aboyko Alexander Boyko added a comment - I've pushed a debug patch to catch these type of errors https://review.whamcloud.com/#/c/39995/

            People

              aboyko Alexander Boyko
              aboyko Alexander Boyko
              Votes:
              0 Vote for this issue
              Watchers:
              7 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: