Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-13974

Wrong behavior for out operations create+write especially for update log

    XMLWordPrintable

Details

    • Bug
    • Resolution: Fixed
    • Major
    • Lustre 2.12.9, Lustre 2.15.0
    • Lustre 2.14.0
    • None
    • 3
    • 9223372036854775807

    Description

      I've catch a next problem with update log.

      When MDT writes to a new update log, after failover, a writer creates two update request. A first update with llog CREATE and  catalog modification. A second update with WRITE header and record with index 1. Both are sending through a out queue. 

      After first update request osp_create_interpreter set zero to obj->opo_attr.la_valid. Any new adding llog record would  call dt_attr_get(cahe is invalid) for llog before dt_write. A size is 0, and llog record with index 2 would be written right after a llog header.

      So any new llog add  between end of osp_create_interpreter(1 update req) and osp_write_interpreter(2 update req) would lead to update log corruption.

      Here is a log of the problem

      40000000:00000040:1.0:1600183643.158176:0:15651:0:(fid_request.c:400:seq_client_alloc_fid()) cli-cli-lustre-MDT0000-osp-MDT0001: Allocated FID [0x2000061c1:0x1:0x0]
      00000004:00000040:1.0:1600183643.158181:0:15651:0:(mdt_handler.c:5793:mdt_object_init()) object init, fid = [0x2000061c1:0x1:0x0]
      00000040:00001000:1.0:1600183643.158557:0:15651:0:(llog_osd.c:400:llog_osd_write_rec()) new record 106a0000 to [0x2000061c1:0x1:0x0]
      00000004:00000040:1.0:1600183643.158559:0:15651:0:(osp_md_object.c:1177:osp_md_write()) write [0x2000061c1:0x1:0x0] offset = 0 length = 32768
      00000004:00000040:1.0:1600183643.158597:0:15651:0:(osp_md_object.c:1177:osp_md_write()) write [0x2000061c1:0x1:0x0] offset = 32768 length = 464
      00000040:00080000:1.0:1600183643.158600:0:15651:0:(llog_osd.c:708:llog_osd_write_rec()) added record [0x2000061c1:0x1:0x0].1, 464 off33232
      00000004:00000040:1.0:1600183643.158614:0:29981:0:(osp_trans.c:266:object_update_request_dump()) i = 0 fid = [0x2000061c1:0x1:0x0] op = create params = 1 batchid = 0 size = 256 repsize 0
      00000004:00000040:0.0:1600183643.158743:0:15657:0:(mdt_handler.c:5793:mdt_object_init()) object init, fid = [0x2000061c1:0x1:0x0]
      00000020:00000040:0.0:1600183643.158763:0:15657:0:(out_handler.c:574:out_index_insert()) lustre-MDT0000: [0x20000000a:0x0:0x0] index insert [0x2000061c1:0x1:0x0]: rc = 0
      00000020:00001000:0.0:1600183643.158777:0:15657:0:(out_lib.c:548:out_tx_create_exec()) lustre-MDT0000-osd: create [0x2000061c1:0x1:0x0]: dof 0, mode 100000
      00000020:00000040:0.0:1600183643.158819:0:15657:0:(out_lib.c:1092:out_obj_index_insert()) lustre-MDT0000-osd: index insert [0x20000000a:0x0:0x0] name: [0x2000061c1:0x1:0x0] fid [0x2000061c1:0x1:0x0], type 32768
      00000040:00001000:1.0:1600183643.163358:0:16581:0:(llog_osd.c:400:llog_osd_write_rec()) new record 106a0000 to [0x2000061c1:0x1:0x0]
      00000004:00000040:1.0:1600183643.163374:0:16581:0:(osp_trans.c:266:object_update_request_dump()) i = 0 fid = [0x2000061c1:0x1:0x0] op = attr_get params = 0 batchid = 0 size = 40 repsize 208
      00000004:00000040:1.0:1600183643.163641:0:16581:0:(osp_md_object.c:1177:osp_md_write()) write [0x2000061c1:0x1:0x0] offset = 0 length = 32768
      00000004:00000040:1.0:1600183643.163686:0:16581:0:(osp_md_object.c:1177:osp_md_write()) write [0x2000061c1:0x1:0x0] offset = 32768 length = 424
      00000040:00080000:1.0:1600183643.163690:0:16581:0:(llog_osd.c:708:llog_osd_write_rec()) added record [0x2000061c1:0x1:0x0].2, 424 off33192
      00000004:00000040:1.0:1600183643.163790:0:29981:0:(osp_trans.c:266:object_update_request_dump()) i = 0 fid = [0x2000061c1:0x1:0x0] op = write params = 2 batchid = 0 size = 32832 repsize 0
      00000004:00000040:1.0:1600183643.163793:0:29981:0:(osp_trans.c:266:object_update_request_dump()) i = 1 fid = [0x2000061c1:0x1:0x0] op = write params = 2 batchid = 0 size = 528 repsize 0
      00000020:00000040:0.0:1600183643.164058:0:15657:0:(out_lib.c:695:out_tx_write_exec()) write [0x2000061c1:0x1:0x0] pos 0 buf ffff9bdfea590038, len 32768
      00000020:00000040:0.0:1600183643.164174:0:15657:0:(out_lib.c:695:out_tx_write_exec()) write [0x2000061c1:0x1:0x0] pos 32768 buf ffff9bdfea598078, len 464
      00000004:00000040:1.0:1600183643.164381:0:29981:0:(osp_trans.c:266:object_update_request_dump()) i = 0 fid = [0x2000061c1:0x1:0x0] op = write params = 2 batchid = 0 size = 32832 repsize 0
      00000004:00000040:1.0:1600183643.164383:0:29981:0:(osp_trans.c:266:object_update_request_dump()) i = 1 fid = [0x2000061c1:0x1:0x0] op = write params = 2 batchid = 0 size = 488 repsize 0
      00000020:00000040:0.0:1600183643.164533:0:15657:0:(out_lib.c:695:out_tx_write_exec()) write [0x2000061c1:0x1:0x0] pos 0 buf ffff9bdfea590038, len 32768
      00000020:00000040:0.0:1600183643.164545:0:15657:0:(out_lib.c:695:out_tx_write_exec()) write [0x2000061c1:0x1:0x0] pos 32768 buf ffff9bdfea598078, len 424
      00000020:00080000:0.0:1600183643.164547:0:15657:0:(out_lib.c:718:out_tx_write_exec()) llog wrong idx [0x2000061c1:0x1:0x0] pos 32768, len 424, size 32768 idx 2->275382272
      

      At log record 1 and 2 was added at the same offset.

      llog_osd_write_rec()) added record [0x2000061c1:0x1:0x0].1, 464 off33232
      llog_osd_write_rec()) added record [0x2000061c1:0x1:0x0].2, 424 off33192

      So our with index 2 would rewrite the 1.
      Any thoughts on how better to fix it?

      Attachments

        Issue Links

          Activity

            People

              aboyko Alexander Boyko
              aboyko Alexander Boyko
              Votes:
              0 Vote for this issue
              Watchers:
              7 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: