Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-18671

LustreError lustre-MDT0000-mdc-ffffa0816bceb000 ost_read failed -116

Details

    • Bug
    • Resolution: Fixed
    • Minor
    • Lustre 2.17.0
    • None
    • 3
    • 9223372036854775807

    Description

      It is reproducible with DOM test from sanity, not always.
      The error relates to DOM component and show ESTALE read.

      Jan 20 17:21:38 devvm-rk1 kernel: Lustre: DEBUG MARKER: == sanity test 273b: DoM: race writeback and object destroy ========================================================== 17:21:38 (1737390098)
      Jan 20 17:21:41 devvm-rk1 kernel: LustreError: lustre-MDT0000-mdc-ffffa0816bceb000: operation ost_read to node 192.168.122.55@tcp failed: rc = -116
      

      The situation was next

      • client closes a file
      • MDT sends blocking AST to a client for a IBIT 0x40 lock
      • client processing BLAST and call
        mdc_lock_flush()>mdc_lock_discard_pages()>osc_page_gang_lookup()->osc_brw_prep_request()
        this brw is for a one byte and outside DoM component size
      • server does not find a lock for this brw and return ESTALE

      Intresting msgs are next

      0000008:00000020:2.0:1737390101.104518:0:1928870:0:(osc_request.c:1833:osc_brw_prep_request()) Using short io for data transfer, size = 1
      00000008:00000020:2.0:1737390101.104520:0:1928870:0:(osc_request.c:741:osc_announce_cached()) lustre-MDT0000-mdc-ffffa0816bceb000: dirty: 0 undirty: 1008074752 dropped 0 grant: 30810112 cl_lost_grant 0
      00000008:00100000:2.0:1737390101.104522:0:1928870:0:(osc_request.c:2003:osc_brw_prep_request()) brw rpc ffffa08172c109c0 - object 0x200004284:2265 offset 1048576<>1048577
      00000008:00000001:2.0:1737390101.104522:0:1928870:0:(osc_request.c:2006:osc_brw_prep_request()) Process leaving (rc=0 : 0 : 0)
      00000100:00000001:2.0:1737390101.104523:0:1928870:0:(jobid.c:937:lustre_get_jobid()) Process entered
      00000100:00000001:2.0:1737390101.104524:0:1928870:0:(jobid.c:977:lustre_get_jobid()) Process leaving (rc=0 : 0 : 0)
      00000100:00000040:2.0:1737390101.104527:0:1928870:0:(ptlrpcd.c:300:ptlrpcd_add_req()) @@@ add req [00000000d3889700] to pc [ptlrpcd_01_01+1]  req@ffffa08172c109c0 x1821780884276864/t0(0) o3->lustre-MDT0000-mdc-ffffa0816bceb000@192.168.122.55@tcp:13/10 lens 488/448 e 0 to 0 dl 0 ref 1 fl New:NQU/200/ffffffff rc 0/-1 job:'ldlm_bl.0' uid:0 gid:0
      

      Attachments

        Activity

          People

            aboyko Alexander Boyko
            aboyko Alexander Boyko
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: