Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-5556

lock timeout expiring while bulk read still in progress (req timeout extended via early reply)

Details

    • Bug
    • Resolution: Fixed
    • Minor
    • Lustre 2.7.0, Lustre 2.5.4
    • Lustre 2.7.0
    • None
    • 3
    • 15492

    Description

      Seen on lola, lock cancellation is blocked by a bulk read for which request deadline is extended by early reply (a message of the bulk transfer was dropped by router). The lock timeout got extended upon request arrival, but not when early reply prolonged the request timeout. This caused the client to be evicted while the bulk read was still being processed on the server.

      Aug 17 18:40:26 lola-3 kernel: LustreError: 0:0:(ldlm_lockd.c:344:waiting_locks_callback()) ### lock callback timer expired after 101s: evicting
       client at 192.168.1.118@o2ib1  ns: filter-soaked-OST0001_UUID lock: ffff8803e94983c0/0x72e211b40ba15a85 lrc: 3/0,0 mode: PW/PW res: [0x235ac2:0
      x0:0x0].0 rrc: 48 type: EXT [85899354112->94489288703] (req 85899354112->85899362303) flags: 0x60000000000020 nid: 192.168.1.118@o2ib1 remote: 0
      xab97f12518cf2c8 expref: 12 pid: 6455 timeout: 4534352032 lvb_type: 0
      ...
      Aug 17 18:40:27 lola-3 kernel: LustreError: 9271:0:(ldlm_lib.c:2693:target_bulk_io()) @@@ Eviction on bulk PUT  req@ffff880248c1cc00 x1476486685884824/t0(0) o3->94bd0a03-46cf-eb0b-78dd-b52825d72007@192.168.1.118@o2ib1:0/0 lens 488/432 e 0 to 0 dl 1408326682 ref 1 fl Interpret:/0/0 rc 0/0
      Aug 17 18:40:27 lola-3 kernel: Lustre: soaked-OST0001: Bulk IO read error with 94bd0a03-46cf-eb0b-78dd-b52825d72007 (at 192.168.1.118@o2ib1), client will retry: rc -107
      

      Attachments

        Issue Links

          Activity

            [LU-5556] lock timeout expiring while bulk read still in progress (req timeout extended via early reply)

            Oleg Drokin (oleg.drokin@intel.com) merged in patch http://review.whamcloud.com/12242/
            Subject: LU-5556 target: limit bulk transfer time
            Project: fs/lustre-release
            Branch: b2_5
            Current Patch Set:
            Commit: 9d7344649b533f19d2a7499764d60a23823f6030

            gerrit Gerrit Updater added a comment - Oleg Drokin (oleg.drokin@intel.com) merged in patch http://review.whamcloud.com/12242/ Subject: LU-5556 target: limit bulk transfer time Project: fs/lustre-release Branch: b2_5 Current Patch Set: Commit: 9d7344649b533f19d2a7499764d60a23823f6030
            tappro Mikhail Pershin added a comment - http://review.whamcloud.com/12242 - patch for 2.5

            patch landed.

            johann Johann Lombardi (Inactive) added a comment - patch landed.

            People

              johann Johann Lombardi (Inactive)
              johann Johann Lombardi (Inactive)
              Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: