Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-1496

Client evicted frequently due to lock callback timer expiration

    XMLWordPrintable

Details

    • Bug
    • Resolution: Fixed
    • Minor
    • Lustre 1.8.9
    • Lustre 1.8.x (1.8.0 - 1.8.5)
    • None
    • 3
    • 6385

    Description

      Our customer is seeing client eviction due to lock callback timer expiration relatively frequently.
      The client is not always same, but it occurred 3 times on Jun 3rd. As far as the customer checked
      the network, there is no error reported.

      << OSS >>
      2012/06/03 17:32:06 kern.err@oss4 kernel[-]: [261289.824573] LustreError: 0:0:(ldlm_lockd.c:305:waiting_locks_callback()) ### lock callback timer expired after 101s: evicting client at 172.17.11.9@o2ib ns: filter-data-OST0007_UUID lock: ffff8100debd3000/0x2babc16b9ceef3a2 lrc: 3/0,0 mode: PW/PW res: 345385/0 rrc: 8 type: EXT [32768->159743] (req 32768->36863) flags: 0x20 remote: 0xd4d1e1a63a1900b8 expref: 14 pid: 2520 timeout 4556489496
      2012/06/03 17:32:08 kern.err@oss4 kernel[-]: [261292.335250] LustreError: 26916:0:(ldlm_lib.c:1914:target_send_reply_msg()) @@@ processing error (107) req@ffff81049eabb000 x1403470492409861/t0 o13><?>@<?>:0/0 lens 192/0 e 0 to 0 dl 1338712334 ref 1 fl Interpret:/0/0 rc -107/0
      2012/06/03 17:32:22 kern.err@oss4 kernel[-]: [261306.299203] LustreError: 2435:0:(ldlm_lib.c:1914:target_send_reply_msg()) @@@ processing error (114) req@ffff8102ac2e3c00 x1403470492409878/t0 o8><?>@<?>:0/0 lens 368/264 e 0 to 0 dl 1338712442 ref 1 fl Interpret:/0/0 rc -114/0

      << client >>
      2012/06/03 17:32:08 kern.err@cnode009 kernel[-]: kernel: LustreError: 11-0: an error occurred while communicating with 172.17.13.36@o2ib. The ost_statfs operation failed with -107
      2012/06/03 17:32:08 kern.warning@cnode009 kernel[-]: kernel: Lustre: data-OST0007-osc-ffff81063bc60800: Connection to service data-OST0007 via nid 172.17.13.36@o2ib was lost; in progress operations using this service will wait for recovery to complete.
      2012/06/03 17:32:14 kern.warning@cnode009 kernel[-]: kernel: Lustre: 3960:0:(client.c:1482:ptlrpc_expire_one_request()) @@@ Request x1403470492409862 sent from data-OST0007-osc-ffff81063bc60800 to NID 172.17.13.36@o2ib 6s ago has timed out (6s prior to deadline).
      2012/06/03 17:32:14 kern.warning@cnode009 kernel[-]: kernel: req@ffff810256624800 x1403470492409862/t0 o8->data-OST0007_UUID@172.17.13.36@o2ib:28/4 lens 368/584 e 0 to 1 dl 1338712334 ref 1 fl Rpc:N/0/0 rc 0/0

      Attachments

        Issue Links

          Activity

            People

              bobijam Zhenyu Xu
              mnishizawa Mitsuhiro Nishizawa
              Votes:
              0 Vote for this issue
              Watchers:
              8 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: