Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-3399

MDT don't update client last commited correctly so produce OOM on client

    XMLWordPrintable

Details

    • Bug
    • Resolution: Unresolved
    • Major
    • None
    • Lustre 2.0.0, Lustre 2.1.0, Lustre 2.2.0, Lustre 2.3.0, Lustre 2.4.0, Lustre 1.8.x (1.8.0 - 1.8.5)
    • lustre mounted with noatime or relatime mount option.
    • 3
    • 8414

    Description

      Cray customer hit a OOM with using a lustre and Robin-Hood application.
      Per internal investigation - application issued a ls -lRa /mnt/lustre for a lustre mount with noatime option. In that case all open/close requests lives in replay cache and wait until it's committed (in cray case ~100k .. 200k requests lives in slab cache). These requests wait a commit from MDT side but none have as client have a open/close/getattr request don't generate a any real transaction.
      it's easy replicated with - prepare a fs with large number a directories, stop lustre, run
      1) MOUNTOPT="-o user_xattr,flock,noatime" PTLDEBUG=-1 SUBSYSTEM=-1 DEBUG_SIZE=800 NOFORMAT=yes sh llmount.sh
      2) Starting client: rhel6-64.shadowland: -o user_xattr,flock,noatime rhel6-64.shadowland@tcp:/lustre /mnt/lustre
      Using TIMEOUT=20
      disable quota as required
      [root@rhel6-64 tests]# lctl dk log1
      Debug log: 148739 lines, 148739 kept, 0 dropped, 0 bad.
      [root@rhel6-64 tests]# ls -lRa /mnt/lustre > /dev/null
      [root@rhel6-64 tests]# mount -t lustre -o user_xattr,flock,noatime rhel6-64.shadowland@tcp:/lustre /mnt/lustre2
      [root@rhel6-64 tests]# grep rpc_cache /proc/slabinfo
      rpc_cache 2466 2695 1168 7 2 : tunables 24 12 8 : slabdata 385 385 0 : globalstat 11338 2695 389 3 0 10 0 0 0 : cpustat 29477 1686 27471 1238

      (patch to move ptlrpc request in slab - wait a WC/Intel review).

      so after simple ls -lRa we have 2500 active requests...
      bug hit.

      to flush these requests need touch $some_file_on_lustre.
      but any touch in different mount point don't flush a request cache.

      per additional investigation - I found MDT send a zero as last committed to the affected client, where client2 have a correct last committed updates.

      As i see, target_to_commited_req send a just per export committed info to the client, instead of use a global data. In that case buggy client don't able to commit a requests in base of different client activity.

      Can someone explain why it's used instead of global obd_last_commited?
      simple patch have reduce problem for my.

      diff --git a/lustre/ldlm/ldlm_lib.c b/lustre/ldlm/ldlm_lib.c
      index 31ffbc9..99f2f26 100644
      --- a/lustre/ldlm/ldlm_lib.c
      +++ b/lustre/ldlm/ldlm_lib.c
      @@ -2529,7 +2529,8 @@ int target_committed_to_req(struct ptlrpc_request *req)
       
               if (likely(!exp->exp_obd->obd_no_transno && req->rq_repmsg != NULL)) {
                       lustre_msg_set_last_committed(req->rq_repmsg,
      -                                              exp->exp_last_committed);
      +                                               exp->exp_obd->obd_last_committed);
      +//                                              exp->exp_last_committed);
               } else {
                       DEBUG_REQ(D_IOCTL, req, "not sending last_committed update (%d/"
                                 "%d)", exp->exp_obd->obd_no_transno,
      @@ -2537,8 +2538,8 @@ int target_committed_to_req(struct ptlrpc_request *req)
                      ret = 0;
              }
       
      -        CDEBUG(D_INFO, "last_committed "LPU64", transno "LPU64", xid "LPU64"\n",
      -               exp->exp_last_committed, req->rq_transno, req->rq_xid);
      +        CDEBUG(D_INFO, "last_committed "LPU64"/"LPU64", transno "LPU64", xid "LPU64"\n",
      +               exp->exp_last_committed, exp->exp_obd->obd_last_committed, req->rq_transno, req->rq_xid);
              return ret;
       }
       EXPORT_SYMBOL(target_committed_to_req);
      

      cache have reduced after next ping.

      Attachments

        Issue Links

          Activity

            People

              bogl Bob Glossman (Inactive)
              shadow Alexey Lyashkov
              Votes:
              0 Vote for this issue
              Watchers:
              8 Start watching this issue

              Dates

                Created:
                Updated: