Details

    • Type: Bug
    • Status: Open
    • Priority: Minor
    • Resolution: Unresolved
    • Affects Version/s: None
    • Fix Version/s: None
    • Labels:
      None
    • Severity:
      3
    • Rank (Obsolete):
      9223372036854775807

      Description

      When we were using Lustre below NFS then it is observed that the write
      requests that we get is not page aligned even if the application is sending
      it correctly. Mostly it is the first and last page which is not aligned.

      Obseravtion as per code analysis is that below function is causing the issue:

       
      
      static int fill_in_write_vector(struct kvec *vec, struct nfsd4_write *write)
      {
       int i = 1;
       int buflen = write->wr_buflen;
      
      vec[0].iov_base = write->wr_head.iov_base;
       vec[0].iov_len = min_t(int, buflen, write->wr_head.iov_len); <======
       buflen -= vec[0].iov_len;
      
      while (buflen) {
       vec[i].iov_base = page_address(write->wr_pagelist[i - 1]);
       vec[i].iov_len = min_t(int, PAGE_SIZE, buflen);
       buflen -= vec[i].iov_len;
       i++;
       }
       return i;
      }
      
      nfsd4_write()
      {
      :
       nvecs = fill_in_write_vector(rqstp->rq_vec, write);
      :
      }
      
      

       

      i.e. 0th vector is filled with min of buflen or wr_head and rest differently

      Because of this, first and last page is not aligned.

      Interestingly when such request hit Lustre write path, as the first page is
      un-aligned (or partial) the whole write becomes un-aligned and causing performance
      degradation. The performance degradation seems mostly because of un-aglined
      write causing read rpcs.

      Possible Solution:
      ==================

      As we can see here first partial size vector causing all following page size
      vector to be considered as partial. Which in tern causing read requests/rpcs.
      I think, we can avoid those read requests as we know first partial vector
      is causing this. We can detect it by scanning the vector (we are already
      iterating over it). Based on this thought/approach cook proto-type patch.

      Seeking feedback on this approach or any question/suggestion/concern on the same ?

      Following stats are collected without and with patch.

      1. salloc -N 8 --ntasks-per-node=16 mpirun --allow-run-as-root /work/tools/bin/ior -a POSIX -w -r -vv -e -t 1m -b 2g -C -Q 21 -F -o /scratch1_nfs/file

      lustre-2.10.4
      Max Write: 379.60 MiB/sec (398.03 MB/sec)
      Max Read: 4618.64 MiB/sec (4842.99 MB/sec)

      lustre-2.10.4/w proto-type patch
      Max Write: 3817.34 MiB/sec (4002.77 MB/sec)
      Max Read: 4474.30 MiB/sec (4691.64 MB/sec)

        Attachments

          Activity

            People

            • Assignee:
              rdeshmukh_wc Rahul Deshmukh
              Reporter:
              rdeshmukh_ddn Rahul Deshmukh (Inactive)
            • Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

              • Created:
                Updated: