Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-12536

Processes stuck in unkillable sleep waiting on IO during Lustre re-export of NFS testing

Details

    • Bug
    • Resolution: Fixed
    • Minor
    • Lustre 2.13.0
    • Lustre 2.7.0
    • None
    • 3
    • 9223372036854775807

    Description

      When a regression test suite is run on an NFS client against a NFS exported Lustre file system, the NFS server/Lustre client slows. Many of the nfsd threads are stuck in osc_extent_wait:

      PID: 5989, 6017, 6018, 6022, 6023, 6024, 6025, 6026, 6027, 6028, 6029, 6030, 6031, 6032, 6033, 6034, 6035, 6036, 6037, 6038, 6039, 6040, 6041, 6042, 6043
      TASKS: 25
              schedule at ffffffff8161523e
              osc_extent_wait at ffffffffa0ec96b0 [osc]
              osc_cache_wait_range at ffffffffa0ecff5c [osc]
              osc_io_fsync_end at ffffffffa0ebc7c6 [osc]
              cl_io_end at ffffffffa09d6ac5 [obdclass]
              lov_io_end_wrapper at ffffffffa0ca3314 [lov]
              lov_io_fsync_end at ffffffffa0ca366e [lov]
              cl_io_end at ffffffffa09d6ac5 [obdclass]
              cl_io_loop at ffffffffa09da0dc [obdclass]
              cl_sync_file_range at ffffffffa0d9aea5 [lustre]
              ll_writepages at ffffffffa0dc1e83 [lustre]
              do_writepages at ffffffff811519ae
              __filemap_fdatawrite_range at ffffffff81146121
              filemap_write_and_wait_range at ffffffff8114623a
              ll_fsync at ffffffffa0d9b09a [lustre]
              vfs_fsync_range at ffffffff811d925b
              vvp_io_write_start at ffffffffa0df29f7 [lustre]
              cl_io_start at ffffffffa09d6d0e [obdclass]
              cl_io_loop at ffffffffa09da0ce [obdclass]
              ll_file_io_generic at ffffffffa0d91f88 [lustre]
              ll_file_write_iter at ffffffffa0d9257d [lustre]
              do_iter_readv_writev at ffffffff811a988a
              do_readv_writev at ffffffff811aa258
              vfs_writev at ffffffff811aa50c
              nfsd_vfs_write at ffffffff812e5e02
              nfsd_write at ffffffff812e84f8
              nfsd3_proc_write at ffffffff812ed523
              nfsd_dispatch at ffffffff812e14ae
              svc_process at ffffffff815ec536
              nfsd at ffffffff812e0ef0
              kthread at ffffffff81074376
              ret_from_fork at ffffffff8161983f
      

      They are waiting for the extent's oe_state to change to OES_INV but there is no I/O pending that would cause the state to change. The ptlrpcd queues are empty; no threads are performing synchronous I/O.

      The problem was traced to a kernel change in generic_write_sync(). It checks for IOCB_DSYNC in the ki_flags instead of O_SYNC and IS_SYNC. As a result, generic_write_sync() is not writing anything and osc_extents are not getting released before the wait begins.

      Old function:

      int generic_write_sync(struct file *file, loff_t pos, loff_t count)
      {
              if (!(file->f_flags & O_DSYNC) && !IS_SYNC(file->f_mapping->host))
                      return 0;
              return vfs_fsync_range(file, pos, pos + count - 1,
                                     (file->f_flags & __O_SYNC) ? 0 : 1);
      }
      

      New function:

      static inline ssize_t generic_write_sync(struct kiocb *iocb, ssize_t count)
      {
              if (iocb->ki_flags & IOCB_DSYNC) {
                      int ret = vfs_fsync_range(iocb->ki_filp,
                                      iocb->ki_pos - count, iocb->ki_pos - 1,
                                      (iocb->ki_flags & IOCB_SYNC) ? 0 : 1);
                      if (ret)
                              return ret;
              }
      
              return count;
      }
      

      Attachments

        Activity

          People

            amk Ann Koehler (Inactive)
            amk Ann Koehler (Inactive)
            Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: