[LU-16935] deadlock between ll_filemap_fault and ll_imp_inval Created: 29/Jun/23  Updated: 29/Jun/23

Status: Open
Project: Lustre
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Bug Priority: Minor
Reporter: Vladimir Saveliev Assignee: Vladimir Saveliev
Resolution: Unresolved Votes: 0
Labels: None

Severity: 3
Rank (Obsolete): 9223372036854775807

 Description   

Thee following loop in ll_filemap_fault

int ll_filemap_fault(struct vm_area_struct *vma, struct vm_fault *vmf)
..
        do {
                seq = read_seqbegin(&ll_i2info(inode)->lli_page_inv_lock);
                ret = __ll_filemap_fault(vma, vmf);
        } while (read_seqretry(&ll_i2info(inode)->lli_page_inv_lock, seq) &&
                 (ret & VM_FAULT_SIGBUS));

may become endless:

ll_filemap_fault()
  filemap_fault()
  ...
    ll_readpage()
      ll_io_read_page()
        rc = cl_sync_io_wait(env, anchor, 0);
        if (!PageUptodate(cl_page_vmpage(page)))
          cl_page_discard()
            vvp_page_discard()
              generic_error_remove_page()
                truncate_complete_page()
                  ...
                    vvp_page_delete()
                      write_seqlock(&ll_i2info(inode)->lli_page_inv_lock);
                      ClearPageUptodate(vmpage);
                      write_sequnlock(&ll_i2info(inode)->lli_page_inv_lock);

If page is not uptodate after cl_sync_io_wait() - vvp_page_delete() called deep inside cl_page_discard() increases lli_page_inv_lock seqlock.

filemap_fault() (true for 4.12.14_122.147 and probably few other kernels of SLES12 SP5) returns VM_FAULT_SIGBUS if readpage fails:

int filemap_fault(struct vm_fault *vmf)
..
        error = mapping->a_ops->readpage(file, page);
        if (!error) {
                wait_on_page_locked(page);
                if (!PageUptodate(page))
                        error = -EIO;
        }
        put_page(page);

        if (!error || error == AOP_TRUNCATED_PAGE)
                goto retry_find;

        /* Things didn't work out. Return zero to tell the mm layer so. */
        shrink_readahead_size_eio(file, ra);
        return VM_FAULT_SIGBUS;

When readpage fails as result of eviction from server side the following deadlock gets formed:

ll_imp_inval stucks in

ptlrpc_invalidate_import_thread
  obd_import_event(IMP_EVENT_INVALIDATE)
    ..
      osc_object_invalidate
        l_wait_event(osc->oo_io_waitq, atomic_read(&osc->oo_nr_ios) == 0, &lwi);

and can not proceed to recovery.

ll_filemap_fault() spins and keeps osc->oo_nr_ios != 0, as readpage()'s read rpc fails with -108 because import is invalid:

static int ptlrpc_import_delay_req(struct obd_import *imp,
..
        } else if (imp->imp_invalid || imp->imp_obd->obd_no_recov) {
                if (!imp->imp_deactive)
                        DEBUG_REQ(D_NET, req, "IMP_INVALID");
                *status = -ESHUTDOWN; /* b=12940 */


 Comments   
Comment by Gerrit Updater [ 29/Jun/23 ]

"Vladimir Saveliev <vladimir.saveliev@hpe.com>" uploaded a new patch: https://review.whamcloud.com/c/fs/lustre-release/+/51505
Subject: LU-16935 llite: do not discard page after readpage failure
Project: fs/lustre-release
Branch: master
Current Patch Set: 1
Commit: ad4208e16bd537bdc93dc52b4e381b5148e6f880

Generated at Sat Feb 10 03:31:13 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.