Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-8299

(vvp_io.c:922:vvp_io_fault_start()) ASSERTION( cl_index(obj, offset) == fio->ft_index )

    XMLWordPrintable

Details

    • Bug
    • Resolution: Fixed
    • Minor
    • Lustre 2.13.0
    • None
    • None
    • 3
    • 9223372036854775807

    Description

      Full stack of the panic:

      <0>LustreError: 15833:0:(vvp_io.c:1233:vvp_io_fault_start()) ASSERTION( cl_index(obj, offset) == fio->ft_index ) failed:
      <0>LustreError: 15833:0:(vvp_io.c:1233:vvp_io_fault_start()) LBUG
      <4>Pid: 15833, comm: a.out
      <4>
      <4>Call Trace:
      <4> [<ffffffffa121c895>] libcfs_debug_dumpstack+0x55/0x80 [libcfs]
      <4> [<ffffffffa121ce97>] lbug_with_loc+0x47/0xb0 [libcfs]
      <4> [<ffffffffa192b6fb>] vvp_io_fault_start+0x9ab/0xc70 [lustre]
      <4> [<ffffffffa122b121>] ? libcfs_debug_msg+0x41/0x50 [libcfs]
      <4> [<ffffffffa1225778>] ? libcfs_log_return+0x28/0x40 [libcfs]
      <4> [<ffffffffa135560a>] cl_io_start+0x6a/0x140 [obdclass]
      <4> [<ffffffffa1359574>] cl_io_loop+0xb4/0x1b0 [obdclass]
      <4> [<ffffffffa1909cee>] ll_fault+0x2de/0x510 [lustre]
      <4> [<ffffffff8114a234>] __do_fault+0x54/0x530
      <4> [<ffffffff8114a807>] handle_pte_fault+0xf7/0xb00
      <4> [<ffffffff8117b2a9>] ? __mem_cgroup_try_charge+0x19/0x5d0
      <4> [<ffffffffa121c27b>] ? cfs_set_ptldebug_header+0x2b/0xc0 [libcfs]
      <4> [<ffffffff8111f93e>] ? find_get_page+0x1e/0xa0
      <4> [<ffffffff8152a0b9>] ? _cond_resched+0x9/0x40
      <4> [<ffffffff81120e23>] ? filemap_fault+0xd3/0x500
      <4> [<ffffffff8114b43a>] handle_mm_fault+0x22a/0x300
      <4> [<ffffffff8104a8d8>] __do_page_fault+0x138/0x480
      <4> [<ffffffff812307e4>] ? inode_has_perm+0x54/0xa0
      <4> [<ffffffff8122ea61>] ? avc_has_perm+0x71/0x90
      <4> [<ffffffff81334bbd>] ? tty_wakeup+0x3d/0x80
      <4> [<ffffffff8152f23e>] do_page_fault+0x3e/0xa0
      <4> [<ffffffff8152c5f5>] page_fault+0x25/0x30
      <4> [<ffffffff8128e6ef>] ? copy_user_generic_unrolled+0x2f/0xb0
      <4> [<ffffffff8133494f>] ? tty_write+0x1ff/0x2a0
      <4> [<ffffffff81337850>] ? n_tty_write+0x0/0x460
      <4> [<ffffffff81189278>] vfs_write+0xb8/0x1a0
      <4> [<ffffffff81189c41>] sys_write+0x51/0x90
      <4> [<ffffffff810e202e>] ? __audit_syscall_exit+0x25e/0x290
      <4> [<ffffffff8100b072>] system_call_fastpath+0x16/0x1b
      

      This is triggered by a trivial mmap test code which accidentally passes in an argument >2^63. I'll attach code for the test.

      Running it like this:

      truncate -s 8589934592 file
      ./mmap_test file 8589934500 8
      

      Will cause that panic. The test is incorrect, but it still shouldn't panic nodes.

      In vvp_io_fault_start, we calculate the index with cl_index. This treats its argument (offset) as signed... But offset (which is the same as ft_index here, which is that argument from userspace) is greater than 2^63, so it's negative, so this doesn't work.

      /**
       * Converts a page index into a byte offset within object \a obj.
       */
      pgoff_t cl_index(const struct cl_object *obj, loff_t offset)
      {
              return offset >> PAGE_CACHE_SHIFT;
      }
      EXPORT_SYMBOL(cl_index);
      

      When offset is negative, bit shift division is invalid. Oops.

      This is corrected by making the function in question (cl_index) accept its argument as unsigned. This gets the bits twiddled correctly, but it's nasty, because offset in vvp_io_fault_start is still loff_t... and loff_t is still signed.

      I don't have a larger solution, because loff_t being signed is something we get from the kernel. It seems wrong, but I'm not sure what to do about it.

      I'll post my little patch and the powers that be can ponder the larger situation.

      Attachments

        Activity

          People

            zam Alexander Zarochentsev
            paf Patrick Farrell
            Votes:
            0 Vote for this issue
            Watchers:
            8 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: