Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-4040

mdt_object_find() and mdt_body_unpack() assert on some FIDs

Details

    • Bug
    • Resolution: Not a Bug
    • Critical
    • None
    • Lustre 2.5.0
    • 3
    • 10849

    Description

      Passing a valid local storage FID to mdt_object_find() or mdt_body_unpack() will cause a successful lu_object lookup but trigger a failed assertion in mdt_obj():

      static struct mdt_object *mdt_obj(struct lu_object *o)
      {
              LASSERT(lu_device_is_mdt(o->lo_dev));
              return container_of0(o, struct mdt_object, mot_obj);
      }
      

      An easy way to exploit this is through the new HSM ioctls which ship unvalidated FIDs to the MDT:

      # sys_hsm archive /mnt/lustre '[0xa:0x0:0x0]'
      
      Message from syslogd@t at Oct  1 15:15:22 ...
       kernel:LustreError: 12435:0:(mdt_handler.c:2395:mdt_obj()) ASSERTION( lu_device_is_mdt(o->lo_dev) ) failed: 
      
      Message from syslogd@t at Oct  1 15:15:22 ...
       kernel:LustreError: 12435:0:(mdt_handler.c:2395:mdt_obj()) LBUG
      
      Call Trace:
       [<ffffffffa0d93895>] libcfs_debug_dumpstack+0x55/0x80 [libcfs]
       [<ffffffffa0d93e97>] lbug_with_loc+0x47/0xb0 [libcfs]
       [<ffffffffa0534365>] mdt_obj+0x55/0x80 [mdt]
       [<ffffffffa0538b06>] mdt_object_find+0x66/0x170 [mdt]
       [<ffffffffa058aa8b>] mdt_hsm_get_md_hsm+0x6b/0x480 [mdt]
       [<ffffffffa0582765>] mdt_hsm_add_actions+0x435/0x10a0 [mdt]
       [<ffffffff81168bab>] ? cache_alloc_refill+0x15b/0x240
       [<ffffffff81169b7c>] ? __kmalloc+0x20c/0x220
       [<ffffffffa0579e17>] mdt_hsm_request+0x647/0xba0 [mdt]
       [<ffffffffa0542a8a>] mdt_handle_common+0x52a/0x1470 [mdt]
       [<ffffffffa057ca25>] mds_regular_handle+0x15/0x20 [mdt]
       [<ffffffffa106ee45>] ptlrpc_server_handle_request+0x385/0xc00 [ptlrpc]
       [<ffffffffa0da527f>] ? lc_watchdog_touch+0x6f/0x170 [libcfs]
       [<ffffffffa10664e9>] ? ptlrpc_wait_event+0xa9/0x2d0 [ptlrpc]
       [<ffffffffa10701ad>] ptlrpc_main+0xaed/0x1740 [ptlrpc]
       [<ffffffffa106f6c0>] ? ptlrpc_main+0x0/0x1740 [ptlrpc]
       [<ffffffff81096a36>] kthread+0x96/0xa0
       [<ffffffff8100c0ca>] child_rip+0xa/0x20
       [<ffffffff810969a0>] ? kthread+0x0/0xa0
       [<ffffffff8100c0c0>] ? child_rip+0x0/0x20
      
      # sys_hsm release /mnt/lustre '[0xa:0x0:0x0]'
      
      Message from syslogd@t at Oct  1 15:24:08 ...
       kernel:LustreError: 3456:0:(mdt_handler.c:2395:mdt_obj()) ASSERTION( lu_device_is_mdt(o->lo_dev) ) failed: 
      
      Message from syslogd@t at Oct  1 15:24:08 ...
       kernel:LustreError: 3456:0:(mdt_handler.c:2395:mdt_obj()) LBUG
      
      Kernel panic - not syncing: LBUG
      Pid: 3456, comm: mdt01_002 Not tainted 2.6.32-358.18.1.el6.lustre.x86_64 #1
      Call Trace:
       [<ffffffff8150f018>] ? panic+0xa7/0x16f
       [<ffffffffa02a9eeb>] ? lbug_with_loc+0x9b/0xb0 [libcfs]
       [<ffffffffa0ac8365>] ? mdt_obj+0x55/0x80 [mdt]
       [<ffffffffa0accb06>] ? mdt_object_find+0x66/0x170 [mdt]
       [<ffffffffa0accecc>] ? mdt_unpack_req_pack_rep+0x2bc/0x4d0 [mdt]
       [<ffffffffa0ad6a54>] ? mdt_handle_common+0x4f4/0x1470 [mdt]
       [<ffffffffa0b10a25>] ? mds_regular_handle+0x15/0x20 [mdt]
       [<ffffffffa05dfe45>] ? ptlrpc_server_handle_request+0x385/0xc00 [ptlrpc]
       [<ffffffffa02bb27f>] ? lc_watchdog_touch+0x6f/0x170 [libcfs]
       [<ffffffffa05d74e9>] ? ptlrpc_wait_event+0xa9/0x2d0 [ptlrpc]
       [<ffffffffa05e11ad>] ? ptlrpc_main+0xaed/0x1740 [ptlrpc]
       [<ffffffffa05e06c0>] ? ptlrpc_main+0x0/0x1740 [ptlrpc]
       [<ffffffff81096a36>] ? kthread+0x96/0xa0
       [<ffffffff8100c0ca>] ? child_rip+0xa/0x20
       [<ffffffff810969a0>] ? kthread+0x0/0xa0
       [<ffffffff8100c0c0>] ? child_rip+0x0/0x20
      

      Attachments

        Activity

          [LU-4040] mdt_object_find() and mdt_body_unpack() assert on some FIDs
          jhammond John Hammond added a comment -

          Bruno, you're correct about mdt_obj(). There may still be some bugs with bad FIDs but that's not new for the MDT handlers. Let's close this one.

          jhammond John Hammond added a comment - Bruno, you're correct about mdt_obj(). There may still be some bugs with bad FIDs but that's not new for the MDT handlers. Let's close this one.

          John, I wonder if this is still an issue with current master ?
          BTW, current mdt_obj() code don't have the "LASSERT(lu_device_is_mdt(o->lo_dev))" any more.
          Also, do you remember which of "the new HSM ioctls which ship unvalidated FIDs to the MDT" you used to trigger the LBUG, so that we can retry against the new code and see if there other/new issues when using unvalidated FIDs arguments ?

          bfaccini Bruno Faccini (Inactive) added a comment - John, I wonder if this is still an issue with current master ? BTW, current mdt_obj() code don't have the "LASSERT(lu_device_is_mdt(o->lo_dev))" any more. Also, do you remember which of "the new HSM ioctls which ship unvalidated FIDs to the MDT" you used to trigger the LBUG, so that we can retry against the new code and see if there other/new issues when using unvalidated FIDs arguments ?

          People

            bfaccini Bruno Faccini (Inactive)
            jhammond John Hammond
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: