Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-16206

PCC crashes MDS: mdt_big_xattr_get()) ASSERTION( info->mti_big_lmm_used == 0 ) failed

Details

    • Bug
    • Resolution: Unresolved
    • Critical
    • None
    • Lustre 2.14.0, Lustre 2.15.1
    • None
    • Linux 5.4.0-1091-azure #96~18.04.1-Ubuntu SMP Tue Aug 30 19:15:32 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux
    • 3
    • 9223372036854775807

    Description

      Reproducible on 2.15.1 and 2.14.0.  Both clients and servers are running Ubuntu 18.04 as shown in Environment.

      Steps to reproduce:

      # confirm hsm is enabled
      mds-node:~# lctl get_param mdt.lustrefs-MDT0000.hsm_control
      mdt.lustrefs-MDT0000.hsm_control=enabled

      # setup pcc on client 0
      client-0:~# mkdir /pcc
      client-0:~# chmod 777 /pcc /lustre
      client-0:~# lhsmtool_posix --daemon --hsm-root /pcc --archive=2 /lustre < /dev/null > /tmp/copytool_log 2>&1
      client-0:~# lctl pcc add /lustre /pcc -p "gid={0},gid={2001} rwid=2"
      # setup pcc on client 1
      client-1:~# mkdir /pcc
      client-1:~# chmod 777 /pcc /lustre
      client-1:~# lhsmtool_posix --daemon --hsm-root /pcc --archive=3 /lustre < /dev/null > /tmp/copytool_log 2>&1
      client-1:~# lctl pcc add /lustre /pcc -p "gid={0},gid={2001} rwid=3"
      # create file on client 0 and confirm in-cache
      client-0:~# echo "test" > /lustre/test
      client-0:~# lfs pcc state /lustre/test
      file: /lustre/test, type: readwrite, PCC file: /pcc/0001/0000/0402/0000/0002/0000/0x200000402:0x1:0x0, user number: 0, flags: 0
      # read file from client 1
      client-1:~# lfs pcc state /lustre/test
      file: /lustre/test, type: none
      client-1:~# cat /lustre/test
      cat: /lustre/test: No data available
      client-1:~# cat /lustre/test
      test
      client-1:~# lfs pcc state /lustre/test
      file: /lustre/test, type: none
      # check pcc state, and attempt to attach again on client 0
      client-0:~# lfs pcc state /lustre/test
      file: /lustre/test, type: none
      client-0:~# lfs pcc attach -i 2 /lustre/test
      ^C^C^C^C^C^C^C^C^C   <---- hang
      # while client 0 is hanging, check state on client 1
      client-1:~# lfs pcc state /lustre/test
      ^C^C^C^C  <---- hang

      Minutes later things resolve and the stuck command lines return.  Examining the MDS, it crashed and rebooted.  Relevant
      output from dmesg:

      [ 3266.211270] LustreError: 11458:0:(mdt_handler.c:960:mdt_big_xattr_get()) ASSERTION( info->mti_big_lmm_used == 0 ) failed:
      [ 3266.217023] LustreError: 11458:0:(mdt_handler.c:960:mdt_big_xattr_get()) LBUG
      [ 3266.220653] Pid: 11458, comm: mdt_rdpg02_001 5.4.0-1091-azure #96~18.04.1-Ubuntu SMP Tue Aug 30 19:15:32 UTC 2022
      [ 3266.220653] Call Trace TBD:
      [ 3266.220654] Kernel panic - not syncing: LBUG
      [ 3266.222778] CPU: 8 PID: 11458 Comm: mdt_rdpg02_001 Kdump: loaded Tainted: P           OE     5.4.0-1091-azure #96~18.04.1-Ubuntu
      [ 3266.224582] Hardware name: Microsoft Corporation Virtual Machine/Virtual Machine, BIOS 090008  12/07/2018
      [ 3266.224582] Call Trace:
      [ 3266.224582]  dump_stack+0x57/0x6d
      [ 3266.224582]  panic+0xf8/0x2d4
      [ 3266.224582]  lbug_with_loc+0x89/0x2c0 [libcfs]
      [ 3266.224582]  mdt_big_xattr_get+0x398/0x8b0 [mdt]
      [ 3266.224582]  ? mdd_read_unlock+0x2d/0xc0 [mdd]
      [ 3266.224582]  ? mdd_readpage+0x1919/0x1ed0 [mdd]
      [ 3266.224582]  __mdt_stripe_get+0x1d4/0x430 [mdt]
      [ 3266.224582]  mdt_attr_get_complex+0x56e/0x1af0 [mdt]
      [ 3266.224582]  mdt_mfd_close+0x2062/0x41c0 [mdt]
      [ 3266.224582]  ? lustre_msg_buf+0x17/0x50 [ptlrpc]
      [ 3266.224582]  ? __req_capsule_offset+0x5ae/0x6e0 [ptlrpc]
      [ 3266.224582]  mdt_close_internal+0x1f0/0x250 [mdt]
      [ 3266.259003]  mdt_close+0x483/0x13f0 [mdt][ 3266.259003]  tgt_request_handle+0xc9a/0x1950 [ptlrpc]
      [ 3266.259003]  ? lustre_msg_get_transno+0x22/0xe0 [ptlrpc]
      [ 3266.259003]  ptlrpc_register_service+0x25e6/0x4610 [ptlrpc]
      [ 3266.259003]  ? __switch_to_asm+0x34/0x70
      [ 3266.259003]  kthread+0x121/0x140
      [ 3266.259003]  ? ptlrpc_register_service+0x1590/0x4610 [ptlrpc]
      [ 3266.259003]  ? kthread_park+0x90/0x90
      [ 3266.259003]  ret_from_fork+0x35/0x40
      [ 3266.259003] Kernel Offset: 0x1be00000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff)

      Attachments

        Issue Links

          Activity

            [LU-16206] PCC crashes MDS: mdt_big_xattr_get()) ASSERTION( info->mti_big_lmm_used == 0 ) failed
            elliswilson Ellis Wilson added a comment -

            I've linked three potentially related bugs.  The last one has a description that's particularly enlightening:

            "This is result of inappropriate usage of mti_big_lmm buffer in various places. Originally it was introduced to be used for getting big LOV/LMV EA and passing them to reply buffers. Meanwhile it is widely used now for internal server needs. These cases should be distinguished and if there is no intention to return EA in reply then flag {mti_big_lmm_used}} should not be set. Maybe it is worth to rename it as mti_big_lmm_keep to mark that is to be kept until reply is packed."
             
            This aligns with a comment about the non-internal version of get_stripe:

              LASSERT(!info->mti_big_lmm_used);

                rc = __mdt_stripe_get(info, o, ma, name);
                /* since big_lmm is always used here, clear 'used' flag to avoid
                 * assertion in mdt_big_xattr_get().
                 */    info->mti_big_lmm_used = 0;
             
            I wonder if a codepath is being tickled that is (ab)using mit_big_lmm_used in a similar fashion that's not covered like this code path is.

            elliswilson Ellis Wilson added a comment - I've linked three potentially related bugs.  The last one has a description that's particularly enlightening: "This is result of inappropriate usage of mti_big_lmm buffer in various places. Originally it was introduced to be used for getting big LOV/LMV EA and passing them to reply buffers. Meanwhile it is widely used now for internal server needs. These cases should be distinguished and if there is no intention to return EA in reply then flag {mti_big_lmm_used}} should not be set. Maybe it is worth to rename it as  mti_big_lmm_keep  to mark that is to be kept until reply is packed."   This aligns with a comment about the non-internal version of get_stripe:   LASSERT(!info->mti_big_lmm_used);     rc = __mdt_stripe_get(info, o, ma, name);     /* since big_lmm is always used here, clear 'used' flag to avoid      * assertion in mdt_big_xattr_get().      */     info->mti_big_lmm_used = 0;   I wonder if a codepath is being tickled that is (ab)using mit_big_lmm_used in a similar fashion that's not covered like this code path is.

            People

              wc-triage WC Triage
              elliswilson Ellis Wilson
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

                Created:
                Updated: