Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-14918

too many ldiskfs transaction credits for llog when unlinking overstriped files

    XMLWordPrintable

Details

    • Bug
    • Resolution: Fixed
    • Major
    • Lustre 2.16.0
    • Lustre 2.14.0, Lustre 2.15.0
    • 3
    • 9223372036854775807

    Description

      Removing widely overstriped files from an ldiskfs MDT causes excessively many transaction credits to be reserved. This can be seen in the MDS console logs:

      Lustre: DEBUG MARKER: == sanity test 130g: FIEMAP (overstripe file) ========
      Lustre: 25401:0:(osd_handler.c:1934:osd_trans_start()) lustre-MDT0000: credits 54595 > trans_max 2592
      Lustre: 25401:0:(osd_handler.c:1863:osd_trans_dump_creds())   create: 800/6400/0, destroy: 1/4/0
      Lustre: 25401:0:(osd_handler.c:1870:osd_trans_dump_creds())   attr_set: 3/3/0, xattr_set: 804/148/0
      Lustre: 25401:0:(osd_handler.c:1880:osd_trans_dump_creds())   write: 4001/34410/0, punch: 0/0/0, quota 6/6/0
      Lustre: 25401:0:(osd_handler.c:1887:osd_trans_dump_creds())   insert: 801/13616/0, delete: 2/5/0
      Lustre: 25401:0:(osd_handler.c:1894:osd_trans_dump_creds())   ref_add: 1/1/0, ref_del: 2/2/0
      Pid: 25401, comm: mdt00_004 3.10.0-1160.36.2.el7_lustre.x86_64 #1 SMP Tue Aug 3 23:03:31 UTC 2021
      Call Trace:
      libcfs_call_trace+0x90/0xf0 [libcfs]
      libcfs_debug_dumpstack+0x26/0x30 [libcfs]
      osd_trans_start+0x4bb/0x4e0 [osd_ldiskfs]
      top_trans_start+0x702/0x940 [ptlrpc]
      lod_trans_start+0x34/0x40 [lod]
      mdd_trans_start+0x1a/0x20 [mdd]
      mdd_unlink+0x4ee/0xae0 [mdd]
      mdo_unlink+0x1b/0x1d [mdt]
      mdt_reint_unlink+0xb64/0x1890 [mdt]
      mdt_reint_rec+0x83/0x210 [mdt]
      mdt_reint_internal+0x720/0xaf0 [mdt]
      mdt_reint+0x67/0x140 [mdt]
      tgt_request_handle+0x7ea/0x1750 [ptlrpc]
      ptlrpc_server_handle_request+0x256/0xb10 [ptlrpc]
      ptlrpc_main+0xb3c/0x14e0 [ptlrpc]
      Lustre: 25401:0:(osd_internal.h:1325:osd_trans_exec_op()) lustre-MDT0000: opcode 7: before 2589 < left 34410, rollback = 7
      

      and

      Lustre: DEBUG MARKER: == sanity test 27Cd: test maximum stripe count ========
      Lustre: 12686:0:(osd_handler.c:1934:osd_trans_start()) lustre-MDT0003: credits 136195 > trans_max 2592
      Lustre: 12686:0:(osd_handler.c:1863:osd_trans_dump_creds())   create: 2000/16000/0, destroy: 1/4/0
      Lustre: 12686:0:(osd_handler.c:1870:osd_trans_dump_creds())   attr_set: 3/3/0, xattr_set: 2004/148/0
      Lustre: 12686:0:(osd_handler.c:1880:osd_trans_dump_creds())   write: 10001/86010/0, punch: 0/0/0, quota 6/6/0
      Lustre: 12686:0:(osd_handler.c:1887:osd_trans_dump_creds())   insert: 2001/34016/0, delete: 2/5/0
      Lustre: 12686:0:(osd_handler.c:1894:osd_trans_dump_creds())   ref_add: 1/1/0, ref_del: 2/2/0
      Pid: 12686, comm: mdt00_000 3.10.0-1160.36.2.el7_lustre.x86_64 #1 SMP Tue Aug 3 23:03:31 UTC 2021
      Call Trace:
      libcfs_call_trace+0x90/0xf0 [libcfs]
      libcfs_debug_dumpstack+0x26/0x30 [libcfs]
      osd_trans_start+0x4bb/0x4e0 [osd_ldiskfs]
      top_trans_start+0x702/0x940 [ptlrpc]
      lod_trans_start+0x34/0x40 [lod]
      mdd_trans_start+0x1a/0x20 [mdd]
      mdd_unlink+0x4ee/0xae0 [mdd]
      mdo_unlink+0x1b/0x1d [mdt]
      mdt_reint_unlink+0xb64/0x1890 [mdt]
      mdt_reint_rec+0x83/0x210 [mdt]
      mdt_reint_internal+0x720/0xaf0 [mdt]
      mdt_reint+0x67/0x140 [mdt]
      tgt_request_handle+0x7ea/0x1750 [ptlrpc]
      ptlrpc_server_handle_request+0x256/0xb10 [ptlrpc]
      ptlrpc_main+0xb3c/0x14e0 [ptlrpc]
      

      and similarly in sanity test_130e, sanity-pfl test_0b, test_1c, always during unlink.

      The two examples shown are trying to reserve a whopping 213MiB and 532MiB of journal space, respectively. Since the maximum xattr size for an overstriped file is 64KiB, this is pretty excessive.

      Attachments

        Issue Links

          Activity

            People

              bzzz Alex Zhuravlev
              adilger Andreas Dilger
              Votes:
              0 Vote for this issue
              Watchers:
              11 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: