Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-4494

sanity-scrub test 4: lustre-MDT0001: too many transaction credits (309 > 256)

Details

    • Bug
    • Resolution: Duplicate
    • Minor
    • None
    • Lustre 2.6.0

    • Lustre branch: master
      MDSCOUNT=4
    • 3
    • 12298

    Description

      While testing patch http://review.whamcloud.com/8523 on master branch with MDSCOUNT=4, the following issues occurred on one of the MDTs during create:

      Lustre: DEBUG MARKER: mkdir -p /mnt/mds4; mount -t lustre -o user_xattr,acl                                /dev/lvm-Role_MDS/P4 /mnt/mds4^M
      LDISKFS-fs (dm-3): mounted filesystem with ordered data mode. quota=on. Opts: ^M
      LDISKFS-fs (dm-3): mounted filesystem with ordered data mode. quota=on. Opts: ^M
      Lustre: DEBUG MARKER: PATH=/usr/lib64/lustre/tests:/usr/lib/lustre/tests:/usr/lib64/lustre/tests:/opt/iozone/bin:/opt/iozone/bin:/usr/lib64/lustre/tests/mpi:/usr/lib64/lustre/tests/racer:/usr/lib64/lustre/../lustre-iokit/sgpdd-survey:/usr/lib64/lustre/tests:/usr/lib64/lustre/u^M
      Lustre: DEBUG MARKER: e2label /dev/lvm-Role_MDS/P4 2>/dev/null^M
      Lustre: DEBUG MARKER: /usr/sbin/lctl mark Using TIMEOUT=20^M
      Lustre: DEBUG MARKER: Using TIMEOUT=20^M
      Lustre: cli-ctl-lustre-MDT0001: Allocated super-sequence [0x0000000440000400-0x0000000480000400):1:mdt]^M
      Lustre: 32030:0:(osd_handler.c:872:osd_trans_start()) lustre-MDT0001: too many transaction credits (309 > 256)
      Lustre: 32030:0:(osd_handler.c:879:osd_trans_start())   create: 2/50, delete: 0/0, destroy: 0/0
      Lustre: 32030:0:(osd_handler.c:884:osd_trans_start())   attr_set: 2/2, xattr_set: 4/43
      Lustre: 32030:0:(osd_handler.c:891:osd_trans_start())   write: 11/154, punch: 2/8, quota 2/2
      Lustre: 32030:0:(osd_handler.c:896:osd_trans_start())   insert: 3/50, delete: 0/0
      Lustre: 32030:0:(osd_handler.c:901:osd_trans_start())   ref_add: 0/0, ref_del: 0/0
      Pid: 32030, comm: mdt00_002
      
      Call Trace:
       [<ffffffffa0903895>] libcfs_debug_dumpstack+0x55/0x80 [libcfs]
       [<ffffffffa0a95a43>] osd_trans_start+0x653/0x670 [osd_ldiskfs]
       [<ffffffffa0dfc579>] lod_trans_start+0x1b9/0x250 [lod]
       [<ffffffffa0e72317>] mdd_trans_start+0x17/0x20 [mdd]
       [<ffffffffa0e65dd9>] mdd_create+0x929/0x17a0 [mdd]
       [<ffffffffa0d52fa8>] mdo_create+0x18/0x50 [mdt]
       [<ffffffffa0d5d037>] mdt_reint_open+0x12a7/0x2120 [mdt]
       [<ffffffffa0d47a91>] mdt_reint_rec+0x41/0xe0 [mdt]
       [<ffffffffa0d2ddb3>] mdt_reint_internal+0x4c3/0x780 [mdt]
       [<ffffffffa0d2e265>] mdt_intent_reint+0x1f5/0x520 [mdt]
       [<ffffffffa0d2c989>] mdt_intent_policy+0x499/0xca0 [mdt]
       [<ffffffffa0527509>] ldlm_lock_enqueue+0x359/0x920 [ptlrpc]
       [<ffffffffa0550edf>] ldlm_handle_enqueue0+0x4ef/0x10b0 [ptlrpc]
       [<ffffffffa05cb492>] tgt_enqueue+0x62/0x1d0 [ptlrpc]
       [<ffffffffa05cdfea>] tgt_handle_request0+0x2ea/0x1490 [ptlrpc]
       [<ffffffffa05cf5ca>] tgt_request_handle+0x43a/0x980 [ptlrpc]
       [<ffffffffa0582725>] ptlrpc_main+0xd25/0x1970 [ptlrpc]
      

      And then the nodes were re-provisioned by autotest.

      Maloo report: https://maloo.whamcloud.com/test_sets/12c1c02a-7e6f-11e3-925a-52540035b04c

      Attachments

        Issue Links

          Activity

            [LU-4494] sanity-scrub test 4: lustre-MDT0001: too many transaction credits (309 > 256)

            Closing as a duplicate of LU-4611, which shows the problem much more significantly with a large number of stripes.

            adilger Andreas Dilger added a comment - Closing as a duplicate of LU-4611 , which shows the problem much more significantly with a large number of stripes.
            green Oleg Drokin added a comment -

            Well, it almost looks like one of the filesystems was formatted with 4M journal size (256 blocks max transactions).
            So then when you try to do a create it fails.

            Could it just be that mdscount=4 overwhelms the journal size by doing a larger transaction than in normal case for some sort of a special purpose small filesystem?
            If so, this woul likely be visible on any test run run in such conditions and we probably can do nothing about it other than changing the fs journal size used for this test

            green Oleg Drokin added a comment - Well, it almost looks like one of the filesystems was formatted with 4M journal size (256 blocks max transactions). So then when you try to do a create it fails. Could it just be that mdscount=4 overwhelms the journal size by doing a larger transaction than in normal case for some sort of a special purpose small filesystem? If so, this woul likely be visible on any test run run in such conditions and we probably can do nothing about it other than changing the fs journal size used for this test

            People

              wc-triage WC Triage
              yujian Jian Yu
              Votes:
              0 Vote for this issue
              Watchers:
              8 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: