Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-2085

sanityn test_16 (fsx) ran over its Autotest time

Details

    • Bug
    • Resolution: Duplicate
    • Major
    • None
    • Lustre 2.4.0
    • 3
    • 2884

    Description

      Commit: 9337ac981608767e9d15c9cc51cb43ad42ffb42a (Jan 25, 2012)
      Maloo: https://maloo.whamcloud.com/test_sets/220f8e3e-4859-11e1-b292-5254004bbbd3

      == sanityn test 16: 2500 iterations of dual-mount fsx =================== 15:50:32 (1327535432)
      Chance of close/open is 1 in 50
      Seed set to 5432
      fd 0: /mnt/lustre/fsxfile
      fd 1: /mnt/lustre2/fsxfile
      skipping zero size read
      skipping zero size read
      skipping zero size read
      truncating to largest ever: 0x790c8c
      truncating to largest ever: 0x7a6801
      truncating to largest ever: 0x7dc953
      truncating to largest ever: 0x81487d
      truncating to largest ever: 0x9c7da8
      000100 1327535647.851342 0 write      0x2b3f91 thru 0x2c0de1	(0xce51 bytes)
      000200 1327535848.080671 1 write      0x14fa62 thru 0x15ef47	(0xf4e6 bytes)
      truncating to largest ever: 0x9e4443
      000300 1327535971.751220 1 read       0x1fdb8c thru 0x208c53	(0xb0c8 bytes)
      000400 1327536143.120379 0 mapwrite   0x73beca thru 0x73f9aa	(0x3ae1 bytes)
      000500 1327536343.371779 1 mapread    0x03c3ab thru 0x045b62	(0x97b8 bytes)
      000600 1327536576.819509 0 trunc from 0x8a8fa4  to  0xd6e090	(0x4c50ed bytes)
      000700 1327536779.280336 0 trunc from 0x6b9883  to  0xaa9d52	(0x3f04d0 bytes)
      truncating to largest ever: 0x9fb613
      000800 1327537034.643260 0 mapread    0x62396f thru 0x62f99b	(0xc02d bytes)
      000900 1327537212.838845 0 read       0x32c1de thru 0x32ee26	(0x2c49 bytes)
      001000 1327537438.173226 0 read       0x0b27aa thru 0x0be90d	(0xc164 bytes)
      001100 1327537653.305146 0 read       0x361731 thru 0x363077	(0x1947 bytes)
      001200 1327537802.126101 1 write      0x214dcb thru 0x21ae71	(0x60a7 bytes)
      001300 1327538007.965572 1 mapwrite   0x251482 thru 0x261322	(0xfea1 bytes)
      001400 1327538174.912600 1 mapread    0x139172 thru 0x140b9f	(0x7a2e bytes)
      001500 1327538353.902998 1 mapwrite   0x613416 thru 0x61faec	(0xc6d7 bytes)
      001600 1327538611.778969 1 mapwrite   0x0a1504 thru 0x0ad80b	(0xc308 bytes)
      001700 1327538895.176592 0 trunc from 0x0bdf9c  to  0x1ef680	(0x1316e5 bytes)
      001800 1327539112.389673 0 trunc from 0x9dce84  to  0x10ac2e8	(0x6cf465 bytes)
      

      Attachments

        Issue Links

          Activity

            [LU-2085] sanityn test_16 (fsx) ran over its Autotest time

            will be solved by ZIL support in LU-4009

            bzzz Alex Zhuravlev added a comment - will be solved by ZIL support in LU-4009
            liwei Li Wei (Inactive) added a comment - https://maloo.whamcloud.com/test_sets/af0adc66-fedf-11e1-b4cd-52540035b04c This was b2_3 with OFD and ZFS OSTs.

            Integrated in lustre-dev » x86_64,client,el6,inkernel #340
            ORI-487 tests: print verbose fsx logs, run less (Revision eb62df48785b5ac8ba19002bccd5f5ba164b7aee)

            Result = SUCCESS
            Mikhail Pershin : eb62df48785b5ac8ba19002bccd5f5ba164b7aee
            Files :

            • lustre/tests/sanityn.sh
            hudson Build Master (Inactive) added a comment - Integrated in lustre-dev » x86_64,client,el6,inkernel #340 ORI-487 tests: print verbose fsx logs, run less (Revision eb62df48785b5ac8ba19002bccd5f5ba164b7aee) Result = SUCCESS Mikhail Pershin : eb62df48785b5ac8ba19002bccd5f5ba164b7aee Files : lustre/tests/sanityn.sh

            Integrated in lustre-dev » x86_64,server,el5,inkernel #340
            ORI-487 tests: print verbose fsx logs, run less (Revision eb62df48785b5ac8ba19002bccd5f5ba164b7aee)

            Result = SUCCESS
            Mikhail Pershin : eb62df48785b5ac8ba19002bccd5f5ba164b7aee
            Files :

            • lustre/tests/sanityn.sh
            hudson Build Master (Inactive) added a comment - Integrated in lustre-dev » x86_64,server,el5,inkernel #340 ORI-487 tests: print verbose fsx logs, run less (Revision eb62df48785b5ac8ba19002bccd5f5ba164b7aee) Result = SUCCESS Mikhail Pershin : eb62df48785b5ac8ba19002bccd5f5ba164b7aee Files : lustre/tests/sanityn.sh

            Integrated in lustre-dev » i686,client,el5,inkernel #340
            ORI-487 tests: print verbose fsx logs, run less (Revision eb62df48785b5ac8ba19002bccd5f5ba164b7aee)

            Result = SUCCESS
            Mikhail Pershin : eb62df48785b5ac8ba19002bccd5f5ba164b7aee
            Files :

            • lustre/tests/sanityn.sh
            hudson Build Master (Inactive) added a comment - Integrated in lustre-dev » i686,client,el5,inkernel #340 ORI-487 tests: print verbose fsx logs, run less (Revision eb62df48785b5ac8ba19002bccd5f5ba164b7aee) Result = SUCCESS Mikhail Pershin : eb62df48785b5ac8ba19002bccd5f5ba164b7aee Files : lustre/tests/sanityn.sh

            Integrated in lustre-dev » x86_64,server,el6,inkernel #340
            ORI-487 tests: print verbose fsx logs, run less (Revision eb62df48785b5ac8ba19002bccd5f5ba164b7aee)

            Result = SUCCESS
            Mikhail Pershin : eb62df48785b5ac8ba19002bccd5f5ba164b7aee
            Files :

            • lustre/tests/sanityn.sh
            hudson Build Master (Inactive) added a comment - Integrated in lustre-dev » x86_64,server,el6,inkernel #340 ORI-487 tests: print verbose fsx logs, run less (Revision eb62df48785b5ac8ba19002bccd5f5ba164b7aee) Result = SUCCESS Mikhail Pershin : eb62df48785b5ac8ba19002bccd5f5ba164b7aee Files : lustre/tests/sanityn.sh

            Integrated in lustre-dev » i686,server,el5,inkernel #340
            ORI-487 tests: print verbose fsx logs, run less (Revision eb62df48785b5ac8ba19002bccd5f5ba164b7aee)

            Result = SUCCESS
            Mikhail Pershin : eb62df48785b5ac8ba19002bccd5f5ba164b7aee
            Files :

            • lustre/tests/sanityn.sh
            hudson Build Master (Inactive) added a comment - Integrated in lustre-dev » i686,server,el5,inkernel #340 ORI-487 tests: print verbose fsx logs, run less (Revision eb62df48785b5ac8ba19002bccd5f5ba164b7aee) Result = SUCCESS Mikhail Pershin : eb62df48785b5ac8ba19002bccd5f5ba164b7aee Files : lustre/tests/sanityn.sh

            Integrated in lustre-dev » i686,client,el6,inkernel #340
            ORI-487 tests: print verbose fsx logs, run less (Revision eb62df48785b5ac8ba19002bccd5f5ba164b7aee)

            Result = SUCCESS
            Mikhail Pershin : eb62df48785b5ac8ba19002bccd5f5ba164b7aee
            Files :

            • lustre/tests/sanityn.sh
            hudson Build Master (Inactive) added a comment - Integrated in lustre-dev » i686,client,el6,inkernel #340 ORI-487 tests: print verbose fsx logs, run less (Revision eb62df48785b5ac8ba19002bccd5f5ba164b7aee) Result = SUCCESS Mikhail Pershin : eb62df48785b5ac8ba19002bccd5f5ba164b7aee Files : lustre/tests/sanityn.sh

            Integrated in lustre-dev » x86_64,client,el5,inkernel #340
            ORI-487 tests: print verbose fsx logs, run less (Revision eb62df48785b5ac8ba19002bccd5f5ba164b7aee)

            Result = SUCCESS
            Mikhail Pershin : eb62df48785b5ac8ba19002bccd5f5ba164b7aee
            Files :

            • lustre/tests/sanityn.sh
            hudson Build Master (Inactive) added a comment - Integrated in lustre-dev » x86_64,client,el5,inkernel #340 ORI-487 tests: print verbose fsx logs, run less (Revision eb62df48785b5ac8ba19002bccd5f5ba164b7aee) Result = SUCCESS Mikhail Pershin : eb62df48785b5ac8ba19002bccd5f5ba164b7aee Files : lustre/tests/sanityn.sh

            Looking at the test results (which at least did not time out):
            https://maloo.whamcloud.com/sub_tests/88beba7e-6da2-11e1-9d47-5254004bbbd3

            000034 1331655558.536392 1 mapread    0x140d20 thru 0x149017	(0x82f8 bytes)
            000035 1331655560.687850 1 mapwrite   0x332aba thru 0x33f074	(0xc5bb bytes)
            000036 1331655561.719549 0 write      0x9cfd1e thru 0x9da5c0	(0xa8a3 bytes)
            000037 1331655561.730190 1 trunc from 0x9da5c1  to  0xe69187	(0x48ebc7 bytes)
            000038 1331655570.962937 1 mapread    0x222fca thru 0x22c098	(0x90cf bytes)
            000039 1331655571.012619 1 trunc from 0x48ebc7  to  0xace5fa	(0x63fa34 bytes)
            000040 1331655571.029985 1 mapread    0x476e02 thru 0x478953	(0x1b52 bytes)
            000041 1331655571.036898 0 trunc from 0x63fa34  to  0x7a4e44	(0x165411 bytes)
            000042 1331655579.471972 0 mapread    0x067b42 thru 0x076009	(0xe4c8 bytes)
            000043 1331655579.511741 1 mapwrite   0x5f7798 thru 0x605ff1	(0xe85a bytes)
            000044 1331655590.064530 1 mapwrite   0x5c79cb thru 0x5d718f	(0xf7c5 bytes)
            000045 1331655590.088253 1 write      0x4ea98b thru 0x4f1700	(0x6d76 bytes)
            

            we can see that when the DLM lock is switching from one client to another (presumably forcing a cache flush and transaction commit due to commit-on-cancel) that the operation latency is very high - over 10s in some cases, and this is a common occurrence.

            It would be useful to know whether IOR SSF has similar performance problems with ZFS, or if it is OK by virtue of the DLM lock getting split into disjoint areas and the threads being able to run without contention.

            This would likely benefit from a ZIL, but we'd need to figure out how to handle out-of-order ZIL operation commit within the Lustre ordered transaction model. Possibly we can change the osd-zfs to still return "success" to the BRW RPC, along with a ZIL record (possibly combined with the multi-slot client llog idea?) that will allow the RPC to be replayed locally even if the client failed, and not bump the last_committed value until the proper transaction commit callbacks are run. That way the client write is persistent on disk before the data is returned to the other client (without waiting for a full commit), but we don't risk data inconsistency if the writing client crashes and is unable to replay its write.

            As a first step, it makes sense to disable commit-on-cancel and see if this fixes the performance problem or not.

            adilger Andreas Dilger added a comment - Looking at the test results (which at least did not time out): https://maloo.whamcloud.com/sub_tests/88beba7e-6da2-11e1-9d47-5254004bbbd3 000034 1331655558.536392 1 mapread 0x140d20 thru 0x149017 (0x82f8 bytes) 000035 1331655560.687850 1 mapwrite 0x332aba thru 0x33f074 (0xc5bb bytes) 000036 1331655561.719549 0 write 0x9cfd1e thru 0x9da5c0 (0xa8a3 bytes) 000037 1331655561.730190 1 trunc from 0x9da5c1 to 0xe69187 (0x48ebc7 bytes) 000038 1331655570.962937 1 mapread 0x222fca thru 0x22c098 (0x90cf bytes) 000039 1331655571.012619 1 trunc from 0x48ebc7 to 0xace5fa (0x63fa34 bytes) 000040 1331655571.029985 1 mapread 0x476e02 thru 0x478953 (0x1b52 bytes) 000041 1331655571.036898 0 trunc from 0x63fa34 to 0x7a4e44 (0x165411 bytes) 000042 1331655579.471972 0 mapread 0x067b42 thru 0x076009 (0xe4c8 bytes) 000043 1331655579.511741 1 mapwrite 0x5f7798 thru 0x605ff1 (0xe85a bytes) 000044 1331655590.064530 1 mapwrite 0x5c79cb thru 0x5d718f (0xf7c5 bytes) 000045 1331655590.088253 1 write 0x4ea98b thru 0x4f1700 (0x6d76 bytes) we can see that when the DLM lock is switching from one client to another (presumably forcing a cache flush and transaction commit due to commit-on-cancel) that the operation latency is very high - over 10s in some cases, and this is a common occurrence. It would be useful to know whether IOR SSF has similar performance problems with ZFS, or if it is OK by virtue of the DLM lock getting split into disjoint areas and the threads being able to run without contention. This would likely benefit from a ZIL, but we'd need to figure out how to handle out-of-order ZIL operation commit within the Lustre ordered transaction model. Possibly we can change the osd-zfs to still return "success" to the BRW RPC, along with a ZIL record (possibly combined with the multi-slot client llog idea?) that will allow the RPC to be replayed locally even if the client failed, and not bump the last_committed value until the proper transaction commit callbacks are run. That way the client write is persistent on disk before the data is returned to the other client (without waiting for a full commit), but we don't risk data inconsistency if the writing client crashes and is unable to replay its write. As a first step, it makes sense to disable commit-on-cancel and see if this fixes the performance problem or not.

            People

              bzzz Alex Zhuravlev
              liwei Li Wei (Inactive)
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: