Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-11205

Failure to clear the changelog for user 1 on MDT

Details

    • Bug
    • Resolution: Duplicate
    • Major
    • None
    • Lustre 2.12.0, Lustre 2.10.4, Lustre 2.10.6
    • CentOS 7.4 (3.10.0-693.2.2.el7_lustre.pl1.x86_64)
    • 3
    • 9223372036854775807

    Description

      Hello,

      We're seeing the following messages on Oak's MDT in 2.10.4:

      Aug 03 09:21:39 oak-md1-s2 kernel: Lustre: 11137:0:(mdd_device.c:1577:mdd_changelog_clear()) oak-MDD0000: Failure to clear the changelog for user 1: -22
      Aug 03 09:31:38 oak-md1-s2 kernel: Lustre: 11271:0:(mdd_device.c:1577:mdd_changelog_clear()) oak-MDD0000: Failure to clear the changelog for user 1: -22
      

      Robinhood (also running 2.10.4) shows this:

      2018/08/03 10:00:47 [13766/22] ChangeLog | ERROR: llapi_changelog_clear("oak-MDT0000", "cl1", 13975842301) returned -22
      2018/08/03 10:00:47 [13766/22] EntryProc | Error -22 performing callback at stage STAGE_CHGLOG_CLR.
      2018/08/03 10:00:47 [13766/16] llapi | cannot purge records for 'cl1'
      2018/08/03 10:00:47 [13766/16] ChangeLog | ERROR: llapi_changelog_clear("oak-MDT0000", "cl1", 13975842303) returned -22
      2018/08/03 10:00:47 [13766/16] EntryProc | Error -22 performing callback at stage STAGE_CHGLOG_CLR.
      2018/08/03 10:00:47 [13766/4] llapi | cannot purge records for 'cl1'
      2018/08/03 10:00:47 [13766/4] ChangeLog | ERROR: llapi_changelog_clear("oak-MDT0000", "cl1", 13975842304) returned -22
      2018/08/03 10:00:47 [13766/4] EntryProc | Error -22 performing callback at stage STAGE_CHGLOG_CLR.
      

      Oak's MDT usage is as follow:

      [root@oak-md1-s2 ~]# df -h -t lustre
      Filesystem                  Size  Used Avail Use% Mounted on
      /dev/mapper/md1-rbod1-mdt0  1.3T  131G 1022G  12% /mnt/oak/mdt/0
      [root@oak-md1-s2 ~]# df -i -t lustre
      Filesystem                    Inodes     IUsed     IFree IUse% Mounted on
      /dev/mapper/md1-rbod1-mdt0 873332736 266515673 606817063   31% /mnt/oak/mdt/0
      

      I'm concerned that the MDT might fill up with changelogs. Could you please assist in troubleshooting this issue?
      Thanks!
      Stephane

      Attachments

        1. changelog-reader.tgz
          0.8 kB
        2. dk_ornl_20190328_1216.gz
          4.61 MB
        3. dk_ornl_20190328.gz
          4.59 MB
        4. dk.1547747365.gz
          8.56 MB
        5. dk.1547747668.gz
          8.58 MB
        6. dk.1547828521.gz
          7.96 MB
        7. f2_llog_reader_20190328.gz
          1.03 MB
        8. lu-11205-ssec.log.gz
          647 kB
        9. ornl_0x1_0xd8_0x0.gz
          1.19 MB

        Issue Links

          Activity

            [LU-11205] Failure to clear the changelog for user 1 on MDT

            Agreed that the proposed fix for LU-11426 should also resolve this.

            Related to the "orphan records" that James mentioned, in my testing of this patch it did not (fully) resolve that issue: the only way I could get them to go away was to rebuild the changelog by deregistering all readers.

            olaf Olaf Weber (Inactive) added a comment - Agreed that the proposed fix for LU-11426 should also resolve this. Related to the "orphan records" that James mentioned, in my testing of this patch it did not (fully) resolve that issue: the only way I could get them to go away was to rebuild the changelog by deregistering all readers.

            LU-11426 would fix the ordering. So error -22 wouldn't happen.

            aboyko Alexander Boyko added a comment - LU-11426 would fix the ordering. So error -22 wouldn't happen.

            @Olaf Weber, the patch allows clear of unordered records. The processing is different case. I do think that software should take care. Read a number of records and process it in order. In most cases it can process unordered too. We should not see unordered operation for a same file or directory cause they are synchronized by a parent lock.

            aboyko Alexander Boyko added a comment - @Olaf Weber, the patch allows clear of unordered records. The processing is different case. I do think that software should take care. Read a number of records and process it in order. In most cases it can process unordered too. We should not see unordered operation for a same file or directory cause they are synchronized by a parent lock.

            It is hard to say whether it really reduces the lost records issue, as too much changed in the cluster I tested on to make it truly apples/apples.

            olaf Olaf Weber (Inactive) added a comment - It is hard to say whether it really reduces the lost records issue, as too much changed in the cluster I tested on to make it truly apples/apples.

            Was it better? In our testing it did make the problem go away. I wonder if the problem is less common.

            simmonsja James A Simmons added a comment - Was it better? In our testing it did make the problem go away. I wonder if the problem is less common.

            When I tested the code posted in the review earlier today, I still saw out-of-order and lost changelog records. From this and other tests it does appear that lost changelog records are out-of-order records that are not reliably (and often reliably not) returned to userspace.

            olaf Olaf Weber (Inactive) added a comment - When I tested the code posted in the review earlier today, I still saw out-of-order and lost changelog records. From this and other tests it does appear that lost changelog records are out-of-order records that are not reliably (and often reliably not) returned to userspace.

            running lfs changelog clear 0 restores the changelog_size back to 4711216. There is some strange issue there but that is independent of this patch. The testing has gone very well!! Thanks Boyko.

            simmonsja James A Simmons added a comment - running lfs changelog clear 0 restores the changelog_size back to 4711216. There is some strange issue there but that is independent of this patch. The testing has gone very well!! Thanks Boyko.

            We did some testing and saw these results. Before we started robinhood to purge things we had:

            lctl get_param ..changelog_size mdd.f2-tds-MDT0000.changelog_size=4711216

            Afterwards we saw

            lctl get_param ..changelog_size mdd.f2-tds-MDT0000.changelog_size=928368

            Does this patch help resolve some of the orphan records as well that we couldn't cleanup before?

            simmonsja James A Simmons added a comment - We did some testing and saw these results. Before we started robinhood to purge things we had: lctl get_param . .changelog_size mdd.f2-tds-MDT0000.changelog_size=4711216 Afterwards we saw lctl get_param . .changelog_size mdd.f2-tds-MDT0000.changelog_size=928368 Does this patch help resolve some of the orphan records as well that we couldn't cleanup before?
            simmonsja James A Simmons added a comment - - edited

            Yep, setting it up right now. Thank you.

            simmonsja James A Simmons added a comment - - edited Yep, setting it up right now. Thank you.
            aboyko Alexander Boyko added a comment - - edited

            @James A Simmons, could you try the patch above? It also includes the reproducer for this issue.

            == sanity test 160j: Verify that reordered records could be cancelled in changelog =================== 08:27:16 (1563352036)
            mdd.lustre-MDT0000.changelog_mask=+hsm
            mdd.lustre-MDT0001.changelog_mask=+hsm
            Registered 2 changelog users: 'cl1 cl1'
            fail_loc=0x8000015d
            lustre-MDT0000.2 01CREAT 08:27:18.551297912 2019.07.17 0x0 t=[0x200000402:0x13:0x0] j=touch.0 ef=0xf u=0:0 nid=192.168.101.35@tcp p=[0x200000402:0x3:0x0] f2
            lustre-MDT0000.3 06UNLNK 08:27:18.553672555 2019.07.17 0x1 t=[0x200000402:0xa:0x0] j=rm.0 ef=0xf u=0:0 nid=192.168.101.35@tcp p=[0x200000402:0x5:0x0] f3
            lustre-MDT0000.4 14SATTR 08:27:18.554017844 2019.07.17 0x14 t=[0x200000402:0x8:0x0] j=chmod.0 ef=0xf u=0:0 nid=192.168.101.35@tcp
            lustre-MDT0000.5 11CLOSE 08:27:18.555499959 2019.07.17 0x42 t=[0x200000402:0x13:0x0] j=touch.0 ef=0xf u=0:0 nid=192.168.101.35@tcp
            lustre-MDT0000.1 01CREAT 08:27:18.550045424 2019.07.17 0x0 t=[0x200000402:0x12:0x0] j=touch.0 ef=0xf u=0:0 nid=192.168.101.35@tcp p=[0x200000402:0x7:0x0] 1
            lustre-MDT0000.6 11CLOSE 08:27:19.562156592 2019.07.17 0x42 t=[0x200000402:0x12:0x0] j=touch.0 ef=0xf u=0:0 nid=192.168.101.35@tcp
            2
            lustre-MDT0000: clear the changelog for cl1 to record #2
            3
            lustre-MDT0000: clear the changelog for cl1 to record #3
            4
            lustre-MDT0000: clear the changelog for cl1 to record #4
            5
            lustre-MDT0000: clear the changelog for cl1 to record #5
            1
            lustre-MDT0000: clear the changelog for cl1 to record #1
            lfs changelog_clear: cannot purge records for 'cl1': Invalid argument (22)
            changelog_clear: record out of range: 1
             sanity test_160j: @@@@@@ FAIL: Fail to clear changelog record 1 
              Trace dump:
              = /usr/lib64/lustre/tests/test-framework.sh:5919:error()
              = /usr/lib64/lustre/tests/sanity.sh:13140:test_160j()
              = /usr/lib64/lustre/tests/test-framework.sh:6203:run_one()
              = /usr/lib64/lustre/tests/test-framework.sh:6242:run_one_logged()
              = /usr/lib64/lustre/tests/test-framework.sh:6038:run_test()
              = /usr/lib64/lustre/tests/sanity.sh:13145:main()
            Dumping lctl log to /tmp/test_logs/1563352024/sanity.test_160j.*.1563352042.log
            
            aboyko Alexander Boyko added a comment - - edited @James A Simmons, could you try the patch above? It also includes the reproducer for this issue. == sanity test 160j: Verify that reordered records could be cancelled in changelog =================== 08:27:16 (1563352036) mdd.lustre-MDT0000.changelog_mask=+hsm mdd.lustre-MDT0001.changelog_mask=+hsm Registered 2 changelog users: 'cl1 cl1' fail_loc=0x8000015d lustre-MDT0000.2 01CREAT 08:27:18.551297912 2019.07.17 0x0 t=[0x200000402:0x13:0x0] j=touch.0 ef=0xf u=0:0 nid=192.168.101.35@tcp p=[0x200000402:0x3:0x0] f2 lustre-MDT0000.3 06UNLNK 08:27:18.553672555 2019.07.17 0x1 t=[0x200000402:0xa:0x0] j=rm.0 ef=0xf u=0:0 nid=192.168.101.35@tcp p=[0x200000402:0x5:0x0] f3 lustre-MDT0000.4 14SATTR 08:27:18.554017844 2019.07.17 0x14 t=[0x200000402:0x8:0x0] j=chmod.0 ef=0xf u=0:0 nid=192.168.101.35@tcp lustre-MDT0000.5 11CLOSE 08:27:18.555499959 2019.07.17 0x42 t=[0x200000402:0x13:0x0] j=touch.0 ef=0xf u=0:0 nid=192.168.101.35@tcp lustre-MDT0000.1 01CREAT 08:27:18.550045424 2019.07.17 0x0 t=[0x200000402:0x12:0x0] j=touch.0 ef=0xf u=0:0 nid=192.168.101.35@tcp p=[0x200000402:0x7:0x0] 1 lustre-MDT0000.6 11CLOSE 08:27:19.562156592 2019.07.17 0x42 t=[0x200000402:0x12:0x0] j=touch.0 ef=0xf u=0:0 nid=192.168.101.35@tcp 2 lustre-MDT0000: clear the changelog for cl1 to record #2 3 lustre-MDT0000: clear the changelog for cl1 to record #3 4 lustre-MDT0000: clear the changelog for cl1 to record #4 5 lustre-MDT0000: clear the changelog for cl1 to record #5 1 lustre-MDT0000: clear the changelog for cl1 to record #1 lfs changelog_clear: cannot purge records for 'cl1': Invalid argument (22) changelog_clear: record out of range: 1 sanity test_160j: @@@@@@ FAIL: Fail to clear changelog record 1 Trace dump: = /usr/lib64/lustre/tests/test-framework.sh:5919:error() = /usr/lib64/lustre/tests/sanity.sh:13140:test_160j() = /usr/lib64/lustre/tests/test-framework.sh:6203:run_one() = /usr/lib64/lustre/tests/test-framework.sh:6242:run_one_logged() = /usr/lib64/lustre/tests/test-framework.sh:6038:run_test() = /usr/lib64/lustre/tests/sanity.sh:13145:main() Dumping lctl log to /tmp/test_logs/1563352024/sanity.test_160j.*.1563352042.log

            Alexandr Boyko (c17825@cray.com) uploaded a new patch: https://review.whamcloud.com/35650
            Subject: LU-11205 mdd: process mixing changelog records
            Project: fs/lustre-release
            Branch: master
            Current Patch Set: 1
            Commit: f9f82470637f7769884a863c8ef90d79fcdfc12f

            gerrit Gerrit Updater added a comment - Alexandr Boyko (c17825@cray.com) uploaded a new patch: https://review.whamcloud.com/35650 Subject: LU-11205 mdd: process mixing changelog records Project: fs/lustre-release Branch: master Current Patch Set: 1 Commit: f9f82470637f7769884a863c8ef90d79fcdfc12f

            People

              tappro Mikhail Pershin
              sthiell Stephane Thiell
              Votes:
              3 Vote for this issue
              Watchers:
              26 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: