Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-2857

2.1.4<->2.4.0 interop: sanity test_76: FAIL: inode slab grew from 11183 to 12183

Details

    • 3
    • 6920

    Description

      The sanity test 76 failed as follows:

      == sanity test 76: confirm clients recycle inodes properly ====== 01:56:01 (1361526961)
      before inodes: 11183
      after inodes: 12183
      wait 2 seconds inodes: 12183
      wait 4 seconds inodes: 12183
      wait 6 seconds inodes: 12183
      wait 8 seconds inodes: 12183
      wait 10 seconds inodes: 12183
      wait 12 seconds inodes: 12183
      wait 14 seconds inodes: 12183
      wait 16 seconds inodes: 12183
      wait 18 seconds inodes: 12183
      wait 20 seconds inodes: 12183
      wait 22 seconds inodes: 12183
      wait 24 seconds inodes: 12183
      wait 26 seconds inodes: 12183
      wait 28 seconds inodes: 12183
      wait 30 seconds inodes: 12183
      wait 32 seconds inodes: 12183
       sanity test_76: @@@@@@ FAIL: inode slab grew from 11183 to 12183
      

      Maloo report: https://maloo.whamcloud.com/test_sets/8b6bd050-7d77-11e2-85d0-52540035b04c

      Attachments

        Issue Links

          Activity

            [LU-2857] 2.1.4<->2.4.0 interop: sanity test_76: FAIL: inode slab grew from 11183 to 12183

            Old ticket for unsupported version

            simmonsja James A Simmons added a comment - Old ticket for unsupported version

            This issue is also seen on Seagate Test infrastructure quite regularly.
            Client - 2.4.3, 2.5.1.x6
            Server - 2.5.1.x6

            parinay parinay v kondekar (Inactive) added a comment - This issue is also seen on Seagate Test infrastructure quite regularly. Client - 2.4.3, 2.5.1.x6 Server - 2.5.1.x6

            sanity test 76 is still failing on master (pre-2.8.0). Recent result with logs is at https://testing.hpdd.intel.com/test_sets/7a914250-e8a1-11e4-b3c6-5254006e85c2

            jamesanunez James Nunez (Inactive) added a comment - sanity test 76 is still failing on master (pre-2.8.0). Recent result with logs is at https://testing.hpdd.intel.com/test_sets/7a914250-e8a1-11e4-b3c6-5254006e85c2
            yujian Jian Yu added a comment - Lustre client: http://build.whamcloud.com/job/lustre-b2_3/41/ (2.3.0) Lustre server: http://build.whamcloud.com/job/lustre-b2_4/69/ (2.4.2 RC1) sanity test 76 hit the same failure: https://maloo.whamcloud.com/test_sets/3c82ea6e-685e-11e3-a16f-52540035b04c
            yujian Jian Yu added a comment - Lustre client build: http://build.whamcloud.com/job/lustre-b1_8/258/ (1.8.9-wc1) Lustre server build: http://build.whamcloud.com/job/lustre-b2_4/44/ (2.4.1 RC1) sanity test 76 hit the same failure: https://maloo.whamcloud.com/test_sets/77da1820-15ad-11e3-87cb-52540035b04c
            yujian Jian Yu added a comment - Lustre client: http://build.whamcloud.com/job/lustre-b2_3/41/ (2.3.0) Lustre server: http://build.whamcloud.com/job/lustre-b2_4/44/ (2.4.1 RC1) sanity test 76 hit the same failure: https://maloo.whamcloud.com/test_sets/cde5bab8-14f3-11e3-9828-52540035b04c
            yujian Jian Yu added a comment - Lustre client build: http://build.whamcloud.com/job/lustre-b1_8/258/ (1.8.9-wc1) Lustre server build: http://build.whamcloud.com/job/lustre-b2_4/31/ sanity test 76 hit the same failure: https://maloo.whamcloud.com/test_sets/03a55dac-0478-11e3-90ba-52540035b04c
            sarah Sarah Liu added a comment -

            also seen in interop between 2.3.0 client and 2.4 server:
            https://maloo.whamcloud.com/test_sets/491428e6-8fed-11e2-9b28-52540035b04c

            sarah Sarah Liu added a comment - also seen in interop between 2.3.0 client and 2.4 server: https://maloo.whamcloud.com/test_sets/491428e6-8fed-11e2-9b28-52540035b04c

            I've pushed http://review.whamcloud.com/5755 to disable this on b2_1.

            Alex's comments in LU-2036:

            OSP objects are not removed immediately, so ll_d_iput() in unlink path finds extent locks and do not reset nlink to 0 leaving the inode in the cache.

            we need to figure out another way to purge inode...

            on MDS corresponding inode does get nlink=0 only if the inode is actually being destroyed (orphans get nlink=1 being on PENDING/). this zero nlink propagated to the client can be a signal for ELC on the client ?

            adilger Andreas Dilger added a comment - I've pushed http://review.whamcloud.com/5755 to disable this on b2_1. Alex's comments in LU-2036 : OSP objects are not removed immediately, so ll_d_iput() in unlink path finds extent locks and do not reset nlink to 0 leaving the inode in the cache. we need to figure out another way to purge inode... on MDS corresponding inode does get nlink=0 only if the inode is actually being destroyed (orphans get nlink=1 being on PENDING/). this zero nlink propagated to the client can be a signal for ELC on the client ?

            Another hit on 2013-03-15 08:47:01 UTCs:

            RC2--PRISTINE-2.6.32-279.14.1.el6.x86_64 (x86_64)
            jenkins-arch=x86_64,build_type=server,distro=el6,ib_stack=inkern (x86_64)

            https://maloo.whamcloud.com/test_sets/92de9552-8dfd-11e2-81eb-52540035b04c

            adilger Andreas Dilger added a comment - Another hit on 2013-03-15 08:47:01 UTCs: RC2--PRISTINE-2.6.32-279.14.1.el6.x86_64 (x86_64) jenkins-arch=x86_64,build_type=server,distro=el6,ib_stack=inkern (x86_64) https://maloo.whamcloud.com/test_sets/92de9552-8dfd-11e2-81eb-52540035b04c
            yujian Jian Yu added a comment -
            yujian Jian Yu added a comment - Lustre b2_1 client build: http://build.whamcloud.com/job/lustre-b2_1/186 Lustre master server build: http://build.whamcloud.com/job/lustre-master/1302 Distro/Arch: RHEL6.3/x86_64 The same issue occurred: https://maloo.whamcloud.com/test_sets/c8a1a6e8-8b55-11e2-965f-52540035b04c

            People

              wc-triage WC Triage
              yujian Jian Yu
              Votes:
              0 Vote for this issue
              Watchers:
              8 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: