Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-4600

Test failure on test suite conf-sanity, subtest test_50h "some OSC imports are still not connected"

Details

    • Bug
    • Resolution: Cannot Reproduce
    • Critical
    • None
    • Lustre 2.7.0, Lustre 2.5.3, Lustre 2.9.0, Lustre 2.10.0
    • None
    • 3
    • 12585

    Description

      This issue was created by maloo for wangdi <di.wang@intel.com>

      This issue relates to the following test suite run: http://maloo.whamcloud.com/test_sets/994ad81c-8fbc-11e3-92cc-52540035b04c.

      The sub-test test_50h failed with the following error:

      some OSC imports are still not connected

      Info required for matching: conf-sanity 50h

      Attachments

        Issue Links

          Activity

            [LU-4600] Test failure on test suite conf-sanity, subtest test_50h "some OSC imports are still not connected"
            bogl Bob Glossman (Inactive) added a comment - another on master: https://testing.hpdd.intel.com/test_sets/9fdfff5e-0751-11e6-9e5d-5254006e85c2
            jamesanunez James Nunez (Inactive) added a comment - - edited

            There are two different error messages found in the test_log for the failures listed in this ticket.

            Some of the logs listed here have the following error in the test_log:

            open(/mnt/lustre/d50h.conf-sanity/2/f50h.conf-sanity-0) error: No space left on device
            

            which may be related to LU-7309.

            Logs with this failure are at:
            2015-11-22 10:46:30 - https://testing.hpdd.intel.com/test_sets/7a1ec004-9134-11e5-b507-5254006e85c2
            2016-01-05 07:30:40 - https://testing.hpdd.intel.com/test_sets/33dac2da-b3aa-11e5-8114-5254006e85c2
            2016-02-01 12:07:07 - https://testing.hpdd.intel.com/test_sets/9a834e80-c908-11e5-aaa9-5254006e85c2

            Others test_logs mentioned in this ticket have the following error:

            open(/mnt/lustre/d50h.conf-sanity/2/f50h.conf-sanity-0) error: File too large
            
            jamesanunez James Nunez (Inactive) added a comment - - edited There are two different error messages found in the test_log for the failures listed in this ticket. Some of the logs listed here have the following error in the test_log: open(/mnt/lustre/d50h.conf-sanity/2/f50h.conf-sanity-0) error: No space left on device which may be related to LU-7309 . Logs with this failure are at: 2015-11-22 10:46:30 - https://testing.hpdd.intel.com/test_sets/7a1ec004-9134-11e5-b507-5254006e85c2 2016-01-05 07:30:40 - https://testing.hpdd.intel.com/test_sets/33dac2da-b3aa-11e5-8114-5254006e85c2 2016-02-01 12:07:07 - https://testing.hpdd.intel.com/test_sets/9a834e80-c908-11e5-aaa9-5254006e85c2 Others test_logs mentioned in this ticket have the following error: open(/mnt/lustre/d50h.conf-sanity/2/f50h.conf-sanity-0) error: File too large
            niu Niu Yawei (Inactive) added a comment - https://testing.hpdd.intel.com/test_sets/786f02ae-6255-11e5-8cee-5254006e85c2
            bogl Bob Glossman (Inactive) added a comment - another seen on b2_5: https://testing.hpdd.intel.com/test_sets/2cdb6ca2-ca74-11e4-9330-5254006e85c2
            doug Doug Oucharek (Inactive) added a comment - Seen in master (2.7): https://testing.hpdd.intel.com/test_sets/218a95fe-3e0f-11e4-b06a-5254006e85c2
            yujian Jian Yu added a comment -

            Lustre Build: https://build.hpdd.intel.com/job/lustre-b2_5/84/
            Distro/Arch: RHEL6.5/x86_64
            Network: o2ib

            https://testing.hpdd.intel.com/test_sets/9c8fcf9a-2b62-11e4-8687-5254006e85c2

            Dmesg on MDS:

            LustreError: 14985:0:(osp_precreate.c:719:osp_precreate_cleanup_orphans()) lustre-OST0000-osc-MDT0000: cannot cleanup orphans: rc = -5
            Lustre: DEBUG MARKER: /usr/sbin/lctl conf_param lustre-OST0000.osc.active='1'
            Lustre: Permanently reactivating lustre-OST0000
            LustreError: 14718:0:(lod_qos.c:946:lod_alloc_specific()) can't lstripe objid [0x200000bd0:0x5:0x0]: have 0 want 1
            LustreError: 167-0: lustre-OST0000-osc-MDT0000: This client was evicted by lustre-OST0000; in progress operations using this service will fail.
            Lustre: DEBUG MARKER: /usr/sbin/lctl mark  conf-sanity test_50h: @@@@@@ FAIL: some OSC imports are still not connected 
            
            yujian Jian Yu added a comment - Lustre Build: https://build.hpdd.intel.com/job/lustre-b2_5/84/ Distro/Arch: RHEL6.5/x86_64 Network: o2ib https://testing.hpdd.intel.com/test_sets/9c8fcf9a-2b62-11e4-8687-5254006e85c2 Dmesg on MDS: LustreError: 14985:0:(osp_precreate.c:719:osp_precreate_cleanup_orphans()) lustre-OST0000-osc-MDT0000: cannot cleanup orphans: rc = -5 Lustre: DEBUG MARKER: /usr/sbin/lctl conf_param lustre-OST0000.osc.active='1' Lustre: Permanently reactivating lustre-OST0000 LustreError: 14718:0:(lod_qos.c:946:lod_alloc_specific()) can't lstripe objid [0x200000bd0:0x5:0x0]: have 0 want 1 LustreError: 167-0: lustre-OST0000-osc-MDT0000: This client was evicted by lustre-OST0000; in progress operations using this service will fail. Lustre: DEBUG MARKER: /usr/sbin/lctl mark conf-sanity test_50h: @@@@@@ FAIL: some OSC imports are still not connected

            I hit this on master (2.7): https://testing.hpdd.intel.com/test_sets/2046baf4-11e3-11e4-90ac-5254006e85c2

            In the client log, I see

            open(/mnt/lustre/d50h.conf-sanity/2/f50h.conf-sanity-0) error: File too large
            

            Maybe related to LU-4340?

            jamesanunez James Nunez (Inactive) added a comment - I hit this on master (2.7): https://testing.hpdd.intel.com/test_sets/2046baf4-11e3-11e4-90ac-5254006e85c2 In the client log, I see open(/mnt/lustre/d50h.conf-sanity/2/f50h.conf-sanity-0) error: File too large Maybe related to LU-4340 ?
            adilger Andreas Dilger added a comment - Seen on master (2.6-pre): https://maloo.whamcloud.com/test_sets/3cd0ba26-e70e-11e3-badc-52540035b04c
            bogl Bob Glossman (Inactive) added a comment - seen in b2_4 https://maloo.whamcloud.com/test_sets/450f675e-b8b6-11e3-bc82-52540035b04c

            It looks like this problem has only been hit once with the patch 7196 applied. Until seen elsewhere we should assume this problem is caused by the patch.

            adilger Andreas Dilger added a comment - It looks like this problem has only been hit once with the patch 7196 applied. Until seen elsewhere we should assume this problem is caused by the patch.

            People

              bzzz Alex Zhuravlev
              maloo Maloo
              Votes:
              0 Vote for this issue
              Watchers:
              15 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: