Details

    • Bug
    • Resolution: Fixed
    • Blocker
    • None
    • Lustre 2.6.0
    • 3
    • 12103

    Description

      This issue was created by maloo for wangdi <di.wang@intel.com>

      This issue relates to the following test suite run: http://maloo.whamcloud.com/test_sets/e08778b8-6a67-11e3-9248-52540035b04c.

      The sub-test test_10 failed with the following error:

      Lustre: DEBUG MARKER: /usr/sbin/lctl mark == insanity test 10: Tenth Failure Mode: MDT0\/OST\/MDT1 Fri Dec 20 15:09:13 PST 2013 == 15:09:13 (1387580953)
      Lustre: DEBUG MARKER: == insanity test 10: Tenth Failure Mode: MDT0/OST/MDT1 Fri Dec 20 15:09:13 PST 2013 == 15:09:13 (1387580953)
      Lustre: DEBUG MARKER: grep -c /mnt/mds1' ' /proc/mounts
      Lustre: DEBUG MARKER: umount -d /mnt/mds1
      Lustre: DEBUG MARKER: lsmod | grep lnet > /dev/null && lctl dl | grep ' ST '
      Lustre: DEBUG MARKER: hostname
      Lustre: DEBUG MARKER: mkdir -p /mnt/mds1
      Lustre: DEBUG MARKER: test -b /dev/lvm-Role_MDS/P1
      Lustre: DEBUG MARKER: mkdir -p /mnt/mds1; mount -t lustre -o user_xattr,acl /dev/lvm-Role_MDS/P1 /mnt/mds1
      LDISKFS-fs (dm-0): mounted filesystem with ordered data mode. quota=on. Opts:
      LustreError: 11-0: lustre-OST0000-osc-MDT0000: Communicating with 10.10.16.180@tcp, operation ost_connect failed with -16.
      LustreError: Skipped 4 previous similar messages
      Lustre: DEBUG MARKER: PATH=/usr/lib64/lustre/tests:/usr/lib/lustre/tests:/usr/lib64/lustre/tests:/opt/iozone/bin:/opt/iozone/bin:/usr/lib64/lustre/tests/mpi:/usr/lib64/lustre/tests/racer:/usr/lib64/lustre/../lustre-iokit/sgpdd-survey:/usr/lib64/lustre/tests:/usr/lib64/lustre/u
      LNet: 20318:0:(debug.c:218:libcfs_debug_str2mask()) You are trying to use a numerical value for the mask - this will be deprecated in a future release.
      LNet: 20318:0:(debug.c:218:libcfs_debug_str2mask()) Skipped 3 previous similar messages
      Lustre: DEBUG MARKER: e2label /dev/lvm-Role_MDS/P1 2>/dev/null
      Lustre: lustre-MDT0000: Will be in recovery for at least 1:00, or until 5 clients reconnect
      Lustre: 7793:0:(client.c:1903:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1387581110/real 1387581110] req@ffff88005dd0ec00 x1454975111842816/t0(0) o38->lustre-MDT0001-osp-MDT0000@10.10.16.183@tcp:24/4 lens 400/544 e 0 to 1 dl 1387581115 ref 1 fl Rpc:XN/0/ffffffff rc 0/-1
      Lustre: 7793:0:(client.c:1903:ptlrpc_expire_one_request()) Skipped 12 previous similar messages
      Lustre: lustre-MDT0000: Recovery over after 0:21, of 5 clients 5 recovered and 0 were evicted.
      LustreError: 11-0: lustre-OST0000-osc-MDT0000: Communicating with 10.10.16.180@tcp, operation ost_connect failed with -16.
      LustreError: Skipped 122 previous similar messages
      LustreError: 11-0: lustre-OST0000-osc-MDT0000: Communicating with 10.10.16.180@tcp, operation ost_connect failed with -16.
      LustreError: Skipped 120 previous similar messages
      LustreError: 11-0: lustre-OST0000-osc-MDT0000: Communicating with 10.10.16.180@tcp, operation ost_connect failed with -16.
      LustreError: Skipped 120 previous similar messages
      LustreError: 11-0: lustre-OST0000-osc-MDT0000: Communicating with 10.10.16.180@tcp, operation ost_connect failed with -16.
      LustreError: Skipped 120 previous similar messages
      LustreError: 11-0: lustre-OST0000-osc-MDT0000: Communicating with 10.10.16.180@tcp, operation ost_connect failed with -16.
      LustreError: Skipped 119 previous similar messages
      SysRq : Show State

      Info required for matching: insanity 10

      Attachments

        Issue Links

          Activity

            [LU-4409] insanity test_10 (MDT0/OST/MDT1)

            TEI-3188 requests insanity test 10 be re-enabled in autotest.

            jamesanunez James Nunez (Inactive) added a comment - TEI-3188 requests insanity test 10 be re-enabled in autotest.

            Closer investigation will show that insanity test_10 is still being skipped for
            all recent lustre-review test runs, with either:

            skipping ALWAYS excluded test 10
            needs >= 2 MDTs
            

            Since it is currently passing on full test runs, it seems safe enough to file a TEI ticket to request that test_10 be removed from the autotest exception list.

            adilger Andreas Dilger added a comment - Closer investigation will show that insanity test_10 is still being skipped for all recent lustre-review test runs , with either: skipping ALWAYS excluded test 10 needs >= 2 MDTs Since it is currently passing on full test runs, it seems safe enough to file a TEI ticket to request that test_10 be removed from the autotest exception list.
            di.wang Di Wang added a comment -

            Test_10 has been re-enabled by http://review.whamcloud.com/10311. So I guess this ticket has been fixed by LU-2059.

            di.wang Di Wang added a comment - Test_10 has been re-enabled by http://review.whamcloud.com/10311 . So I guess this ticket has been fixed by LU-2059 .

            I'm happy if you can renable the test in autotest.

            adilger Andreas Dilger added a comment - I'm happy if you can renable the test in autotest.

            Is this ticket still an open problem? A patch for LU-2059, http://review.whamcloud.com/#/c/10311, re-enabled insanity test 10 for all cases except when there are less than two MDTs. Yet, insanity test 10 is still skipped since it is disabled in autotest; TEI-1312.

            Insanity test 10 is passing in recent full test sessions:
            https://testing.hpdd.intel.com/test_sessions/d7711864-b904-11e4-a983-5254006e85c2
            https://testing.hpdd.intel.com/test_sessions/63504c66-b8d7-11e4-a983-5254006e85c2
            https://testing.hpdd.intel.com/test_sessions/dba0429c-b590-11e4-9366-5254006e85c2

            jamesanunez James Nunez (Inactive) added a comment - Is this ticket still an open problem? A patch for LU-2059 , http://review.whamcloud.com/#/c/10311 , re-enabled insanity test 10 for all cases except when there are less than two MDTs. Yet, insanity test 10 is still skipped since it is disabled in autotest; TEI-1312. Insanity test 10 is passing in recent full test sessions: https://testing.hpdd.intel.com/test_sessions/d7711864-b904-11e4-a983-5254006e85c2 https://testing.hpdd.intel.com/test_sessions/63504c66-b8d7-11e4-a983-5254006e85c2 https://testing.hpdd.intel.com/test_sessions/dba0429c-b590-11e4-9366-5254006e85c2

            This subtest is currently disabled at the autotest level, so a regular test run will skip it. We need to explicitly request testing on the patch - hopefully Test-Parameters works.

            adilger Andreas Dilger added a comment - This subtest is currently disabled at the autotest level, so a regular test run will skip it. We need to explicitly request testing on the patch - hopefully Test-Parameters works.
            di.wang Di Wang added a comment -

            I just updated the patch 8650 to fix the problem. http://review.whamcloud.com/#/c/8650/

            di.wang Di Wang added a comment - I just updated the patch 8650 to fix the problem. http://review.whamcloud.com/#/c/8650/

            People

              wc-triage WC Triage
              maloo Maloo
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: