Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-10723

Interop 2.10.3<->2.11 sanity test_232b: OSS hung

    XMLWordPrintable

Details

    • Bug
    • Resolution: Fixed
    • Minor
    • Lustre 2.11.0
    • Lustre 2.11.0
    • None
    • 3
    • 9223372036854775807

    Description

      sanity test_232b - Timeout occurred after 168 mins, last suite running was sanity, restarting cluster to continue tests
      ^^^^^^^^^^^^^ DO NOT REMOVE LINE ABOVE ^^^^^^^^^^^^^

      This issue was created by maloo for sarah_lw <wei3.liu@intel.com>

      This issue relates to the following test suite run:
      https://testing.hpdd.intel.com/test_sets/4c514328-12aa-11e8-a6ad-52540065bddc
      test_232b failed with the following error:

      Timeout occurred after 168 mins, last suite running was sanity, restarting cluster to continue tests
      

      client: lustre-master tag-2.10.58
      server: 2.10.3

      OSS console

      [ 6415.456311] Lustre: DEBUG MARKER: /usr/sbin/lctl mark == sanity test 232b: failed data version lock should not block umount ================================ 22:43:59 \(1518648239\)
      [ 6415.640059] Lustre: DEBUG MARKER: == sanity test 232b: failed data version lock should not block umount ================================ 22:43:59 (1518648239)
      [ 6416.022119] Lustre: DEBUG MARKER: /usr/sbin/lctl set_param fail_loc=0x31c
      [ 6416.188210] Lustre: *** cfs_fail_loc=31c, val=0***
      [ 6416.188807] LustreError: 5316:0:(ldlm_request.c:469:ldlm_cli_enqueue_local()) ### delayed lvb init failed (rc -12) ns: filter-lustre-OST0000_UUID lock: ffff88005e9f6400/0xf4fc5b448cfd123c lrc: 2/0,0 mode: --/PR res: [0x99c2:0x0:0x0].0x0 rrc: 2 type: EXT [0->0] (req 0->0) flags: 0x40000000000000 nid: local remote: 0x0 expref: -99 pid: 5316 timeout: 0 lvb_type: 0
      [ 6416.345912] Lustre: DEBUG MARKER: /usr/sbin/lctl set_param fail_loc=0
      [ 6416.882703] Lustre: DEBUG MARKER: grep -c /mnt/lustre-ost1' ' /proc/mounts
      [ 6417.199191] Lustre: DEBUG MARKER: umount -d /mnt/lustre-ost1
      [ 6417.366042] Lustre: Failing over lustre-OST0000
      [ 6417.427264] LustreError: 24697:0:(ldlm_resource.c:1100:ldlm_resource_complain()) filter-lustre-OST0000_UUID: namespace resource [0x99c2:0x0:0x0].0x0 (ffff88005d0809c0) refcount nonzero (1) after lock cleanup; forcing cleanup.
      [ 6417.429304] LustreError: 24697:0:(ldlm_resource.c:1682:ldlm_resource_dump()) --- Resource: [0x99c2:0x0:0x0].0x0 (ffff88005d0809c0) refcount = 2
      [ 6418.378000] Lustre: lustre-OST0000: Not available for connect from 10.2.8.127@tcp (stopping)
      [ 6421.517980] Lustre: lustre-OST0000: Not available for connect from 10.2.8.125@tcp (stopping)
      [ 6422.430852] LustreError: 0-0: Forced cleanup waiting for filter-lustre-OST0000_UUID namespace with 1 resources in use, (rc=-110)
      [ 6426.515485] Lustre: lustre-OST0000: Not available for connect from 10.2.8.125@tcp (stopping)
      [ 6426.516597] Lustre: Skipped 1 previous similar message
      [ 6427.431867] LustreError: 0-0: Forced cleanup waiting for filter-lustre-OST0000_UUID namespace with 1 resources in use, (rc=-110)
      [ 6431.515551] Lustre: lustre-OST0000: Not available for connect from 10.2.8.125@tcp (stopping)
      [ 6431.516800] Lustre: Skipped 2 previous similar messages
      [ 6432.432858] LustreError: 0-0: Forced cleanup waiting for filter-lustre-OST0000_UUID namespace with 1 resources in use, (rc=-110)
      [ 6436.515436] Lustre: lustre-OST0000: Not available for connect from 10.2.8.125@tcp (stopping)
      [ 6436.516663] Lustre: Skipped 2 previous similar messages
      [ 6437.433853] LustreError: 0-0: Forced cleanup waiting for filter-lustre-OST0000_UUID namespace with 1 resources in use, (rc=-110)
      [ 6442.434854] LustreError: 0-0: Forced cleanup waiting for filter-lustre-OST0000_UUID namespace with 1 resources in use, (rc=-110)
      [ 6446.515570] Lustre: lustre-OST0000: Not available for connect from 10.2.8.125@tcp (stopping)
      [ 6446.516867] Lustre: Skipped 5 previous similar messages
      [ 6452.435856] LustreError: 0-0: Forced cleanup waiting for filter-lustre-OST0000_UUID namespace with 1 resources in use, (rc=-110)
      [ 6452.437151] LustreError: Skipped 1 previous similar message
      [ 6466.515613] Lustre: lustre-OST0000: Not available for connect from 10.2.8.125@tcp (stopping)
      [ 6466.516623] Lustre: Skipped 11 previous similar messages
      [ 6472.436880] LustreError: 0-0: Forced cleanup waiting for filter-lustre-OST0000_UUID namespace with 1 resources in use, (rc=-110)
      [ 6472.438283] LustreError: Skipped 3 previous similar messages
      [ 6501.515401] Lustre: lustre-OST0000: Not available for connect from 10.2.8.125@tcp (stopping)
      [ 6501.516599] Lustre: Skipped 20 previous similar messages
      [ 6507.438850] LustreError: 0-0: Forced cleanup waiting for filter-lustre-OST0000_UUID namespace with 1 resources in use, (rc=-110)
      [ 6507.440165] LustreError: Skipped 6 previous similar messages
      [ 6530.986984] LustreError: 24702:0:(client.c:1166:ptlrpc_import_delay_req()) @@@ IMP_CLOSED   req@ffff880007849200 x1592411468623712/t0(0) o101->lustre-MDT0000-lwp-OST0000@10.2.8.127@tcp:23/10 lens 456/496 e 0 to 0 dl 0 ref 2 fl Rpc:/0/ffffffff rc 0/-1
      [ 6530.989429] LustreError: 24702:0:(qsd_reint.c:56:qsd_reint_completion()) lustre-OST0000: failed to enqueue global quota lock, glb fid:[0x200000006:0x20000:0x0], rc:-5
      [ 6530.990942] LustreError: 24702:0:(qsd_reint.c:56:qsd_reint_completion()) Skipped 1 previous similar message
      [ 6566.515348] Lustre: lustre-OST0000: Not available for connect from 10.2.8.125@tcp (stopping)
      [ 6566.516397] Lustre: Skipped 38 previous similar messages
      [ 6572.439881] LustreError: 0-0: Forced cleanup waiting for filter-lustre-OST0000_UUID namespace with 1 resources in use, (rc=-110)
      

      Attachments

        Issue Links

          Activity

            People

              bougetq Quentin Bouget (Inactive)
              maloo Maloo
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: