Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-12769

replay-dual test 0b hangs in client mount

    XMLWordPrintable

Details

    • Bug
    • Resolution: Fixed
    • Major
    • Lustre 2.13.0, Lustre 2.12.4
    • Lustre 2.13.0, Lustre 2.12.4
    • None
    • 3
    • 9223372036854775807

    Description

      replay-dual test_0b hangs in when the client tries to mount the file system. Looking at the logs at https://testing.whamcloud.com/test_sets/2165cd40-d70d-11e9-9fc9-52540065bddc, the last lines seen in the suite_log are:

      Started lustre-MDT0000
      Starting client: trevis-16vm11.trevis.whamcloud.com:  -o user_xattr,flock trevis-20vm9@tcp:/lustre /mnt/lustre
      CMD: trevis-16vm11.trevis.whamcloud.com mkdir -p /mnt/lustre
      CMD: trevis-16vm11.trevis.whamcloud.com mount -t lustre -o user_xattr,flock trevis-20vm9@tcp:/lustre /mnt/lustre
      

      On the MDS (vm11), we see the following in the console log

      [ 1298.607167] Lustre: DEBUG MARKER: e2label /dev/mapper/mds1_flakey 2>/dev/null
      [ 1309.346832] Lustre: lustre-MDT0000: Denying connection for new client 26b5e8fc-98f4-4 (at 10.9.4.195@tcp), waiting for 4 known clients (0 recovered, 3 in progress, and 0 evicted) to recover in 0:43
      [ 1314.381684] Lustre: lustre-MDT0000: Denying connection for new client 26b5e8fc-98f4-4 (at 10.9.4.195@tcp), waiting for 4 known clients (0 recovered, 3 in progress, and 0 evicted) to recover in 0:38
      [ 1314.383960] Lustre: Skipped 1 previous similar message
      [ 1319.501601] Lustre: lustre-MDT0000: Denying connection for new client 26b5e8fc-98f4-4 (at 10.9.4.195@tcp), waiting for 4 known clients (0 recovered, 3 in progress, and 0 evicted) to recover in 0:33
      [ 1324.621716] Lustre: lustre-MDT0000: Denying connection for new client 26b5e8fc-98f4-4 (at 10.9.4.195@tcp), waiting for 4 known clients (0 recovered, 3 in progress, and 0 evicted) to recover in 0:28
      [ 1329.741784] Lustre: lustre-MDT0000: Denying connection for new client 26b5e8fc-98f4-4 (at 10.9.4.195@tcp), waiting for 4 known clients (0 recovered, 3 in progress, and 0 evicted) to recover in 0:23
      [ 1339.982308] Lustre: lustre-MDT0000: Denying connection for new client 26b5e8fc-98f4-4 (at 10.9.4.195@tcp), waiting for 4 known clients (0 recovered, 3 in progress, and 0 evicted) to recover in 0:12
      [ 1339.984388] Lustre: Skipped 1 previous similar message
      [ 1352.877227] Lustre: lustre-MDT0000: recovery is timed out, evict stale exports
      [ 1352.878085] Lustre: lustre-MDT0000: disconnecting 1 stale clients
      [ 1360.462321] Lustre: lustre-MDT0000: Denying connection for new client 26b5e8fc-98f4-4 (at 10.9.4.195@tcp), waiting for 4 known clients (0 recovered, 3 in progress, and 1 evicted) to recover in 0:27
      [ 1360.464241] Lustre: Skipped 3 previous similar messages
      [ 1396.303505] Lustre: lustre-MDT0000: Denying connection for new client 26b5e8fc-98f4-4 (at 10.9.4.195@tcp), waiting for 4 known clients (0 recovered, 3 in progress, and 1 evicted) already passed deadline 0:07
      [ 1396.305383] Lustre: Skipped 6 previous similar messages
      [ 1462.863612] Lustre: lustre-MDT0000: Denying connection for new client 26b5e8fc-98f4-4 (at 10.9.4.195@tcp), waiting for 4 known clients (0 recovered, 3 in progress, and 1 evicted) already passed deadline 1:14
      [ 1462.865659] Lustre: Skipped 12 previous similar messages
      [ 1590.865592] Lustre: lustre-MDT0000: Denying connection for new client 26b5e8fc-98f4-4 (at 10.9.4.195@tcp), waiting for 4 known clients (0 recovered, 3 in progress, and 1 evicted) already passed deadline 3:22
      [ 1590.867668] Lustre: Skipped 24 previous similar messages
      [ 1846.869597] Lustre: lustre-MDT0000: Denying connection for new client 26b5e8fc-98f4-4 (at 10.9.4.195@tcp), waiting for 4 known clients (0 recovered, 3 in progress, and 1 evicted) already passed deadline 7:38
      [ 1846.871619] Lustre: Skipped 49 previous similar messages
      [ 2358.877590] Lustre: lustre-MDT0000: Denying connection for new client 26b5e8fc-98f4-4 (at 10.9.4.195@tcp), waiting for 4 known clients (0 recovered, 3 in progress, and 1 evicted) already passed deadline 16:10
      [ 2358.879608] Lustre: Skipped 99 previous similar messages
      [ 2963.046705] Lustre: lustre-MDT0000: Denying connection for new client 26b5e8fc-98f4-4 (at 10.9.4.195@tcp), waiting for 4 known clients (0 recovered, 3 in progress, and 1 evicted) already passed deadline 26:14
      [ 2963.048899] Lustre: Skipped 117 previous similar messages
      [ 3567.216150] Lustre: lustre-MDT0000: Denying connection for new client 26b5e8fc-98f4-4 (at 10.9.4.195@tcp), waiting for 4 known clients (0 recovered, 3 in progress, and 1 evicted) already passed deadline 36:18
      [ 3567.218114] Lustre: Skipped 117 previous similar messages
      [ 4171.385464] Lustre: lustre-MDT0000: Denying connection for new client 26b5e8fc-98f4-4 (at 10.9.4.195@tcp), waiting for 4 known clients (0 recovered, 3 in progress, and 1 evicted) already passed deadline 46:23
      [ 4171.387400] Lustre: Skipped 117 previous similar messages
      [ 4775.554195] Lustre: lustre-MDT0000: Denying connection for new client 26b5e8fc-98f4-4 (at 10.9.4.195@tcp), waiting for 4 known clients (0 recovered, 3 in progress, and 1 evicted) already passed deadline 56:27
      [ 4775.556078] Lustre: Skipped 117 previous similar messages
      [ 5379.723493] Lustre: lustre-MDT0000: Denying connection for new client 26b5e8fc-98f4-4 (at 10.9.4.195@tcp), waiting for 4 known clients (0 recovered, 3 in progress, and 1 evicted) already passed deadline 66:31
      [ 5379.725448] Lustre: Skipped 117 previous similar messages
      [ 5983.892872] Lustre: lustre-MDT0000: Denying connection for new client 26b5e8fc-98f4-4 (at 10.9.4.195@tcp), waiting for 4 known clients (0 recovered, 3 in progress, and 1 evicted) already passed deadline 76:35
      [ 5983.894832] Lustre: Skipped 117 previous similar messages
      [ 6588.062279] Lustre: lustre-MDT0000: Denying connection for new client 26b5e8fc-98f4-4 (at 10.9.4.195@tcp), waiting for 4 known clients (0 recovered, 3 in progress, and 1 evicted) already passed deadline 86:39
      [ 6588.064310] Lustre: Skipped 117 previous similar messages
      [ 7192.231245] Lustre: lustre-MDT0000: Denying connection for new client 26b5e8fc-98f4-4 (at 10.9.4.195@tcp), waiting for 4 known clients (0 recovered, 3 in progress, and 1 evicted) already passed deadline 96:43
      [ 7192.233232] Lustre: Skipped 117 previous similar messages
      [ 7796.400758] Lustre: lustre-MDT0000: Denying connection for new client 26b5e8fc-98f4-4 (at 10.9.4.195@tcp), waiting for 4 known clients (0 recovered, 3 in progress, and 1 evicted) already passed deadline 106:48
      [ 7796.403030] Lustre: Skipped 117 previous similar messages
      [ 8400.570060] Lustre: lustre-MDT0000: Denying connection for new client 26b5e8fc-98f4-4 (at 10.9.4.195@tcp), waiting for 4 known clients (0 recovered, 3 in progress, and 1 evicted) already passed deadline 116:52
      [ 8400.572101] Lustre: Skipped 117 previous similar messages
      

      On the OSS (vm4) console log, we see

      [16003.344264] Lustre: DEBUG MARKER: /usr/sbin/lctl mark trevis-20vm9.trevis.whamcloud.com: executing set_default_debug -1 all 4
      [16003.724818] Lustre: DEBUG MARKER: trevis-20vm9.trevis.whamcloud.com: executing set_default_debug -1 all 4
      [16009.720340] Lustre: Evicted from MGS (at 10.9.4.245@tcp) after server handle changed from 0xb142ad53c4c0bc64 to 0xb142ad53c4c0d363
      [16014.732310] LustreError: 10096:0:(ldlm_resource.c:1147:ldlm_resource_complain()) lustre-MDT0000-lwp-OST0001: namespace resource [0x200000006:0x20000:0x0].0x0 (ffff8f95b862c0c0) refcount nonzero (1) after lock cleanup; forcing cleanup.
      [16014.735727] LustreError: 10096:0:(ldlm_resource.c:1147:ldlm_resource_complain()) Skipped 1 previous similar message
      

      This may be related to LU-11038 and/or LU-12340.

      Attachments

        Issue Links

          Activity

            People

              bzzz Alex Zhuravlev
              jamesanunez James Nunez (Inactive)
              Votes:
              0 Vote for this issue
              Watchers:
              6 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: