Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-14780

LustreError: 4936:0:(file.c:4985:ll_layout_lock_set()) LBUG

    XMLWordPrintable

Details

    • Bug
    • Resolution: Fixed
    • Major
    • Lustre 2.15.0
    • None
    • 3
    • 9223372036854775807

    Description

      A case happened on client nodes which LBUG on "using ls -l", however, if "lfs getstripe" it first, I can then "ls -l" without triggering the LBUG.

      2021-06-03 10:50:33 [ 1584.601278] LustreError: 4936:0:(file.c:4985:ll_layout_lock_set()) ASSERTION( ldlm_has_layout(lock) ) failed: 
      2021-06-03 10:50:33 [ 1584.606428] LustreError: 4938:0:(file.c:4985:ll_layout_lock_set()) ASSERTION( ldlm_has_layout(lock) ) failed: 
      2021-06-03 10:50:33 [ 1584.612486] LustreError: 4936:0:(file.c:4985:ll_layout_lock_set()) LBUG
      2021-06-03 10:50:33 [ 1584.612489] Pid: 4936, comm: dsync 4.18.0-240.15.1.el8.x86_64 #1 SMP Wed Mar 17 23:53:03 UTC 2021
      2021-06-03 10:50:33 [ 1584.623732] LustreError: 4938:0:(file.c:4985:ll_layout_lock_set()) LBUG
      2021-06-03 10:50:33 [ 1584.631129] Call Trace:
      2021-06-03 10:50:33 [ 1584.631157]  libcfs_call_trace+0x86/0xc0 [libcfs]
      2021-06-03 10:50:33 [ 1584.656942]  lbug_with_loc+0x43/0x80 [libcfs]
      2021-06-03 10:50:33 [ 1584.661880]  ll_layout_lock_set+0xac/0x610 [lustre]
      2021-06-03 10:50:33 [ 1584.665010] LustreError: 4935:0:(file.c:4985:ll_layout_lock_set()) ASSERTION( ldlm_has_layout(lock) ) failed: 
      2021-06-03 10:50:33 [ 1584.667385]  ll_layout_refresh+0x1b7/0x310 [lustre]
      2021-06-03 10:50:33 [ 1584.678675] LustreError: 4935:0:(file.c:4985:ll_layout_lock_set()) LBUG
      2021-06-03 10:50:33 [ 1584.684169]  vvp_io_init+0x20c/0x340 [lustre]
      2021-06-03 10:50:33 [ 1584.695274] LustreError: 4931:0:(file.c:4985:ll_layout_lock_set()) ASSERTION( ldlm_has_layout(lock) ) failed: 
      2021-06-03 10:50:33 [ 1584.696524]  cl_io_init0.isra.16+0x83/0x130 [obdclass]
      2021-06-03 10:50:33 [ 1584.707788] LustreError: 4931:0:(file.c:4985:ll_layout_lock_set()) LBUG
      2021-06-03 10:50:33 [ 1584.713559]  cl_glimpse_size0+0x97/0x240 [lustre]
      2021-06-03 10:50:33 [ 1584.726287]  ll_getattr+0x1d6/0x460 [lustre]
      2021-06-03 10:50:33 [ 1584.726847] LustreError: 4939:0:(file.c:4985:ll_layout_lock_set()) ASSERTION( ldlm_has_layout(lock) ) failed: 
      2021-06-03 10:50:33 [ 1584.731123]  vfs_statx+0x8a/0xd0
      2021-06-03 10:50:33 [ 1584.731125]  __do_sys_newlstat+0x39/0x70
      2021-06-03 10:50:33 [ 1584.731129]  do_syscall_64+0x5b/0x1a0
      2021-06-03 10:50:33 [ 1584.731132]  entry_SYSCALL_64_after_hwframe+0x65/0xca
      2021-06-03 10:50:33 [ 1584.731135]  0xffffffffffffffff
      2021-06-03 10:50:33 [ 1584.731138] Kernel panic - not syncing: LBUG
      2021-06-03 10:50:33 [ 1584.731139] Pid: 4938, comm: dsync 4.18.0-240.15.1.el8.x86_64 #1 SMP Wed Mar 17 23:53:03 UTC 2021
      2021-06-03 10:50:33 [ 1584.731140] Call Trace:
      2021-06-03 10:50:33 [ 1584.731170]  libcfs_call_trace+0x86/0xc0 [libcfs]
      2021-06-03 10:50:33 [ 1584.731175]  lbug_with_loc+0x43/0x80 [libcfs]
      2021-06-03 10:50:33 [ 1584.731201]  ll_layout_lock_set+0xac/0x610 [lustre]
      2021-06-03 10:50:34 [ 1584.731209]  ll_layout_refresh+0x1b7/0x310 [lustre]
      2021-06-03 10:50:34 [ 1584.731220]  vvp_io_init+0x20c/0x340 [lustre]
      2021-06-03 10:50:34 [ 1584.731257]  cl_io_init0.isra.16+0x83/0x130 [obdclass]
      2021-06-03 10:50:34 [ 1584.731267]  cl_glimpse_size0+0x97/0x240 [lustre]
      2021-06-03 10:50:34 [ 1584.731275]  ll_getattr+0x1d6/0x460 [lustre]
      2021-06-03 10:50:34 [ 1584.731280]  vfs_statx+0x8a/0xd0
      2021-06-03 10:50:34 [ 1584.731281]  __do_sys_newlstat+0x39/0x70
      2021-06-03 10:50:34 [ 1584.731285]  do_syscall_64+0x5b/0x1a0
      2021-06-03 10:50:34 [ 1584.731288]  entry_SYSCALL_64_after_hwframe+0x65/0xca
      2021-06-03 10:50:34 [ 1584.731290]  0xffffffffffffffff
      2021-06-03 10:50:34 [ 1584.731292] Pid: 4935, comm: dsync 4.18.0-240.15.1.el8.x86_64 #1 SMP Wed Mar 17 23:53:03 UTC 2021
      2021-06-03 10:50:34 [ 1584.731293] Call Trace:
      2021-06-03 10:50:34 [ 1584.731314]  libcfs_call_trace+0x86/0xc0 [libcfs]
      2021-06-03 10:50:34 [ 1584.731318]  lbug_with_loc+0x43/0x80 [libcfs]
      2021-06-03 10:50:34 [ 1584.731338]  ll_layout_lock_set+0xac/0x610 [lustre]
      2021-06-03 10:50:34 [ 1584.731347]  ll_layout_refresh+0x1b7/0x310 [lustre]
      2021-06-03 10:50:34 [ 1584.731357]  vvp_io_init+0x20c/0x340 [lustre]
      2021-06-03 10:50:34 [ 1584.731386]  cl_io_init0.isra.16+0x83/0x130 [obdclass]
      2021-06-03 10:50:34 [ 1584.731396]  cl_glimpse_size0+0x97/0x240 [lustre]
      2021-06-03 10:50:34 [ 1584.731405]  ll_getattr+0x1d6/0x460 [lustre]
      2021-06-03 10:50:34 [ 1584.731408]  vfs_statx+0x8a/0xd0
      2021-06-03 10:50:34 [ 1584.731410]  __do_sys_newlstat+0x39/0x70
      2021-06-03 10:50:34 [ 1584.731413]  do_syscall_64+0x5b/0x1a0
      2021-06-03 10:50:34 [ 1584.731415]  entry_SYSCALL_64_after_hwframe+0x65/0xca
      2021-06-03 10:50:34 [ 1584.731417]  0xffffffffffffffff
      2021-06-03 10:50:34 [ 1584.731418] Pid: 4931, comm: dsync 4.18.0-240.15.1.el8.x86_64 #1 SMP Wed Mar 17 23:53:03 UTC 2021
      2021-06-03 10:50:34 [ 1584.731420] Call Trace:
      2021-06-03 10:50:34 [ 1584.731434]  libcfs_call_trace+0x86/0xc0 [libcfs]
      2021-06-03 10:50:34 [ 1584.731439]  lbug_with_loc+0x43/0x80 [libcfs]
      2021-06-03 10:50:34 [ 1584.731451]  ll_layout_lock_set+0xac/0x610 [lustre]
      2021-06-03 10:50:34 [ 1584.731462]  ll_layout_refresh+0x1b7/0x310 [lustre]
      2021-06-03 10:50:34 [ 1584.731474]  vvp_io_init+0x20c/0x340 [lustre]
      2021-06-03 10:50:34 [ 1584.731492]  cl_io_init0.isra.16+0x83/0x130 [obdclass]
      2021-06-03 10:50:34 [ 1584.731503]  cl_glimpse_size0+0x97/0x240 [lustre]
      2021-06-03 10:50:34 [ 1584.731513]  ll_getattr+0x1d6/0x460 [lustre]
      2021-06-03 10:50:34 [ 1584.731516]  vfs_statx+0x8a/0xd0
      2021-06-03 10:50:34 [ 1584.731518]  __do_sys_newlstat+0x39/0x70
      2021-06-03 10:50:34 [ 1584.731520]  do_syscall_64+0x5b/0x1a0
      2021-06-03 10:50:34 [ 1584.731522]  entry_SYSCALL_64_after_hwframe+0x65/0xca
      2021-06-03 10:50:34 [ 1584.731523]  0xffffffffffffffff
      2021-06-03 10:50:34 [ 1584.742387] LustreError: 4939:0:(file.c:4985:ll_layout_lock_set()) LBUG
      2021-06-03 10:50:34 [ 1584.746013] CPU: 29 PID: 4936 Comm: dsync Tainted: G           OE    --------- -  - 4.18.0-240.15.1.el8.x86_64 #1
      2021-06-03 10:50:34 [ 1584.750438] Pid: 4939, comm: dsync 4.18.0-240.15.1.el8.x86_64 #1 SMP Wed Mar 17 23:53:03 UTC 2021
      2021-06-03 10:50:34 [ 1584.754554] Call Trace:
      2021-06-03 10:50:34 [ 1584.760241] Call Trace:
      2021-06-03 10:50:34 [ 1584.763786]  dump_stack+0x5c/0x80
      2021-06-03 10:50:34 [ 1584.768589]  libcfs_call_trace+0x86/0xc0 [libcfs]
      2021-06-03 10:50:34 [ 1584.778951]  panic+0xe7/0x2a9
      2021-06-03 10:50:34 [ 1584.781733]  lbug_with_loc+0x43/0x80 [libcfs]
      2021-06-03 10:50:34 [ 1584.787014]  lbug_with_loc.cold.3+0x18/0x18 [libcfs]
      2021-06-03 10:50:34 [ 1584.791932]  ll_layout_lock_set+0xac/0x610 [lustre]
      2021-06-03 10:50:34 [ 1584.797413]  ll_layout_lock_set+0xac/0x610 [lustre]
      2021-06-03 10:50:34 [ 1584.802900]  ll_layout_refresh+0x1b7/0x310 [lustre]
      2021-06-03 10:50:34 [ 1584.807797]  ll_layout_refresh+0x1b7/0x310 [lustre]
      2021-06-03 10:50:34 [ 1584.813573]  vvp_io_init+0x20c/0x340 [lustre]
      2021-06-03 10:50:34 [ 1584.818854]  vvp_io_init+0x20c/0x340 [lustre]
      2021-06-03 10:50:34 [ 1584.823667]  cl_io_init0.isra.16+0x83/0x130 [obdclass]
      2021-06-03 10:50:34 [ 1584.827821]  cl_io_init0.isra.16+0x83/0x130 [obdclass]
      2021-06-03 10:50:34 [ 1584.832766]  cl_glimpse_size0+0x97/0x240 [lustre]
      2021-06-03 10:50:34 [ 1584.837401]  cl_glimpse_size0+0x97/0x240 [lustre]
      2021-06-03 10:50:34 [ 1584.843609]  ll_getattr+0x1d6/0x460 [lustre]
      2021-06-03 10:50:34 [ 1584.847642]  ll_getattr+0x1d6/0x460 [lustre]
      2021-06-03 10:50:34 [ 1584.858503]  vfs_statx+0x8a/0xd0
      2021-06-03 10:50:34 [ 1584.861769]  vfs_statx+0x8a/0xd0
      2021-06-03 10:50:34 [ 1584.867591]  __do_sys_newlstat+0x39/0x70
      2021-06-03 10:50:34 [ 1584.872992]  __do_sys_newlstat+0x39/0x70
      2021-06-03 10:50:34 [ 1584.878985]  do_syscall_64+0x5b/0x1a0
      2021-06-03 10:50:34 [ 1584.884952]  do_syscall_64+0x5b/0x1a0
      2021-06-03 10:50:34 [ 1584.890314]  entry_SYSCALL_64_after_hwframe+0x65/0xca
      2021-06-03 10:50:34 [ 1584.896536]  entry_SYSCALL_64_after_hwframe+0x65/0xca
      2021-06-03 10:50:34 [ 1584.902384]  0xffffffffffffffff
      2021-06-03 10:50:34 [ 1584.907682] RIP: 0033:0x7f1686fc6d89
      2021-06-03 10:50:34 [ 1585.205327] Code: 64 c7 00 16 00 00 00 b8 ff ff ff ff c3 0f 1f 40 00 f3 0f 1e fa 48 89 f0 83 ff 01 77 34 48 89 c7 48 89 d6 b8 06 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 07 c3 66 0f 1f 44 00 00 48 8b 15 c9 00 2d 00
      2021-06-03 10:50:34 [ 1585.226999] RSP: 002b:00007ffdf03e1458 EFLAGS: 00000246 ORIG_RAX: 0000000000000006
      2021-06-03 10:50:34 [ 1585.235853] RAX: ffffffffffffffda RBX: 00007ffdf03e2750 RCX: 00007f1686fc6d89
      2021-06-03 10:50:34 [ 1585.244257] RDX: 00007ffdf03e15f0 RSI: 00007ffdf03e15f0 RDI: 00007ffdf03e1680
      2021-06-03 10:50:34 [ 1585.252633] RBP: 00007ffdf03e1480 R08: 0000000000000000 R09: ffffff00000fffff
      2021-06-03 10:50:34 [ 1585.260999] R10: 00262f4a53a31a46 R11: 0000000000000246 R12: 00000000023e72f0
      2021-06-03 10:50:34 [ 1585.269375] R13: 00007f1687dda308 R14: 00007f1687dda380 R15: 00007f1687dda2e8
      2021-06-03 10:50:35 [ 1586.310396] Shutting down cpus with NMI
      2021-06-03 10:50:35 [ 1586.438268] Kernel Offset: 0x37a00000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff)
      2021-06-03 10:50:35 [ 1586.499673] ---[ end Kernel panic - not syncing: LBUG ]---
      

      While after rebooting one of our MDSs several times (in relation to other Lustre issues) this is no longer reproducible. So it may have been induced by one of the MDTs being in a bad state.

      Attachments

        Issue Links

          Activity

            People

              bobijam Zhenyu Xu
              bobijam Zhenyu Xu
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: