Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-13000

MDS hit (osd_handler.c:2165:osd_object_release()) ASSERTION( !(o->oo_destroyed == 0 && o->oo_inode && o->oo_inode ->i_nlink == 0) ) failed

    XMLWordPrintable

Details

    • Bug
    • Resolution: Duplicate
    • Minor
    • None
    • Lustre 2.13.0
    • b2.13-ib #2
    • 3
    • 9223372036854775807

    Description

      2 of 4 MDSs hit this issue after restart SOAK. SOAK was restarted due to LU-12990, after restart, without cleaning the update_log and did mds failover, it hit the problem right away. The restart process are:
      1. stop soak
      2. umount everything
      3. reboot
      4. mount
      5. restart soak

      soak-9 console

      [ 1731.742403] Lustre: soaked-MDT0002-osp-MDT0001: Connection restored to 192.168.1.110@o2ib (at 192.168.1.110@o2ib)
      [ 1731.753872] Lustre: Skipped 2 previous similar messages
      [ 1736.627026] Lustre: soaked-MDT0001: Recovery over after 5:49, of 12 clients 3 recovered and 9 were evicted.
      [23485.235434] LustreError: 5279:0:(osd_handler.c:2165:osd_object_release()) ASSERTION( !(o->oo_destroyed == 0 && o->oo_inode && o->oo_inode
      ->i_nlink == 0) ) failed:
      [23485.251769] LustreError: 5279:0:(osd_handler.c:2165:osd_object_release()) LBUG
      [23485.259857] Pid: 5279, comm: mdt01_001 3.10.0-1062.1.1.el7_lustre.x86_64 #1 SMP Fri Nov 8 18:37:40 UTC 2019
      [23485.270743] Call Trace:
      [23485.273503]  [<ffffffffc0dab8ac>] libcfs_call_trace+0x8c/0xc0 [libcfs]
      [23485.280832]  [<ffffffffc0dab95c>] lbug_with_loc+0x4c/0xa0 [libcfs]
      [23485.287762]  [<ffffffffc169466c>] osd_object_release+0x7c/0x80 [osd_ldiskfs]
      [23485.295670]  [<ffffffffc0eef0a8>] lu_object_put+0x198/0x3e0 [obdclass]
      [23485.303009]  [<ffffffffc0eb3a8a>] llog_osd_regular_fid_add_name_entry+0x27a/0x500 [obdclass]
      [23485.312472]  [<ffffffffc0eb4a5f>] llog_osd_declare_create+0x3af/0x710 [obdclass]
      [23485.320769]  [<ffffffffc0ea07f5>] llog_declare_create+0x75/0x1f0 [obdclass]
      [23485.328579]  [<ffffffffc0ea6fbd>] llog_cat_prep_log+0x11d/0x360 [obdclass]
      [23485.336290]  [<ffffffffc0ea7260>] llog_cat_declare_add_rec+0x60/0x260 [obdclass]
      [23485.344586]  [<ffffffffc0e9e178>] llog_declare_add+0x78/0x1a0 [obdclass]
      [23485.352100]  [<ffffffffc124c52e>] top_trans_start+0x17e/0x940 [ptlrpc]
      [23485.359490]  [<ffffffffc18ce2b4>] lod_trans_start+0x34/0x40 [lod]
      [23485.366340]  [<ffffffffc1989f9a>] mdd_trans_start+0x1a/0x20 [mdd]
      [23485.373197]  [<ffffffffc196e2f2>] mdd_create+0xbe2/0x1630 [mdd]
      [23485.379838]  [<ffffffffc17f8e84>] mdt_create+0xb54/0x10e0 [mdt]
      [23485.386519]  [<ffffffffc17f957b>] mdt_reint_create+0x16b/0x360 [mdt]
      [23485.393643]  [<ffffffffc17feab3>] mdt_reint_rec+0x83/0x210 [mdt]
      [23485.400378]  [<ffffffffc17d89e0>] mdt_reint_internal+0x7b0/0xba0 [mdt]
      [23485.407703]  [<ffffffffc17e46d7>] mdt_reint+0x67/0x140 [mdt]
      [23485.414052]  [<ffffffffc123b83a>] tgt_request_handle+0x98a/0x1630 [ptlrpc]
      [23485.421781]  [<ffffffffc11dda96>] ptlrpc_server_handle_request+0x256/0xb10 [ptlrpc]
      [23485.430384]  [<ffffffffc11e15cc>] ptlrpc_main+0xbac/0x1540 [ptlrpc]
      [23485.437434]  [<ffffffff848c50d1>] kthread+0xd1/0xe0
      [23485.442910]  [<ffffffff84f8cd37>] ret_from_fork_nospec_end+0x0/0x39
      [23485.449927]  [<ffffffffffffffff>] 0xffffffffffffffff
      [23485.455515] Kernel panic - not syncing: LBUG
      [23485.460280] CPU: 12 PID: 5279 Comm: mdt01_001 Kdump: loaded Tainted: G           OE  ------------   3.10.0-1062.1.1.el7_lustre.x86_64 #1
      [23485.473964] Hardware name: Intel Corporation S2600GZ ........../S2600GZ, BIOS SE5C600.86B.01.08.0003.022620131521 02/26/2013
      [23485.486499] Call Trace:
      [23485.489230]  [<ffffffff84f792c2>] dump_stack+0x19/0x1b
      [23485.494965]  [<ffffffff84f72941>] panic+0xe8/0x21f
      [23485.500316]  [<ffffffffc0dab9ab>] lbug_with_loc+0x9b/0xa0 [libcfs]
      [23485.507217]  [<ffffffffc169466c>] osd_object_release+0x7c/0x80 [osd_ldiskfs]
      [23485.515102]  [<ffffffffc0eef0a8>] lu_object_put+0x198/0x3e0 [obdclass]
      [23485.522401]  [<ffffffffc0eb3a8a>] llog_osd_regular_fid_add_name_entry+0x27a/0x500 [obdclass]
      [23485.531833]  [<ffffffffc0eb4a5f>] llog_osd_declare_create+0x3af/0x710 [obdclass]
      [23485.540099]  [<ffffffffc0ea07f5>] llog_declare_create+0x75/0x1f0 [obdclass]
      [23485.547881]  [<ffffffffc0ea6fbd>] llog_cat_prep_log+0x11d/0x360 [obdclass]
      [23485.555566]  [<ffffffffc0ea7260>] llog_cat_declare_add_rec+0x60/0x260 [obdclass]
      [23485.563832]  [<ffffffffc0e9e178>] llog_declare_add+0x78/0x1a0 [obdclass]
      [23485.571336]  [<ffffffffc124c52e>] top_trans_start+0x17e/0x940 [ptlrpc]
      [23485.578627]  [<ffffffffc18ce2b4>] lod_trans_start+0x34/0x40 [lod]
      [23485.585432]  [<ffffffffc1989f9a>] mdd_trans_start+0x1a/0x20 [mdd]
      [23485.592239]  [<ffffffffc196e2f2>] mdd_create+0xbe2/0x1630 [mdd]
      [23485.598859]  [<ffffffffc17f8e84>] mdt_create+0xb54/0x10e0 [mdt]
      [23485.605489]  [<ffffffffc0ecc3c4>] ? lprocfs_stats_lock+0x24/0xd0 [obdclass]
      [23485.613270]  [<ffffffffc17f957b>] mdt_reint_create+0x16b/0x360 [mdt]
      [23485.620370]  [<ffffffffc17feab3>] mdt_reint_rec+0x83/0x210 [mdt]
      [23485.627080]  [<ffffffffc17d89e0>] mdt_reint_internal+0x7b0/0xba0 [mdt]
      [23485.634378]  [<ffffffffc17e1f67>] ? mdt_thread_info_init+0xa7/0x1e0 [mdt]
      [23485.641963]  [<ffffffffc17e46d7>] mdt_reint+0x67/0x140 [mdt]
      [23485.648312]  [<ffffffffc123b83a>] tgt_request_handle+0x98a/0x1630 [ptlrpc]
      [23485.656022]  [<ffffffffc1212c41>] ? ptlrpc_nrs_req_get_nolock0+0xd1/0x170 [ptlrpc]
      [23485.664474]  [<ffffffffc0dabcbe>] ? ktime_get_real_seconds+0xe/0x10 [libcfs]
      [23485.672372]  [<ffffffffc11dda96>] ptlrpc_server_handle_request+0x256/0xb10 [ptlrpc]
      [23485.680951]  [<ffffffffc11da4e1>] ? ptlrpc_wait_event+0xd1/0x3a0 [ptlrpc]
      [23485.688535]  [<ffffffff848d2643>] ? __wake_up+0x13/0x20
      [23485.694392]  [<ffffffffc11e15cc>] ptlrpc_main+0xbac/0x1540 [ptlrpc]
      [23485.701412]  [<ffffffffc11e0a20>] ? ptlrpc_register_service+0xf90/0xf90 [ptlrpc]
      [23485.709666]  [<ffffffff848c50d1>] kthread+0xd1/0xe0
      [23485.715115]  [<ffffffff848c5000>] ? insert_kthread_work+0x40/0x40
      [23485.721914]  [<ffffffff84f8cd37>] ret_from_fork_nospec_begin+0x21/0x21
      [23485.729199]  [<ffffffff848c5000>] ? insert_kthread_work+0x40/0x40
      [    0.000000] Initializing cgroup subsys cpuset
      [    0.000000] Initializing cgroup subsys cpu
      [    0.000000] Initializing cgroup subsys cpuacct
      [    0.000000] Linux version 3.10.0-1062.1.1.el7_lustre.x86_64 (jenkins@trevis-306-el7-x8664-2.trevis.whamcloud.com) (gcc version 4.8.5 20150623 (Red Hat 4.8.5-36) (GCC) ) #1 SMP Fri Nov 8 18:37:40 UTC 2019
      [    0.000000] Command line: BOOT_IMAGE=/boot/vmlinuz-3.10.0-1062.1.1.el7_lustre.x86_64 ro console=ttyS0,115200 irqpoll nr_cpus=1 reset_devices cgroup_disable=memory mce=off numa=off udev.children-max=2 panic=10 rootflags=nofail acpi_no_memhotplug transparent_hugepage=never nokaslr novmcoredd disable_cpu_apicid=0 elfcorehdr=869816K
      [    0.000000] e820: BIOS-provided physical RAM map:
      

      soak-10 console

      [ 2564.640197] Lustre: soaked-MDT0002: disconnecting 9 stale clients
      [ 2564.727697] Lustre: soaked-MDT0002: Recovery over after 5:00, of 12 clients 3 recovered and 9 were evicted.
      [ 2567.239742] Lustre: soaked-MDT0000-osp-MDT0002: Connection restored to 192.168.1.108@o2ib (at 192.168.1.108@o2ib)
      [ 2567.251222] Lustre: Skipped 8 previous similar messages
      [24318.991635] LustreError: 5459:0:(osd_handler.c:2165:osd_object_release()) ASSERTION( !(o->oo_destroyed == 0 && o->oo_inode && o->oo_inode
      ->i_nlink == 0) ) failed: 
      [24319.007997] LustreError: 5459:0:(osd_handler.c:2165:osd_object_release()) LBUG
      [24319.016083] Pid: 5459, comm: mdt_out01_002 3.10.0-1062.1.1.el7_lustre.x86_64 #1 SMP Fri Nov 8 18:37:40 UTC 2019
      [24319.027375] Call Trace:
      [24319.030151]  [<ffffffffc0c7d8ac>] libcfs_call_trace+0x8c/0xc0 [libcfs]
      [24319.037494]  [<ffffffffc0c7d95c>] lbug_with_loc+0x4c/0xa0 [libcfs]
      [24319.044452]  [<ffffffffc158166c>] osd_object_release+0x7c/0x80 [osd_ldiskfs]
      [24319.052389]  [<ffffffffc0dc10a8>] lu_object_put+0x198/0x3e0 [obdclass]
      [24319.059807]  [<ffffffffc1110b9c>] out_tx_end+0x1ec/0x5c0 [ptlrpc]
      [24319.066758]  [<ffffffffc1114d52>] out_handle+0x1442/0x1bb0 [ptlrpc]
      [24319.073859]  [<ffffffffc110d83a>] tgt_request_handle+0x98a/0x1630 [ptlrpc]
      [24319.081638]  [<ffffffffc10afa96>] ptlrpc_server_handle_request+0x256/0xb10 [ptlrpc]
      [24319.090301]  [<ffffffffc10b35cc>] ptlrpc_main+0xbac/0x1540 [ptlrpc]
      [24319.097391]  [<ffffffff8bcc50d1>] kthread+0xd1/0xe0
      [24319.102906]  [<ffffffff8c38cd37>] ret_from_fork_nospec_end+0x0/0x39
      [24319.109926]  [<ffffffffffffffff>] 0xffffffffffffffff
      [24319.115541] Kernel panic - not syncing: LBUG
      [24319.120317] CPU: 12 PID: 5459 Comm: mdt_out01_002 Kdump: loaded Tainted: G           OE  ------------   3.10.0-1062.1.1.el7_lustre.x86_64 #1
      [24319.134391] Hardware name: Intel Corporation S2600GZ ........../S2600GZ, BIOS SE5C600.86B.01.08.0003.022620131521 02/26/2013
      [24319.146920] Call Trace:
      [24319.149656]  [<ffffffff8c3792c2>] dump_stack+0x19/0x1b
      [24319.155410]  [<ffffffff8c372941>] panic+0xe8/0x21f
      [24319.160762]  [<ffffffffc0c7d9ab>] lbug_with_loc+0x9b/0xa0 [libcfs]
      [24319.167668]  [<ffffffffc158166c>] osd_object_release+0x7c/0x80 [osd_ldiskfs]
      [24319.175562]  [<ffffffffc0dc10a8>] lu_object_put+0x198/0x3e0 [obdclass]
      [24319.182897]  [<ffffffffc1110b9c>] out_tx_end+0x1ec/0x5c0 [ptlrpc]
      [24319.189744]  [<ffffffffc1114d52>] out_handle+0x1442/0x1bb0 [ptlrpc]
      [24319.196784]  [<ffffffffc110d83a>] tgt_request_handle+0x98a/0x1630 [ptlrpc]
      [24319.204509]  [<ffffffffc10e4c41>] ? ptlrpc_nrs_req_get_nolock0+0xd1/0x170 [ptlrpc]
      [24319.212965]  [<ffffffffc0c7dcbe>] ? ktime_get_real_seconds+0xe/0x10 [libcfs]
      [24319.220877]  [<ffffffffc10afa96>] ptlrpc_server_handle_request+0x256/0xb10 [ptlrpc]
      [24319.229471]  [<ffffffffc10ac4e1>] ? ptlrpc_wait_event+0xd1/0x3a0 [ptlrpc]
      [24319.237076]  [<ffffffff8bcd2643>] ? __wake_up+0x13/0x20
      [24319.242946]  [<ffffffffc10b35cc>] ptlrpc_main+0xbac/0x1540 [ptlrpc]
      [24319.249988]  [<ffffffffc10b2a20>] ? ptlrpc_register_service+0xf90/0xf90 [ptlrpc]
      [24319.258245]  [<ffffffff8bcc50d1>] kthread+0xd1/0xe0
      [24319.263695]  [<ffffffff8bcc5000>] ? insert_kthread_work+0x40/0x40
      [24319.270500]  [<ffffffff8c38cd37>] ret_from_fork_nospec_begin+0x21/0x21
      [24319.277791]  [<ffffffff8bcc5000>] ? insert_kthread_work+0x40/0x40
      [    0.000000] Initializing cgroup subsys cpuset
      [    0.000000] Initializing cgroup subsys cpu
      [    0.000000] Initializing cgroup subsys cpuacct
      

      Attachments

        Issue Links

          Activity

            People

              wc-triage WC Triage
              sarah Sarah Liu
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: