Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-9808

recovery-small test 102: osd_destroy()) ASSERTION( !lu_object_is_dying(dt->do_lu.lo_header) ) failed

    XMLWordPrintable

Details

    • Bug
    • Resolution: Cannot Reproduce
    • Minor
    • None
    • Lustre 2.10.0
    • None
    • 3
    • 9223372036854775807

    Description

      master + test-only change that enabled 2 recovery-small tests: https://review.whamcloud.com/#/c/27382/1

      causes this very consistent crash for me in recovery-small test 102, that is somewhat similar to LU-8060:

      [ 4069.911603] Lustre: DEBUG MARKER: == recovery-small test 102: IR: New client gets updated nidtbl after MGS restart ===================== 18:41:40 (1501368100)
      [ 4074.881029] LustreError: 137-5: lustre-OST0000_UUID: not available for connect from 0@lo (no target). If you are running an HA pair check that the target is mounted on the other server.
      [ 4074.884748] LustreError: Skipped 1 previous similar message
      [ 4083.127141] LDISKFS-fs (loop1): file extents enabled, maximum tree depth=5
      [ 4083.133413] LDISKFS-fs (loop1): mounted filesystem with ordered data mode. Opts: ,errors=remount-ro,no_mbcache,nodelalloc
      [ 4083.482673] Lustre: lustre-OST0000: Imperative Recovery enabled, recovery window shrunk from 60-180 down to 60-180
      [ 4085.776322] Lustre: DEBUG MARKER: centos-34.localnet: executing wait_import_state_mount FULL osc.lustre-OST0000-osc-*.ost_server_uuid
      [ 4088.448773] Lustre: lustre-OST0000: deleting orphan objects from 0x0:16965 to 0x0:17057
      [ 4089.173897] Lustre: DEBUG MARKER: osc.lustre-OST0000-osc-*.ost_server_uuid in FULL state after 3 sec
      [ 4096.630036] Lustre: Unmounted lustre-client
      [ 4097.292327] LustreError: 32347:0:(osd_handler.c:3197:osd_destroy()) ASSERTION( !lu_object_is_dying(dt->do_lu.lo_header) ) failed: 
      [ 4097.293334] LustreError: 32347:0:(osd_handler.c:3197:osd_destroy()) LBUG
      [ 4097.294142] Pid: 32347, comm: umount
      [ 4097.294810] 
      Call Trace:
      [ 4097.295666]  [<ffffffffa02207ce>] libcfs_call_trace+0x4e/0x60 [libcfs]
      [ 4097.296291]  [<ffffffffa022085c>] lbug_with_loc+0x4c/0xb0 [libcfs]
      [ 4097.296861]  [<ffffffffa0addc1e>] osd_destroy+0x50e/0x740 [osd_ldiskfs]
      [ 4097.297366]  [<ffffffffa0adcbbf>] ? osd_ref_del+0x13f/0x690 [osd_ldiskfs]
      [ 4097.298166]  [<ffffffffa0345397>] llog_osd_destroy+0x347/0x890 [obdclass]
      [ 4097.298809]  [<ffffffffa0330608>] llog_destroy+0x2f8/0x3d0 [obdclass]
      [ 4097.299339]  [<ffffffffa0338b9b>] llog_cat_close+0xfb/0x230 [obdclass]
      [ 4097.300058]  [<ffffffffa0b9c584>] mdd_changelog_fini+0x64/0x1d0 [mdd]
      [ 4097.300538]  [<ffffffffa0b9dc8a>] mdd_process_config+0x16a/0x600 [mdd]
      [ 4097.301169]  [<ffffffffa0bfaa5c>] mdt_stack_fini+0x2bc/0xcf0 [mdt]
      [ 4097.301824]  [<ffffffffa0bfbccf>] mdt_device_fini+0x83f/0xfb0 [mdt]
      [ 4097.302359]  [<ffffffffa0368b64>] class_cleanup+0x7b4/0xcf0 [obdclass]
      [ 4097.303037]  [<ffffffffa036b0dd>] class_process_config+0x19cd/0x23b0 [obdclass]
      [ 4097.303888]  [<ffffffff811cd4f9>] ? __kmalloc+0x649/0x660
      [ 4097.304378]  [<ffffffffa022bcc7>] ? libcfs_debug_msg+0x57/0x80 [libcfs]
      [ 4097.305199]  [<ffffffffa036bc86>] class_manual_cleanup+0x1c6/0x6d0 [obdclass]
      [ 4097.305723]  [<ffffffffa03994de>] server_put_super+0x8ae/0xca0 [obdclass]
      [ 4097.306233]  [<ffffffff811efa46>] generic_shutdown_super+0x56/0xe0
      [ 4097.306897]  [<ffffffff811efe22>] kill_anon_super+0x12/0x20
      [ 4097.307404]  [<ffffffffa036e3c2>] lustre_kill_super+0x32/0x50 [obdclass]
      [ 4097.317922]  [<ffffffff811f0329>] deactivate_locked_super+0x49/0x60
      [ 4097.318539]  [<ffffffff811f0926>] deactivate_super+0x46/0x60
      [ 4097.319040]  [<ffffffff8120f115>] mntput_no_expire+0xc5/0x120
      [ 4097.319801]  [<ffffffff8121029f>] SyS_umount+0x9f/0x3c0
      [ 4097.320293]  [<ffffffff8170fc49>] system_call_fastpath+0x16/0x1b
      [ 4097.322496] 
      [ 4097.323292] Kernel panic - not syncing: LBUG
      [ 4097.323791] CPU: 13 PID: 32347 Comm: umount Tainted: P           OE  ------------   3.10.0-debug #2
      [ 4097.324792] Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011
      [ 4097.325318]  ffffffffa023fed2 0000000014c35ea5 ffff8800986fb930 ffffffff816fd3e4
      [ 4097.326384]  ffff8800986fb9b0 ffffffff816f8c34 ffffffff00000008 ffff8800986fb9c0
      [ 4097.327395]  ffff8800986fb960 0000000014c35ea5 0000000014c35ea5 ffff88033e5ad948
      [ 4097.328526] Call Trace:
      [ 4097.329037]  [<ffffffff816fd3e4>] dump_stack+0x19/0x1b
      [ 4097.329501]  [<ffffffff816f8c34>] panic+0xd8/0x1e7
      [ 4097.330089]  [<ffffffffa0220874>] lbug_with_loc+0x64/0xb0 [libcfs]
      [ 4097.330668]  [<ffffffffa0addc1e>] osd_destroy+0x50e/0x740 [osd_ldiskfs]
      [ 4097.331221]  [<ffffffffa0adcbbf>] ? osd_ref_del+0x13f/0x690 [osd_ldiskfs]
      [ 4097.331790]  [<ffffffffa0345397>] llog_osd_destroy+0x347/0x890 [obdclass]
      [ 4097.332336]  [<ffffffffa0330608>] llog_destroy+0x2f8/0x3d0 [obdclass]
      [ 4097.333068]  [<ffffffffa0338b9b>] llog_cat_close+0xfb/0x230 [obdclass]
      [ 4097.333580]  [<ffffffffa0b9c584>] mdd_changelog_fini+0x64/0x1d0 [mdd]
      [ 4097.334171]  [<ffffffffa0b9dc8a>] mdd_process_config+0x16a/0x600 [mdd]
      [ 4097.334709]  [<ffffffffa0bfaa5c>] mdt_stack_fini+0x2bc/0xcf0 [mdt]
      [ 4097.335240]  [<ffffffffa0bfbccf>] mdt_device_fini+0x83f/0xfb0 [mdt]
      [ 4097.335775]  [<ffffffffa0368b64>] class_cleanup+0x7b4/0xcf0 [obdclass]
      [ 4097.336539]  [<ffffffffa036b0dd>] class_process_config+0x19cd/0x23b0 [obdclass]
      [ 4097.337676]  [<ffffffff811cd4f9>] ? __kmalloc+0x649/0x660
      [ 4097.338462]  [<ffffffffa022bcc7>] ? libcfs_debug_msg+0x57/0x80 [libcfs]
      [ 4097.339263]  [<ffffffffa036bc86>] class_manual_cleanup+0x1c6/0x6d0 [obdclass]
      [ 4097.339815]  [<ffffffffa03994de>] server_put_super+0x8ae/0xca0 [obdclass]
      [ 4097.340591]  [<ffffffff811efa46>] generic_shutdown_super+0x56/0xe0
      [ 4097.341146]  [<ffffffff811efe22>] kill_anon_super+0x12/0x20
      [ 4097.341889]  [<ffffffffa036e3c2>] lustre_kill_super+0x32/0x50 [obdclass]
      [ 4097.342685]  [<ffffffff811f0329>] deactivate_locked_super+0x49/0x60
      [ 4097.343460]  [<ffffffff811f0926>] deactivate_super+0x46/0x60
      [ 4097.344173]  [<ffffffff8120f115>] mntput_no_expire+0xc5/0x120
      [ 4097.344881]  [<ffffffff8121029f>] SyS_umount+0x9f/0x3c0
      [ 4097.345686]  [<ffffffff8170fc49>] system_call_fastpath+0x16/0x1b
      

      Attachments

        Issue Links

          Activity

            People

              wc-triage WC Triage
              green Oleg Drokin
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: