Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-12003

Access to invalid semaphore in osd_trunc_unlock_all (ldiskfs)

    XMLWordPrintable

Details

    • Bug
    • Resolution: Fixed
    • Major
    • Lustre 2.14.0, Lustre 2.12.7
    • Lustre 2.13.0
    • None
    • 3
    • 9223372036854775807

    Description

      I am having thse craces mostly in racer, but someimes in other tests where all of a sudden transaction unlock steps on ivalid memory pointer and explodes.

      Here's a sample from racer:

      [ 1956.129169] BUG: unable to handle kernel paging request at ffff88031ba1ce50
      [ 1956.161700] IP: [<ffffffff810ba263>] up_read+0x13/0x30
      [ 1956.161700] PGD 241b067 PUD 241e067 PMD 33ff04067 PTE 800000031ba1c060
      [ 1956.169565] Oops: 0002 [#1] SMP DEBUG_PAGEALLOC
      [ 1956.169565] Modules linked in: loop zfs(PO) zunicode(PO) zavl(PO) icp(PO) zcommon(PO) znvpair(PO) spl(O) lustre(OE) ofd(OE) osp(OE) lod(OE) ost(OE) mdt(OE) mdd(OE) mgs(OE) osd_ldiskfs(OE) ldiskfs(OE) jbd2 mbcache lquota(OE) lfsck(OE) obdecho(OE) mgc(OE) lov(OE) mdc(OE) osc(OE) lmv(OE) fid(OE) fld(OE) ptlrpc_gss(OE) ptlrpc(OE) obdclass(OE) ksocklnd(OE) lnet(OE) dm_flakey dm_mod libcfs(OE) crc_t10dif crct10dif_generic crct10dif_common i2c_piix4 virtio_console virtio_balloon pcspkr ip_tables rpcsec_gss_krb5 ata_generic pata_acpi drm_kms_helper ttm drm drm_panel_orientation_quirks ata_piix serio_raw i2c_core virtio_blk libata floppy
      [ 1956.179867] CPU: 5 PID: 16415 Comm: mdt02_005 Kdump: loaded Tainted: P           OE  ------------   3.10.0-7.6-debug #1
      [ 1956.179867] Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011
      [ 1956.179867] task: ffff880285c8c8c0 ti: ffff880285c90000 task.ti: ffff880285c90000
      [ 1956.193466] RIP: 0010:[<ffffffff810ba263>]  [<ffffffff810ba263>] up_read+0x13/0x30
      [ 1956.193466] RSP: 0018:ffff880285c938d8  EFLAGS: 00010202
      [ 1956.193466] RAX: ffff88031ba1ce50 RBX: ffff880226d79200 RCX: 0000000000000000
      [ 1956.193466] RDX: ffffffffffffffff RSI: 000000000000006b RDI: ffff88031ba1ce50
      [ 1956.193466] RBP: ffff880285c938d8 R08: ffff880226d79e80 R09: ffff880226d79e80
      [ 1956.193466] R10: ffff8802d8880000 R11: ffff8802d88806a8 R12: ffff88021943be40
      [ 1956.193466] R13: ffff880226d79200 R14: ffff880285c93948 R15: ffff8802d88806b0
      [ 1956.193466] FS:  0000000000000000(0000) GS:ffff88033db40000(0000) knlGS:0000000000000000
      [ 1956.193466] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [ 1956.225299] CR2: ffff88031ba1ce50 CR3: 00000002210e4000 CR4: 00000000000006e0
      [ 1956.225299] Call Trace:
      [ 1956.225299]  [<ffffffffa0ae6fb5>] osd_trunc_unlock_all+0x35/0x150 [osd_ldiskfs]
      [ 1956.225299]  [<ffffffffa0accda5>] osd_trans_stop+0x205/0x820 [osd_ldiskfs]
      [ 1956.316681]  [<ffffffffa060bf23>] dt_trans_stop+0x13/0x30 [ptlrpc]
      [ 1956.317199]  [<ffffffffa060f82d>] top_trans_stop+0x30d/0xa10 [ptlrpc]
      [ 1956.317199]  [<ffffffffa0ce9b3c>] lod_trans_stop+0x25c/0x340 [lod]
      [ 1956.317199]  [<ffffffffa060e6ba>] ? top_trans_start+0x34a/0x960 [ptlrpc]
      [ 1956.317199]  [<ffffffffa0bdf108>] mdd_trans_stop+0x28/0x16e [mdd]
      [ 1956.317199]  [<ffffffffa0bd34a6>] mdd_attr_set+0x5e6/0xcf0 [mdd]
      [ 1956.317199]  [<ffffffffa0594032>] ? lustre_msg_get_versions+0x22/0xf0 [ptlrpc]
      [ 1956.317199]  [<ffffffffa0c41e44>] mdt_reint_setattr+0xad4/0x1510 [mdt]
      [ 1956.317199]  [<ffffffffa0c32c71>] ? mdt_root_squash+0x21/0x430 [mdt]
      [ 1956.317199]  [<ffffffffa0c325a2>] ? ucred_set_audit_enabled.isra.13+0x22/0x60 [mdt]
      [ 1956.369343]  [<ffffffffa0c45c80>] mdt_reint_rec+0x80/0x210 [mdt]
      [ 1956.369343]  [<ffffffffa0c22890>] mdt_reint_internal+0x790/0xb30 [mdt]
      [ 1956.369343]  [<ffffffffa0c2a9e7>] ? mdt_thread_info_init+0xa7/0x1e0 [mdt]
      [ 1956.369343]  [<ffffffffa0c2d9b7>] mdt_reint+0x67/0x140 [mdt]
      [ 1956.369343]  [<ffffffffa05fc2a5>] tgt_request_handle+0x915/0x1610 [ptlrpc]
      [ 1956.369343]  [<ffffffffa01a1fa7>] ? libcfs_debug_msg+0x57/0x80 [libcfs]
      [ 1956.369343]  [<ffffffffa05a13d9>] ptlrpc_server_handle_request+0x259/0xad0 [ptlrpc]
      [ 1956.369343]  [<ffffffff810bfbd8>] ? __wake_up_common+0x58/0x90
      [ 1956.369343]  [<ffffffff813fb7bb>] ? do_raw_spin_unlock+0x4b/0x90
      [ 1956.369343]  [<ffffffffa05a53bc>] ptlrpc_main+0xb7c/0x22c0 [ptlrpc]
      [ 1956.369343]  [<ffffffff813fb7bb>] ? do_raw_spin_unlock+0x4b/0x90
      [ 1956.369343]  [<ffffffff817b99fe>] ? _raw_spin_unlock_irq+0xe/0x30
      [ 1956.369343]  [<ffffffff813fb7bb>] ? do_raw_spin_unlock+0x4b/0x90
      [ 1956.369343]  [<ffffffffa05a4840>] ? ptlrpc_register_service+0xfb0/0xfb0 [ptlrpc]
      [ 1956.369343]  [<ffffffff810b4ed4>] kthread+0xe4/0xf0
      [ 1956.369343]  [<ffffffff810b4df0>] ? kthread_create_on_node+0x140/0x140
      [ 1956.369343]  [<ffffffff817c4c77>] ret_from_fork_nospec_begin+0x21/0x21
      [ 1956.369343]  [<ffffffff810b4df0>] ? kthread_create_on_node+0x140/0x140
      

      Here's a sample from sanity:

      [13322.028331] BUG: unable to handle kernel paging request at ffff880243adce50
      [13322.028331] IP: [<ffffffff810ba263>] up_read+0x13/0x30
      [13322.028331] PGD 241b067 PUD 33edfb067 PMD 33eddd067 PTE 8000000243adc060
      [13322.028331] Oops: 0002 [#1] SMP DEBUG_PAGEALLOC
      [13322.028331] Modules linked in: lustre(OE) ofd(OE) osp(OE) lod(OE) ost(OE) mdt(OE) mdd(OE) mgs(OE) osd_ldiskfs(OE) ldiskfs(OE) lquota(OE) lfsck(OE) obdecho(OE) mgc(OE) lov(OE) mdc(OE) osc(OE) lmv(OE) fid(OE) fld(OE) ptlrpc_gss(OE) ptlrpc(OE) obdclass(OE) ksocklnd(OE) lnet(OE) libcfs(OE) dm_flakey dm_mod brd ext4 loop zfs(PO) zunicode(PO) zavl(PO) icp(PO) zcommon(PO) znvpair(PO) spl(O) jbd2 mbcache crc_t10dif crct10dif_generic crct10dif_common pcspkr virtio_balloon virtio_console i2c_piix4 ip_tables rpcsec_gss_krb5 ata_generic pata_acpi drm_kms_helper ttm drm drm_panel_orientation_quirks ata_piix i2c_core serio_raw virtio_blk libata floppy [last unloaded: libcfs]
      [13322.028331] CPU: 8 PID: 17372 Comm: mdt04_001 Kdump: loaded Tainted: P        W  OE  ------------   3.10.0-7.6-debug #1
      [13322.028331] Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011
      [13322.028331] task: ffff88027d992500 ti: ffff88024c3ec000 task.ti: ffff88024c3ec000
      [13322.028331] RIP: 0010:[<ffffffff810ba263>]  [<ffffffff810ba263>] up_read+0x13/0x30
      [13322.028331] RSP: 0018:ffff88024c3ef8d8  EFLAGS: 00010202
      [13322.028331] RAX: ffff880243adce50 RBX: ffff88016832b640 RCX: 0000000000000000
      [13322.028331] RDX: ffffffffffffffff RSI: 000000000000006b RDI: ffff880243adce50
      [13322.028331] RBP: ffff88024c3ef8d8 R08: ffff88029bb13f98 R09: ffff880295b13e60
      [13322.028331] R10: ffff8802d9c12000 R11: ffff8802d9c12228 R12: ffff8800196687a0
      [13322.028331] R13: ffff88016832b640 R14: ffff88024c3ef948 R15: ffff8802d9c12230
      [13322.028331] FS:  0000000000000000(0000) GS:ffff88033dc00000(0000) knlGS:0000000000000000
      [13322.028331] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [13322.028331] CR2: ffff880243adce50 CR3: 000000006bb86000 CR4: 00000000000006e0
      [13322.028331] Call Trace:
      [13322.028331]  [<ffffffffa0c1bfb5>] osd_trunc_unlock_all+0x35/0x150 [osd_ldiskfs]
      [13322.028331]  [<ffffffffa0c01da5>] osd_trans_stop+0x205/0x820 [osd_ldiskfs]
      [13322.028331]  [<ffffffffa0b6c300>] ? ldiskfs_get_acl+0x400/0x410 [ldiskfs]
      [13322.028331]  [<ffffffffa0626f23>] dt_trans_stop+0x13/0x30 [ptlrpc]
      [13322.028331]  [<ffffffffa062a82d>] top_trans_stop+0x30d/0xa10 [ptlrpc]
      [13322.028331]  [<ffffffffa0d99b3c>] lod_trans_stop+0x25c/0x340 [lod]
      [13322.028331]  [<ffffffffa0938108>] mdd_trans_stop+0x28/0x16e [mdd]
      [13322.028331]  [<ffffffffa092c4a6>] mdd_attr_set+0x5e6/0xcf0 [mdd]
      [13322.028331]  [<ffffffffa05af032>] ? lustre_msg_get_versions+0x22/0xf0 [ptlrpc]
      [13322.028331]  [<ffffffffa0cf1e44>] mdt_reint_setattr+0xad4/0x1510 [mdt]
      [13322.028331]  [<ffffffffa0ce2c71>] ? mdt_root_squash+0x21/0x430 [mdt]
      [13322.028331]  [<ffffffffa0ce25a2>] ? ucred_set_audit_enabled.isra.13+0x22/0x60 [mdt]
      [13322.028331]  [<ffffffffa0cf5c80>] mdt_reint_rec+0x80/0x210 [mdt]
      [13322.028331]  [<ffffffffa0cd2890>] mdt_reint_internal+0x790/0xb30 [mdt]
      [13322.028331]  [<ffffffffa0cda9e7>] ? mdt_thread_info_init+0xa7/0x1e0 [mdt]
      [13322.028331]  [<ffffffffa0cdd9b7>] mdt_reint+0x67/0x140 [mdt]
      [13322.028331]  [<ffffffffa06172a5>] tgt_request_handle+0x915/0x1610 [ptlrpc]
      [13322.028331]  [<ffffffffa0214fa7>] ? libcfs_debug_msg+0x57/0x80 [libcfs]
      [13322.028331]  [<ffffffffa05bc3d9>] ptlrpc_server_handle_request+0x259/0xad0 [ptlrpc]
      [13322.028331]  [<ffffffff810bfbd8>] ? __wake_up_common+0x58/0x90
      [13322.028331]  [<ffffffff813fb7bb>] ? do_raw_spin_unlock+0x4b/0x90
      [13322.028331]  [<ffffffffa05c03bc>] ptlrpc_main+0xb7c/0x22c0 [ptlrpc]
      [13322.028331]  [<ffffffff813fb7bb>] ? do_raw_spin_unlock+0x4b/0x90
      [13322.028331]  [<ffffffffa05bf840>] ? ptlrpc_register_service+0xfb0/0xfb0 [ptlrpc]
      [13322.028331]  [<ffffffff810b4ed4>] kthread+0xe4/0xf0
      [13322.028331]  [<ffffffff810b4df0>] ? kthread_create_on_node+0x140/0x140
      [13322.028331]  [<ffffffff817c4c77>] ret_from_fork_nospec_begin+0x21/0x21
      [13322.028331]  [<ffffffff810b4df0>] ? kthread_create_on_node+0x140/0x140
      

      Attachments

        Activity

          People

            bzzz Alex Zhuravlev
            green Oleg Drokin
            Votes:
            0 Vote for this issue
            Watchers:
            7 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: