Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-10297

parallel-scale-nfsv4 test_metabench: ASSERTION( nfound <= inuse->op_count ) failed

    XMLWordPrintable

Details

    • Bug
    • Resolution: Fixed
    • Minor
    • Lustre 2.11.0, Lustre 2.10.4
    • Lustre 2.11.0, Lustre 2.10.2
    • None
    • 3
    • 9223372036854775807

    Description

      This issue was created by maloo for sarah_lw <wei3.liu@intel.com>

      This issue relates to the following test suite run: https://testing.hpdd.intel.com/test_sets/bbaab274-d4fa-11e7-9c63-52540065bddc.

      The sub-test test_metabench failed with the following error:

      Timeout occurred after 347 mins, last suite running was parallel-scale-nfsv4, restarting cluster to continue tests
      

      2.10.2 RC1 EL7.4 zfs
      MDS crash

      [13488.166260] Lustre: DEBUG MARKER: dmesg
      [13488.563678] Lustre: DEBUG MARKER: /usr/sbin/lctl mark == parallel-scale-nfsv4 test metabench: metabench ==================================================== 09:11:38 \(1511946698\)
      [13488.724793] Lustre: DEBUG MARKER: == parallel-scale-nfsv4 test metabench: metabench ==================================================== 09:11:38 (1511946698)
      [13641.720773] LustreError: 1187:0:(lod_qos.c:1624:lod_alloc_qos()) ASSERTION( nfound <= inuse->op_count ) failed: nfound:3, op_count:0
      [13641.721991] LustreError: 1187:0:(lod_qos.c:1624:lod_alloc_qos()) LBUG
      [13641.722604] Pid: 1187, comm: mdt00_011
      [13641.722980] 
      [13641.722980] Call Trace:
      [13641.723386]  [<ffffffffc05c27ae>] libcfs_call_trace+0x4e/0x60 [libcfs]
      [13641.724029]  [<ffffffffc05c283c>] lbug_with_loc+0x4c/0xb0 [libcfs]
      [13641.724636]  [<ffffffffc125c342>] lod_alloc_qos.constprop.17+0x1582/0x1590 [lod]
      [13641.725507]  [<ffffffffc125efe1>] lod_qos_prep_create+0x1291/0x17f0 [lod]
      [13641.726161]  [<ffffffffc062c4b6>] ? nvlist_lookup_byte_array+0x26/0x30 [znvpair]
      [13641.726921]  [<ffffffffc0fd1cc9>] ? __osd_xattr_get+0xa9/0x210 [osd_zfs]
      [13641.727558]  [<ffffffffc0fd0007>] ? osd_fid_lookup+0x47/0x3a0 [osd_zfs]
      [13641.728196]  [<ffffffffc125fab8>] lod_prepare_create+0x298/0x3f0 [lod]
      [13641.728900]  [<ffffffffc125463e>] lod_declare_striped_create+0x1ee/0x970 [lod]
      [13641.729620]  [<ffffffffc077dbc5>] ? sa_object_size+0x15/0x20 [zfs]
      [13641.730220]  [<ffffffffc12583d1>] lod_declare_xattr_set+0x221/0xe40 [lod]
      [13641.730953]  [<ffffffffc12b2d97>] mdd_create_data+0x487/0x720 [mdd]
      [13641.731568]  [<ffffffffc1186f8a>] mdt_mfd_open+0xc5a/0xe70 [mdt]
      [13641.732152]  [<ffffffffc118771b>] mdt_finish_open+0x57b/0x690 [mdt]
      [13641.732832]  [<ffffffffc1188fcc>] mdt_reint_open+0x179c/0x31a0 [mdt]
      [13641.733464]  [<ffffffffc0bbb717>] ? upcall_cache_get_entry+0x3f7/0x8f0 [obdclass]
      [13641.734187]  [<ffffffffc0bc042e>] ? lu_ucred+0x1e/0x30 [obdclass]
      [13641.734853]  [<ffffffffc116e925>] ? mdt_ucred+0x15/0x20 [mdt]
      [13641.735398]  [<ffffffffc116f1f1>] ? mdt_root_squash+0x21/0x430 [mdt]
      [13641.736035]  [<ffffffffc117e8a0>] mdt_reint_rec+0x80/0x210 [mdt]
      [13641.736676]  [<ffffffffc116030b>] mdt_reint_internal+0x5fb/0x9c0 [mdt]
      [13641.737313]  [<ffffffffc1160832>] mdt_intent_reint+0x162/0x430 [mdt]
      [13641.737993]  [<ffffffffc116b59e>] mdt_intent_policy+0x43e/0xc70 [mdt]
      [13641.738644]  [<ffffffffc0d312b7>] ldlm_lock_enqueue+0x387/0x970 [ptlrpc]
      [13641.739318]  [<ffffffffc0d5ad03>] ldlm_handle_enqueue0+0x9c3/0x1680 [ptlrpc]
      [13641.740102]  [<ffffffffc0d82ee0>] ? lustre_swab_ldlm_request+0x0/0x30 [ptlrpc]
      [13641.740847]  [<ffffffffc0de01d2>] tgt_enqueue+0x62/0x210 [ptlrpc]
      [13641.741488]  [<ffffffffc0de40d5>] tgt_request_handle+0x925/0x1370 [ptlrpc]
      [13641.742174]  [<ffffffffc0d8cf16>] ptlrpc_server_handle_request+0x236/0xa90 [ptlrpc]
      [13641.742975]  [<ffffffff810ba598>] ? __wake_up_common+0x58/0x90
      [13641.743565]  [<ffffffffc0d90652>] ptlrpc_main+0xa92/0x1e40 [ptlrpc]
      [13641.744168]  [<ffffffff81029557>] ? __switch_to+0xd7/0x510
      [13641.744770]  [<ffffffff816a9000>] ? __schedule+0x370/0x8b0
      [13641.745310]  [<ffffffffc0d8fbc0>] ? ptlrpc_main+0x0/0x1e40 [ptlrpc]
      [13641.745927]  [<ffffffff810b099f>] kthread+0xcf/0xe0
      [13641.746448]  [<ffffffff810b08d0>] ? kthread+0x0/0xe0
      [13641.746936]  [<ffffffff816b4fd8>] ret_from_fork+0x58/0x90
      [13641.747498]  [<ffffffff810b08d0>] ? kthread+0x0/0xe0
      [13641.747980] 
      [13641.748184] Kernel panic - not syncing: LBUG
      [13641.748599] CPU: 0 PID: 1187 Comm: mdt00_011 Tainted: P           OE  ------------   3.10.0-693.5.2.el7_lustre.x86_64 #1
      [13641.749598] Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011
      [13641.750132]  ffff88005a2f2f00 00000000d5284124 ffff88004dbbf558 ffffffff816a3e2d
      [13641.750897]  ffff88004dbbf5d8 ffffffff8169dd14 ffffffff00000008 ffff88004dbbf5e8
      [13641.751664]  ffff88004dbbf588 00000000d5284124 00000000d5284124 0000000000000246
      [13641.752429] Call Trace:
      [13641.752678]  [<ffffffff816a3e2d>] dump_stack+0x19/0x1b
      [13641.753161]  [<ffffffff8169dd14>] panic+0xe8/0x20d
      [13641.753619]  [<ffffffffc05c2854>] lbug_with_loc+0x64/0xb0 [libcfs]
      [13641.754206]  [<ffffffffc125c342>] lod_alloc_qos.constprop.17+0x1582/0x1590 [lod]
      [13641.754914]  [<ffffffffc125efe1>] lod_qos_prep_create+0x1291/0x17f0 [lod]
      [13641.755559]  [<ffffffffc062c4b6>] ? nvlist_lookup_byte_array+0x26/0x30 [znvpair]
      [13641.756260]  [<ffffffffc0fd1cc9>] ? __osd_xattr_get+0xa9/0x210 [osd_zfs]
      [13641.756892]  [<ffffffffc0fd0007>] ? osd_fid_lookup+0x47/0x3a0 [osd_zfs]
      [13641.757517]  [<ffffffffc125fab8>] lod_prepare_create+0x298/0x3f0 [lod]
      [13641.758127]  [<ffffffffc125463e>] lod_declare_striped_create+0x1ee/0x970 [lod]
      [13641.758822]  [<ffffffffc077dbc5>] ? sa_object_size+0x15/0x20 [zfs]
      [13641.759407]  [<ffffffffc12583d1>] lod_declare_xattr_set+0x221/0xe40 [lod]
      [13641.760048]  [<ffffffffc12b2d97>] mdd_create_data+0x487/0x720 [mdd]
      [13641.760650]  [<ffffffffc1186f8a>] mdt_mfd_open+0xc5a/0xe70 [mdt]
      [13641.761217]  [<ffffffffc118771b>] mdt_finish_open+0x57b/0x690 [mdt]
      [13641.761818]  [<ffffffffc1188fcc>] mdt_reint_open+0x179c/0x31a0 [mdt]
      [13641.762425]  [<ffffffffc0bbb717>] ? upcall_cache_get_entry+0x3f7/0x8f0 [obdclass]
      [13641.763143]  [<ffffffffc0bc042e>] ? lu_ucred+0x1e/0x30 [obdclass]
      [13641.763723]  [<ffffffffc116e925>] ? mdt_ucred+0x15/0x20 [mdt]
      [13641.764266]  [<ffffffffc116f1f1>] ? mdt_root_squash+0x21/0x430 [mdt]
      [13641.764879]  [<ffffffffc117e8a0>] mdt_reint_rec+0x80/0x210 [mdt]
      [13641.765448]  [<ffffffffc116030b>] mdt_reint_internal+0x5fb/0x9c0 [mdt]
      [13641.766062]  [<ffffffffc1160832>] mdt_intent_reint+0x162/0x430 [mdt]
      [13641.766662]  [<ffffffffc116b59e>] mdt_intent_policy+0x43e/0xc70 [mdt]
      [13641.767285]  [<ffffffffc0d312b7>] ldlm_lock_enqueue+0x387/0x970 [ptlrpc]
      [13641.767938]  [<ffffffffc0d5ad03>] ldlm_handle_enqueue0+0x9c3/0x1680 [ptlrpc]
      [13641.768628]  [<ffffffffc0d82ee0>] ? lustre_swab_ldlm_lock_desc+0x30/0x30 [ptlrpc]
      [13641.769359]  [<ffffffffc0de01d2>] tgt_enqueue+0x62/0x210 [ptlrpc]
      [13641.769961]  [<ffffffffc0de40d5>] tgt_request_handle+0x925/0x1370 [ptlrpc]
      [13641.770634]  [<ffffffffc0d8cf16>] ptlrpc_server_handle_request+0x236/0xa90 [ptlrpc]
      [13641.771346]  [<ffffffff810ba598>] ? __wake_up_common+0x58/0x90
      [13641.771925]  [<ffffffffc0d90652>] ptlrpc_main+0xa92/0x1e40 [ptlrpc]
      [13641.772515]  [<ffffffff81029557>] ? __switch_to+0xd7/0x510
      [13641.773028]  [<ffffffff816a9000>] ? __schedule+0x370/0x8b0
      [13641.773567]  [<ffffffffc0d8fbc0>] ? ptlrpc_register_service+0xe30/0xe30 [ptlrpc]
      [13641.774255]  [<ffffffff810b099f>] kthread+0xcf/0xe0
      [13641.774718]  [<ffffffff810b08d0>] ? insert_kthread_work+0x40/0x40
      [13641.775281]  [<ffffffff816b4fd8>] ret_from_fork+0x58/0x90
      [13641.775790]  [<ffffffff810b08d0>] ? insert_kthread_work+0x40/0x40
      [    0.000000] Initializing cgroup subsys cpuset
      [    0.000000] Initializing cgroup subsys cpu
      [    0.000000] Initializing cgroup subsys cpuacct
      [    0.000000] Linux version 3.10.0-693.5.2.el7_lustre.x86_64 (jenkins@trevis-310-el7-x8664-2.trevis.hpdd.intel.com) (gcc version 4.8.5 20150623 (Red Hat 4.8.5-4) (GCC) ) #1 SMP Mon Nov 27 15:30:51 UTC 2017
      [    0.000000] Command line: BOOT_IMAGE=/boot/vmlinuz-3.10.0-693.5.2.el7_lustre.x86_64 root=UUID=d6ed6b92-d297-4df2-adc0-6d2fa1ffd5ed ro console=tty0 LANG=en_US.UTF-8 console=ttyS0,115200 net.ifnames=0 irqpoll nr_cpus=1 reset_devices cgroup_disable=memory mce=off numa=off udev.children-max=2 panic=10 rootflags=nofail acpi_no_memhotplug transparent_hugepage=never disable_cpu_apicid=0 elfcorehdr=867700K
      [    0.000000] e820: BIOS-provided physical RAM map:
      [    0.000000] BIOS-e820: [mem 0x0000000000000000-0x0000000000000fff] reserved
      [    0.000000] BIOS-e820: [mem 0x0000000000001000-0x000000000009f7ff] usable
      [    0.000000] BIOS-e820: [mem 0x000000000009f800-0x000000000009ffff] reserved
      [    0.000000] BIOS-e820: [mem 0x00000000000f0000-0x00000000000fffff] reserved
      [    0.000000] BIOS-e820: [mem 0x0000000021000000-0x0000000034f5cfff] usable
      [    0.000000] BIOS-e820: [mem 0x0000000034fff800-0x0000000034ffffff] usable
      [    0.000000] BIOS-e820: [mem 0x000000007fffa000-0x000000007fffffff] reserved
      [    0.000000] BIOS-e820: [mem 0x00000000feffc000-0x00000000feffffff] reserved
      [    0.000000] BIOS-e820: [mem 0x00000000fffc0000-0x00000000ffffffff] reserved
      [    0.000000] NX (Execute Disable) protection: active
      [    0.000000] SMBIOS 2.4 present.
      [    0.000000] Hypervisor detected: KVM
      

      Attachments

        Issue Links

          Activity

            People

              bobijam Zhenyu Xu
              maloo Maloo
              Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: