Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-11269

ptlrpc_set_add_req()) ASSERTION( req->rq_import->imp_state != LUSTRE_IMP_IDLE ) failed

Details

    • 3
    • 9223372036854775807

    Description

      Still seeing this issue:

      Feb 6 16:27:00 bcn1833 kernel: LustreError: 119911:0:(client.c:1087:ptlrpc_set_add_req()) ASSERTION( req->rq_import->imp_state != LUSTRE_IMP_IDLE ) failed:
      Feb 6 16:27:00 bcn1833 kernel: LustreError: 119911:0:(client.c:1087:ptlrpc_set_add_req()) LBUG
      Feb 6 16:27:00 bcn1833 kernel: Pid: 119911, comm: namd2 3.10.0-1062.9.1.el7.x86_64 #1 SMP Fri Dec 6 15:49:49 UTC 2019
      Feb 6 16:27:00 bcn1833 kernel: Call Trace:
      Feb 6 16:27:00 bcn1833 kernel: [<ffffffffc11c37cc>] libcfs_call_trace+0x8c/0xc0 [libcfs]
      Feb 6 16:27:00 bcn1833 kernel: [<ffffffffc11c387c>] lbug_with_loc+0x4c/0xa0 [libcfs]
      Feb 6 16:27:00 bcn1833 kernel: [<ffffffffc1878ecc>] ptlrpc_set_add_req+0xfc/0x100 [ptlrpc]
      Feb 6 16:27:00 bcn1833 kernel: [<ffffffffc187d809>] ptlrpc_queue_wait+0x79/0x230 [ptlrpc]
      Feb 6 16:27:00 bcn1833 kernel: [<ffffffffc1864942>] ldlm_cli_enqueue+0x3d2/0x920 [ptlrpc]
      Feb 6 16:27:00 bcn1833 kernel: [<ffffffffc1453b35>] osc_enqueue_base+0x2c5/0x6b0 [osc]
      Feb 6 16:27:00 bcn1833 kernel: [<ffffffffc145eb7b>] osc_lock_enqueue+0x38b/0x840 [osc]
      Feb 6 16:27:00 bcn1833 kernel: [<ffffffffc16fce35>] cl_lock_enqueue+0x65/0x120 [obdclass]
      Feb 6 16:27:00 bcn1833 kernel: [<ffffffffc12801a5>] lov_lock_enqueue+0x95/0x150 [lov]
      Feb 6 16:27:00 bcn1833 kernel: [<ffffffffc16fce35>] cl_lock_enqueue+0x65/0x120 [obdclass]
      Feb 6 16:27:00 bcn1833 kernel: [<ffffffffc16fd3c7>] cl_lock_request+0x67/0x1f0 [obdclass]
      Feb 6 16:27:00 bcn1833 kernel: [<ffffffffc170123b>] cl_io_lock+0x2bb/0x3d0 [obdclass]
      Feb 6 16:27:00 bcn1833 kernel: [<ffffffffc170156a>] cl_io_loop+0xba/0x1c0 [obdclass]
      Feb 6 16:27:00 bcn1833 kernel: [<ffffffffc157798b>] ll_file_io_generic+0x63b/0xc90 [lustre]
      Feb 6 16:27:00 bcn1833 kernel: [<ffffffffc1578479>] ll_file_aio_write+0x289/0x660 [lustre]
      Feb 6 16:27:00 bcn1833 kernel: [<ffffffffc1578950>] ll_file_write+0x100/0x1c0 [lustre]
      Feb 6 16:27:00 bcn1833 kernel: [<ffffffff9fe4a7f0>] vfs_write+0xc0/0x1f0
      Feb 6 16:27:00 bcn1833 kernel: [<ffffffff9fe4b60f>] SyS_write+0x7f/0xf0
      Feb 6 16:27:00 bcn1833 kernel: [<ffffffffa038dede>] system_call_fastpath+0x25/0x2a
      Feb 6 16:27:00 bcn1833 kernel: [<ffffffffffffffff>] 0xffffffffffffffff
      Feb 6 16:27:00 bcn1833 kernel: Kernel panic - not syncing: LBUG
      

      Attachments

        Issue Links

          Activity

            [LU-11269] ptlrpc_set_add_req()) ASSERTION( req->rq_import->imp_state != LUSTRE_IMP_IDLE ) failed

            James Nunez (jnunez@whamcloud.com) uploaded a new patch: https://review.whamcloud.com/38437
            Subject: LU-11269 ptlrpc: revert "request's counter in import"
            Project: fs/lustre-release
            Branch: master
            Current Patch Set: 1
            Commit: 0098de040cc3953034807c61f742077a3025beb0

            gerrit Gerrit Updater added a comment - James Nunez (jnunez@whamcloud.com) uploaded a new patch: https://review.whamcloud.com/38437 Subject: LU-11269 ptlrpc: revert "request's counter in import" Project: fs/lustre-release Branch: master Current Patch Set: 1 Commit: 0098de040cc3953034807c61f742077a3025beb0

            Minh Diep (mdiep@whamcloud.com) uploaded a new patch: https://review.whamcloud.com/38340
            Subject: LU-11269 ptlrpc: request's counter in import
            Project: fs/lustre-release
            Branch: b2_12
            Current Patch Set: 1
            Commit: 5cef888e8524932c5913191e40d79b96e668b32a

            gerrit Gerrit Updater added a comment - Minh Diep (mdiep@whamcloud.com) uploaded a new patch: https://review.whamcloud.com/38340 Subject: LU-11269 ptlrpc: request's counter in import Project: fs/lustre-release Branch: b2_12 Current Patch Set: 1 Commit: 5cef888e8524932c5913191e40d79b96e668b32a

            Oleg Drokin (green@whamcloud.com) merged in patch https://review.whamcloud.com/37722/
            Subject: LU-11269 ptlrpc: request's counter in import
            Project: fs/lustre-release
            Branch: master
            Current Patch Set:
            Commit: b09afdf57643cbc1c437a42b4babb0837dd19e65

            gerrit Gerrit Updater added a comment - Oleg Drokin (green@whamcloud.com) merged in patch https://review.whamcloud.com/37722/ Subject: LU-11269 ptlrpc: request's counter in import Project: fs/lustre-release Branch: master Current Patch Set: Commit: b09afdf57643cbc1c437a42b4babb0837dd19e65

            Oleg Drokin (green@whamcloud.com) merged in patch https://review.whamcloud.com/37649/
            Subject: LU-11269 ptlrpc: do not expose transient IDLE state
            Project: fs/lustre-release
            Branch: b2_12
            Current Patch Set:
            Commit: e668075a922a2a39247c00fe5f47d7a4368967f3

            gerrit Gerrit Updater added a comment - Oleg Drokin (green@whamcloud.com) merged in patch https://review.whamcloud.com/37649/ Subject: LU-11269 ptlrpc: do not expose transient IDLE state Project: fs/lustre-release Branch: b2_12 Current Patch Set: Commit: e668075a922a2a39247c00fe5f47d7a4368967f3

            Minh Diep (mdiep@whamcloud.com) uploaded a new patch: https://review.whamcloud.com/37649
            Subject: LU-11269 ptlrpc: do not expose transient IDLE state
            Project: fs/lustre-release
            Branch: b2_12
            Current Patch Set: 1
            Commit: 22b238fb902d90229f25b0d109540b362d14be87

            gerrit Gerrit Updater added a comment - Minh Diep (mdiep@whamcloud.com) uploaded a new patch: https://review.whamcloud.com/37649 Subject: LU-11269 ptlrpc: do not expose transient IDLE state Project: fs/lustre-release Branch: b2_12 Current Patch Set: 1 Commit: 22b238fb902d90229f25b0d109540b362d14be87
            pjones Peter Jones added a comment -

            Landed for 2.14

            pjones Peter Jones added a comment - Landed for 2.14

            Oleg Drokin (green@whamcloud.com) merged in patch https://review.whamcloud.com/37523/
            Subject: LU-11269 ptlrpc: do not expose transient IDLE state
            Project: fs/lustre-release
            Branch: master
            Current Patch Set:
            Commit: ea8d2ecc783fbaff12c581935ac426b9b8567031

            gerrit Gerrit Updater added a comment - Oleg Drokin (green@whamcloud.com) merged in patch https://review.whamcloud.com/37523/ Subject: LU-11269 ptlrpc: do not expose transient IDLE state Project: fs/lustre-release Branch: master Current Patch Set: Commit: ea8d2ecc783fbaff12c581935ac426b9b8567031
            gerrit Gerrit Updater added a comment - - edited

            Alex Zhuravlev (bzzz@whamcloud.com) uploaded a new patch: https://review.whamcloud.com/37523
            Subject: LU-11269 ptlrpc: do not expose transient IDLE state
            Project: fs/lustre-release
            Branch: master
            Current Patch Set: 1
            Commit: d9f015f84433100177792a8402ef3a289703d57d

            gerrit Gerrit Updater added a comment - - edited Alex Zhuravlev (bzzz@whamcloud.com) uploaded a new patch: https://review.whamcloud.com/37523 Subject: LU-11269 ptlrpc: do not expose transient IDLE state Project: fs/lustre-release Branch: master Current Patch Set: 1 Commit: d9f015f84433100177792a8402ef3a289703d57d
            sarah Sarah Liu added a comment - - edited

            also hit this on SOAK on client side, lustre-b2_12-ib #35

            [180971.440478] Lustre: soaked-OST0007-osc-ffffa0200627b800: Connection restored to 192.168.1.107@o2ib (at 192.168.1.107@o2ib)
            [182441.030502] LustreError: 61428:0:(client.c:1088:ptlrpc_set_add_req()) ASSERTION( req->rq_import->imp_state != LUSTRE_IMP_IDLE ) failed: 
            [182441.044298] LustreError: 61428:0:(client.c:1088:ptlrpc_set_add_req()) LBUG
            [182441.052078] Pid: 61428, comm: simul 3.10.0-957.21.3.el7.x86_64 #1 SMP Tue Jun 18 16:35:19 UTC 2019
            [182441.062176] Call Trace:
            [182441.065025]  [<ffffffffc0a2a7cc>] libcfs_call_trace+0x8c/0xc0 [libcfs]
            [182441.072438]  [<ffffffffc0a2a87c>] lbug_with_loc+0x4c/0xa0 [libcfs]
            [182441.079472]  [<ffffffffc0e0002c>] ptlrpc_set_add_req+0xfc/0x100 [ptlrpc]
            [182441.087102]  [<ffffffffc0e04959>] ptlrpc_queue_wait+0x79/0x230 [ptlrpc]
            [182441.094614]  [<ffffffffc0de97c2>] ldlm_cli_enqueue+0x3d2/0x920 [ptlrpc]
            [182441.102122]  [<ffffffffc0f77a15>] osc_enqueue_base+0x2c5/0x6b0 [osc]
            [182441.109335]  [<ffffffffc0f82a1b>] osc_lock_enqueue+0x38b/0x840 [osc]
            [182441.116555]  [<ffffffffc0b9e315>] cl_lock_enqueue+0x65/0x120 [obdclass]
            [182441.124096]  [<ffffffffc0f281a5>] lov_lock_enqueue+0x95/0x150 [lov]
            [182441.131205]  [<ffffffffc0b9e315>] cl_lock_enqueue+0x65/0x120 [obdclass]
            [182441.138727]  [<ffffffffc0b9e8a7>] cl_lock_request+0x67/0x1f0 [obdclass]
            [182441.146221]  [<ffffffffc0ba270b>] cl_io_lock+0x2bb/0x3d0 [obdclass]
            [182441.153335]  [<ffffffffc0ba2a3a>] cl_io_loop+0xba/0x1c0 [obdclass]
            [182441.160371]  [<ffffffffc104f530>] ll_file_io_generic+0x590/0xcb0 [lustre]
            [182441.168080]  [<ffffffffc1050942>] ll_file_aio_read+0x352/0x410 [lustre]
            [182441.175584]  [<ffffffffc1050b00>] ll_file_read+0x100/0x1c0 [lustre]
            [182441.182695]  [<ffffffffac2416ff>] vfs_read+0x9f/0x170
            [182441.188443]  [<ffffffffac2425bf>] SyS_read+0x7f/0xf0
            [182441.194092]  [<ffffffffac775ddb>] system_call_fastpath+0x22/0x27
            [182441.200916]  [<ffffffffffffffff>] 0xffffffffffffffff
            [182441.206599] Kernel panic - not syncing: LBUG
            [182441.211472] CPU: 0 PID: 61428 Comm: simul Kdump: loaded Tainted: G           OE  ------------   3.10.0-957.21.3.el7.x86_64 #1
            [182441.224185] Hardware name: Intel Corporation S2600GZ ........../S2600GZ, BIOS SE5C600.86B.01.08.0003.022620131521 02/26/2013
            [182441.236801] Call Trace:
            [182441.239627]  [<ffffffffac763107>] dump_stack+0x19/0x1b
            [182441.245457]  [<ffffffffac75c810>] panic+0xe8/0x21f
            [182441.250903]  [<ffffffffc0a2a8cb>] lbug_with_loc+0x9b/0xa0 [libcfs]
            [182441.257911]  [<ffffffffc0e0002c>] ptlrpc_set_add_req+0xfc/0x100 [ptlrpc]
            [182441.265499]  [<ffffffffc0e04959>] ptlrpc_queue_wait+0x79/0x230 [ptlrpc]
            [182441.272988]  [<ffffffffc0de97c2>] ldlm_cli_enqueue+0x3d2/0x920 [ptlrpc]
            [182441.280469]  [<ffffffffac63ce30>] ? dev_queue_xmit+0x10/0x20
            [182441.286893]  [<ffffffffc0de4990>] ? ldlm_expired_completion_wait+0x220/0x220 [ptlrpc]
            [182441.295732]  [<ffffffffc0f81fb0>] ? osc_lock_lockless_cancel+0xe0/0xe0 [osc]
            [182441.303689]  [<ffffffffc0f80d30>] ? osc_lock_upcall+0x580/0x580 [osc]
            [182441.310969]  [<ffffffffc0f77a15>] osc_enqueue_base+0x2c5/0x6b0 [osc]
            [182441.318162]  [<ffffffffc0f807b0>] ? osc_lock_lvb_update+0x330/0x330 [osc]
            [182441.325828]  [<ffffffffc0f82a1b>] osc_lock_enqueue+0x38b/0x840 [osc]
            [182441.333020]  [<ffffffffc0f807b0>] ? osc_lock_lvb_update+0x330/0x330 [osc]
            [182441.340694]  [<ffffffffc0b9e315>] cl_lock_enqueue+0x65/0x120 [obdclass]
            [182441.348176]  [<ffffffffc0f281a5>] lov_lock_enqueue+0x95/0x150 [lov]
            [182441.355267]  [<ffffffffc0b9e315>] cl_lock_enqueue+0x65/0x120 [obdclass]
            [182441.362756]  [<ffffffffc0b9e8a7>] cl_lock_request+0x67/0x1f0 [obdclass]
            [182441.370245]  [<ffffffffc0ba270b>] cl_io_lock+0x2bb/0x3d0 [obdclass]
            [182441.377345]  [<ffffffffc0ba2a3a>] cl_io_loop+0xba/0x1c0 [obdclass]
            [182441.384345]  [<ffffffffc104f530>] ll_file_io_generic+0x590/0xcb0 [lustre]
            [182441.392022]  [<ffffffffc1050942>] ll_file_aio_read+0x352/0x410 [lustre]
            [182441.399506]  [<ffffffffc1050b00>] ll_file_read+0x100/0x1c0 [lustre]
            [182441.406596]  [<ffffffffac2416ff>] vfs_read+0x9f/0x170
            [182441.412329]  [<ffffffffac2425bf>] SyS_read+0x7f/0xf0
            [182441.417977]  [<ffffffffac775d21>] ? system_call_after_swapgs+0xae/0x146
            [182441.425455]  [<ffffffffac775ddb>] system_call_fastpath+0x22/0x27
            [182441.432256]  [<ffffffffac775d21>] ? system_call_after_swapgs+0xae/0x146
            [    0.000000] Initializing cgroup subsys cpuset
            [    0.000000] Initializing cgroup subsys cpu
            [    0.000000] Initializing cgroup subsys cpuacct
            [    0.000000] Linux version 3.10.0-957.21.3.el7.x86_64 (mockbuild@kbuilder.bsys.centos.org) (gcc version 4.8.5 20150623 (Red Hat 4.8.5-36) (GCC) ) #1 SMP Tue Jun 18 16:35:19 UTC 2019
            [    0.000000] Command line: BOOT_IMAGE=/boot/vmlinuz-3.10.0-957.21.3.el7.x86_64 ro console=ttyS0,115200 LANG=en_US.UTF-8 irqpoll nr_cpus=1 reset_devices cgroup_disable=memory mce=off numa=off udev.children-max=2 panic=10 rootflags=nofail acpi_no_memhotplug transparent_hugepage=never nokaslr disable_cpu_apicid=0 elfcorehdr=871860K
            
            
            
            sarah Sarah Liu added a comment - - edited also hit this on SOAK on client side, lustre-b2_12-ib #35 [180971.440478] Lustre: soaked-OST0007-osc-ffffa0200627b800: Connection restored to 192.168.1.107@o2ib (at 192.168.1.107@o2ib) [182441.030502] LustreError: 61428:0:(client.c:1088:ptlrpc_set_add_req()) ASSERTION( req->rq_import->imp_state != LUSTRE_IMP_IDLE ) failed: [182441.044298] LustreError: 61428:0:(client.c:1088:ptlrpc_set_add_req()) LBUG [182441.052078] Pid: 61428, comm: simul 3.10.0-957.21.3.el7.x86_64 #1 SMP Tue Jun 18 16:35:19 UTC 2019 [182441.062176] Call Trace: [182441.065025] [<ffffffffc0a2a7cc>] libcfs_call_trace+0x8c/0xc0 [libcfs] [182441.072438] [<ffffffffc0a2a87c>] lbug_with_loc+0x4c/0xa0 [libcfs] [182441.079472] [<ffffffffc0e0002c>] ptlrpc_set_add_req+0xfc/0x100 [ptlrpc] [182441.087102] [<ffffffffc0e04959>] ptlrpc_queue_wait+0x79/0x230 [ptlrpc] [182441.094614] [<ffffffffc0de97c2>] ldlm_cli_enqueue+0x3d2/0x920 [ptlrpc] [182441.102122] [<ffffffffc0f77a15>] osc_enqueue_base+0x2c5/0x6b0 [osc] [182441.109335] [<ffffffffc0f82a1b>] osc_lock_enqueue+0x38b/0x840 [osc] [182441.116555] [<ffffffffc0b9e315>] cl_lock_enqueue+0x65/0x120 [obdclass] [182441.124096] [<ffffffffc0f281a5>] lov_lock_enqueue+0x95/0x150 [lov] [182441.131205] [<ffffffffc0b9e315>] cl_lock_enqueue+0x65/0x120 [obdclass] [182441.138727] [<ffffffffc0b9e8a7>] cl_lock_request+0x67/0x1f0 [obdclass] [182441.146221] [<ffffffffc0ba270b>] cl_io_lock+0x2bb/0x3d0 [obdclass] [182441.153335] [<ffffffffc0ba2a3a>] cl_io_loop+0xba/0x1c0 [obdclass] [182441.160371] [<ffffffffc104f530>] ll_file_io_generic+0x590/0xcb0 [lustre] [182441.168080] [<ffffffffc1050942>] ll_file_aio_read+0x352/0x410 [lustre] [182441.175584] [<ffffffffc1050b00>] ll_file_read+0x100/0x1c0 [lustre] [182441.182695] [<ffffffffac2416ff>] vfs_read+0x9f/0x170 [182441.188443] [<ffffffffac2425bf>] SyS_read+0x7f/0xf0 [182441.194092] [<ffffffffac775ddb>] system_call_fastpath+0x22/0x27 [182441.200916] [<ffffffffffffffff>] 0xffffffffffffffff [182441.206599] Kernel panic - not syncing: LBUG [182441.211472] CPU: 0 PID: 61428 Comm: simul Kdump: loaded Tainted: G OE ------------ 3.10.0-957.21.3.el7.x86_64 #1 [182441.224185] Hardware name: Intel Corporation S2600GZ ........../S2600GZ, BIOS SE5C600.86B.01.08.0003.022620131521 02/26/2013 [182441.236801] Call Trace: [182441.239627] [<ffffffffac763107>] dump_stack+0x19/0x1b [182441.245457] [<ffffffffac75c810>] panic+0xe8/0x21f [182441.250903] [<ffffffffc0a2a8cb>] lbug_with_loc+0x9b/0xa0 [libcfs] [182441.257911] [<ffffffffc0e0002c>] ptlrpc_set_add_req+0xfc/0x100 [ptlrpc] [182441.265499] [<ffffffffc0e04959>] ptlrpc_queue_wait+0x79/0x230 [ptlrpc] [182441.272988] [<ffffffffc0de97c2>] ldlm_cli_enqueue+0x3d2/0x920 [ptlrpc] [182441.280469] [<ffffffffac63ce30>] ? dev_queue_xmit+0x10/0x20 [182441.286893] [<ffffffffc0de4990>] ? ldlm_expired_completion_wait+0x220/0x220 [ptlrpc] [182441.295732] [<ffffffffc0f81fb0>] ? osc_lock_lockless_cancel+0xe0/0xe0 [osc] [182441.303689] [<ffffffffc0f80d30>] ? osc_lock_upcall+0x580/0x580 [osc] [182441.310969] [<ffffffffc0f77a15>] osc_enqueue_base+0x2c5/0x6b0 [osc] [182441.318162] [<ffffffffc0f807b0>] ? osc_lock_lvb_update+0x330/0x330 [osc] [182441.325828] [<ffffffffc0f82a1b>] osc_lock_enqueue+0x38b/0x840 [osc] [182441.333020] [<ffffffffc0f807b0>] ? osc_lock_lvb_update+0x330/0x330 [osc] [182441.340694] [<ffffffffc0b9e315>] cl_lock_enqueue+0x65/0x120 [obdclass] [182441.348176] [<ffffffffc0f281a5>] lov_lock_enqueue+0x95/0x150 [lov] [182441.355267] [<ffffffffc0b9e315>] cl_lock_enqueue+0x65/0x120 [obdclass] [182441.362756] [<ffffffffc0b9e8a7>] cl_lock_request+0x67/0x1f0 [obdclass] [182441.370245] [<ffffffffc0ba270b>] cl_io_lock+0x2bb/0x3d0 [obdclass] [182441.377345] [<ffffffffc0ba2a3a>] cl_io_loop+0xba/0x1c0 [obdclass] [182441.384345] [<ffffffffc104f530>] ll_file_io_generic+0x590/0xcb0 [lustre] [182441.392022] [<ffffffffc1050942>] ll_file_aio_read+0x352/0x410 [lustre] [182441.399506] [<ffffffffc1050b00>] ll_file_read+0x100/0x1c0 [lustre] [182441.406596] [<ffffffffac2416ff>] vfs_read+0x9f/0x170 [182441.412329] [<ffffffffac2425bf>] SyS_read+0x7f/0xf0 [182441.417977] [<ffffffffac775d21>] ? system_call_after_swapgs+0xae/0x146 [182441.425455] [<ffffffffac775ddb>] system_call_fastpath+0x22/0x27 [182441.432256] [<ffffffffac775d21>] ? system_call_after_swapgs+0xae/0x146 [ 0.000000] Initializing cgroup subsys cpuset [ 0.000000] Initializing cgroup subsys cpu [ 0.000000] Initializing cgroup subsys cpuacct [ 0.000000] Linux version 3.10.0-957.21.3.el7.x86_64 (mockbuild@kbuilder.bsys.centos.org) (gcc version 4.8.5 20150623 (Red Hat 4.8.5-36) (GCC) ) #1 SMP Tue Jun 18 16:35:19 UTC 2019 [ 0.000000] Command line: BOOT_IMAGE=/boot/vmlinuz-3.10.0-957.21.3.el7.x86_64 ro console=ttyS0,115200 LANG=en_US.UTF-8 irqpoll nr_cpus=1 reset_devices cgroup_disable=memory mce=off numa=off udev.children-max=2 panic=10 rootflags=nofail acpi_no_memhotplug transparent_hugepage=never nokaslr disable_cpu_apicid=0 elfcorehdr=871860K
            yujian Jian Yu added a comment -

            +1 on master branch while running sanity-sec test 20 in review-dne-zfs-part-2 test group:
            https://testing.whamcloud.com/test_sets/dcbe721c-f9ab-11e8-b756-52540065bddc

            yujian Jian Yu added a comment - +1 on master branch while running sanity-sec test 20 in review-dne-zfs-part-2 test group: https://testing.whamcloud.com/test_sets/dcbe721c-f9ab-11e8-b756-52540065bddc
            sarah Sarah Liu added a comment -

            Hit similar issue on soak, lustre-master-ib 2.11.54_91_g605a139

            client dmesg

            [ 2388.749104] alg: No test for adler32 (adler32-zlib)
            [ 2389.628880] Lustre: Lustre: Build Version: 2.11.54_91_g605a139
            [ 2389.806899] LNet: Using FMR for registration
            [ 2389.825192] LNet: Added LNI 192.168.1.129@o2ib [8/256/0/180]
            [ 3597.244535] LustreError: 11-0: soaked-MDT0003-mdc-ffff8bc97a7fd000: operation mds_connect to node 192.168.1.111@o2ib failed: rc = -16
            [ 3597.258019] LustreError: 11-0: soaked-MDT0002-mdc-ffff8bc97a7fd000: operation mds_connect to node 192.168.1.110@o2ib failed: rc = -16
            [ 3597.273043] LustreError: 11-0: soaked-OST0002-osc-ffff8bc97a7fd000: operation ost_connect to node 192.168.1.106@o2ib failed: rc = -16
            [ 3597.275373] Lustre: Mounted soaked-client
            [ 3597.291219] LustreError: 11-0: soaked-OST0006-osc-ffff8bc97a7fd000: operation ost_connect to node 192.168.1.106@o2ib failed: rc = -16
            [ 3597.304919] LustreError: 11-0: soaked-OST000a-osc-ffff8bc97a7fd000: operation ost_connect to node 192.168.1.106@o2ib failed: rc = -16
            [ 3597.318665] LustreError: 11-0: soaked-OST000e-osc-ffff8bc97a7fd000: operation ost_connect to node 192.168.1.106@o2ib failed: rc = -16
            Sep  4 22:01:01 soak-29 TIME: Time stamp for console
            [ 6457.411019] LustreError: 4511:0:(client.c:1082:ptlrpc_set_add_req()) ASSERTION( req->rq_import->imp_state != LUSTRE_IMP_IDLE ) failed:
            [ 6457.424658] LustreError: 4511:0:(client.c:1082:ptlrpc_set_add_req()) LBUG
            [ 6457.432264] Pid: 4511, comm: simul 3.10.0-862.9.1.el7.x86_64 #1 SMP Mon Jul 16 16:29:36 UTC 2018
            [ 6457.442107] Call Trace:
            [ 6457.444861]  [<ffffffffc0b907cc>] libcfs_call_trace+0x8c/0xc0 [libcfs]
            [ 6457.452209]  [<ffffffffc0b9087c>] lbug_with_loc+0x4c/0xa0 [libcfs]
            [ 6457.459158]  [<ffffffffc0ea9dec>] ptlrpc_set_add_req+0xfc/0x100 [ptlrpc]
            [ 6457.466718]  [<ffffffffc0eae735>] ptlrpc_queue_wait+0x75/0x220 [ptlrpc]
            [ 6457.474167]  [<ffffffffc0e936a2>] ldlm_cli_enqueue+0x3d2/0x920 [ptlrpc]
            [ 6457.481618]  [<ffffffffc101e965>] osc_enqueue_base+0x2b5/0x6a0 [osc]
            [ 6457.488776]  [<ffffffffc102917b>] osc_lock_enqueue+0x38b/0x840 [osc]
            [ 6457.495916]  [<ffffffffc0ce4de5>] cl_lock_enqueue+0x65/0x120 [obdclass]
            [ 6457.503386]  [<ffffffffc0fcf285>] lov_lock_enqueue+0x95/0x150 [lov]
            [ 6457.510458]  [<ffffffffc0ce4de5>] cl_lock_enqueue+0x65/0x120 [obdclass]
            [ 6457.517903]  [<ffffffffc0ce5377>] cl_lock_request+0x67/0x1f0 [obdclass]
            [ 6457.525352]  [<ffffffffc0ce926b>] cl_io_lock+0x2bb/0x3d0 [obdclass]
            [ 6457.532408]  [<ffffffffc0ce95fb>] cl_io_loop+0x11b/0xc70 [obdclass]
            [ 6457.539477]  [<ffffffffc10c6db2>] ll_file_io_generic+0x4e2/0xd10 [lustre]
            [ 6457.547145]  [<ffffffffc10c7b32>] ll_file_aio_write+0x372/0x540 [lustre]
            [ 6457.554681]  [<ffffffffc10c7da4>] ll_file_write+0xa4/0x170 [lustre]
            [ 6457.561731]  [<ffffffffb1a1b490>] vfs_write+0xc0/0x1f0
            [ 6457.567514]  [<ffffffffb1a1c2bf>] SyS_write+0x7f/0xf0
            [ 6457.573197]  [<ffffffffb1f20795>] system_call_fastpath+0x1c/0x21
            [ 6457.579954]  [<ffffffffffffffff>] 0xffffffffffffffff
            [ 6457.585582] Kernel panic - not syncing: LBUG
            [ 6457.591152] CPU: 18 PID: 4511 Comm: simul Tainted: G          IOE  ------------   3.10.0-862.9.1.el7.x86_64 #1
            [ 6457.603125] Hardware name: Intel Corporation S2600WTT/S2600WTT, BIOS SE5C610.86B.01.01.0008.021120151325 02/11/2015
            [ 6457.615578] Call Trace:
            [ 6457.619126]  [<ffffffffb1f0e84e>] dump_stack+0x19/0x1b
            [ 6457.625657]  [<ffffffffb1f08b50>] panic+0xe8/0x21f
            [ 6457.631785]  [<ffffffffc0b908cb>] lbug_with_loc+0x9b/0xa0 [libcfs]
            [ 6457.639483]  [<ffffffffc0ea9dec>] ptlrpc_set_add_req+0xfc/0x100 [ptlrpc]
            [ 6457.647758]  [<ffffffffc0eae735>] ptlrpc_queue_wait+0x75/0x220 [ptlrpc]
            [ 6457.655965]  [<ffffffffc0e936a2>] ldlm_cli_enqueue+0x3d2/0x920 [ptlrpc]
            [ 6457.664156]  [<ffffffffc0e8e7c0>] ? ldlm_expired_completion_wait+0x220/0x220 [ptlrpc]
            [ 6457.673637]  [<ffffffffc1028710>] ? osc_lock_lockless_cancel+0xe0/0xe0 [osc]
            [ 6457.682276]  [<ffffffffc1027ad0>] ? osc_lock_upcall+0x580/0x580 [osc]
            [ 6457.690247]  [<ffffffffc101e965>] osc_enqueue_base+0x2b5/0x6a0 [osc]
            [ 6457.698079]  [<ffffffffc1027550>] ? osc_lock_lvb_update+0x330/0x330 [osc]
            [ 6457.706337]  [<ffffffffc102917b>] osc_lock_enqueue+0x38b/0x840 [osc]
            [ 6457.714139]  [<ffffffffc1027550>] ? osc_lock_lvb_update+0x330/0x330 [osc]
            [ 6457.722390]  [<ffffffffc0ce4de5>] cl_lock_enqueue+0x65/0x120 [obdclass]
            [ 6457.730416]  [<ffffffffc0fcf285>] lov_lock_enqueue+0x95/0x150 [lov]
            [ 6457.738100]  [<ffffffffc0ce4de5>] cl_lock_enqueue+0x65/0x120 [obdclass]
            [ 6457.746155]  [<ffffffffc0ce5377>] cl_lock_request+0x67/0x1f0 [obdclass]
            [ 6457.754191]  [<ffffffffc0ce926b>] cl_io_lock+0x2bb/0x3d0 [obdclass]
            [ 6457.761754]  [<ffffffffc0ce95fb>] cl_io_loop+0x11b/0xc70 [obdclass]
            [ 6457.769300]  [<ffffffffc10c6db2>] ll_file_io_generic+0x4e2/0xd10 [lustre]
            [ 6457.777391]  [<ffffffffc10c7b32>] ll_file_aio_write+0x372/0x540 [lustre]
            [ 6457.785395]  [<ffffffffc10c7da4>] ll_file_write+0xa4/0x170 [lustre]
            [ 6457.792946]  [<ffffffffb1a1b490>] vfs_write+0xc0/0x1f0
            [ 6457.799194]  [<ffffffffb1a1c2bf>] SyS_write+0x7f/0xf0
            [ 6457.805311]  [<ffffffffb1f20795>] system_call_fastpath+0x1c/0x21
            [ 6457.812550] Kernel Offset: 0x30800000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff)
            
            sarah Sarah Liu added a comment - Hit similar issue on soak, lustre-master-ib 2.11.54_91_g605a139 client dmesg [ 2388.749104] alg: No test for adler32 (adler32-zlib) [ 2389.628880] Lustre: Lustre: Build Version: 2.11.54_91_g605a139 [ 2389.806899] LNet: Using FMR for registration [ 2389.825192] LNet: Added LNI 192.168.1.129@o2ib [8/256/0/180] [ 3597.244535] LustreError: 11-0: soaked-MDT0003-mdc-ffff8bc97a7fd000: operation mds_connect to node 192.168.1.111@o2ib failed: rc = -16 [ 3597.258019] LustreError: 11-0: soaked-MDT0002-mdc-ffff8bc97a7fd000: operation mds_connect to node 192.168.1.110@o2ib failed: rc = -16 [ 3597.273043] LustreError: 11-0: soaked-OST0002-osc-ffff8bc97a7fd000: operation ost_connect to node 192.168.1.106@o2ib failed: rc = -16 [ 3597.275373] Lustre: Mounted soaked-client [ 3597.291219] LustreError: 11-0: soaked-OST0006-osc-ffff8bc97a7fd000: operation ost_connect to node 192.168.1.106@o2ib failed: rc = -16 [ 3597.304919] LustreError: 11-0: soaked-OST000a-osc-ffff8bc97a7fd000: operation ost_connect to node 192.168.1.106@o2ib failed: rc = -16 [ 3597.318665] LustreError: 11-0: soaked-OST000e-osc-ffff8bc97a7fd000: operation ost_connect to node 192.168.1.106@o2ib failed: rc = -16 Sep 4 22:01:01 soak-29 TIME: Time stamp for console [ 6457.411019] LustreError: 4511:0:(client.c:1082:ptlrpc_set_add_req()) ASSERTION( req->rq_import->imp_state != LUSTRE_IMP_IDLE ) failed: [ 6457.424658] LustreError: 4511:0:(client.c:1082:ptlrpc_set_add_req()) LBUG [ 6457.432264] Pid: 4511, comm: simul 3.10.0-862.9.1.el7.x86_64 #1 SMP Mon Jul 16 16:29:36 UTC 2018 [ 6457.442107] Call Trace: [ 6457.444861] [<ffffffffc0b907cc>] libcfs_call_trace+0x8c/0xc0 [libcfs] [ 6457.452209] [<ffffffffc0b9087c>] lbug_with_loc+0x4c/0xa0 [libcfs] [ 6457.459158] [<ffffffffc0ea9dec>] ptlrpc_set_add_req+0xfc/0x100 [ptlrpc] [ 6457.466718] [<ffffffffc0eae735>] ptlrpc_queue_wait+0x75/0x220 [ptlrpc] [ 6457.474167] [<ffffffffc0e936a2>] ldlm_cli_enqueue+0x3d2/0x920 [ptlrpc] [ 6457.481618] [<ffffffffc101e965>] osc_enqueue_base+0x2b5/0x6a0 [osc] [ 6457.488776] [<ffffffffc102917b>] osc_lock_enqueue+0x38b/0x840 [osc] [ 6457.495916] [<ffffffffc0ce4de5>] cl_lock_enqueue+0x65/0x120 [obdclass] [ 6457.503386] [<ffffffffc0fcf285>] lov_lock_enqueue+0x95/0x150 [lov] [ 6457.510458] [<ffffffffc0ce4de5>] cl_lock_enqueue+0x65/0x120 [obdclass] [ 6457.517903] [<ffffffffc0ce5377>] cl_lock_request+0x67/0x1f0 [obdclass] [ 6457.525352] [<ffffffffc0ce926b>] cl_io_lock+0x2bb/0x3d0 [obdclass] [ 6457.532408] [<ffffffffc0ce95fb>] cl_io_loop+0x11b/0xc70 [obdclass] [ 6457.539477] [<ffffffffc10c6db2>] ll_file_io_generic+0x4e2/0xd10 [lustre] [ 6457.547145] [<ffffffffc10c7b32>] ll_file_aio_write+0x372/0x540 [lustre] [ 6457.554681] [<ffffffffc10c7da4>] ll_file_write+0xa4/0x170 [lustre] [ 6457.561731] [<ffffffffb1a1b490>] vfs_write+0xc0/0x1f0 [ 6457.567514] [<ffffffffb1a1c2bf>] SyS_write+0x7f/0xf0 [ 6457.573197] [<ffffffffb1f20795>] system_call_fastpath+0x1c/0x21 [ 6457.579954] [<ffffffffffffffff>] 0xffffffffffffffff [ 6457.585582] Kernel panic - not syncing: LBUG [ 6457.591152] CPU: 18 PID: 4511 Comm: simul Tainted: G IOE ------------ 3.10.0-862.9.1.el7.x86_64 #1 [ 6457.603125] Hardware name: Intel Corporation S2600WTT/S2600WTT, BIOS SE5C610.86B.01.01.0008.021120151325 02/11/2015 [ 6457.615578] Call Trace: [ 6457.619126] [<ffffffffb1f0e84e>] dump_stack+0x19/0x1b [ 6457.625657] [<ffffffffb1f08b50>] panic+0xe8/0x21f [ 6457.631785] [<ffffffffc0b908cb>] lbug_with_loc+0x9b/0xa0 [libcfs] [ 6457.639483] [<ffffffffc0ea9dec>] ptlrpc_set_add_req+0xfc/0x100 [ptlrpc] [ 6457.647758] [<ffffffffc0eae735>] ptlrpc_queue_wait+0x75/0x220 [ptlrpc] [ 6457.655965] [<ffffffffc0e936a2>] ldlm_cli_enqueue+0x3d2/0x920 [ptlrpc] [ 6457.664156] [<ffffffffc0e8e7c0>] ? ldlm_expired_completion_wait+0x220/0x220 [ptlrpc] [ 6457.673637] [<ffffffffc1028710>] ? osc_lock_lockless_cancel+0xe0/0xe0 [osc] [ 6457.682276] [<ffffffffc1027ad0>] ? osc_lock_upcall+0x580/0x580 [osc] [ 6457.690247] [<ffffffffc101e965>] osc_enqueue_base+0x2b5/0x6a0 [osc] [ 6457.698079] [<ffffffffc1027550>] ? osc_lock_lvb_update+0x330/0x330 [osc] [ 6457.706337] [<ffffffffc102917b>] osc_lock_enqueue+0x38b/0x840 [osc] [ 6457.714139] [<ffffffffc1027550>] ? osc_lock_lvb_update+0x330/0x330 [osc] [ 6457.722390] [<ffffffffc0ce4de5>] cl_lock_enqueue+0x65/0x120 [obdclass] [ 6457.730416] [<ffffffffc0fcf285>] lov_lock_enqueue+0x95/0x150 [lov] [ 6457.738100] [<ffffffffc0ce4de5>] cl_lock_enqueue+0x65/0x120 [obdclass] [ 6457.746155] [<ffffffffc0ce5377>] cl_lock_request+0x67/0x1f0 [obdclass] [ 6457.754191] [<ffffffffc0ce926b>] cl_io_lock+0x2bb/0x3d0 [obdclass] [ 6457.761754] [<ffffffffc0ce95fb>] cl_io_loop+0x11b/0xc70 [obdclass] [ 6457.769300] [<ffffffffc10c6db2>] ll_file_io_generic+0x4e2/0xd10 [lustre] [ 6457.777391] [<ffffffffc10c7b32>] ll_file_aio_write+0x372/0x540 [lustre] [ 6457.785395] [<ffffffffc10c7da4>] ll_file_write+0xa4/0x170 [lustre] [ 6457.792946] [<ffffffffb1a1b490>] vfs_write+0xc0/0x1f0 [ 6457.799194] [<ffffffffb1a1c2bf>] SyS_write+0x7f/0xf0 [ 6457.805311] [<ffffffffb1f20795>] system_call_fastpath+0x1c/0x21 [ 6457.812550] Kernel Offset: 0x30800000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff)

            People

              bzzz Alex Zhuravlev
              green Oleg Drokin
              Votes:
              1 Vote for this issue
              Watchers:
              13 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: