Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-12343

osc_lock.c:687:osc_ldlm_weigh_ast()) LBUG

Details

    • 3
    • 9223372036854775807

    Description

      during mdtest from 10 clients to a directory which configured with DNE2/DoM, few clients crashed at same time below.

      [403683.467928] LustreError: 11-0: cache1-MDT0000-mdc-ffff89de6e235000: operation mds_connect to node 10.0.10.175@o2ib10 failed: rc = -16
      [403709.580276] Lustre: Mounted cache1-client
      [403807.677757] LustreError: 192778:0:(osc_lock.c:687:osc_ldlm_weigh_ast()) ASSERTION( dlmlock->l_resource->lr_type == LDLM_EXTENT || ldlm_has_dom(dlmlock) ) failed: 
      [403807.681557] LustreError: 192778:0:(osc_lock.c:687:osc_ldlm_weigh_ast()) LBUG
      [403807.683405] Pid: 192778, comm: mdtest 3.10.0-957.10.1.el7.x86_64 #1 SMP Mon Mar 18 15:06:45 UTC 2019
      [403807.683407] Call Trace:
      [403807.683419]  [<ffffffffc08867cc>] libcfs_call_trace+0x8c/0xc0 [libcfs]
      [403807.683446]  [<ffffffffc088687c>] lbug_with_loc+0x4c/0xa0 [libcfs]
      [403807.683452]  [<ffffffffc0d4b947>] osc_ldlm_weigh_ast+0x377/0x3a0 [osc]
      [403807.683474]  [<ffffffffc0d9aa91>] mdc_cancel_weight+0xe1/0x130 [mdc]
      [403807.683484]  [<ffffffffc0bb10d1>] ldlm_cancel_no_wait_policy+0x51/0x80 [ptlrpc]
      [403807.683527]  [<ffffffffc0bb1118>] ldlm_cancel_aged_no_wait_policy+0x18/0x70 [ptlrpc]
      [403807.683546]  [<ffffffffc0bb25ba>] ldlm_prepare_lru_list+0x1fa/0x4c0 [ptlrpc]
      [403807.683565]  [<ffffffffc0bb5f1a>] ldlm_cancel_lru_local+0x1a/0x30 [ptlrpc]
      [403807.683584]  [<ffffffffc0bb614e>] ldlm_prep_elc_req+0x21e/0x490 [ptlrpc]
      [403807.683605]  [<ffffffffc0bb63e8>] ldlm_prep_enqueue_req+0x28/0x30 [ptlrpc]
      [403807.683624]  [<ffffffffc0da9f55>] mdc_intent_getattr_pack.isra.15+0xa5/0x3a0 [mdc]
      [403807.683632]  [<ffffffffc0dace12>] mdc_enqueue_base+0x532/0x1500 [mdc]
      [403807.683640]  [<ffffffffc0dae545>] mdc_intent_lock+0x135/0x560 [mdc]
      [403807.683649]  [<ffffffffc0deb962>] lmv_intent_lock+0x472/0xaf0 [lmv]
      [403807.683659]  [<ffffffffc0e4760a>] ll_lookup_it+0x3aa/0x1910 [lustre]
      [403807.683685]  [<ffffffffc0e49fbb>] ll_lookup_nd+0xbb/0x190 [lustre]
      [403807.683696]  [<ffffffffa9a4c5b3>] lookup_real+0x23/0x60
      [403807.683702]  [<ffffffffa9a4cfd2>] __lookup_hash+0x42/0x60
      [403807.683705]  [<ffffffffa9a538bc>] do_unlinkat+0x14c/0x2d0
      [403807.683710]  [<ffffffffa9a549d6>] SyS_unlink+0x16/0x20
      [403807.683715]  [<ffffffffa9f75ddb>] system_call_fastpath+0x22/0x27
      [403807.683722]  [<ffffffffffffffff>] 0xffffffffffffffff
      [403807.683756] Kernel panic - not syncing: LBUG
      [403807.685577] CPU: 7 PID: 192778 Comm: mdtest Kdump: loaded Tainted: G           OEL ------------   3.10.0-957.10.1.el7.x86_64 #1
      [403807.689195] Hardware name: Intel Corporation S2600KPR/S2600KPR, BIOS SE5C610.86B.01.01.0027.071020182329 07/10/2018
      [403807.691049] Call Trace:
      [403807.692858]  [<ffffffffa9f62e41>] dump_stack+0x19/0x1b
      [403807.694647]  [<ffffffffa9f5c550>] panic+0xe8/0x21f
      [403807.696419]  [<ffffffffc08868cb>] lbug_with_loc+0x9b/0xa0 [libcfs]
      [403807.698183]  [<ffffffffc0d4b947>] osc_ldlm_weigh_ast+0x377/0x3a0 [osc]
      [403807.699954]  [<ffffffffc0d9aa91>] mdc_cancel_weight+0xe1/0x130 [mdc]
      [403807.701740]  [<ffffffffc0bb10d1>] ldlm_cancel_no_wait_policy+0x51/0x80 [ptlrpc]
      [403807.703481]  [<ffffffffc0bb1118>] ldlm_cancel_aged_no_wait_policy+0x18/0x70 [ptlrpc]
      [403807.706888]  [<ffffffffc0bb25ba>] ldlm_prepare_lru_list+0x1fa/0x4c0 [ptlrpc]
      [403807.710249]  [<ffffffffc0bb5f1a>] ldlm_cancel_lru_local+0x1a/0x30 [ptlrpc]
      [403807.711919]  [<ffffffffc0bb614e>] ldlm_prep_elc_req+0x21e/0x490 [ptlrpc]
      [403807.713536]  [<ffffffffc0bb63e8>] ldlm_prep_enqueue_req+0x28/0x30 [ptlrpc]
      [403807.715110]  [<ffffffffc0da9f55>] mdc_intent_getattr_pack.isra.15+0xa5/0x3a0 [mdc]
      [403807.716683]  [<ffffffffc0dace12>] mdc_enqueue_base+0x532/0x1500 [mdc]
      [403807.721334]  [<ffffffffc0dae545>] mdc_intent_lock+0x135/0x560 [mdc]
      [403807.727177]  [<ffffffffc0deb962>] lmv_intent_lock+0x472/0xaf0 [lmv]
      [403807.735265]  [<ffffffffc0e4760a>] ll_lookup_it+0x3aa/0x1910 [lustre]
      [403807.744991]  [<ffffffffc0e49fbb>] ll_lookup_nd+0xbb/0x190 [lustre]
      [403807.746080]  [<ffffffffa9a4c5b3>] lookup_real+0x23/0x60
      [403807.747136]  [<ffffffffa9a4cfd2>] __lookup_hash+0x42/0x60
      [403807.748161]  [<ffffffffa9a538bc>] do_unlinkat+0x14c/0x2d0
      [403807.750112]  [<ffffffffa9a549d6>] SyS_unlink+0x16/0x20
      [403807.751057]  [<ffffffffa9f75ddb>] system_call_fastpath+0x22/0x27
      

      Attachments

        Activity

          [LU-12343] osc_lock.c:687:osc_ldlm_weigh_ast()) LBUG
          pjones Peter Jones made changes -
          Fix Version/s New: Lustre 2.12.3 [ 14418 ]
          pjones Peter Jones made changes -
          Labels Original: LTS12 ORNL New: ORNL
          pjones Peter Jones made changes -
          Link Original: This issue is related to JFC-17 [ JFC-17 ]
          pjones Peter Jones made changes -
          Link New: This issue is related to JFC-20 [ JFC-20 ]
          pjones Peter Jones made changes -
          Link New: This issue is related to JFC-17 [ JFC-17 ]
          pjones Peter Jones made changes -
          Labels Original: ORNL New: LTS12 ORNL
          pjones Peter Jones made changes -
          Resolution New: Fixed [ 1 ]
          Status Original: Open [ 1 ] New: Resolved [ 5 ]
          adilger Andreas Dilger made changes -
          Description Original: during mdtest from 10 clients to a directory which configured with DNE2/DoM, few clients crashed at same time below.
          {noformat}
          [403683.467928] LustreError: 11-0: cache1-MDT0000-mdc-ffff89de6e235000: operation mds_connect to node 10.0.10.175@o2ib10 failed: rc = -16
          [403709.580276] Lustre: Mounted cache1-client
          [403807.677757] LustreError: 192778:0:(osc_lock.c:687:osc_ldlm_weigh_ast()) ASSERTION( dlmlock->l_resource->lr_type == LDLM_EXTENT || ldlm_has_dom(dlmlock) ) failed:
          [403807.681557] LustreError: 192778:0:(osc_lock.c:687:osc_ldlm_weigh_ast()) LBUG
          [403807.683405] Pid: 192778, comm: mdtest 3.10.0-957.10.1.el7.x86_64 #1 SMP Mon Mar 18 15:06:45 UTC 2019
          [403807.683407] Call Trace:
          [403807.683419] [<ffffffffc08867cc>] libcfs_call_trace+0x8c/0xc0 [libcfs]
          [403807.683446] [<ffffffffc088687c>] lbug_with_loc+0x4c/0xa0 [libcfs]
          [403807.683452] [<ffffffffc0d4b947>] osc_ldlm_weigh_ast+0x377/0x3a0 [osc]
          [403807.683474] [<ffffffffc0d9aa91>] mdc_cancel_weight+0xe1/0x130 [mdc]
          [403807.683484] [<ffffffffc0bb10d1>] ldlm_cancel_no_wait_policy+0x51/0x80 [ptlrpc]
          [403807.683527] [<ffffffffc0bb1118>] ldlm_cancel_aged_no_wait_policy+0x18/0x70 [ptlrpc]
          [403807.683546] [<ffffffffc0bb25ba>] ldlm_prepare_lru_list+0x1fa/0x4c0 [ptlrpc]
          [403807.683565] [<ffffffffc0bb5f1a>] ldlm_cancel_lru_local+0x1a/0x30 [ptlrpc]
          [403807.683584] [<ffffffffc0bb614e>] ldlm_prep_elc_req+0x21e/0x490 [ptlrpc]
          [403807.683605] [<ffffffffc0bb63e8>] ldlm_prep_enqueue_req+0x28/0x30 [ptlrpc]
          [403807.683624] [<ffffffffc0da9f55>] mdc_intent_getattr_pack.isra.15+0xa5/0x3a0 [mdc]
          [403807.683632] [<ffffffffc0dace12>] mdc_enqueue_base+0x532/0x1500 [mdc]
          [403807.683640] [<ffffffffc0dae545>] mdc_intent_lock+0x135/0x560 [mdc]
          [403807.683649] [<ffffffffc0deb962>] lmv_intent_lock+0x472/0xaf0 [lmv]
          [403807.683659] [<ffffffffc0e4760a>] ll_lookup_it+0x3aa/0x1910 [lustre]
          [403807.683685] [<ffffffffc0e49fbb>] ll_lookup_nd+0xbb/0x190 [lustre]
          [403807.683696] [<ffffffffa9a4c5b3>] lookup_real+0x23/0x60
          [403807.683702] [<ffffffffa9a4cfd2>] __lookup_hash+0x42/0x60
          [403807.683705] [<ffffffffa9a538bc>] do_unlinkat+0x14c/0x2d0
          [403807.683710] [<ffffffffa9a549d6>] SyS_unlink+0x16/0x20
          [403807.683715] [<ffffffffa9f75ddb>] system_call_fastpath+0x22/0x27
          [403807.683722] [<ffffffffffffffff>] 0xffffffffffffffff
          [403807.683756] Kernel panic - not syncing: LBUG
          [403807.685577] CPU: 7 PID: 192778 Comm: mdtest Kdump: loaded Tainted: G OEL ------------ 3.10.0-957.10.1.el7.x86_64 #1
          [403807.689195] Hardware name: Intel Corporation S2600KPR/S2600KPR, BIOS SE5C610.86B.01.01.0027.071020182329 07/10/2018
          [403807.691049] Call Trace:
          [403807.692858] [<ffffffffa9f62e41>] dump_stack+0x19/0x1b
          [403807.694647] [<ffffffffa9f5c550>] panic+0xe8/0x21f
          [403807.696419] [<ffffffffc08868cb>] lbug_with_loc+0x9b/0xa0 [libcfs]
          [403807.698183] [<ffffffffc0d4b947>] osc_ldlm_weigh_ast+0x377/0x3a0 [osc]
          [403807.699954] [<ffffffffc0d9aa91>] mdc_cancel_weight+0xe1/0x130 [mdc]
          [403807.701740] [<ffffffffc0bb10d1>] ldlm_cancel_no_wait_policy+0x51/0x80 [ptlrpc]
          [403807.703481] [<ffffffffc0bb1118>] ldlm_cancel_aged_no_wait_policy+0x18/0x70 [ptlrpc]
          [403807.705194] [<ffffffffc0ba0f18>] ? ldlm_lock_remove_from_lru_check+0x158/0x1a0 [ptlrpc]
          [403807.706888] [<ffffffffc0bb25ba>] ldlm_prepare_lru_list+0x1fa/0x4c0 [ptlrpc]
          [403807.708576] [<ffffffffc0bb1100>] ? ldlm_cancel_no_wait_policy+0x80/0x80 [ptlrpc]
          [403807.710249] [<ffffffffc0bb5f1a>] ldlm_cancel_lru_local+0x1a/0x30 [ptlrpc]
          [403807.711919] [<ffffffffc0bb614e>] ldlm_prep_elc_req+0x21e/0x490 [ptlrpc]
          [403807.713536] [<ffffffffc0bb63e8>] ldlm_prep_enqueue_req+0x28/0x30 [ptlrpc]
          [403807.715110] [<ffffffffc0da9f55>] mdc_intent_getattr_pack.isra.15+0xa5/0x3a0 [mdc]
          [403807.716683] [<ffffffffc0dace12>] mdc_enqueue_base+0x532/0x1500 [mdc]
          [403807.718271] [<ffffffffc0a22781>] ? lprocfs_counter_sub+0xc1/0x130 [obdclass]
          [403807.719812] [<ffffffffc0e47c23>] ? ll_lookup_it+0x9c3/0x1910 [lustre]
          [403807.721334] [<ffffffffc0dae545>] mdc_intent_lock+0x135/0x560 [mdc]
          [403807.722814] [<ffffffffc0e46ba0>] ? ll_md_need_convert+0x1b0/0x1b0 [lustre]
          [403807.724291] [<ffffffffc0bb1ae0>] ? ldlm_expired_completion_wait+0x220/0x220 [ptlrpc]
          [403807.725744] [<ffffffffc0db18e0>] ? mdc_changelog_cdev_finish+0x1c0/0x1c0 [mdc]
          [403807.727177] [<ffffffffc0deb962>] lmv_intent_lock+0x472/0xaf0 [lmv]
          [403807.728578] [<ffffffffa992e262>] ? from_kgid+0x12/0x20
          [403807.729954] [<ffffffffc0e46e87>] ? ll_i2suppgid+0x37/0x40 [lustre]
          [403807.731323] [<ffffffffc0e46eb4>] ? ll_i2gids+0x24/0xb0 [lustre]
          [403807.732655] [<ffffffffa992e262>] ? from_kgid+0x12/0x20
          [403807.733968] [<ffffffffc0e46ba0>] ? ll_md_need_convert+0x1b0/0x1b0 [lustre]
          [403807.735265] [<ffffffffc0e4760a>] ll_lookup_it+0x3aa/0x1910 [lustre]
          [403807.736552] [<ffffffffc0e16893>] ? ll_inode_permission+0x93/0x3c0 [lustre]
          [403807.737817] [<ffffffffa9af9252>] ? security_inode_permission+0x22/0x30
          [403807.739054] [<ffffffffa9a4d472>] ? __inode_permission+0x52/0xd0
          [403807.740259] [<ffffffffa9a4d508>] ? inode_permission+0x18/0x50
          [403807.741452] [<ffffffffa9a5134e>] ? link_path_walk+0x27e/0x8b0
          [403807.742653] [<ffffffffc0a48550>] ? cl_env_put+0x140/0x1d0 [obdclass]
          [403807.743865] [<ffffffffc0e5e4b0>] ? cl_inode_fini+0x90/0x1c0 [lustre]
          [403807.744991] [<ffffffffc0e49fbb>] ll_lookup_nd+0xbb/0x190 [lustre]
          [403807.746080] [<ffffffffa9a4c5b3>] lookup_real+0x23/0x60
          [403807.747136] [<ffffffffa9a4cfd2>] __lookup_hash+0x42/0x60
          [403807.748161] [<ffffffffa9a538bc>] do_unlinkat+0x14c/0x2d0
          [403807.749153] [<ffffffffa9a5312d>] ? putname+0x3d/0x60
          [403807.750112] [<ffffffffa9a549d6>] SyS_unlink+0x16/0x20
          [403807.751057] [<ffffffffa9f75ddb>] system_call_fastpath+0x22/0x27
          {noformat}
          New: during mdtest from 10 clients to a directory which configured with DNE2/DoM, few clients crashed at same time below.
          {noformat}
          [403683.467928] LustreError: 11-0: cache1-MDT0000-mdc-ffff89de6e235000: operation mds_connect to node 10.0.10.175@o2ib10 failed: rc = -16
          [403709.580276] Lustre: Mounted cache1-client
          [403807.677757] LustreError: 192778:0:(osc_lock.c:687:osc_ldlm_weigh_ast()) ASSERTION( dlmlock->l_resource->lr_type == LDLM_EXTENT || ldlm_has_dom(dlmlock) ) failed:
          [403807.681557] LustreError: 192778:0:(osc_lock.c:687:osc_ldlm_weigh_ast()) LBUG
          [403807.683405] Pid: 192778, comm: mdtest 3.10.0-957.10.1.el7.x86_64 #1 SMP Mon Mar 18 15:06:45 UTC 2019
          [403807.683407] Call Trace:
          [403807.683419] [<ffffffffc08867cc>] libcfs_call_trace+0x8c/0xc0 [libcfs]
          [403807.683446] [<ffffffffc088687c>] lbug_with_loc+0x4c/0xa0 [libcfs]
          [403807.683452] [<ffffffffc0d4b947>] osc_ldlm_weigh_ast+0x377/0x3a0 [osc]
          [403807.683474] [<ffffffffc0d9aa91>] mdc_cancel_weight+0xe1/0x130 [mdc]
          [403807.683484] [<ffffffffc0bb10d1>] ldlm_cancel_no_wait_policy+0x51/0x80 [ptlrpc]
          [403807.683527] [<ffffffffc0bb1118>] ldlm_cancel_aged_no_wait_policy+0x18/0x70 [ptlrpc]
          [403807.683546] [<ffffffffc0bb25ba>] ldlm_prepare_lru_list+0x1fa/0x4c0 [ptlrpc]
          [403807.683565] [<ffffffffc0bb5f1a>] ldlm_cancel_lru_local+0x1a/0x30 [ptlrpc]
          [403807.683584] [<ffffffffc0bb614e>] ldlm_prep_elc_req+0x21e/0x490 [ptlrpc]
          [403807.683605] [<ffffffffc0bb63e8>] ldlm_prep_enqueue_req+0x28/0x30 [ptlrpc]
          [403807.683624] [<ffffffffc0da9f55>] mdc_intent_getattr_pack.isra.15+0xa5/0x3a0 [mdc]
          [403807.683632] [<ffffffffc0dace12>] mdc_enqueue_base+0x532/0x1500 [mdc]
          [403807.683640] [<ffffffffc0dae545>] mdc_intent_lock+0x135/0x560 [mdc]
          [403807.683649] [<ffffffffc0deb962>] lmv_intent_lock+0x472/0xaf0 [lmv]
          [403807.683659] [<ffffffffc0e4760a>] ll_lookup_it+0x3aa/0x1910 [lustre]
          [403807.683685] [<ffffffffc0e49fbb>] ll_lookup_nd+0xbb/0x190 [lustre]
          [403807.683696] [<ffffffffa9a4c5b3>] lookup_real+0x23/0x60
          [403807.683702] [<ffffffffa9a4cfd2>] __lookup_hash+0x42/0x60
          [403807.683705] [<ffffffffa9a538bc>] do_unlinkat+0x14c/0x2d0
          [403807.683710] [<ffffffffa9a549d6>] SyS_unlink+0x16/0x20
          [403807.683715] [<ffffffffa9f75ddb>] system_call_fastpath+0x22/0x27
          [403807.683722] [<ffffffffffffffff>] 0xffffffffffffffff
          [403807.683756] Kernel panic - not syncing: LBUG
          [403807.685577] CPU: 7 PID: 192778 Comm: mdtest Kdump: loaded Tainted: G OEL ------------ 3.10.0-957.10.1.el7.x86_64 #1
          [403807.689195] Hardware name: Intel Corporation S2600KPR/S2600KPR, BIOS SE5C610.86B.01.01.0027.071020182329 07/10/2018
          [403807.691049] Call Trace:
          [403807.692858] [<ffffffffa9f62e41>] dump_stack+0x19/0x1b
          [403807.694647] [<ffffffffa9f5c550>] panic+0xe8/0x21f
          [403807.696419] [<ffffffffc08868cb>] lbug_with_loc+0x9b/0xa0 [libcfs]
          [403807.698183] [<ffffffffc0d4b947>] osc_ldlm_weigh_ast+0x377/0x3a0 [osc]
          [403807.699954] [<ffffffffc0d9aa91>] mdc_cancel_weight+0xe1/0x130 [mdc]
          [403807.701740] [<ffffffffc0bb10d1>] ldlm_cancel_no_wait_policy+0x51/0x80 [ptlrpc]
          [403807.703481] [<ffffffffc0bb1118>] ldlm_cancel_aged_no_wait_policy+0x18/0x70 [ptlrpc]
          [403807.706888] [<ffffffffc0bb25ba>] ldlm_prepare_lru_list+0x1fa/0x4c0 [ptlrpc]
          [403807.710249] [<ffffffffc0bb5f1a>] ldlm_cancel_lru_local+0x1a/0x30 [ptlrpc]
          [403807.711919] [<ffffffffc0bb614e>] ldlm_prep_elc_req+0x21e/0x490 [ptlrpc]
          [403807.713536] [<ffffffffc0bb63e8>] ldlm_prep_enqueue_req+0x28/0x30 [ptlrpc]
          [403807.715110] [<ffffffffc0da9f55>] mdc_intent_getattr_pack.isra.15+0xa5/0x3a0 [mdc]
          [403807.716683] [<ffffffffc0dace12>] mdc_enqueue_base+0x532/0x1500 [mdc]
          [403807.721334] [<ffffffffc0dae545>] mdc_intent_lock+0x135/0x560 [mdc]
          [403807.727177] [<ffffffffc0deb962>] lmv_intent_lock+0x472/0xaf0 [lmv]
          [403807.735265] [<ffffffffc0e4760a>] ll_lookup_it+0x3aa/0x1910 [lustre]
          [403807.744991] [<ffffffffc0e49fbb>] ll_lookup_nd+0xbb/0x190 [lustre]
          [403807.746080] [<ffffffffa9a4c5b3>] lookup_real+0x23/0x60
          [403807.747136] [<ffffffffa9a4cfd2>] __lookup_hash+0x42/0x60
          [403807.748161] [<ffffffffa9a538bc>] do_unlinkat+0x14c/0x2d0
          [403807.750112] [<ffffffffa9a549d6>] SyS_unlink+0x16/0x20
          [403807.751057] [<ffffffffa9f75ddb>] system_call_fastpath+0x22/0x27
          {noformat}
          simmonsja James A Simmons made changes -
          Labels New: ORNL
          pjones Peter Jones made changes -
          Fix Version/s New: Lustre 2.13.0 [ 14290 ]

          People

            pfarrell Patrick Farrell (Inactive)
            sihara Shuichi Ihara
            Votes:
            0 Vote for this issue
            Watchers:
            6 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: