Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-12582

lustre 2.10.3 QoS nrs_tbf_rule_match LBUG

    XMLWordPrintable

Details

    • Bug
    • Resolution: Unresolved
    • Critical
    • None
    • Lustre 2.10.3
    • None
    • Lustre 2.10.3, CentOS 7.4, OS 3.10.0.693, OFED 4.2
    • 1
    • 9223372036854775807

    Description

      Hi all. The kernel assert 'LASSERT((tmp_rule->tr_flags & NTRS_STOPPING) == 0)' is triggered in  function  static struct nrs_tbf_rule *nrs_tbf_rule_match(struct nrs_tbf_head *head, struct nrs_tbf_client *cli) of lustre/ptlrpc/nrs_tbf.c.
       
      it seems that the tmp_rule->tr_flags of some rule is set to NTRS_STOPPING or the variable tmp_rule gets a bad value, but I have never run lctl command like 'lctl set_param mds.MDS.mdt.nrs_tbf_rule="stop rule1" '.
       
      Here is the call trace:

      [1139547.817517] LustreError: 20086:0:(nrs_tbf.c:235:nrs_tbf_rule_match()) ASSERTION( (tmp_rule->tr_flags & 0x0000001) == 0 ) failed:
      [1139547.817540] LustreError: 20086:0:(nrs_tbf.c:235:nrs_tbf_rule_match()) LBUG
      [1139547.817544] Pid: 20086, comm: mdt00_068
      [1139547.817547]
      Call Trace:
      [1139547.817586]  [<ffffffffc09d57ae>] libcfs_call_trace+0x4e/0x60 [libcfs]
      [1139547.817602]  [<ffffffffc09d583c>] lbug_with_loc+0x4c/0xb0 [libcfs]
      [1139547.817700]  [<ffffffffc0f804c5>] nrs_tbf_rule_match+0xc5/0xd0 [ptlrpc]
      [1139547.817780]  [<ffffffffc0f834ad>] nrs_tbf_res_get+0xad/0x4c0 [ptlrpc]
      [1139547.817852]  [<ffffffffc0f7621c>] nrs_resource_get+0x7c/0x100 [ptlrpc]
      [1139547.817922]  [<ffffffffc0f76790>] nrs_resource_get_safe+0x80/0xf0 [ptlrpc]
      [1139547.817993]  [<ffffffffc0f7a263>] ptlrpc_nrs_req_initialize+0x83/0x100 [ptlrpc]
      [1139547.818059]  [<ffffffffc0f48f31>] ptlrpc_main+0x1771/0x1e40 [ptlrpc]
      [1139547.818125]  [<ffffffffc0f477c0>] ? ptlrpc_main+0x0/0x1e40 [ptlrpc]
      [1139547.818134]  [<ffffffff810b252f>] kthread+0xcf/0xe0
      [1139547.818141]  [<ffffffff810b2460>] ? kthread+0x0/0xe0
      [1139547.818149]  [<ffffffff816b8798>] ret_from_fork+0x58/0x90
      [1139547.818155]  [<ffffffff810b2460>] ? kthread+0x0/0xe0
      [1139547.818159]
      [1139547.818162] Kernel panic - not syncing: LBUG
      [1139547.818212] CPU: 16 PID: 20086 Comm: mdt00_068 Tainted: G           OEL ------------   3.10.0-693.11.6.el7_lustre.x86_64 #1
      [1139547.818355] Call Trace:
      [1139547.818385]  [<ffffffff816a5e7d>] dump_stack+0x19/0x1b
      [1139547.818433]  [<ffffffff8169fd64>] panic+0xe8/0x20d
      [1139547.818492]  [<ffffffffc09d5854>] lbug_with_loc+0x64/0xb0 [libcfs]
      [1139547.818611]  [<ffffffffc0f804c5>] nrs_tbf_rule_match+0xc5/0xd0 [ptlrpc]
      [1139547.818732]  [<ffffffffc0f834ad>] nrs_tbf_res_get+0xad/0x4c0 [ptlrpc]
      [1139547.818848]  [<ffffffffc0f7621c>] nrs_resource_get+0x7c/0x100 [ptlrpc]
      [1139547.818965]  [<ffffffffc0f76790>] nrs_resource_get_safe+0x80/0xf0 [ptlrpc]
      [1139547.819084]  [<ffffffffc0f7a263>] ptlrpc_nrs_req_initialize+0x83/0x100 [ptlrpc]
      [1139547.819203]  [<ffffffffc0f48f31>] ptlrpc_main+0x1771/0x1e40 [ptlrpc]
      [1139547.819316]  [<ffffffffc0f477c0>] ? ptlrpc_register_service+0xe30/0xe30 [ptlrpc]
      [1139547.819376]  [<ffffffff810b252f>] kthread+0xcf/0xe0
      [1139547.819419]  [<ffffffff810b2460>] ? insert_kthread_work+0x40/0x40
      [1139547.819470]  [<ffffffff816b8798>] ret_from_fork+0x58/0x90
      [1139547.819516]  [<ffffffff810b2460>] ? insert_kthread_work+0x40/0x40
      

       

      Attachments

        Activity

          People

            wc-triage WC Triage
            anhua anhua (Inactive)
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated: