Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-7448

sleeping under spinlock somewhere in nrs/tbf code

    XMLWordPrintable

Details

    • Bug
    • Resolution: Duplicate
    • Minor
    • None
    • None
    • None
    • 3
    • 9223372036854775807

    Description

      I got this tell tale complaint in sanityn test 77e:

      .781807] BUG: spinlock wrong CPU on CPU#1, lctl/29799 (Not tainted)
      .782638]  lock: ffff880055277d08, .magic: dead4ead, .owner: lctl/29799, .owner_c
      .784053] Pid: 29799, comm: lctl Not tainted 2.6.32-rhe6.7-debug #1
      .784828] Call Trace:
      .785635]  [<ffffffff812a06fa>] ? spin_bug+0xaa/0x100
      .786402]  [<ffffffff812a07c6>] ? _raw_spin_unlock+0x76/0xa0
      .787184]  [<ffffffff81530afe>] ? _spin_unlock+0xe/0x10
      .787989]  [<ffffffffa153faa4>] ? nrs_policy_ctl+0xd4/0x2e0 [ptlrpc]
      .788835]  [<ffffffffa15414f2>] ? ptlrpc_nrs_policy_control+0xe2/0x2a0 [ptlrpc]
      .790282]  [<ffffffffa1522876>] ? ptlrpc_lprocfs_nrs_seq_write+0x3e6/0x600 [ptlrp
      .791762]  [<ffffffffa1522490>] ? ptlrpc_lprocfs_nrs_seq_write+0x0/0x600 [ptlrpc]
      .794101]  [<ffffffff811ff945>] ? proc_reg_write+0x85/0xc0
      .794872]  [<ffffffff81192f48>] ? vfs_write+0xb8/0x1a0
      .795616]  [<ffffffff811943f6>] ? fget_light_pos+0x16/0x50
      .796356]  [<ffffffff81193881>] ? sys_write+0x51/0xb0
      .797114]  [<ffffffff815312ee>] ? do_device_not_available+0xe/0x10
      .797878]  [<ffffffff8100b112>] ? system_call_fastpath+0x16/0x1b
      

      What this means is that something in nrs_policy_ctl() slept while holding nrs->nrs_lock to the point that when it woke up, it was rescheduled on a different cpu.

      I only running rhel6 at the moment so this is the best I got and I do not see any obvious culprit right away.
      Probably need to rerun with rhel7 where it actually catches offenders much better.

      Attachments

        Issue Links

          Activity

            People

              emoly.liu Emoly Liu
              green Oleg Drokin
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: