Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-11998

ASSERTION( req->rq_phase == expected_phase ) failed in sanity 411

    XMLWordPrintable

Details

    • Bug
    • Resolution: Unresolved
    • Major
    • None
    • Lustre 2.13.0
    • None
    • 3
    • 9223372036854775807

    Description

      In my testing this is one of the more frequent failures so I have a lot or crashdumps if needed

      Basically it unfolds like this:

      [17452.428803] Lustre: DEBUG MARKER: == sanity test 411: Slab allocation error with cgroup does not LBUG ================================== 06:38:53 (1550662733)
      [17460.153247] SLAB: Unable to allocate memory on node 0 (gfp=0x100050)
      [17460.173293]   cache: kmalloc-512(0:osc_slab_alloc), object size: 4096, order: 0
      [17460.177182]   node 0: slabs: 95/95, objs: 95/95, free: 0
      [17460.338187] SLAB: Unable to allocate memory on node 0 (gfp=0x180050)
      [17460.357819]   cache: radix_tree_node(0:osc_slab_alloc), object size: 4096, order: 0
      [17460.360806]   node 0: slabs: 20/20, objs: 20/20, free: 0
      [17460.404582] SLAB: Unable to allocate memory on node 0 (gfp=0x100050)
      [17460.407022]   cache: kmalloc-ac(0:osc_slab_alloc), object size: 64, order: 0
      [17460.413337]   node 0: slabs: 2/2, objs: 118/118, free: 0
      [17460.606578] SLAB: Unable to allocate memory on node 0 (gfp=0x180050)
      [17460.624043]   cache: radix_tree_node(0:osc_slab_alloc), object size: 4096, order: 0
      [17460.625738]   node 0: slabs: 37/37, objs: 37/37, free: 0
      [17460.906768] SLAB: Unable to allocate memory on node 0 (gfp=0x100000)
      [17460.907221]   cache: kmalloc-ac(0:osc_slab_alloc), object size: 64, order: 0
      [17460.907221]   node 0: slabs: 3/3, objs: 177/177, free: 0
      [17460.932028] SLAB: Unable to allocate memory on node 0 (gfp=0x100000)
      [17460.933775]   cache: kmalloc-ac(0:osc_slab_alloc), object size: 64, order: 0
      [17460.933775]   node 0: slabs: 3/3, objs: 177/177, free: 0
      [17461.005654] SLAB: Unable to allocate memory on node 0 (gfp=0x100050)
      [17461.025645]   cache: osc_extent_kmem(0:osc_slab_alloc), object size: 216, order: 0
      [17461.030831]   node 0: slabs: 4/4, objs: 72/72, free: 0
      [17461.293822] SLAB: Unable to allocate memory on node 0 (gfp=0x100020)
      [17461.294008]   cache: ptlrpc_cache(0:osc_slab_alloc), object size: 4096, order: 0
      [17461.294008]   node 0: slabs: 1/1, objs: 1/1, free: 0
      [17461.316586] LustreError: 20110:0:(events.c:326:request_in_callback()) Can't allocate incoming request descriptor: Dropping mdt_readpage RPC from 12345-0@lo
      [17461.319280] SLAB: Unable to allocate memory on node 0 (gfp=0x100050)
      [17461.321488]   cache: kmalloc-256(0:osc_slab_alloc), object size: 4096, order: 0
      [17461.324562]   node 0: slabs: 0/0, objs: 0/0, free: 0
      [17461.325578] LustreError: 20110:0:(client.c:1045:ptlrpc_set_destroy()) ASSERTION( req->rq_phase == expected_phase ) failed: 
      [17461.328245] LustreError: 20110:0:(client.c:1045:ptlrpc_set_destroy()) LBUG
      [17461.330257] Pid: 20110, comm: dd 3.10.0-7.6-debug #1 SMP Wed Nov 7 21:55:08 EST 2018
      [17461.333789] Call Trace:
      [17461.335607]  [<ffffffffa029f7dc>] libcfs_call_trace+0x8c/0xc0 [libcfs]
      [17461.342130]  [<ffffffffa029f88c>] lbug_with_loc+0x4c/0xa0 [libcfs]
      [17461.343898]  [<ffffffffa0651891>] ptlrpc_set_destroy+0x331/0x410 [ptlrpc]
      [17461.346113]  [<ffffffffa0657c2d>] ptlrpc_queue_wait+0x8d/0x230 [ptlrpc]
      [17461.347822]  [<ffffffffa092d82e>] mdc_close+0x1ee/0x990 [mdc]
      [17461.348764]  [<ffffffffa05bf8fe>] lmv_close+0x17e/0x2a0 [lmv]
      [17461.349849]  [<ffffffffa143b0a1>] ll_close_inode_openhandle+0x2e1/0xcf0 [lustre]
      [17461.352105]  [<ffffffffa143fed0>] ll_md_real_close+0xf0/0x1e0 [lustre]
      [17461.354109]  [<ffffffffa14405d5>] ll_file_release+0x615/0x8c0 [lustre]
      [17461.357071]  [<ffffffff812386bc>] __fput+0xfc/0x300
      [17461.360240]  [<ffffffff8123899e>] ____fput+0xe/0x10
      [17461.361226]  [<ffffffff810b1885>] task_work_run+0xb5/0xf0
      [17461.362386]  [<ffffffff8102bc22>] do_notify_resume+0x92/0xb0
      [17461.364703]  [<ffffffff817c5158>] int_signal+0x12/0x17
      [17461.366681]  [<ffffffffffffffff>] 0xffffffffffffffff
      [17461.368926] Kernel panic - not syncing: LBUG
      

      Attachments

        Activity

          People

            wc-triage WC Triage
            green Oleg Drokin
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated: