Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-8435

LBUG (osc_cache.c:1290:osc_completion()) ASSERTION( equi(page->cp_state == CPS_PAGEIN, cmd == OBD_BRW_READ) )

Details

    • Bug
    • Resolution: Fixed
    • Major
    • Lustre 2.11.0
    • Lustre 2.7.0
    • Bull Lustre distribution based on Lustre 2.7.2
    • 3
    • 9223372036854775807

    Description

      In the last month one of our customer hit more than 100 times a crash with the following signature:

      [506626.555125] SLUB: Unable to allocate memory on node -1 (gfp=0x80c0)
      [506626.562216]   cache: kvm_mmu_page_header(22:step_batch), object size: 168,
      buffer size: 168, default order: 1, min order: 0
      [506626.574729]   node 0: slabs: 0, objs: 0, free: 0
      [506626.579974]   node 1: slabs: 0, objs: 0, free: 0
      [506626.585219]   node 2: slabs: 60, objs: 2880, free: 0
      [506626.590852]   node 3: slabs: 0, objs: 0, free: 0
      [506626.596112] LustreError: 41604:0:(osc_cache.c:1290:osc_completion())
      ASSERTION( equi(page->cp_state == CPS_PAGEIN, cmd == OBD_BRW_READ) ) failed:
      cp_state:0, cmd:1
      [506626.612512] LustreError: 41604:0:(osc_cache.c:1290:osc_completion()) LBUG
      [506626.620186] Pid: 41604, comm: cat
      [506626.623978]
                      Call Trace:
      [506626.628573]  [<ffffffffa05eb853>] libcfs_debug_dumpstack+0x53/0x80
      [libcfs]
      [506626.636448]  [<ffffffffa05ebdf5>] lbug_with_loc+0x45/0xc0 [libcfs]
      [506626.643456]  [<ffffffffa0dea859>] osc_ap_completion.isra.30+0x4d9/0x5b0
      [osc]
      [506626.651526]  [<ffffffffa0df558d>] osc_queue_sync_pages+0x2dd/0x350 [osc]
      [506626.659108]  [<ffffffffa0de750f>] osc_io_submit+0x42f/0x530 [osc]
      [506626.666037]  [<ffffffffa086fbd6>] cl_io_submit_rw+0x66/0x170 [obdclass]
      [506626.673531]  [<ffffffffa0b8d257>] lov_io_submit+0x2a7/0x420 [lov]
      [506626.680450]  [<ffffffffa086fbd6>] cl_io_submit_rw+0x66/0x170 [obdclass]
      [506626.687961]  [<ffffffffa0c67f70>] ll_readpage+0x2d0/0x560 [lustre]
      [506626.694964]  [<ffffffff8116af87>] generic_file_aio_read+0x3b7/0x750
      [506626.702078]  [<ffffffffa0c98485>] vvp_io_read_start+0x3c5/0x470 [lustre]
      [506626.709674]  [<ffffffffa086f965>] cl_io_start+0x65/0x130 [obdclass]
      [506626.716785]  [<ffffffffa0872f85>] cl_io_loop+0xa5/0x190 [obdclass]
      [506626.723797]  [<ffffffffa0c34e8c>] ll_file_io_generic+0x5fc/0xae0 [lustre]
      [506626.731477]  [<ffffffffa0c35db2>] ll_file_aio_read+0x192/0x530 [lustre]
      [506626.738962]  [<ffffffffa0c3621b>] ll_file_read+0xcb/0x1e0 [lustre]
      [506626.745962]  [<ffffffff811dea1c>] vfs_read+0x9c/0x170
      [506626.751700]  [<ffffffff811df56f>] SyS_read+0x7f/0xe0
      [506626.757345]  [<ffffffff81646889>] system_call_fastpath+0x16/0x1b
      [506626.764138]
      [506626.765990] Kernel panic - not syncing: LBUG
      [506626.770850] CPU: 53 PID: 41604 Comm: cat Tainted: G           OE 
      ------------   3.10.0-327.22.2.el7.x86_64 #1
      [506626.782104] Hardware name: BULL bullx blade/CHPU, BIOS BIOSX07.037.01.003
      10/23/2015
      [506626.790838]  ffffffffa0610ced 000000000f6a3070 ffff8817799eb8c0
      ffffffff816360f4
      [506626.799228]  ffff8817799eb940 ffffffff8162f96a ffffffff00000008
      ffff8817799eb950
      [506626.807618]  ffff8817799eb8f0 000000000f6a3070 ffffffffa0e01466
      0000000000000246
      [506626.816005] Call Trace:
      [506626.818839]  [<ffffffff816360f4>] dump_stack+0x19/0x1b
      [506626.824668]  [<ffffffff8162f96a>] panic+0xd8/0x1e7
      [506626.830128]  [<ffffffffa05ebe5b>] lbug_with_loc+0xab/0xc0 [libcfs]
      [506626.837129]  [<ffffffffa0dea859>] osc_ap_completion.isra.30+0x4d9/0x5b0
      [osc]
      [506626.845192]  [<ffffffffa0df558d>] osc_queue_sync_pages+0x2dd/0x350 [osc]
      [506626.852766]  [<ffffffffa0de750f>] osc_io_submit+0x42f/0x530 [osc]
      [506626.859702]  [<ffffffffa086fbd6>] cl_io_submit_rw+0x66/0x170 [obdclass]
      [506626.867184]  [<ffffffffa0b8d257>] lov_io_submit+0x2a7/0x420 [lov]
      [506626.874099]  [<ffffffffa086fbd6>] cl_io_submit_rw+0x66/0x170 [obdclass]
      [506626.881611]  [<ffffffffa0c67f70>] ll_readpage+0x2d0/0x560 [lustre]
      [506626.888609]  [<ffffffff8116af87>] generic_file_aio_read+0x3b7/0x750
      [506626.895721]  [<ffffffffa0c98485>] vvp_io_read_start+0x3c5/0x470 [lustre]
      [506626.903322]  [<ffffffffa086f965>] cl_io_start+0x65/0x130 [obdclass]
      [506626.910418]  [<ffffffffa0872f85>] cl_io_loop+0xa5/0x190 [obdclass]
      [506626.917420]  [<ffffffffa0c34e8c>] ll_file_io_generic+0x5fc/0xae0 [lustre]
      [506626.925091]  [<ffffffffa0c35db2>] ll_file_aio_read+0x192/0x530 [lustre]
      [506626.932575]  [<ffffffffa0c3621b>] ll_file_read+0xcb/0x1e0 [lustre]
      [506626.939569]  [<ffffffff811dea1c>] vfs_read+0x9c/0x170
      [506626.945300]  [<ffffffff811df56f>] SyS_read+0x7f/0xe0
      [506626.950938]  [<ffffffff81646889>] system_call_fastpath+0x16/0x1b
      

      The customer being a black site, we can't provide the crashdump, but will happily provide any text output you would find useful.

      Attachments

        1. crash_output.txt
          24 kB
        2. foreach_bt_merge.txt
          152 kB
        3. struct_analyze1.txt
          50 kB

        Issue Links

          Activity

            [LU-8435] LBUG (osc_cache.c:1290:osc_completion()) ASSERTION( equi(page->cp_state == CPS_PAGEIN, cmd == OBD_BRW_READ) )

            On that note, Aurelien, I think we should add a write component to the test after the memory limit is set... Or perhaps a separate test. But either way - write under pressure would be good to have as well.

            paf Patrick Farrell (Inactive) added a comment - On that note, Aurelien, I think we should add a write component to the test after the memory limit is set... Or perhaps a separate test. But either way - write under pressure would be good to have as well.

            Bruno, this was exactly the purpose of this test. It seems it discover other memory management issues in client code. I/O is not really expected to succeed under such constraints, but only returing EIO or ENOMEM, not crashing

            adegremont Aurelien Degremont (Inactive) added a comment - Bruno, this was exactly the purpose of this test. It seems it discover other memory management issues in client code. I/O is not really expected to succeed under such constraints, but only returing EIO or ENOMEM, not crashing
            green Oleg Drokin added a comment -

            Ok, thanks.
            I had 4 more failures in the past 24 hours, btw.

            The crashdumps are on onyx-68 in /export/crashdumps.
            they are:
            192.168.123.199-2017-09-01-10:34:*
            192.168.123.111-2017-09-02-15:06:*
            192.168.123.195-2017-09-03-13:*
            192.168.123.151-2017-09-03-14:06:*
            192.168.123.135-2017-09-03-14:11:*

            build tree is currently in /export/centos7-nfsroot/home/green/git/lustre-release with all the modules (I'll update it on Tuesday ,but should be good for the next 30 or so hours).

            green Oleg Drokin added a comment - Ok, thanks. I had 4 more failures in the past 24 hours, btw. The crashdumps are on onyx-68 in /export/crashdumps. they are: 192.168.123.199-2017-09-01-10:34:* 192.168.123.111-2017-09-02-15:06:* 192.168.123.195-2017-09-03-13:* 192.168.123.151-2017-09-03-14:06:* 192.168.123.135-2017-09-03-14:11:* build tree is currently in /export/centos7-nfsroot/home/green/git/lustre-release with all the modules (I'll update it on Tuesday ,but should be good for the next 30 or so hours).

            Oleg,
            my guess is that this new sub-test sanity/test_411, introduced by change #21745, is setting a highly constraining Kernel memory limit that is very likely to trigger some memcg/slab bug.
            But I am ok to have a look to the crash dump to try to confirm.

            bfaccini Bruno Faccini (Inactive) added a comment - Oleg, my guess is that this new sub-test sanity/test_411, introduced by change #21745, is setting a highly constraining Kernel memory limit that is very likely to trigger some memcg/slab bug. But I am ok to have a look to the crash dump to try to confirm.
            green Oleg Drokin added a comment -

            Hm I just hada failure in a test introduced by this patch:

            [38199.302263] Lustre: DEBUG MARKER: == sanity test 411: Slab allocation error with cgroup does not LBUG ================================== 10:34:27 (1504276467)
            [38212.118675] BUG: unable to handle kernel NULL pointer dereference at 0000000000000008
            [38212.120795] IP: [<ffffffff811dbb04>] __memcg_kmem_get_cache+0xe4/0x220
            [38212.121489] PGD 310c0a067 PUD 28e92c067 PMD 0 
            [38212.122192] Oops: 0000 [#1] SMP DEBUG_PAGEALLOC
            [38212.122849] Modules linked in: lustre(OE) ofd(OE) osp(OE) lod(OE) ost(OE) mdt(OE) mdd(OE) mgs(OE) osd_zfs(OE) lquota(OE) lfsck(OE) obdecho(OE) mgc(OE) lov(OE) osc(OE) mdc(OE) lmv(OE) fid(OE) fld(OE) ptlrpc_gss(OE) ptlrpc(OE) obdclass(OE) ksocklnd(OE) lnet(OE) libcfs(OE) brd ext4 mbcache loop zfs(PO) zunicode(PO) zavl(PO) icp(PO) zcommon(PO) znvpair(PO) spl(O) zlib_deflate jbd2 syscopyarea sysfillrect ata_generic sysimgblt pata_acpi ttm drm_kms_helper ata_piix drm i2c_piix4 libata serio_raw virtio_balloon pcspkr virtio_console i2c_core virtio_blk floppy nfsd ip_tables rpcsec_gss_krb5 [last unloaded: libcfs]
            [38212.145920] CPU: 2 PID: 31539 Comm: dd Tainted: P        W  OE  ------------   3.10.0-debug #2
            [38212.147177] Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011
            [38212.147821] task: ffff8802f2bf4800 ti: ffff880294f20000 task.ti: ffff880294f20000
            [38212.152755] RIP: 0010:[<ffffffff811dbb04>]  [<ffffffff811dbb04>] __memcg_kmem_get_cache+0xe4/0x220
            [38212.153730] RSP: 0018:ffff880294f237f0  EFLAGS: 00010286
            [38212.154194] RAX: 0000000000000000 RBX: ffff8803232c5c40 RCX: 0000000000000002
            [38212.154672] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000246
            [38212.155168] RBP: ffff880294f23810 R08: 0000000000000000 R09: 0000000000000000
            [38212.155647] R10: 0000000000000000 R11: 0000000200000007 R12: ffff8802f2bf4800
            [38212.156134] R13: ffff88031f6a6000 R14: ffff8803232c5c40 R15: ffff8803232c5c40
            [38212.156898] FS:  00007f1f35a4e740(0000) GS:ffff88033e440000(0000) knlGS:0000000000000000
            [38212.159271] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
            [38212.159923] CR2: 0000000000000008 CR3: 00000002f011d000 CR4: 00000000000006e0
            [38212.160625] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
            [38212.161320] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
            [38212.163273] Stack:
            [38212.163852]  ffffffff811dba68 0000000000008050 ffff8802c59a5000 ffff8802a991ee00
            [38212.165119]  ffff880294f238a0 ffffffff811cca5c ffffffffa0570615 ffffc9000ab51000
            [38212.166468]  ffff880200000127 ffffffffa05a5547 ffff88028b683e80 ffff8803232c5c40
            [38212.168537] Call Trace:
            [38212.169340]  [<ffffffff811dba68>] ? __memcg_kmem_get_cache+0x48/0x220
            [38212.170547]  [<ffffffff811cca5c>] kmem_cache_alloc+0x1ec/0x640
            [38212.171879]  [<ffffffffa0570615>] ? ldlm_resource_putref+0x75/0x400 [ptlrpc]
            [38212.172659]  [<ffffffffa05a5547>] ? ptlrpc_request_cache_alloc+0x27/0x110 [ptlrpc]
            [38212.174145]  [<ffffffffa07c0f0d>] ? mdc_resource_get_unused+0x14d/0x2a0 [mdc]
            [38212.174871]  [<ffffffffa05a5547>] ptlrpc_request_cache_alloc+0x27/0x110 [ptlrpc]
            [38212.177273]  [<ffffffffa05a5655>] ptlrpc_request_alloc_internal+0x25/0x480 [ptlrpc]
            [38212.178618]  [<ffffffffa05a5ac3>] ptlrpc_request_alloc+0x13/0x20 [ptlrpc]
            [38212.179440]  [<ffffffffa07c6a60>] mdc_enqueue_base+0x6c0/0x18a0 [mdc]
            [38212.180168]  [<ffffffffa07c845b>] mdc_intent_lock+0x26b/0x520 [mdc]
            [38212.180869]  [<ffffffffa161dad0>] ? ll_invalidate_negative_children+0x1e0/0x1e0 [lustre]
            [38212.182291]  [<ffffffffa0584ab0>] ? ldlm_expired_completion_wait+0x240/0x240 [ptlrpc]
            [38212.183569]  [<ffffffffa079723d>] lmv_intent_lock+0xc0d/0x1b50 [lmv]
            [38212.184289]  [<ffffffff810ac3c1>] ? in_group_p+0x31/0x40
            [38212.184941]  [<ffffffffa161e5c5>] ? ll_i2suppgid+0x15/0x40 [lustre]
            [38212.185667]  [<ffffffffa161e614>] ? ll_i2gids+0x24/0xb0 [lustre]
            [38212.186372]  [<ffffffff811073d2>] ? from_kgid+0x12/0x20
            [38212.187062]  [<ffffffffa1609275>] ? ll_prep_md_op_data+0x235/0x520 [lustre]
            [38212.187754]  [<ffffffffa161dad0>] ? ll_invalidate_negative_children+0x1e0/0x1e0 [lustre]
            [38212.190244]  [<ffffffffa161fd34>] ll_lookup_it+0x2a4/0xef0 [lustre]
            [38212.190918]  [<ffffffffa1620ab7>] ll_atomic_open+0x137/0x12d0 [lustre]
            [38212.191636]  [<ffffffff817063d7>] ? _raw_spin_unlock+0x27/0x40
            [38212.192425]  [<ffffffff811f82fb>] ? lookup_dcache+0x8b/0xb0
            [38212.193270]  [<ffffffff811fd551>] do_last+0xa21/0x12b0
            [38212.194603]  [<ffffffff811fdea2>] path_openat+0xc2/0x4a0
            [38212.195481]  [<ffffffff811ff69b>] do_filp_open+0x4b/0xb0
            [38212.196351]  [<ffffffff817063d7>] ? _raw_spin_unlock+0x27/0x40
            [38212.197169]  [<ffffffff8120d137>] ? __alloc_fd+0xa7/0x130
            [38212.197815]  [<ffffffff811ec553>] do_sys_open+0xf3/0x1f0
            [38212.198506]  [<ffffffff811ec66e>] SyS_open+0x1e/0x20
            [38212.199225]  [<ffffffff8170fc49>] system_call_fastpath+0x16/0x1b
            [38212.199896] Code: 01 00 00 41 f6 85 10 03 00 00 03 0f 84 f6 00 00 00 4d 85 ed 48 c7 c2 ff ff ff ff 74 07 49 63 95 98 06 00 00 48 8b 83 e0 00 00 00 <4c> 8b 64 d0 08 4d 85 e4 0f 85 d1 00 00 00 41 f6 45 10 01 0f 84 
            [38212.202617] RIP  [<ffffffff811dbb04>] __memcg_kmem_get_cache+0xe4/0x220
            [38212.203345]  RSP <ffff880294f237f0>
            

            I have a crashdump if anybody is interested.

            green Oleg Drokin added a comment - Hm I just hada failure in a test introduced by this patch: [38199.302263] Lustre: DEBUG MARKER: == sanity test 411: Slab allocation error with cgroup does not LBUG ================================== 10:34:27 (1504276467) [38212.118675] BUG: unable to handle kernel NULL pointer dereference at 0000000000000008 [38212.120795] IP: [<ffffffff811dbb04>] __memcg_kmem_get_cache+0xe4/0x220 [38212.121489] PGD 310c0a067 PUD 28e92c067 PMD 0 [38212.122192] Oops: 0000 [#1] SMP DEBUG_PAGEALLOC [38212.122849] Modules linked in: lustre(OE) ofd(OE) osp(OE) lod(OE) ost(OE) mdt(OE) mdd(OE) mgs(OE) osd_zfs(OE) lquota(OE) lfsck(OE) obdecho(OE) mgc(OE) lov(OE) osc(OE) mdc(OE) lmv(OE) fid(OE) fld(OE) ptlrpc_gss(OE) ptlrpc(OE) obdclass(OE) ksocklnd(OE) lnet(OE) libcfs(OE) brd ext4 mbcache loop zfs(PO) zunicode(PO) zavl(PO) icp(PO) zcommon(PO) znvpair(PO) spl(O) zlib_deflate jbd2 syscopyarea sysfillrect ata_generic sysimgblt pata_acpi ttm drm_kms_helper ata_piix drm i2c_piix4 libata serio_raw virtio_balloon pcspkr virtio_console i2c_core virtio_blk floppy nfsd ip_tables rpcsec_gss_krb5 [last unloaded: libcfs] [38212.145920] CPU: 2 PID: 31539 Comm: dd Tainted: P W OE ------------ 3.10.0-debug #2 [38212.147177] Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011 [38212.147821] task: ffff8802f2bf4800 ti: ffff880294f20000 task.ti: ffff880294f20000 [38212.152755] RIP: 0010:[<ffffffff811dbb04>] [<ffffffff811dbb04>] __memcg_kmem_get_cache+0xe4/0x220 [38212.153730] RSP: 0018:ffff880294f237f0 EFLAGS: 00010286 [38212.154194] RAX: 0000000000000000 RBX: ffff8803232c5c40 RCX: 0000000000000002 [38212.154672] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000246 [38212.155168] RBP: ffff880294f23810 R08: 0000000000000000 R09: 0000000000000000 [38212.155647] R10: 0000000000000000 R11: 0000000200000007 R12: ffff8802f2bf4800 [38212.156134] R13: ffff88031f6a6000 R14: ffff8803232c5c40 R15: ffff8803232c5c40 [38212.156898] FS: 00007f1f35a4e740(0000) GS:ffff88033e440000(0000) knlGS:0000000000000000 [38212.159271] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [38212.159923] CR2: 0000000000000008 CR3: 00000002f011d000 CR4: 00000000000006e0 [38212.160625] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [38212.161320] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 [38212.163273] Stack: [38212.163852] ffffffff811dba68 0000000000008050 ffff8802c59a5000 ffff8802a991ee00 [38212.165119] ffff880294f238a0 ffffffff811cca5c ffffffffa0570615 ffffc9000ab51000 [38212.166468] ffff880200000127 ffffffffa05a5547 ffff88028b683e80 ffff8803232c5c40 [38212.168537] Call Trace: [38212.169340] [<ffffffff811dba68>] ? __memcg_kmem_get_cache+0x48/0x220 [38212.170547] [<ffffffff811cca5c>] kmem_cache_alloc+0x1ec/0x640 [38212.171879] [<ffffffffa0570615>] ? ldlm_resource_putref+0x75/0x400 [ptlrpc] [38212.172659] [<ffffffffa05a5547>] ? ptlrpc_request_cache_alloc+0x27/0x110 [ptlrpc] [38212.174145] [<ffffffffa07c0f0d>] ? mdc_resource_get_unused+0x14d/0x2a0 [mdc] [38212.174871] [<ffffffffa05a5547>] ptlrpc_request_cache_alloc+0x27/0x110 [ptlrpc] [38212.177273] [<ffffffffa05a5655>] ptlrpc_request_alloc_internal+0x25/0x480 [ptlrpc] [38212.178618] [<ffffffffa05a5ac3>] ptlrpc_request_alloc+0x13/0x20 [ptlrpc] [38212.179440] [<ffffffffa07c6a60>] mdc_enqueue_base+0x6c0/0x18a0 [mdc] [38212.180168] [<ffffffffa07c845b>] mdc_intent_lock+0x26b/0x520 [mdc] [38212.180869] [<ffffffffa161dad0>] ? ll_invalidate_negative_children+0x1e0/0x1e0 [lustre] [38212.182291] [<ffffffffa0584ab0>] ? ldlm_expired_completion_wait+0x240/0x240 [ptlrpc] [38212.183569] [<ffffffffa079723d>] lmv_intent_lock+0xc0d/0x1b50 [lmv] [38212.184289] [<ffffffff810ac3c1>] ? in_group_p+0x31/0x40 [38212.184941] [<ffffffffa161e5c5>] ? ll_i2suppgid+0x15/0x40 [lustre] [38212.185667] [<ffffffffa161e614>] ? ll_i2gids+0x24/0xb0 [lustre] [38212.186372] [<ffffffff811073d2>] ? from_kgid+0x12/0x20 [38212.187062] [<ffffffffa1609275>] ? ll_prep_md_op_data+0x235/0x520 [lustre] [38212.187754] [<ffffffffa161dad0>] ? ll_invalidate_negative_children+0x1e0/0x1e0 [lustre] [38212.190244] [<ffffffffa161fd34>] ll_lookup_it+0x2a4/0xef0 [lustre] [38212.190918] [<ffffffffa1620ab7>] ll_atomic_open+0x137/0x12d0 [lustre] [38212.191636] [<ffffffff817063d7>] ? _raw_spin_unlock+0x27/0x40 [38212.192425] [<ffffffff811f82fb>] ? lookup_dcache+0x8b/0xb0 [38212.193270] [<ffffffff811fd551>] do_last+0xa21/0x12b0 [38212.194603] [<ffffffff811fdea2>] path_openat+0xc2/0x4a0 [38212.195481] [<ffffffff811ff69b>] do_filp_open+0x4b/0xb0 [38212.196351] [<ffffffff817063d7>] ? _raw_spin_unlock+0x27/0x40 [38212.197169] [<ffffffff8120d137>] ? __alloc_fd+0xa7/0x130 [38212.197815] [<ffffffff811ec553>] do_sys_open+0xf3/0x1f0 [38212.198506] [<ffffffff811ec66e>] SyS_open+0x1e/0x20 [38212.199225] [<ffffffff8170fc49>] system_call_fastpath+0x16/0x1b [38212.199896] Code: 01 00 00 41 f6 85 10 03 00 00 03 0f 84 f6 00 00 00 4d 85 ed 48 c7 c2 ff ff ff ff 74 07 49 63 95 98 06 00 00 48 8b 83 e0 00 00 00 <4c> 8b 64 d0 08 4d 85 e4 0f 85 d1 00 00 00 41 f6 45 10 01 0f 84 [38212.202617] RIP [<ffffffff811dbb04>] __memcg_kmem_get_cache+0xe4/0x220 [38212.203345] RSP <ffff880294f237f0> I have a crashdump if anybody is interested.
            pjones Peter Jones added a comment -

            Landed for 2.11

            pjones Peter Jones added a comment - Landed for 2.11

            Oleg Drokin (oleg.drokin@intel.com) merged in patch https://review.whamcloud.com/21745/
            Subject: LU-8435 tests: slab alloc error does not LBUG
            Project: fs/lustre-release
            Branch: master
            Current Patch Set:
            Commit: 15dac618aabf2d5611a280bce13ca79c673f4f6d

            gerrit Gerrit Updater added a comment - Oleg Drokin (oleg.drokin@intel.com) merged in patch https://review.whamcloud.com/21745/ Subject: LU-8435 tests: slab alloc error does not LBUG Project: fs/lustre-release Branch: master Current Patch Set: Commit: 15dac618aabf2d5611a280bce13ca79c673f4f6d
            pjones Peter Jones added a comment -

            Yes I meant the testing patch

            pjones Peter Jones added a comment - Yes I meant the testing patch
            simmonsja James A Simmons added a comment - - edited

            Peter the original fix https://review.whamcloud.com/#/c/13956 has already landed to master. I think this is safe to close. Or do you mean https://review.whamcloud.com/#/c/21745 ?

            simmonsja James A Simmons added a comment - - edited Peter the original fix https://review.whamcloud.com/#/c/13956  has already landed to master. I think this is safe to close. Or do you mean https://review.whamcloud.com/#/c/21745  ?
            pjones Peter Jones added a comment -

            I think that we need the ticket to remain open until the original patch has landed to master

            pjones Peter Jones added a comment - I think that we need the ticket to remain open until the original patch has landed to master

            Hi Bruno,

            Thanks for the backport.
            You can close.

            Regards.

            spiechurski Sebastien Piechurski added a comment - Hi Bruno, Thanks for the backport. You can close. Regards.

            People

              bfaccini Bruno Faccini (Inactive)
              spiechurski Sebastien Piechurski
              Votes:
              0 Vote for this issue
              Watchers:
              16 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: