Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-8435

LBUG (osc_cache.c:1290:osc_completion()) ASSERTION( equi(page->cp_state == CPS_PAGEIN, cmd == OBD_BRW_READ) )

Details

    • Bug
    • Resolution: Fixed
    • Major
    • Lustre 2.11.0
    • Lustre 2.7.0
    • Bull Lustre distribution based on Lustre 2.7.2
    • 3
    • 9223372036854775807

    Description

      In the last month one of our customer hit more than 100 times a crash with the following signature:

      [506626.555125] SLUB: Unable to allocate memory on node -1 (gfp=0x80c0)
      [506626.562216]   cache: kvm_mmu_page_header(22:step_batch), object size: 168,
      buffer size: 168, default order: 1, min order: 0
      [506626.574729]   node 0: slabs: 0, objs: 0, free: 0
      [506626.579974]   node 1: slabs: 0, objs: 0, free: 0
      [506626.585219]   node 2: slabs: 60, objs: 2880, free: 0
      [506626.590852]   node 3: slabs: 0, objs: 0, free: 0
      [506626.596112] LustreError: 41604:0:(osc_cache.c:1290:osc_completion())
      ASSERTION( equi(page->cp_state == CPS_PAGEIN, cmd == OBD_BRW_READ) ) failed:
      cp_state:0, cmd:1
      [506626.612512] LustreError: 41604:0:(osc_cache.c:1290:osc_completion()) LBUG
      [506626.620186] Pid: 41604, comm: cat
      [506626.623978]
                      Call Trace:
      [506626.628573]  [<ffffffffa05eb853>] libcfs_debug_dumpstack+0x53/0x80
      [libcfs]
      [506626.636448]  [<ffffffffa05ebdf5>] lbug_with_loc+0x45/0xc0 [libcfs]
      [506626.643456]  [<ffffffffa0dea859>] osc_ap_completion.isra.30+0x4d9/0x5b0
      [osc]
      [506626.651526]  [<ffffffffa0df558d>] osc_queue_sync_pages+0x2dd/0x350 [osc]
      [506626.659108]  [<ffffffffa0de750f>] osc_io_submit+0x42f/0x530 [osc]
      [506626.666037]  [<ffffffffa086fbd6>] cl_io_submit_rw+0x66/0x170 [obdclass]
      [506626.673531]  [<ffffffffa0b8d257>] lov_io_submit+0x2a7/0x420 [lov]
      [506626.680450]  [<ffffffffa086fbd6>] cl_io_submit_rw+0x66/0x170 [obdclass]
      [506626.687961]  [<ffffffffa0c67f70>] ll_readpage+0x2d0/0x560 [lustre]
      [506626.694964]  [<ffffffff8116af87>] generic_file_aio_read+0x3b7/0x750
      [506626.702078]  [<ffffffffa0c98485>] vvp_io_read_start+0x3c5/0x470 [lustre]
      [506626.709674]  [<ffffffffa086f965>] cl_io_start+0x65/0x130 [obdclass]
      [506626.716785]  [<ffffffffa0872f85>] cl_io_loop+0xa5/0x190 [obdclass]
      [506626.723797]  [<ffffffffa0c34e8c>] ll_file_io_generic+0x5fc/0xae0 [lustre]
      [506626.731477]  [<ffffffffa0c35db2>] ll_file_aio_read+0x192/0x530 [lustre]
      [506626.738962]  [<ffffffffa0c3621b>] ll_file_read+0xcb/0x1e0 [lustre]
      [506626.745962]  [<ffffffff811dea1c>] vfs_read+0x9c/0x170
      [506626.751700]  [<ffffffff811df56f>] SyS_read+0x7f/0xe0
      [506626.757345]  [<ffffffff81646889>] system_call_fastpath+0x16/0x1b
      [506626.764138]
      [506626.765990] Kernel panic - not syncing: LBUG
      [506626.770850] CPU: 53 PID: 41604 Comm: cat Tainted: G           OE 
      ------------   3.10.0-327.22.2.el7.x86_64 #1
      [506626.782104] Hardware name: BULL bullx blade/CHPU, BIOS BIOSX07.037.01.003
      10/23/2015
      [506626.790838]  ffffffffa0610ced 000000000f6a3070 ffff8817799eb8c0
      ffffffff816360f4
      [506626.799228]  ffff8817799eb940 ffffffff8162f96a ffffffff00000008
      ffff8817799eb950
      [506626.807618]  ffff8817799eb8f0 000000000f6a3070 ffffffffa0e01466
      0000000000000246
      [506626.816005] Call Trace:
      [506626.818839]  [<ffffffff816360f4>] dump_stack+0x19/0x1b
      [506626.824668]  [<ffffffff8162f96a>] panic+0xd8/0x1e7
      [506626.830128]  [<ffffffffa05ebe5b>] lbug_with_loc+0xab/0xc0 [libcfs]
      [506626.837129]  [<ffffffffa0dea859>] osc_ap_completion.isra.30+0x4d9/0x5b0
      [osc]
      [506626.845192]  [<ffffffffa0df558d>] osc_queue_sync_pages+0x2dd/0x350 [osc]
      [506626.852766]  [<ffffffffa0de750f>] osc_io_submit+0x42f/0x530 [osc]
      [506626.859702]  [<ffffffffa086fbd6>] cl_io_submit_rw+0x66/0x170 [obdclass]
      [506626.867184]  [<ffffffffa0b8d257>] lov_io_submit+0x2a7/0x420 [lov]
      [506626.874099]  [<ffffffffa086fbd6>] cl_io_submit_rw+0x66/0x170 [obdclass]
      [506626.881611]  [<ffffffffa0c67f70>] ll_readpage+0x2d0/0x560 [lustre]
      [506626.888609]  [<ffffffff8116af87>] generic_file_aio_read+0x3b7/0x750
      [506626.895721]  [<ffffffffa0c98485>] vvp_io_read_start+0x3c5/0x470 [lustre]
      [506626.903322]  [<ffffffffa086f965>] cl_io_start+0x65/0x130 [obdclass]
      [506626.910418]  [<ffffffffa0872f85>] cl_io_loop+0xa5/0x190 [obdclass]
      [506626.917420]  [<ffffffffa0c34e8c>] ll_file_io_generic+0x5fc/0xae0 [lustre]
      [506626.925091]  [<ffffffffa0c35db2>] ll_file_aio_read+0x192/0x530 [lustre]
      [506626.932575]  [<ffffffffa0c3621b>] ll_file_read+0xcb/0x1e0 [lustre]
      [506626.939569]  [<ffffffff811dea1c>] vfs_read+0x9c/0x170
      [506626.945300]  [<ffffffff811df56f>] SyS_read+0x7f/0xe0
      [506626.950938]  [<ffffffff81646889>] system_call_fastpath+0x16/0x1b
      

      The customer being a black site, we can't provide the crashdump, but will happily provide any text output you would find useful.

      Attachments

        1. crash_output.txt
          24 kB
        2. foreach_bt_merge.txt
          152 kB
        3. struct_analyze1.txt
          50 kB

        Issue Links

          Activity

            [LU-8435] LBUG (osc_cache.c:1290:osc_completion()) ASSERTION( equi(page->cp_state == CPS_PAGEIN, cmd == OBD_BRW_READ) )

            After browsing Kernel sources in kernel.org, I think this issue is fixed in 4.x and 3.x more recent Kernels, mainly with changes that have been introduced in patch/commit "memcg: don't call memcg_update_all_caches if new cache id fits" available at
            https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable.git/patch/mm/memcontrol.c?id=f3bb3043a092368a255bca5d1c6f4352c96a3b2d, and where ALL kmem_cache are systematically parsed to re-size/initialize a new memcg_params when a kmem mermory-cgroup is being started.

            And again, I need to insist that this bug is very unlikely to be triggered if we keep running with SLUB configured instead to switch back to SLAB.

            bfaccini Bruno Faccini (Inactive) added a comment - After browsing Kernel sources in kernel.org, I think this issue is fixed in 4.x and 3.x more recent Kernels, mainly with changes that have been introduced in patch/commit "memcg: don't call memcg_update_all_caches if new cache id fits" available at https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable.git/patch/mm/memcontrol.c?id=f3bb3043a092368a255bca5d1c6f4352c96a3b2d , and where ALL kmem_cache are systematically parsed to re-size/initialize a new memcg_params when a kmem mermory-cgroup is being started. And again, I need to insist that this bug is very unlikely to be triggered if we keep running with SLUB configured instead to switch back to SLAB.

            This problem should have already been addressed by patch at https://patchwork.kernel.org/patch/2850152/.
            But since the RHEL7 Kernel version we have experienced these crashes with already includes it, after digging into the Kernel related source code, I think the following one is required to fully fix this issue :

            [root@onyx-68 linux-3.10.0-327.22.2.el7-debug]# diff -urN mm/memcontrol.c mm/memcontrol.c.bfi
            --- mm/memcontrol.c     2016-06-09 06:31:12.000000000 -0700
            +++ mm/memcontrol.c.bfi 2017-09-08 07:37:18.647281366 -0700
            @@ -3163,7 +3163,15 @@ int memcg_update_cache_size(struct kmem_cache *s, int num_groups)
             
                    VM_BUG_ON(s->memcg_params && !s->memcg_params->is_root_cache);
             
            -       if (num_groups > memcg_limited_groups_array_size) {
            +       /* resize/grow existing memcg_params or allocate it if had not already
            +        * been done during kmem_cache creation because none of previously used
            +        * kmem memcg were present at this time (i.e.
            +        * memcg_limited_groups_array_size != 0 but memcg_kmem_enabled()
            +        * returned false). It could not have been necessary if
            +        * memcg_caches_array_size() was not used to anticipate more than slots
            +        * required and if memcg_limited_groups_array_size would simply
            +        * increment upon each new kmem memcg creation.
            +       if (num_groups > memcg_limited_groups_array_size || !s->memcg_params) {
                            int i;
                            ssize_t size = memcg_caches_array_size(num_groups);
             
            @@ -3203,7 +3211,8 @@
                             * bigger than the others. And all updates will reset this
                             * anyway.
                             */
            -               kfree(cur_params);
            +               if (cur_params)
            +                       kfree(cur_params);
                    }
                    return 0;
             }
            [root@onyx-68 linux-3.10.0-327.22.2.el7-debug]# 
            

            Oleg, you may want to try it ? And on my side I will check if this problem is still present in recent Kernels and report it if not.

            bfaccini Bruno Faccini (Inactive) added a comment - This problem should have already been addressed by patch at https://patchwork.kernel.org/patch/2850152/ . But since the RHEL7 Kernel version we have experienced these crashes with already includes it, after digging into the Kernel related source code, I think the following one is required to fully fix this issue : [root@onyx-68 linux-3.10.0-327.22.2.el7-debug]# diff -urN mm/memcontrol.c mm/memcontrol.c.bfi --- mm/memcontrol.c 2016-06-09 06:31:12.000000000 -0700 +++ mm/memcontrol.c.bfi 2017-09-08 07:37:18.647281366 -0700 @@ -3163,7 +3163,15 @@ int memcg_update_cache_size(struct kmem_cache *s, int num_groups) VM_BUG_ON(s->memcg_params && !s->memcg_params->is_root_cache); - if (num_groups > memcg_limited_groups_array_size) { + /* resize/grow existing memcg_params or allocate it if had not already + * been done during kmem_cache creation because none of previously used + * kmem memcg were present at this time (i.e. + * memcg_limited_groups_array_size != 0 but memcg_kmem_enabled() + * returned false). It could not have been necessary if + * memcg_caches_array_size() was not used to anticipate more than slots + * required and if memcg_limited_groups_array_size would simply + * increment upon each new kmem memcg creation. + if (num_groups > memcg_limited_groups_array_size || !s->memcg_params) { int i; ssize_t size = memcg_caches_array_size(num_groups); @@ -3203,7 +3211,8 @@ * bigger than the others. And all updates will reset this * anyway. */ - kfree(cur_params); + if (cur_params) + kfree(cur_params); } return 0; } [root@onyx-68 linux-3.10.0-327.22.2.el7-debug]# Oleg, you may want to try it ? And on my side I will check if this problem is still present in recent Kernels and report it if not.
            bfaccini Bruno Faccini (Inactive) added a comment - - edited

            Well, I have been quite puzzled with these crash-dumps, because according to the Kernel source code, the only case where memcg_params could be NULL is when a kmem_cache is being created and !memcg_kmem_enabled() (tested in memcg_register_cache(), this is when no memory group with kmem limit is still active), but then all kmem_cache should have their memcg_params==NULL, and not only the Lustre ones!! I have tested this on a fresh system after reboot with no memory cgroup configured, all kmem_cache, and even the Lustre ones after a delayed Lustre start and modules load, have their memcg_params==NULL.

            Finally, I found a possible scenario where this can happen, and this is confirmed by/in all Kernel logs from the crash-dumps, the sanity/test_411 test (likely to be the only one to create+use a memory group), has to be run at least 2 times with Lustre modules unload in between (causing Lustre kmem_cache to be destroyed and re-created).
            Then, trying to reproduce by doing so on a test system I was not able to make it since !memcg_kmem_enabled() could never be satisfied because of "child/memcg" kmem_cache (created during 1st sub-test run and memory group creation) still being present, and this even being fully empty/freed. But this could be where the SLAB vs SLUB config should have made the difference, because it appears that SLAB mechanism include a regular kmem_cache shrink/drain cleanup that could have caused these kind of leaked kmem_cache to be garbage collected, when this never (or may be a lot later) could happen running with SLUB.

            But anyway, the fact that memcg_params could be NULL for a recently created kmem_cache when no kmem memory cgroup is still active (but some had already existed before!), needs to be handled by current code. I will check if this is already the case in recent Kernels, and push a fix if not.

            Oleg, this looks like a new argument to confirm the need to test Kernels with "unsupported" configs!

            bfaccini Bruno Faccini (Inactive) added a comment - - edited Well, I have been quite puzzled with these crash-dumps, because according to the Kernel source code, the only case where memcg_params could be NULL is when a kmem_cache is being created and !memcg_kmem_enabled() (tested in memcg_register_cache(), this is when no memory group with kmem limit is still active), but then all kmem_cache should have their memcg_params==NULL, and not only the Lustre ones!! I have tested this on a fresh system after reboot with no memory cgroup configured, all kmem_cache, and even the Lustre ones after a delayed Lustre start and modules load, have their memcg_params==NULL. Finally, I found a possible scenario where this can happen, and this is confirmed by/in all Kernel logs from the crash-dumps, the sanity/test_411 test (likely to be the only one to create+use a memory group), has to be run at least 2 times with Lustre modules unload in between (causing Lustre kmem_cache to be destroyed and re-created). Then, trying to reproduce by doing so on a test system I was not able to make it since !memcg_kmem_enabled() could never be satisfied because of "child/memcg" kmem_cache (created during 1st sub-test run and memory group creation) still being present, and this even being fully empty/freed. But this could be where the SLAB vs SLUB config should have made the difference, because it appears that SLAB mechanism include a regular kmem_cache shrink/drain cleanup that could have caused these kind of leaked kmem_cache to be garbage collected, when this never (or may be a lot later) could happen running with SLUB. But anyway, the fact that memcg_params could be NULL for a recently created kmem_cache when no kmem memory cgroup is still active (but some had already existed before!), needs to be handled by current code. I will check if this is already the case in recent Kernels, and push a fix if not. Oleg, this looks like a new argument to confirm the need to test Kernels with "unsupported" configs!
            green Oleg Drokin added a comment -

            noet that I am running unsupported redhat kernel config with a lot of debugging and I also enable SLAB, where as redhat defaults to SLUB.

            This does not make the bugs any less real of course, but at times it takes some time to convince the RH people that it's ok to have unsupported config because it just helps to detect the bugs earlier vs random crashes elsewhere later.

            green Oleg Drokin added a comment - noet that I am running unsupported redhat kernel config with a lot of debugging and I also enable SLAB, where as redhat defaults to SLUB. This does not make the bugs any less real of course, but at times it takes some time to convince the RH people that it's ok to have unsupported config because it just helps to detect the bugs earlier vs random crashes elsewhere later.
            bfaccini Bruno Faccini (Inactive) added a comment - - edited

            Aurelien, I know that you posted this new test to ensure memcg limit do not cause crashes in Lustre code, but with this new kind of crash in Kernel/memcg layer, it seems you should also propose it for the kernel regression tests suite!!

            My first crash-dumps analysis results poins to a possible race during memcg lazy registration to current kmem_caches and concurrent Slab allocations, that triggers the unexpected situation, in __memcg_kmem_get_cache(), where memcg_params has still not been initialized in "ptlrpc_cache" kmem_cache.

            It is also interesting to note that recent auto-tests results of sanity/tests_411 are all success, when all these crashes have occurred during single-node sessions, and that the only kmem_cache in the system that do not have memcg_params initialized are those that have been created in Lustre code.

            More to come.

            bfaccini Bruno Faccini (Inactive) added a comment - - edited Aurelien, I know that you posted this new test to ensure memcg limit do not cause crashes in Lustre code, but with this new kind of crash in Kernel/memcg layer, it seems you should also propose it for the kernel regression tests suite!! My first crash-dumps analysis results poins to a possible race during memcg lazy registration to current kmem_caches and concurrent Slab allocations, that triggers the unexpected situation, in __memcg_kmem_get_cache(), where memcg_params has still not been initialized in "ptlrpc_cache" kmem_cache. It is also interesting to note that recent auto-tests results of sanity/tests_411 are all success, when all these crashes have occurred during single-node sessions, and that the only kmem_cache in the system that do not have memcg_params initialized are those that have been created in Lustre code. More to come.

            On that note, Aurelien, I think we should add a write component to the test after the memory limit is set... Or perhaps a separate test. But either way - write under pressure would be good to have as well.

            paf Patrick Farrell (Inactive) added a comment - On that note, Aurelien, I think we should add a write component to the test after the memory limit is set... Or perhaps a separate test. But either way - write under pressure would be good to have as well.

            Bruno, this was exactly the purpose of this test. It seems it discover other memory management issues in client code. I/O is not really expected to succeed under such constraints, but only returing EIO or ENOMEM, not crashing

            adegremont Aurelien Degremont (Inactive) added a comment - Bruno, this was exactly the purpose of this test. It seems it discover other memory management issues in client code. I/O is not really expected to succeed under such constraints, but only returing EIO or ENOMEM, not crashing
            green Oleg Drokin added a comment -

            Ok, thanks.
            I had 4 more failures in the past 24 hours, btw.

            The crashdumps are on onyx-68 in /export/crashdumps.
            they are:
            192.168.123.199-2017-09-01-10:34:*
            192.168.123.111-2017-09-02-15:06:*
            192.168.123.195-2017-09-03-13:*
            192.168.123.151-2017-09-03-14:06:*
            192.168.123.135-2017-09-03-14:11:*

            build tree is currently in /export/centos7-nfsroot/home/green/git/lustre-release with all the modules (I'll update it on Tuesday ,but should be good for the next 30 or so hours).

            green Oleg Drokin added a comment - Ok, thanks. I had 4 more failures in the past 24 hours, btw. The crashdumps are on onyx-68 in /export/crashdumps. they are: 192.168.123.199-2017-09-01-10:34:* 192.168.123.111-2017-09-02-15:06:* 192.168.123.195-2017-09-03-13:* 192.168.123.151-2017-09-03-14:06:* 192.168.123.135-2017-09-03-14:11:* build tree is currently in /export/centos7-nfsroot/home/green/git/lustre-release with all the modules (I'll update it on Tuesday ,but should be good for the next 30 or so hours).

            Oleg,
            my guess is that this new sub-test sanity/test_411, introduced by change #21745, is setting a highly constraining Kernel memory limit that is very likely to trigger some memcg/slab bug.
            But I am ok to have a look to the crash dump to try to confirm.

            bfaccini Bruno Faccini (Inactive) added a comment - Oleg, my guess is that this new sub-test sanity/test_411, introduced by change #21745, is setting a highly constraining Kernel memory limit that is very likely to trigger some memcg/slab bug. But I am ok to have a look to the crash dump to try to confirm.
            green Oleg Drokin added a comment -

            Hm I just hada failure in a test introduced by this patch:

            [38199.302263] Lustre: DEBUG MARKER: == sanity test 411: Slab allocation error with cgroup does not LBUG ================================== 10:34:27 (1504276467)
            [38212.118675] BUG: unable to handle kernel NULL pointer dereference at 0000000000000008
            [38212.120795] IP: [<ffffffff811dbb04>] __memcg_kmem_get_cache+0xe4/0x220
            [38212.121489] PGD 310c0a067 PUD 28e92c067 PMD 0 
            [38212.122192] Oops: 0000 [#1] SMP DEBUG_PAGEALLOC
            [38212.122849] Modules linked in: lustre(OE) ofd(OE) osp(OE) lod(OE) ost(OE) mdt(OE) mdd(OE) mgs(OE) osd_zfs(OE) lquota(OE) lfsck(OE) obdecho(OE) mgc(OE) lov(OE) osc(OE) mdc(OE) lmv(OE) fid(OE) fld(OE) ptlrpc_gss(OE) ptlrpc(OE) obdclass(OE) ksocklnd(OE) lnet(OE) libcfs(OE) brd ext4 mbcache loop zfs(PO) zunicode(PO) zavl(PO) icp(PO) zcommon(PO) znvpair(PO) spl(O) zlib_deflate jbd2 syscopyarea sysfillrect ata_generic sysimgblt pata_acpi ttm drm_kms_helper ata_piix drm i2c_piix4 libata serio_raw virtio_balloon pcspkr virtio_console i2c_core virtio_blk floppy nfsd ip_tables rpcsec_gss_krb5 [last unloaded: libcfs]
            [38212.145920] CPU: 2 PID: 31539 Comm: dd Tainted: P        W  OE  ------------   3.10.0-debug #2
            [38212.147177] Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011
            [38212.147821] task: ffff8802f2bf4800 ti: ffff880294f20000 task.ti: ffff880294f20000
            [38212.152755] RIP: 0010:[<ffffffff811dbb04>]  [<ffffffff811dbb04>] __memcg_kmem_get_cache+0xe4/0x220
            [38212.153730] RSP: 0018:ffff880294f237f0  EFLAGS: 00010286
            [38212.154194] RAX: 0000000000000000 RBX: ffff8803232c5c40 RCX: 0000000000000002
            [38212.154672] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000246
            [38212.155168] RBP: ffff880294f23810 R08: 0000000000000000 R09: 0000000000000000
            [38212.155647] R10: 0000000000000000 R11: 0000000200000007 R12: ffff8802f2bf4800
            [38212.156134] R13: ffff88031f6a6000 R14: ffff8803232c5c40 R15: ffff8803232c5c40
            [38212.156898] FS:  00007f1f35a4e740(0000) GS:ffff88033e440000(0000) knlGS:0000000000000000
            [38212.159271] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
            [38212.159923] CR2: 0000000000000008 CR3: 00000002f011d000 CR4: 00000000000006e0
            [38212.160625] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
            [38212.161320] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
            [38212.163273] Stack:
            [38212.163852]  ffffffff811dba68 0000000000008050 ffff8802c59a5000 ffff8802a991ee00
            [38212.165119]  ffff880294f238a0 ffffffff811cca5c ffffffffa0570615 ffffc9000ab51000
            [38212.166468]  ffff880200000127 ffffffffa05a5547 ffff88028b683e80 ffff8803232c5c40
            [38212.168537] Call Trace:
            [38212.169340]  [<ffffffff811dba68>] ? __memcg_kmem_get_cache+0x48/0x220
            [38212.170547]  [<ffffffff811cca5c>] kmem_cache_alloc+0x1ec/0x640
            [38212.171879]  [<ffffffffa0570615>] ? ldlm_resource_putref+0x75/0x400 [ptlrpc]
            [38212.172659]  [<ffffffffa05a5547>] ? ptlrpc_request_cache_alloc+0x27/0x110 [ptlrpc]
            [38212.174145]  [<ffffffffa07c0f0d>] ? mdc_resource_get_unused+0x14d/0x2a0 [mdc]
            [38212.174871]  [<ffffffffa05a5547>] ptlrpc_request_cache_alloc+0x27/0x110 [ptlrpc]
            [38212.177273]  [<ffffffffa05a5655>] ptlrpc_request_alloc_internal+0x25/0x480 [ptlrpc]
            [38212.178618]  [<ffffffffa05a5ac3>] ptlrpc_request_alloc+0x13/0x20 [ptlrpc]
            [38212.179440]  [<ffffffffa07c6a60>] mdc_enqueue_base+0x6c0/0x18a0 [mdc]
            [38212.180168]  [<ffffffffa07c845b>] mdc_intent_lock+0x26b/0x520 [mdc]
            [38212.180869]  [<ffffffffa161dad0>] ? ll_invalidate_negative_children+0x1e0/0x1e0 [lustre]
            [38212.182291]  [<ffffffffa0584ab0>] ? ldlm_expired_completion_wait+0x240/0x240 [ptlrpc]
            [38212.183569]  [<ffffffffa079723d>] lmv_intent_lock+0xc0d/0x1b50 [lmv]
            [38212.184289]  [<ffffffff810ac3c1>] ? in_group_p+0x31/0x40
            [38212.184941]  [<ffffffffa161e5c5>] ? ll_i2suppgid+0x15/0x40 [lustre]
            [38212.185667]  [<ffffffffa161e614>] ? ll_i2gids+0x24/0xb0 [lustre]
            [38212.186372]  [<ffffffff811073d2>] ? from_kgid+0x12/0x20
            [38212.187062]  [<ffffffffa1609275>] ? ll_prep_md_op_data+0x235/0x520 [lustre]
            [38212.187754]  [<ffffffffa161dad0>] ? ll_invalidate_negative_children+0x1e0/0x1e0 [lustre]
            [38212.190244]  [<ffffffffa161fd34>] ll_lookup_it+0x2a4/0xef0 [lustre]
            [38212.190918]  [<ffffffffa1620ab7>] ll_atomic_open+0x137/0x12d0 [lustre]
            [38212.191636]  [<ffffffff817063d7>] ? _raw_spin_unlock+0x27/0x40
            [38212.192425]  [<ffffffff811f82fb>] ? lookup_dcache+0x8b/0xb0
            [38212.193270]  [<ffffffff811fd551>] do_last+0xa21/0x12b0
            [38212.194603]  [<ffffffff811fdea2>] path_openat+0xc2/0x4a0
            [38212.195481]  [<ffffffff811ff69b>] do_filp_open+0x4b/0xb0
            [38212.196351]  [<ffffffff817063d7>] ? _raw_spin_unlock+0x27/0x40
            [38212.197169]  [<ffffffff8120d137>] ? __alloc_fd+0xa7/0x130
            [38212.197815]  [<ffffffff811ec553>] do_sys_open+0xf3/0x1f0
            [38212.198506]  [<ffffffff811ec66e>] SyS_open+0x1e/0x20
            [38212.199225]  [<ffffffff8170fc49>] system_call_fastpath+0x16/0x1b
            [38212.199896] Code: 01 00 00 41 f6 85 10 03 00 00 03 0f 84 f6 00 00 00 4d 85 ed 48 c7 c2 ff ff ff ff 74 07 49 63 95 98 06 00 00 48 8b 83 e0 00 00 00 <4c> 8b 64 d0 08 4d 85 e4 0f 85 d1 00 00 00 41 f6 45 10 01 0f 84 
            [38212.202617] RIP  [<ffffffff811dbb04>] __memcg_kmem_get_cache+0xe4/0x220
            [38212.203345]  RSP <ffff880294f237f0>
            

            I have a crashdump if anybody is interested.

            green Oleg Drokin added a comment - Hm I just hada failure in a test introduced by this patch: [38199.302263] Lustre: DEBUG MARKER: == sanity test 411: Slab allocation error with cgroup does not LBUG ================================== 10:34:27 (1504276467) [38212.118675] BUG: unable to handle kernel NULL pointer dereference at 0000000000000008 [38212.120795] IP: [<ffffffff811dbb04>] __memcg_kmem_get_cache+0xe4/0x220 [38212.121489] PGD 310c0a067 PUD 28e92c067 PMD 0 [38212.122192] Oops: 0000 [#1] SMP DEBUG_PAGEALLOC [38212.122849] Modules linked in: lustre(OE) ofd(OE) osp(OE) lod(OE) ost(OE) mdt(OE) mdd(OE) mgs(OE) osd_zfs(OE) lquota(OE) lfsck(OE) obdecho(OE) mgc(OE) lov(OE) osc(OE) mdc(OE) lmv(OE) fid(OE) fld(OE) ptlrpc_gss(OE) ptlrpc(OE) obdclass(OE) ksocklnd(OE) lnet(OE) libcfs(OE) brd ext4 mbcache loop zfs(PO) zunicode(PO) zavl(PO) icp(PO) zcommon(PO) znvpair(PO) spl(O) zlib_deflate jbd2 syscopyarea sysfillrect ata_generic sysimgblt pata_acpi ttm drm_kms_helper ata_piix drm i2c_piix4 libata serio_raw virtio_balloon pcspkr virtio_console i2c_core virtio_blk floppy nfsd ip_tables rpcsec_gss_krb5 [last unloaded: libcfs] [38212.145920] CPU: 2 PID: 31539 Comm: dd Tainted: P W OE ------------ 3.10.0-debug #2 [38212.147177] Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011 [38212.147821] task: ffff8802f2bf4800 ti: ffff880294f20000 task.ti: ffff880294f20000 [38212.152755] RIP: 0010:[<ffffffff811dbb04>] [<ffffffff811dbb04>] __memcg_kmem_get_cache+0xe4/0x220 [38212.153730] RSP: 0018:ffff880294f237f0 EFLAGS: 00010286 [38212.154194] RAX: 0000000000000000 RBX: ffff8803232c5c40 RCX: 0000000000000002 [38212.154672] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000246 [38212.155168] RBP: ffff880294f23810 R08: 0000000000000000 R09: 0000000000000000 [38212.155647] R10: 0000000000000000 R11: 0000000200000007 R12: ffff8802f2bf4800 [38212.156134] R13: ffff88031f6a6000 R14: ffff8803232c5c40 R15: ffff8803232c5c40 [38212.156898] FS: 00007f1f35a4e740(0000) GS:ffff88033e440000(0000) knlGS:0000000000000000 [38212.159271] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [38212.159923] CR2: 0000000000000008 CR3: 00000002f011d000 CR4: 00000000000006e0 [38212.160625] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [38212.161320] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 [38212.163273] Stack: [38212.163852] ffffffff811dba68 0000000000008050 ffff8802c59a5000 ffff8802a991ee00 [38212.165119] ffff880294f238a0 ffffffff811cca5c ffffffffa0570615 ffffc9000ab51000 [38212.166468] ffff880200000127 ffffffffa05a5547 ffff88028b683e80 ffff8803232c5c40 [38212.168537] Call Trace: [38212.169340] [<ffffffff811dba68>] ? __memcg_kmem_get_cache+0x48/0x220 [38212.170547] [<ffffffff811cca5c>] kmem_cache_alloc+0x1ec/0x640 [38212.171879] [<ffffffffa0570615>] ? ldlm_resource_putref+0x75/0x400 [ptlrpc] [38212.172659] [<ffffffffa05a5547>] ? ptlrpc_request_cache_alloc+0x27/0x110 [ptlrpc] [38212.174145] [<ffffffffa07c0f0d>] ? mdc_resource_get_unused+0x14d/0x2a0 [mdc] [38212.174871] [<ffffffffa05a5547>] ptlrpc_request_cache_alloc+0x27/0x110 [ptlrpc] [38212.177273] [<ffffffffa05a5655>] ptlrpc_request_alloc_internal+0x25/0x480 [ptlrpc] [38212.178618] [<ffffffffa05a5ac3>] ptlrpc_request_alloc+0x13/0x20 [ptlrpc] [38212.179440] [<ffffffffa07c6a60>] mdc_enqueue_base+0x6c0/0x18a0 [mdc] [38212.180168] [<ffffffffa07c845b>] mdc_intent_lock+0x26b/0x520 [mdc] [38212.180869] [<ffffffffa161dad0>] ? ll_invalidate_negative_children+0x1e0/0x1e0 [lustre] [38212.182291] [<ffffffffa0584ab0>] ? ldlm_expired_completion_wait+0x240/0x240 [ptlrpc] [38212.183569] [<ffffffffa079723d>] lmv_intent_lock+0xc0d/0x1b50 [lmv] [38212.184289] [<ffffffff810ac3c1>] ? in_group_p+0x31/0x40 [38212.184941] [<ffffffffa161e5c5>] ? ll_i2suppgid+0x15/0x40 [lustre] [38212.185667] [<ffffffffa161e614>] ? ll_i2gids+0x24/0xb0 [lustre] [38212.186372] [<ffffffff811073d2>] ? from_kgid+0x12/0x20 [38212.187062] [<ffffffffa1609275>] ? ll_prep_md_op_data+0x235/0x520 [lustre] [38212.187754] [<ffffffffa161dad0>] ? ll_invalidate_negative_children+0x1e0/0x1e0 [lustre] [38212.190244] [<ffffffffa161fd34>] ll_lookup_it+0x2a4/0xef0 [lustre] [38212.190918] [<ffffffffa1620ab7>] ll_atomic_open+0x137/0x12d0 [lustre] [38212.191636] [<ffffffff817063d7>] ? _raw_spin_unlock+0x27/0x40 [38212.192425] [<ffffffff811f82fb>] ? lookup_dcache+0x8b/0xb0 [38212.193270] [<ffffffff811fd551>] do_last+0xa21/0x12b0 [38212.194603] [<ffffffff811fdea2>] path_openat+0xc2/0x4a0 [38212.195481] [<ffffffff811ff69b>] do_filp_open+0x4b/0xb0 [38212.196351] [<ffffffff817063d7>] ? _raw_spin_unlock+0x27/0x40 [38212.197169] [<ffffffff8120d137>] ? __alloc_fd+0xa7/0x130 [38212.197815] [<ffffffff811ec553>] do_sys_open+0xf3/0x1f0 [38212.198506] [<ffffffff811ec66e>] SyS_open+0x1e/0x20 [38212.199225] [<ffffffff8170fc49>] system_call_fastpath+0x16/0x1b [38212.199896] Code: 01 00 00 41 f6 85 10 03 00 00 03 0f 84 f6 00 00 00 4d 85 ed 48 c7 c2 ff ff ff ff 74 07 49 63 95 98 06 00 00 48 8b 83 e0 00 00 00 <4c> 8b 64 d0 08 4d 85 e4 0f 85 d1 00 00 00 41 f6 45 10 01 0f 84 [38212.202617] RIP [<ffffffff811dbb04>] __memcg_kmem_get_cache+0xe4/0x220 [38212.203345] RSP <ffff880294f237f0> I have a crashdump if anybody is interested.
            pjones Peter Jones added a comment -

            Landed for 2.11

            pjones Peter Jones added a comment - Landed for 2.11

            People

              bfaccini Bruno Faccini (Inactive)
              spiechurski Sebastien Piechurski
              Votes:
              0 Vote for this issue
              Watchers:
              16 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: