Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-93

Kernel panic caused by Lustre 1.8.5/1.8.6 on RH6 (patchless client)

Details

    • Bug
    • Resolution: Fixed
    • Blocker
    • None
    • Lustre 1.8.6
    • None
    • RH6, recompiled kernel, OFED 1.5.3 RC3/4/5 (bug does not depend on OFED version)

    Description

      bug already described in https://bugzilla.redhat.com/show_bug.cgi?id=678175

      Lustre 1.8.5 works without problems in RH5. Switching to RH6 I get Kernel Panic
      errors halting the system. As far as I can tell this happens only if Read and
      Writes calls are used at the same time. Using 12 threads Read or 12 threads
      Writing has not caused this issue to appear. Note - it's not possible to
      completely reproduce this issue. Sometimes the error happens after 2min,
      sometimes it takes 15min. So I'm pretty sure this is a race condition.

      The Kernel Panic happens in conjunction with the "page" structure. Lustre uses
      "page->private" to store a pointer to private information. Sometimes this value

      • which obviously should be either NULL or a pointer to a valid Kernel address
      • is set to "2". This disables all checks in Lustre for "page->private ==
        NULL", and a subsequent access "page->private->ll_magic" causes the panic. I
        probably could set "page->private" to NULL in such a case, but that would only
        hide the sympton, not remedy the root cause and thereby most likley introduce a
        new error.

      From the Kernel traces I was able to recreate the callpath:

      llap_cast_private+0x18/0xa0
      llap_from_page_with_lockh.clone.8+0x1484/0x1cf0 [lustre]
      (inline: ll_readahead_page)
      (inline: ll_readahead_pages)
      ll_readahead+0xf59/0x1ca0 [lustre]
      ll_readpage+0x29b/0x2270 [lustre]
      generic_file_aio_read+0x1f0/0x730
      ll_file_aio_read+0xcd6/0x2720 [lustre]
      ll_file_read+0xd0/0xf0 [lustre]
      vfs_read+0xb5/0x1a0
      sys_read+0x51/0x90
      system_call_fastpath+0x16/0x1b

      tracing back the origins of page->private==2 lead me to the call

      page = grab_cache_page_nowait(mapping, index);

      This function returns sometimes page->private is already set to "2". From what
      I could see in the Lustre sources "page->private" is always treated to store a
      pointer into kernel space. It is therefore very unlikely that Lustre causes
      this error itself, especially as the code works on RH5

      This leads me to the following questions:

      a) If the page was not in the cache and has been freshly allocated - should
      page->private be 0? Cause it looks to me it this is not always the case. The
      page->private==2 error is already present when the function returns.
      b) Is there a simple way to distinguish between a page freshly allocated and
      one already residing in the page cache?

      I attached a sosreport as well

      thanks
      Michael

      ############################################################################

      ############################################################################

      [root@et06 crt]# uname -r
      2.6.32-71.14.1.el6.x86_64.crt

      ############################################################################
      A typical panic originally looked like:

      ############################################################################

      BUG: unable to handle kernel NULL pointer dereference at 0000000000000002

      IP: [<ffffffffa0889d08>] llap_cast_private+0x18/0xa0 [lustre]
      PGD 32d72a067 PUD 330da0067 PMD 0
      Oops: 0000 1 SMP
      last sysfs file: /sys/module/ipv6/initstate
      CPU 20

      Modules linked in: mgc(U) lustre(U) lov(U) mdc(U) lquota(U) osc(U) ko2iblnd(U)
      ptlrpc(U) obdclass(U) lnet(U) lvfs(U) libcfs(U) nfs lockd fscache nfs_acl
      auth_rpcgss acpi_cpufreq freq_table sunrpc rdma_ucm(U) ib_sdp(U) rdma_cm(U)
      iw_cm(U) ib_addr(U) ib_ipoib(U) ib_cm(U) ib_sa(U) ipv6 ib_uverbs(U) ib_umad(U)
      mlx4_ib(U) ib_mad(U) ib_core(U) mlx4_core(U) ext2 dm_mirror dm_region_hash
      dm_log dm_mod radeon ttm drm_kms_helper drm i2c_algo_bit serio_raw i2c_i801
      i2c_core sg iTCO_wdt iTCO_vendor_support pata_jmicron i7core_edac edac_core
      shpchp igb dca ext3 jbd mbcache sr_mod cdrom sd_mod crc_t10dif usb_storage ahci
      ata_generic pata_acpi [last unloaded: cpufreq_ondemand]

      Modules linked in: mgc(U) lustre(U) lov(U) mdc(U) lquota(U) osc(U) ko2iblnd(U)
      ptlrpc(U) obdclass(U) lnet(U) lvfs(U) libcfs(U) nfs lockd fscache nfs_acl
      auth_rpcgss acpi_cpufreq freq_table sunrpc rdma_ucm(U) ib_sdp(U) rdma_cm(U)
      iw_cm(U) ib_addr(U) ib_ipoib(U) ib_cm(U) ib_sa(U) ipv6 ib_uverbs(U) ib_umad(U)
      mlx4_ib(U) ib_mad(U) ib_core(U) mlx4_core(U) ext2 dm_mirror dm_region_hash
      dm_log dm_mod radeon ttm drm_kms_helper drm i2c_algo_bit serio_raw i2c_i801
      i2c_core sg iTCO_wdt iTCO_vendor_support pata_jmicron i7core_edac edac_core
      shpchp igb dca ext3 jbd mbcache sr_mod cdrom sd_mod crc_t10dif usb_storage ahci
      ata_generic pata_acpi [last unloaded: cpufreq_ondemand]
      Pid: 3477, comm: bonnie Not tainted 2.6.32-71.14.1.el6.x86_64.crt #1 X8DTN
      RIP: 0010:[<ffffffffa0889d08>] [<ffffffffa0889d08>]
      llap_cast_private+0x18/0xa0 [lustre]
      RSP: 0018:ffff880553371748 EFLAGS: 00010202
      RAX: 0000000000000002 RBX: ffff880264d18000 RCX: ffff88009e7487e8
      RDX: ffff8805533719a8 RSI: 0000000000000002 RDI: ffffea0002988c48
      RBP: ffff880553371788 R08: 0000000000000002 R09: ffffea0002988c50
      R10: ffff8805fe3b43c0 R11: 0000000000000010 R12: ffff88032f6e2df0
      R13: ffffea0002988c48 R14: ffff88032f6e2cd0 R15: ffff88032fca8800
      FS: 00002b219d82fb20(0000) GS:ffff88034ad00000(0000) knlGS:0000000000000000
      CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
      CR2: 0000000000000002 CR3: 000000032d7ae000 CR4: 00000000000006e0
      DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
      Process bonnie (pid: 3477, threadinfo ffff880553370000, task ffff8805761f54a0)
      Stack:
      000000003f818000 ffffea0002988c10 ffff8805533717b8 ffffffffa05cbaef
      <0> ffff88062f0a2200 0000000000000000 ffff880553371798 ffff88062f0a2200
      <0> ffff880553371868 ffffffffa088e921 ffff880553371818 ffffffffa0648daa
      Call Trace:
      [<ffffffffa05cbaef>] ? class_handle2object+0x8f/0x1c0 [obdclass]
      [<ffffffffa088e921>] llap_from_page_with_lockh.clone.8+0xa1/0x1120 [lustre]
      [<ffffffffa0648daa>] ? ldlm_lock_decref_internal+0xba/0x880 [ptlrpc]
      [<ffffffffa0645bbf>] ? __ldlm_handle2lock+0x9f/0x3d0 [ptlrpc]
      [<ffffffffa088d961>] ? ll_issue_page_read+0x131/0x420 [lustre]
      [<ffffffff8110c07e>] ? find_get_page+0x1e/0xa0
      [<ffffffffa0891869>] ll_readahead+0xf59/0x1ca0 [lustre]
      [<ffffffffa0643651>] ? ldlm_lock_add_to_lru_nolock+0x51/0xe0 [ptlrpc]
      [<ffffffffa0643879>] ? ldlm_lock_add_to_lru+0x49/0x110 [ptlrpc]
      [<ffffffffa088eb21>] ? llap_from_page_with_lockh.clone.8+0x2a1/0x1120 [lustre]
      [<ffffffffa0894475>] ll_readpage+0x165/0x1da0 [lustre]
      [<ffffffffa0643984>] ? ldlm_lock_remove_from_lru+0x44/0x110 [ptlrpc]
      [<ffffffffa0643bb2>] ? ldlm_lock_fast_match+0xc2/0x130 [ptlrpc]
      [<ffffffff812637ed>] ? copy_user_generic_string+0x2d/0x40
      [<ffffffff8110d3e0>] generic_file_aio_read+0x1f0/0x730
      [<ffffffffa086ee8c>] ll_file_aio_read+0xc8c/0x2610 [lustre]
      [<ffffffffa08708e0>] ll_file_read+0xd0/0xf0 [lustre]
      [<ffffffff81091de0>] ? autoremove_wake_function+0x0/0x40
      [<ffffffff811ffb76>] ? security_file_permission+0x16/0x20
      [<ffffffff8116cf7d>] ? rw_verify_area+0x5d/0xc0
      [<ffffffff8116d905>] vfs_read+0xb5/0x1a0
      [<ffffffff8116da41>] sys_read+0x51/0x90
      [<ffffffff81013172>] system_call_fastpath+0x16/0x1b
      Code: ff ff 90 4c 8b 5d c8 4d 29 f3 4c 03 5d c0 e9 d3 fe ff ff 55 48 89 e5 48
      83 ec 40 0f 1f 44 00 00 48 8b 47 10 48 85 c0 75 02 c9 c3 <8b> 10 81 fa 21 06 e3
      05 74 f4 89 54 24 28 48 89 44 24 20 ba 00

      RIP [<ffffffffa0889d08>] llap_cast_private+0x18/0xa0 [lustre]
      RSP <ffff880553371748>
      CR2: 0000000000000002
      --[ end trace 2a22ba0ec9d8ad43 ]--
      Kernel panic - not syncing: Fatal exception
      Pid: 3477, comm: bonnie Tainted: G D ----------------
      2.6.32-71.14.1.el6.x86_64.crt #1
      Call Trace:
      [<ffffffff814c8183>] panic+0x78/0x137
      [<ffffffff814cc254>] oops_end+0xe4/0x100
      Feb 4 10:56:01 [<ffffffff8104651b>] no_context+0xfb/0x260
      et06 kernel: BUG [<ffffffff810467a5>] __bad_area_nosemaphore+0x125/0x1e0
      : unable to hand [<ffffffff810468ce>] bad_area+0x4e/0x60
      le kernel NULL p [<ffffffff814cdda0>] do_page_fault+0x390/0x3a0
      ointer dereferen [<ffffffff814cb5a5>] page_fault+0x25/0x30
      ce at 0000000000 [<ffffffffa0889d08>] ? llap_cast_private+0x18/0xa0 [lustre]
      000002
      Feb 4 10 [<ffffffffa05cbaef>] ? class_handle2object+0x8f/0x1c0 [obdclass]
      :56:01 et06 kern [<ffffffffa088e921>]
      llap_from_page_with_lockh.clone.8+0xa1/0x1120 [lustre]
      el: BUG: unable [<ffffffffa0648daa>] ? ldlm_lock_decref_internal+0xba/0x880
      [ptlrpc]
      to handle kernel [<ffffffffa0645bbf>] ? __ldlm_handle2lock+0x9f/0x3d0 [ptlrpc]
      NULL pointer de [<ffffffffa088d961>] ? ll_issue_page_read+0x131/0x420 [lustre]
      reference at 000 [<ffffffff8110c07e>] ? find_get_page+0x1e/0xa0
      0000000000002
      Fe [<ffffffffa0891869>] ll_readahead+0xf59/0x1ca0 [lustre]
      b 4 10:56:01 et [<ffffffffa0643651>] ? ldlm_lock_add_to_lru_nolock+0x51/0xe0
      [ptlrpc]
      06 kernel: IP: [ [<ffffffffa0643879>] ? ldlm_lock_add_to_lru+0x49/0x110
      [ptlrpc]
      <ffffffffa0889d0 [<ffffffffa088eb21>] ?
      llap_from_page_with_lockh.clone.8+0x2a1/0x1120 [lustre]
      8>] llap_cast_pr [<ffffffffa0894475>] ll_readpage+0x165/0x1da0 [lustre]
      ivate+0x18/0xa0 [<ffffffffa0643984>] ? ldlm_lock_remove_from_lru+0x44/0x110
      [ptlrpc]
      [lustre]
      Feb 4 [<ffffffffa0643bb2>] ? ldlm_lock_fast_match+0xc2/0x130 [ptlrpc]
      10:56:01 et06 ke [<ffffffff812637ed>] ? copy_user_generic_string+0x2d/0x40
      rnel: IP: [<ffff [<ffffffff8110d3e0>] generic_file_aio_read+0x1f0/0x730
      ffffa0889d08>] l [<ffffffffa086ee8c>] ll_file_aio_read+0xc8c/0x2610 [lustre]
      lap_cast_private [<ffffffffa08708e0>] ll_file_read+0xd0/0xf0 [lustre]
      +0x18/0xa0 [lust [<ffffffff81091de0>] ? autoremove_wake_function+0x0/0x40
      re]
      Feb 4 10:56 [<ffffffff811ffb76>] ? security_file_permission+0x16/0x20
      :01 et06 kernel: [<ffffffff8116cf7d>] ? rw_verify_area+0x5d/0xc0
      PGD 32d72a067 P [<ffffffff8116d905>] vfs_read+0xb5/0x1a0
      UD 330da0067 PMD [<ffffffff8116da41>] sys_read+0x51/0x90
      0
      Feb 4 10:56 [<ffffffff81013172>] system_call_fastpath+0x16/0x1b
      :01 et06 kernel:panic occurred, switching back to text console

      ###########################################################################
      ############################################################################

      [root@et06 crt]# modinfo lustre
      filename:
      /lib/modules/2.6.32-71.14.1.el6.x86_64.crt/updates/kernel/fs/lustre/lustre.ko
      license: GPL
      description: Lustre Lite Client File System
      author: Sun Microsystems, Inc. <http://www.lustre.org/>
      srcversion: 83AC865556E2AE50715D831
      depends: obdclass,mdc,ptlrpc,libcfs,lvfs,lov,osc,lnet
      vermagic: 2.6.32-71.14.1.el6.x86_64.crt SMP mod_unload modversions
      [root@et06 crt]# gdb
      /lib/modules/2.6.32-71.14.1.el6.x86_64.crt/updates/kernel/fs/lustre/lustre.ko
      ...

      Reading symbols from
      /lib/modules/2.6.32-71.14.1.el6.x86_64.crt/updates/kernel/fs/lustre/lustre.ko...done.
      (gdb) disassemble /m *(llap_cast_private+0x18/0xa0)
      Dump of assembler code for function llap_cast_private:
      538

      { 0x0000000000036d20 <+0>: push %rbp 0x0000000000036d21 <+1>: mov %rsp,%rbp 0x0000000000036d24 <+4>: sub $0x40,%rsp 0x0000000000036d28 <+8>: callq 0x36d2d <llap_cast_private+13> 539 struct ll_async_page *llap = (struct ll_async_page *)page_private(page); 0x0000000000036d2d <+13>: mov 0x10(%rdi),%rax 540 541 LASSERTF(llap == NULL || llap->llap_magic == LLAP_MAGIC, 0x0000000000036d31 <+17>: test %rax,%rax 0x0000000000036d34 <+20>: jne 0x36d38 <llap_cast_private+24> 0x0000000000036d38 <+24>: mov (%rax),%edx 0x0000000000036d3a <+26>: cmp $0x5e30621,%edx 0x0000000000036d40 <+32>: je 0x36d36 <llap_cast_private+22> 0x0000000000036d42 <+34>: mov %edx,0x28(%rsp) 0x0000000000036d46 <+38>: mov %rax,0x20(%rsp) 0x0000000000036d4b <+43>: mov $0x40000,%edx 0x0000000000036d50 <+48>: mov %rdi,0x18(%rsp) 0x0000000000036d55 <+53>: mov $0x80,%esi 0x0000000000036d5a <+58>: xor %edi,%edi 0x0000000000036d5c <+60>: mov $0x21f,%r9d 0x0000000000036d62 <+66>: mov $0x0,%r8 0x0000000000036d69 <+73>: mov $0x0,%rcx 0x0000000000036d70 <+80>: xor %eax,%eax 0x0000000000036d72 <+82>: movl $0x5e30621,0x30(%rsp) 0x0000000000036d7a <+90>: movq $0x0,0x10(%rsp) 0x0000000000036d83 <+99>: movq $0x0,0x8(%rsp) 0x0000000000036d8c <+108>: movq $0x0,(%rsp) 0x0000000000036d94 <+116>: callq 0x36d99 <llap_cast_private+121> 0x0000000000036d99 <+121>: mov $0x21f,%edx 0x0000000000036d9e <+126>: mov $0x0,%rsi 0x0000000000036da5 <+133>: mov $0x0,%rdi 0x0000000000036dac <+140>: callq 0x36db1 0x0000000000036db1: data32 data32 data32 data32 data32 nopw %cs:0x0(%rax,%rax,1) 542 "page %p private %lu gave magic %d which != %d\n", 543 page, page_private(page), llap->llap_magic, LLAP_MAGIC); 544 545 return llap; 546 }

      0x0000000000036d36 <+22>: leaveq
      0x0000000000036d37 <+23>: retq

      End of assembler dump.

      Attachments

        Issue Links

          Activity

            [LU-93] Kernel panic caused by Lustre 1.8.5/1.8.6 on RH6 (patchless client)

            do you have a good way to retrieve this number? otherwise here from our Linux output:

            From dmesg on et66, intel-ucode/06-2c-02

            microcode: CPU23 sig=0x206c2, pf=0x1, revision=0x13
            platform microcode: firmware: requesting intel-ucode/06-2c-02

            michael.hebenstreit@intel.com Michael Hebensteit (Inactive) added a comment - do you have a good way to retrieve this number? otherwise here from our Linux output: From dmesg on et66, intel-ucode/06-2c-02 microcode: CPU23 sig=0x206c2, pf=0x1, revision=0x13 platform microcode: firmware: requesting intel-ucode/06-2c-02

            it would fit the circumstances where/when it happened - but in my case a BIOS upgrade fixed it. This would rather speak against it.

            michael.hebenstreit@intel.com Michael Hebensteit (Inactive) added a comment - it would fit the circumstances where/when it happened - but in my case a BIOS upgrade fixed it. This would rather speak against it.

            Hi,

            Michael I have a question for you. At Bull we are trying to identify which microcode can solve this issue. For our MESCA machines we are testing a new BIOS that integrates the microcode update codename M04206E6_00000008.
            Do you think you could retrieve the microcode revision brought by the BIOS update of your SuperMicro machines? That would be very helpful for us!

            People from the Kernel Team here at Bull pointed us a kernel bug that is fixed in 2.6.32 vanilla and RHEL6.1 beta, dealing with TLB entries:
            https://patchwork.kernel.org/patch/564801/
            Do you think it could be related to the present bug, and fix this issue?

            TIA,
            Sebastien.

            sebastien.buisson Sebastien Buisson (Inactive) added a comment - Hi, Michael I have a question for you. At Bull we are trying to identify which microcode can solve this issue. For our MESCA machines we are testing a new BIOS that integrates the microcode update codename M04206E6_00000008. Do you think you could retrieve the microcode revision brought by the BIOS update of your SuperMicro machines? That would be very helpful for us! People from the Kernel Team here at Bull pointed us a kernel bug that is fixed in 2.6.32 vanilla and RHEL6.1 beta, dealing with TLB entries: https://patchwork.kernel.org/patch/564801/ Do you think it could be related to the present bug, and fix this issue? TIA, Sebastien.

            commented in the 130 thread

            michael.hebenstreit@intel.com Michael Hebensteit (Inactive) added a comment - commented in the 130 thread

            Another customer hit the same bug with lustre 2.0. See LU-130.
            Michael, could you please tell us what BIOS version has fixed the problem for you?

            johann Johann Lombardi (Inactive) added a comment - Another customer hit the same bug with lustre 2.0. See LU-130 . Michael, could you please tell us what BIOS version has fixed the problem for you?
            pjones Peter Jones added a comment -

            Michael

            I spoke with Johann Lombardi who was the tech lead for 1.6.x and 1.8.x for many years and he agrees with your rough probabilities based upon the information presently available

            Regards

            Peter

            pjones Peter Jones added a comment - Michael I spoke with Johann Lombardi who was the tech lead for 1.6.x and 1.8.x for many years and he agrees with your rough probabilities based upon the information presently available Regards Peter
            pjones Peter Jones added a comment -

            update from Michael

            "thanks for the excellent work so far

            Lai - can you please answer the following question: based on your professional experience, with the data we have so far - how likely are the following scenarios:

            a) it's a bug in the Redhat Linux kernel
            b) it's a bug in Lustre
            c) it's a hardware/BIOS related bug
            d) it's something completely different

            could you assign probabilities to those 4 option (they should add up to 100%) - my own asessment would currently be:

            1% it's a bug in the Redhat Linux kernel
            4% it's a bug in Lustre
            90% it's a hardware/BIOS related bug
            5% it's something completely different

            for the next steps I need to do some preparation first

            a) create a new image based on RH6.1 and start exhaustive test procedure to see if the error creeps up again
            b) create 2 systems based on RH5, one with 1.6.7.2, one with 1.8.2 to show performance problems

            so there is no need for a meeting today
            "

            pjones Peter Jones added a comment - update from Michael "thanks for the excellent work so far Lai - can you please answer the following question: based on your professional experience, with the data we have so far - how likely are the following scenarios: a) it's a bug in the Redhat Linux kernel b) it's a bug in Lustre c) it's a hardware/BIOS related bug d) it's something completely different could you assign probabilities to those 4 option (they should add up to 100%) - my own asessment would currently be: 1% it's a bug in the Redhat Linux kernel 4% it's a bug in Lustre 90% it's a hardware/BIOS related bug 5% it's something completely different for the next steps I need to do some preparation first a) create a new image based on RH6.1 and start exhaustive test procedure to see if the error creeps up again b) create 2 systems based on RH5, one with 1.6.7.2, one with 1.8.2 to show performance problems so there is no need for a meeting today "
            laisiyao Lai Siyao added a comment -

            Work done for this issue in my side (16 hours total):

            • analyzed RH6 kernel page allocation/free code path changes against RH5 kernel.
            • reviewed lustre page free allocation/free code path.
            • analyzed vmcore and test logs.
            laisiyao Lai Siyao added a comment - Work done for this issue in my side (16 hours total): analyzed RH6 kernel page allocation/free code path changes against RH5 kernel. reviewed lustre page free allocation/free code path. analyzed vmcore and test logs.
            laisiyao Lai Siyao added a comment -

            Michael, as you pointed out, page->private is borrowed to store migratetype (the value might be 2) in free_hot_cold_page() in RH6, though this is suspicious, I can't imagine a free page but still in page cache (which may trigger panic not so hard, and not limited to RH6 either).

            I will wait for the latest test result; or after you get more direct evidence of the above scenario.

            laisiyao Lai Siyao added a comment - Michael, as you pointed out, page->private is borrowed to store migratetype (the value might be 2) in free_hot_cold_page() in RH6, though this is suspicious, I can't imagine a free page but still in page cache (which may trigger panic not so hard, and not limited to RH6 either). I will wait for the latest test result; or after you get more direct evidence of the above scenario.
            laisiyao Lai Siyao added a comment - - edited

            dump bad page info:

            crash> struct page ffffea000b1d09a8
            struct page {
              flags = 18014398509482017, 
              _count = {
                counter = 2
              }, 
              {
                _mapcount = {
                  counter = -1
                }, 
                {
                  inuse = 65535, 
                  objects = 65535
                }
              }, 
              {
                {
                  private = 2, 
                  mapping = 0xffff88062f1cea70
                }, 
                ptl = {
                  raw_lock = {
                    slock = 2
                  }
                }, 
                slab = 0x2, 
                first_page = 0x2
              }, 
              {
                index = 379628, 
                freelist = 0x5caec
              }, 
              lru = {
                next = 0xffffea000b1d0a08, 
                prev = 0xffffea000b1d0998
              }
            }
            
            laisiyao Lai Siyao added a comment - - edited dump bad page info: crash> struct page ffffea000b1d09a8 struct page { flags = 18014398509482017, _count = { counter = 2 }, { _mapcount = { counter = -1 }, { inuse = 65535, objects = 65535 } }, { { private = 2, mapping = 0xffff88062f1cea70 }, ptl = { raw_lock = { slock = 2 } }, slab = 0x2, first_page = 0x2 }, { index = 379628, freelist = 0x5caec }, lru = { next = 0xffffea000b1d0a08, prev = 0xffffea000b1d0998 } }

            For the record, another customer also reported some weird assertion failures (although not the
            same assertion as in this bug) with the SLES11 SP1 kernel. Somehow PG_locked was cleared
            behind our back. Surprisingly, it also turned out to be a BIOS issue.

            Unfortunately, the bugzilla ticket (i.e. bug 23880) is private.

            johann Johann Lombardi (Inactive) added a comment - For the record, another customer also reported some weird assertion failures (although not the same assertion as in this bug) with the SLES11 SP1 kernel. Somehow PG_locked was cleared behind our back. Surprisingly, it also turned out to be a BIOS issue. Unfortunately, the bugzilla ticket (i.e. bug 23880) is private.

            People

              laisiyao Lai Siyao
              michael.hebenstreit@intel.com Michael Hebensteit (Inactive)
              Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: