Details
-
Bug
-
Resolution: Fixed
-
Blocker
-
None
-
Lustre 1.8.6
-
None
-
RH6, recompiled kernel, OFED 1.5.3 RC3/4/5 (bug does not depend on OFED version)
-
4
-
8549
Description
bug already described in https://bugzilla.redhat.com/show_bug.cgi?id=678175
Lustre 1.8.5 works without problems in RH5. Switching to RH6 I get Kernel Panic
errors halting the system. As far as I can tell this happens only if Read and
Writes calls are used at the same time. Using 12 threads Read or 12 threads
Writing has not caused this issue to appear. Note - it's not possible to
completely reproduce this issue. Sometimes the error happens after 2min,
sometimes it takes 15min. So I'm pretty sure this is a race condition.
The Kernel Panic happens in conjunction with the "page" structure. Lustre uses
"page->private" to store a pointer to private information. Sometimes this value
- which obviously should be either NULL or a pointer to a valid Kernel address
- is set to "2". This disables all checks in Lustre for "page->private ==
NULL", and a subsequent access "page->private->ll_magic" causes the panic. I
probably could set "page->private" to NULL in such a case, but that would only
hide the sympton, not remedy the root cause and thereby most likley introduce a
new error.
From the Kernel traces I was able to recreate the callpath:
llap_cast_private+0x18/0xa0
llap_from_page_with_lockh.clone.8+0x1484/0x1cf0 [lustre]
(inline: ll_readahead_page)
(inline: ll_readahead_pages)
ll_readahead+0xf59/0x1ca0 [lustre]
ll_readpage+0x29b/0x2270 [lustre]
generic_file_aio_read+0x1f0/0x730
ll_file_aio_read+0xcd6/0x2720 [lustre]
ll_file_read+0xd0/0xf0 [lustre]
vfs_read+0xb5/0x1a0
sys_read+0x51/0x90
system_call_fastpath+0x16/0x1b
tracing back the origins of page->private==2 lead me to the call
page = grab_cache_page_nowait(mapping, index);
This function returns sometimes page->private is already set to "2". From what
I could see in the Lustre sources "page->private" is always treated to store a
pointer into kernel space. It is therefore very unlikely that Lustre causes
this error itself, especially as the code works on RH5
This leads me to the following questions:
a) If the page was not in the cache and has been freshly allocated - should
page->private be 0? Cause it looks to me it this is not always the case. The
page->private==2 error is already present when the function returns.
b) Is there a simple way to distinguish between a page freshly allocated and
one already residing in the page cache?
I attached a sosreport as well
thanks
Michael
############################################################################
############################################################################
[root@et06 crt]# uname -r
2.6.32-71.14.1.el6.x86_64.crt
############################################################################
A typical panic originally looked like:
############################################################################
BUG: unable to handle kernel NULL pointer dereference at 0000000000000002
IP: [<ffffffffa0889d08>] llap_cast_private+0x18/0xa0 [lustre]
PGD 32d72a067 PUD 330da0067 PMD 0
Oops: 0000 1 SMP
last sysfs file: /sys/module/ipv6/initstate
CPU 20
Modules linked in: mgc(U) lustre(U) lov(U) mdc(U) lquota(U) osc(U) ko2iblnd(U)
ptlrpc(U) obdclass(U) lnet(U) lvfs(U) libcfs(U) nfs lockd fscache nfs_acl
auth_rpcgss acpi_cpufreq freq_table sunrpc rdma_ucm(U) ib_sdp(U) rdma_cm(U)
iw_cm(U) ib_addr(U) ib_ipoib(U) ib_cm(U) ib_sa(U) ipv6 ib_uverbs(U) ib_umad(U)
mlx4_ib(U) ib_mad(U) ib_core(U) mlx4_core(U) ext2 dm_mirror dm_region_hash
dm_log dm_mod radeon ttm drm_kms_helper drm i2c_algo_bit serio_raw i2c_i801
i2c_core sg iTCO_wdt iTCO_vendor_support pata_jmicron i7core_edac edac_core
shpchp igb dca ext3 jbd mbcache sr_mod cdrom sd_mod crc_t10dif usb_storage ahci
ata_generic pata_acpi [last unloaded: cpufreq_ondemand]
Modules linked in: mgc(U) lustre(U) lov(U) mdc(U) lquota(U) osc(U) ko2iblnd(U)
ptlrpc(U) obdclass(U) lnet(U) lvfs(U) libcfs(U) nfs lockd fscache nfs_acl
auth_rpcgss acpi_cpufreq freq_table sunrpc rdma_ucm(U) ib_sdp(U) rdma_cm(U)
iw_cm(U) ib_addr(U) ib_ipoib(U) ib_cm(U) ib_sa(U) ipv6 ib_uverbs(U) ib_umad(U)
mlx4_ib(U) ib_mad(U) ib_core(U) mlx4_core(U) ext2 dm_mirror dm_region_hash
dm_log dm_mod radeon ttm drm_kms_helper drm i2c_algo_bit serio_raw i2c_i801
i2c_core sg iTCO_wdt iTCO_vendor_support pata_jmicron i7core_edac edac_core
shpchp igb dca ext3 jbd mbcache sr_mod cdrom sd_mod crc_t10dif usb_storage ahci
ata_generic pata_acpi [last unloaded: cpufreq_ondemand]
Pid: 3477, comm: bonnie Not tainted 2.6.32-71.14.1.el6.x86_64.crt #1 X8DTN
RIP: 0010:[<ffffffffa0889d08>] [<ffffffffa0889d08>]
llap_cast_private+0x18/0xa0 [lustre]
RSP: 0018:ffff880553371748 EFLAGS: 00010202
RAX: 0000000000000002 RBX: ffff880264d18000 RCX: ffff88009e7487e8
RDX: ffff8805533719a8 RSI: 0000000000000002 RDI: ffffea0002988c48
RBP: ffff880553371788 R08: 0000000000000002 R09: ffffea0002988c50
R10: ffff8805fe3b43c0 R11: 0000000000000010 R12: ffff88032f6e2df0
R13: ffffea0002988c48 R14: ffff88032f6e2cd0 R15: ffff88032fca8800
FS: 00002b219d82fb20(0000) GS:ffff88034ad00000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: 0000000000000002 CR3: 000000032d7ae000 CR4: 00000000000006e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process bonnie (pid: 3477, threadinfo ffff880553370000, task ffff8805761f54a0)
Stack:
000000003f818000 ffffea0002988c10 ffff8805533717b8 ffffffffa05cbaef
<0> ffff88062f0a2200 0000000000000000 ffff880553371798 ffff88062f0a2200
<0> ffff880553371868 ffffffffa088e921 ffff880553371818 ffffffffa0648daa
Call Trace:
[<ffffffffa05cbaef>] ? class_handle2object+0x8f/0x1c0 [obdclass]
[<ffffffffa088e921>] llap_from_page_with_lockh.clone.8+0xa1/0x1120 [lustre]
[<ffffffffa0648daa>] ? ldlm_lock_decref_internal+0xba/0x880 [ptlrpc]
[<ffffffffa0645bbf>] ? __ldlm_handle2lock+0x9f/0x3d0 [ptlrpc]
[<ffffffffa088d961>] ? ll_issue_page_read+0x131/0x420 [lustre]
[<ffffffff8110c07e>] ? find_get_page+0x1e/0xa0
[<ffffffffa0891869>] ll_readahead+0xf59/0x1ca0 [lustre]
[<ffffffffa0643651>] ? ldlm_lock_add_to_lru_nolock+0x51/0xe0 [ptlrpc]
[<ffffffffa0643879>] ? ldlm_lock_add_to_lru+0x49/0x110 [ptlrpc]
[<ffffffffa088eb21>] ? llap_from_page_with_lockh.clone.8+0x2a1/0x1120 [lustre]
[<ffffffffa0894475>] ll_readpage+0x165/0x1da0 [lustre]
[<ffffffffa0643984>] ? ldlm_lock_remove_from_lru+0x44/0x110 [ptlrpc]
[<ffffffffa0643bb2>] ? ldlm_lock_fast_match+0xc2/0x130 [ptlrpc]
[<ffffffff812637ed>] ? copy_user_generic_string+0x2d/0x40
[<ffffffff8110d3e0>] generic_file_aio_read+0x1f0/0x730
[<ffffffffa086ee8c>] ll_file_aio_read+0xc8c/0x2610 [lustre]
[<ffffffffa08708e0>] ll_file_read+0xd0/0xf0 [lustre]
[<ffffffff81091de0>] ? autoremove_wake_function+0x0/0x40
[<ffffffff811ffb76>] ? security_file_permission+0x16/0x20
[<ffffffff8116cf7d>] ? rw_verify_area+0x5d/0xc0
[<ffffffff8116d905>] vfs_read+0xb5/0x1a0
[<ffffffff8116da41>] sys_read+0x51/0x90
[<ffffffff81013172>] system_call_fastpath+0x16/0x1b
Code: ff ff 90 4c 8b 5d c8 4d 29 f3 4c 03 5d c0 e9 d3 fe ff ff 55 48 89 e5 48
83 ec 40 0f 1f 44 00 00 48 8b 47 10 48 85 c0 75 02 c9 c3 <8b> 10 81 fa 21 06 e3
05 74 f4 89 54 24 28 48 89 44 24 20 ba 00
RIP [<ffffffffa0889d08>] llap_cast_private+0x18/0xa0 [lustre]
RSP <ffff880553371748>
CR2: 0000000000000002
--[ end trace 2a22ba0ec9d8ad43 ]--
Kernel panic - not syncing: Fatal exception
Pid: 3477, comm: bonnie Tainted: G D ----------------
2.6.32-71.14.1.el6.x86_64.crt #1
Call Trace:
[<ffffffff814c8183>] panic+0x78/0x137
[<ffffffff814cc254>] oops_end+0xe4/0x100
Feb 4 10:56:01 [<ffffffff8104651b>] no_context+0xfb/0x260
et06 kernel: BUG [<ffffffff810467a5>] __bad_area_nosemaphore+0x125/0x1e0
: unable to hand [<ffffffff810468ce>] bad_area+0x4e/0x60
le kernel NULL p [<ffffffff814cdda0>] do_page_fault+0x390/0x3a0
ointer dereferen [<ffffffff814cb5a5>] page_fault+0x25/0x30
ce at 0000000000 [<ffffffffa0889d08>] ? llap_cast_private+0x18/0xa0 [lustre]
000002
Feb 4 10 [<ffffffffa05cbaef>] ? class_handle2object+0x8f/0x1c0 [obdclass]
:56:01 et06 kern [<ffffffffa088e921>]
llap_from_page_with_lockh.clone.8+0xa1/0x1120 [lustre]
el: BUG: unable [<ffffffffa0648daa>] ? ldlm_lock_decref_internal+0xba/0x880
[ptlrpc]
to handle kernel [<ffffffffa0645bbf>] ? __ldlm_handle2lock+0x9f/0x3d0 [ptlrpc]
NULL pointer de [<ffffffffa088d961>] ? ll_issue_page_read+0x131/0x420 [lustre]
reference at 000 [<ffffffff8110c07e>] ? find_get_page+0x1e/0xa0
0000000000002
Fe [<ffffffffa0891869>] ll_readahead+0xf59/0x1ca0 [lustre]
b 4 10:56:01 et [<ffffffffa0643651>] ? ldlm_lock_add_to_lru_nolock+0x51/0xe0
[ptlrpc]
06 kernel: IP: [ [<ffffffffa0643879>] ? ldlm_lock_add_to_lru+0x49/0x110
[ptlrpc]
<ffffffffa0889d0 [<ffffffffa088eb21>] ?
llap_from_page_with_lockh.clone.8+0x2a1/0x1120 [lustre]
8>] llap_cast_pr [<ffffffffa0894475>] ll_readpage+0x165/0x1da0 [lustre]
ivate+0x18/0xa0 [<ffffffffa0643984>] ? ldlm_lock_remove_from_lru+0x44/0x110
[ptlrpc]
[lustre]
Feb 4 [<ffffffffa0643bb2>] ? ldlm_lock_fast_match+0xc2/0x130 [ptlrpc]
10:56:01 et06 ke [<ffffffff812637ed>] ? copy_user_generic_string+0x2d/0x40
rnel: IP: [<ffff [<ffffffff8110d3e0>] generic_file_aio_read+0x1f0/0x730
ffffa0889d08>] l [<ffffffffa086ee8c>] ll_file_aio_read+0xc8c/0x2610 [lustre]
lap_cast_private [<ffffffffa08708e0>] ll_file_read+0xd0/0xf0 [lustre]
+0x18/0xa0 [lust [<ffffffff81091de0>] ? autoremove_wake_function+0x0/0x40
re]
Feb 4 10:56 [<ffffffff811ffb76>] ? security_file_permission+0x16/0x20
:01 et06 kernel: [<ffffffff8116cf7d>] ? rw_verify_area+0x5d/0xc0
PGD 32d72a067 P [<ffffffff8116d905>] vfs_read+0xb5/0x1a0
UD 330da0067 PMD [<ffffffff8116da41>] sys_read+0x51/0x90
0
Feb 4 10:56 [<ffffffff81013172>] system_call_fastpath+0x16/0x1b
:01 et06 kernel:panic occurred, switching back to text console
###########################################################################
############################################################################
[root@et06 crt]# modinfo lustre
filename:
/lib/modules/2.6.32-71.14.1.el6.x86_64.crt/updates/kernel/fs/lustre/lustre.ko
license: GPL
description: Lustre Lite Client File System
author: Sun Microsystems, Inc. <http://www.lustre.org/>
srcversion: 83AC865556E2AE50715D831
depends: obdclass,mdc,ptlrpc,libcfs,lvfs,lov,osc,lnet
vermagic: 2.6.32-71.14.1.el6.x86_64.crt SMP mod_unload modversions
[root@et06 crt]# gdb
/lib/modules/2.6.32-71.14.1.el6.x86_64.crt/updates/kernel/fs/lustre/lustre.ko
...
Reading symbols from
/lib/modules/2.6.32-71.14.1.el6.x86_64.crt/updates/kernel/fs/lustre/lustre.ko...done.
(gdb) disassemble /m *(llap_cast_private+0x18/0xa0)
Dump of assembler code for function llap_cast_private:
538
0x0000000000036d36 <+22>: leaveq
0x0000000000036d37 <+23>: retq
End of assembler dump.
Attachments
Issue Links
- is related to
-
LU-339 add configure check for MS_HAS_NEW_AOPS to support new aops
- Resolved
- Trackbacks
-
Lustre 1.8.x known issues tracker While testing against Lustre b18 branch, we would hit known bugs which were already reported in Lustre Bugzilla https://bugzilla.lustre.org/. In order to move away from relying on Bugzilla, we would create a JIRA