[LU-411] kernel panic when running conf-sanity on RHEL5/i686 Created: 14/Jun/11 Updated: 31/Oct/11 Resolved: 31/Oct/11 |
|
| Status: | Resolved |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | None |
| Fix Version/s: | None |
| Type: | Bug | Priority: | Minor |
| Reporter: | Sarah Liu | Assignee: | Yang Sheng |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | None | ||
| Environment: |
lustre-master/build158/rhel5/i686 mds/ost on fat-intel-1 two lustre clients |
||
| Attachments: |
|
| Severity: | 3 |
| Rank (Obsolete): | 4941 |
| Description |
|
kernel panic when running conf-sanity test-34c please find logs in the attached. |
| Comments |
| Comment by Peter Jones [ 14/Jun/11 ] |
|
YangSheng Could you please look into this one? Thanks Peter |
| Comment by Yang Sheng [ 14/Jun/11 ] |
|
taken. |
| Comment by Yang Sheng [ 14/Jun/11 ] |
|
Stack overflow on patchless i686/rhel5 client. It still work on 4k stack enabled. Lustre: DEBUG MARKER: == conf-sanity test 34c: force umount with failed ost should be normal =============================== 04:51:19 (1308052279) Lustre: 9436:0:(sec.c:1474:sptlrpc_import_sec_adapt()) import MGC192.168.4.128@o2ib->MGC192.168.4.128@o2ib_0 netid 50000: select flavor null LustreError: 152-6: Ignoring deprecated mount option 'acl'. Lustre: MGC192.168.4.128@o2ib: Reactivating import Lustre: 9436:0:(o2iblnd.c:413:kiblnd_find_peer_locked()) maximum lustre stack 3124 [<f9080a82>] kiblnd_find_peer_locked+0x1a2/0x1b0 [ko2iblnd] [<f9099664>] kiblnd_launch_tx+0x44/0x1000 [ko2iblnd] [<f90848c7>] kiblnd_pool_alloc_node+0xf7/0x3c0 [ko2iblnd] [<f9093328>] kiblnd_init_tx_msg+0x88/0x1b0 [ko2iblnd] [<f909ed5f>] kiblnd_send+0x1df/0xbc0 [ko2iblnd] [<f9204c9b>] lnet_ni_send+0x4b/0xc0 [lnet] [<f92071db>] lnet_send+0x38b/0xd90 [lnet] [<f92ea1b5>] cfs_set_ptldebug_header+0x35/0x90 [libcfs] [<f92fa526>] libcfs_debug_vmsg2+0x5b6/0x9e0 [libcfs] [<f920d665>] LNetPut+0x565/0xef0 [lnet] [<f93a36f4>] ptl_send_buf+0x1f4/0xab0 [ptlrpc] [<f93a855a>] ptl_send_rpc+0x87a/0x1560 [ptlrpc] [<f938ece5>] ptlrpc_send_new_req+0xb45/0xf40 [ptlrpc] [<f9398a89>] ptlrpc_set_wait+0x99/0x9b0 [ptlrpc] [<f938c019>] ptlrpc_request_addref+0xd9/0x200 [ptlrpc] [<f939b200>] ptlrpc_queue_wait+0x90/0x490 [ptlrpc] [<f93d8bca>] llog_client_read_header+0xfa/0xade [ptlrpc] [<f9ce8a29>] llog_init_handle+0x159/0x10d0 [obdclass] [<f9d4b910>] class_config_parse_llog+0x270/0xad0 [obdclass] [<f99adc2f>] mgc_process_log+0x4ef/0x47b0 [mgc] [<f99ac079>] mgc_name2resid+0x119/0x360 [mgc] [<f99b4930>] mgc_blocking_ast+0x0/0xa60 [mgc] [<f9369820>] ldlm_completion_ast+0x0/0xc20 [ptlrpc] [<f99b2f1e>] do_config_log_add+0xdbe/0x1260 [mgc] [<f99b3979>] config_log_add+0x5b9/0x930 [mgc] [<f99b9a5b>] mgc_process_config+0x8bb/0x1230 [mgc] [<f99b91a0>] mgc_process_config+0x0/0x1230 [mgc] [<f9d635a4>] lustre_process_log+0x1184/0x21b0 [obdclass] [<f92fa526>] libcfs_debug_vmsg2+0x5b6/0x9e0 [libcfs] [<f9bc8c77>] ll_fill_super+0xba7/0xd2f0 [lustre] [<f92ea1b5>] cfs_set_ptldebug_header+0x35/0x90 [libcfs] [<c04f2185>] vsnprintf+0x49d/0x4db [<f92fa526>] libcfs_debug_vmsg2+0x5b6/0x9e0 [libcfs] [<f93467c5>] client_connect_import+0xc5/0x860 [ptlrpc] [<f9d5a225>] lustre_start_mgc+0xe75/0x4f80 [obdclass] [<f9d6d478>] lustre_fill_super+0x758/0xba0 [obdclass] [<c047d946>] get_sb_nodev+0x48/0x7f [<f9d6cd20>] lustre_fill_super+0x0/0xba0 [obdclass] [<c047d376>] vfs_kern_mount+0x7d/0xf2 [<f9d6cd20>] lustre_fill_super+0x0/0xba0 [obdclass] [<c047d41d>] do_kern_mount+0x25/0x36 [<c0491086>] do_mount+0x5fb/0x66b [<c049030a>] mntput_no_expire+0x11/0x6a [<c04856f2>] __link_path_walk+0xd6b/0xdab [<c049030a>] mntput_no_expire+0x11/0x6a [<c04857bf>] link_path_walk+0x8d/0x95 [<c0485b2b>] do_path_lookup+0x219/0x27f [<c045e126>] get_page_from_freelist+0x96/0x370 [<c045e469>] __alloc_pages+0x69/0x2cf [<c048ff73>] copy_mount_options+0x26/0x109 [<c0491163>] sys_mount+0x6d/0xa5 [<c0404f4b>] syscall_call+0x7/0xb ======================= Lustre: 9436:0:(o2iblnd_cb.c:1160:kiblnd_queue_tx_locked()) maximum lustre stack 3128 [<f909272b>] kiblnd_queue_tx_locked+0x20b/0x2c0 [ko2iblnd] [<f90963df>] kiblnd_queue_tx+0x1f/0x40 [ko2iblnd] [<f9099739>] kiblnd_launch_tx+0x119/0x1000 [ko2iblnd] [<f90848c7>] kiblnd_pool_alloc_node+0xf7/0x3c0 [ko2iblnd] [<f9093328>] kiblnd_init_tx_msg+0x88/0x1b0 [ko2iblnd] [<f909ed5f>] kiblnd_send+0x1df/0xbc0 [ko2iblnd] [<f9204c9b>] lnet_ni_send+0x4b/0xc0 [lnet] [<f92071db>] lnet_send+0x38b/0xd90 [lnet] [<f92ea1b5>] cfs_set_ptldebug_header+0x35/0x90 [libcfs] [<f92fa526>] libcfs_debug_vmsg2+0x5b6/0x9e0 [libcfs] [<f920d665>] LNetPut+0x565/0xef0 [lnet] [<f93a36f4>] ptl_send_buf+0x1f4/0xab0 [ptlrpc] [<f93a855a>] ptl_send_rpc+0x87a/0x1560 [ptlrpc] [<f938ece5>] ptlrpc_send_new_req+0xb45/0xf40 [ptlrpc] [<f9398a89>] ptlrpc_set_wait+0x99/0x9b0 [ptlrpc] [<f938c019>] ptlrpc_request_addref+0xd9/0x200 [ptlrpc] [<f939b200>] ptlrpc_queue_wait+0x90/0x490 [ptlrpc] [<f93d8bca>] llog_client_read_header+0xfa/0xade [ptlrpc] [<f9ce8a29>] llog_init_handle+0x159/0x10d0 [obdclass] [<f9d4b910>] class_config_parse_llog+0x270/0xad0 [obdclass] [<f99adc2f>] mgc_process_log+0x4ef/0x47b0 [mgc] [<f99ac079>] mgc_name2resid+0x119/0x360 [mgc] [<f99b4930>] mgc_blocking_ast+0x0/0xa60 [mgc] [<f9369820>] ldlm_completion_ast+0x0/0xc20 [ptlrpc] [<f99b2f1e>] do_config_log_add+0xdbe/0x1260 [mgc] [<f99b3979>] config_log_add+0x5b9/0x930 [mgc] [<f99b9a5b>] mgc_process_config+0x8bb/0x1230 [mgc] [<f99b91a0>] mgc_process_config+0x0/0x1230 [mgc] [<f9d635a4>] lustre_process_log+0x1184/0x21b0 [obdclass] [<f92fa526>] libcfs_debug_vmsg2+0x5b6/0x9e0 [libcfs] [<f9bc8c77>] ll_fill_super+0xba7/0xd2f0 [lustre] [<f92ea1b5>] cfs_set_ptldebug_header+0x35/0x90 [libcfs] [<c04f2185>] vsnprintf+0x49d/0x4db [<f92fa526>] libcfs_debug_vmsg2+0x5b6/0x9e0 [libcfs] [<f93467c5>] client_connect_import+0xc5/0x860 [ptlrpc] [<f9d5a225>] lustre_start_mgc+0xe75/0x4f80 [obdclass] [<f9d6d478>] lustre_fill_super+0x758/0xba0 [obdclass] [<c047d946>] get_sb_nodev+0x48/0x7f [<f9d6cd20>] lustre_fill_super+0x0/0xba0 [obdclass] [<c047d376>] vfs_kern_mount+0x7d/0xf2 [<f9d6cd20>] lustre_fill_super+0x0/0xba0 [obdclass] [<c047d41d>] do_kern_mount+0x25/0x36 [<c0491086>] do_mount+0x5fb/0x66b [<c049030a>] mntput_no_expire+0x11/0x6a [<c04856f2>] __link_path_walk+0xd6b/0xdab [<c049030a>] mntput_no_expire+0x11/0x6a [<c04857bf>] link_path_walk+0x8d/0x95 [<c0485b2b>] do_path_lookup+0x219/0x27f [<c045e126>] get_page_from_freelist+0x96/0x370 [<c045e469>] __alloc_pages+0x69/0x2cf [<c048ff73>] copy_mount_options+0x26/0x109 [<c0491163>] sys_mount+0x6d/0xa5 [<c0404f4b>] syscall_call+0x7/0xb BUG: unable to handle kernel NULL pointer dereference at virtual address 00000003 printing eip: c0406050 *pde = a73de067 Oops: 0000 [#1] |
| Comment by Oleg Drokin [ 25/Jun/11 ] |
|
I am not really sure how do you say it works with 4k stack and immediately post a crash report |
| Comment by Yang Sheng [ 27/Jun/11 ] |
|
Hi, Oleg, I am glad hear your advice for this issue. This is a rhel5/i686 patchless client, so the CONFIG_4KSTACK must enabled. And from warning message: |
| Comment by Peter Jones [ 19/Jul/11 ] |
|
So, what are the next steps here? This has shown up again during an autotest run... |
| Comment by Yang Sheng [ 19/Jul/11 ] |
|
I have commited a patch got advice from Andreas and update patch shortly. |
| Comment by Peter Jones [ 19/Jul/11 ] |
|
ah yes I see - http://review.whamcloud.com/#change,950 - thanks |
| Comment by Build Master (Inactive) [ 27/Jul/11 ] |
|
Integrated in Oleg Drokin : 28c4d8388e62a1252c3540ece3c127d3c9ce7148
|
| Comment by Build Master (Inactive) [ 27/Jul/11 ] |
|
Integrated in Oleg Drokin : 28c4d8388e62a1252c3540ece3c127d3c9ce7148
|
| Comment by Build Master (Inactive) [ 27/Jul/11 ] |
|
Integrated in Oleg Drokin : 28c4d8388e62a1252c3540ece3c127d3c9ce7148
|
| Comment by Build Master (Inactive) [ 27/Jul/11 ] |
|
Integrated in Oleg Drokin : 28c4d8388e62a1252c3540ece3c127d3c9ce7148
|
| Comment by Build Master (Inactive) [ 27/Jul/11 ] |
|
Integrated in Oleg Drokin : 28c4d8388e62a1252c3540ece3c127d3c9ce7148
|
| Comment by Build Master (Inactive) [ 27/Jul/11 ] |
|
Integrated in Oleg Drokin : 28c4d8388e62a1252c3540ece3c127d3c9ce7148
|
| Comment by Build Master (Inactive) [ 27/Jul/11 ] |
|
Integrated in Oleg Drokin : 28c4d8388e62a1252c3540ece3c127d3c9ce7148
|
| Comment by Build Master (Inactive) [ 27/Jul/11 ] |
|
Integrated in Oleg Drokin : 28c4d8388e62a1252c3540ece3c127d3c9ce7148
|
| Comment by Build Master (Inactive) [ 27/Jul/11 ] |
|
Integrated in Oleg Drokin : 28c4d8388e62a1252c3540ece3c127d3c9ce7148
|
| Comment by Build Master (Inactive) [ 27/Jul/11 ] |
|
Integrated in Oleg Drokin : 28c4d8388e62a1252c3540ece3c127d3c9ce7148
|
| Comment by Build Master (Inactive) [ 27/Jul/11 ] |
|
Integrated in Oleg Drokin : 28c4d8388e62a1252c3540ece3c127d3c9ce7148
|
| Comment by Build Master (Inactive) [ 27/Jul/11 ] |
|
Integrated in Oleg Drokin : 28c4d8388e62a1252c3540ece3c127d3c9ce7148
|
| Comment by Build Master (Inactive) [ 27/Jul/11 ] |
|
Integrated in Oleg Drokin : 28c4d8388e62a1252c3540ece3c127d3c9ce7148
|
| Comment by Build Master (Inactive) [ 27/Jul/11 ] |
|
Integrated in Oleg Drokin : 28c4d8388e62a1252c3540ece3c127d3c9ce7148
|
| Comment by Yang Sheng [ 31/Oct/11 ] |
|
Seem this issue not show up a long time. So i close it. Please fell free reopen if any. |