Details
-
Bug
-
Resolution: Cannot Reproduce
-
Minor
-
None
-
Lustre 1.8.7
-
None
-
3
-
10317
Description
our customer is seeing what OSS is dumping the following Call traces occasionally.
This is similar to LU-745, but this customer installed lustre-1.8.7 and LU-745 should be fixed.
Feb 14 15:03:45 oss02 kernel: [<ffffffff8002e024>] __wake_up+0x38/0x4f Feb 14 15:03:45 oss02 kernel: [<ffffffff88aba7f3>] jbd2_log_wait_commit+0xa3/0xf5 [jbd2] Feb 14 15:03:45 oss02 kernel: [<ffffffff800a2dff>] autoremove_wake_function+0x0/0x2e Feb 14 15:03:45 oss02 kernel: [<ffffffff88b6590b>] fsfilt_ldiskfs_commit_wait+0xab/0xd0 [fsfilt_ldiskfs] Feb 14 15:03:45 oss02 kernel: [<ffffffff88ba6194>] filter_commitrw_write+0x1e14/0x2dd0 [obdfilter] Feb 14 15:03:45 oss02 kernel: [<ffffffff88b47d09>] ost_brw_write+0x1c99/0x2480 [ost] Feb 14 15:03:45 oss02 kernel: [<ffffffff88881ac8>] ptlrpc_send_reply+0x5e8/0x600 [ptlrpc] Feb 14 15:03:45 oss02 kernel: [<ffffffff8884c8b0>] target_committed_to_req+0x40/0x120 [ptlrpc] Feb 14 15:03:45 oss02 kernel: [<ffffffff8008e7f9>] default_wake_function+0x0/0xe Feb 14 15:03:45 oss02 kernel: [<ffffffff888860a8>] lustre_msg_check_version_v2+0x8/0x20 [ptlrpc] Feb 14 15:03:45 oss02 kernel: [<ffffffff88b4b09e>] ost_handle+0x2bae/0x55b0 [ost] Feb 14 15:03:45 oss02 kernel: [<ffffffff888956d9>] ptlrpc_server_handle_request+0x989/0xe00 [ptlrpc] Feb 14 15:03:45 oss02 kernel: [<ffffffff88895e35>] ptlrpc_wait_event+0x2e5/0x310 [ptlrpc] Feb 14 15:03:45 oss02 kernel: [<ffffffff8008e7f9>] default_wake_function+0x0/0xe Feb 14 15:03:45 oss02 kernel: [<ffffffff88896dc6>] ptlrpc_main+0xf66/0x1120 [ptlrpc] Feb 14 15:03:45 oss02 kernel: [<ffffffff8005dfb1>] child_rip+0xa/0x11 Feb 14 15:03:45 oss02 kernel: [<ffffffff88895e60>] ptlrpc_main+0x0/0x1120 [ptlrpc] Feb 14 15:03:45 oss02 kernel: [<ffffffff8005dfa7>] child_rip+0x0/0x11
The another call trace is here
Feb 15 11:10:47 oss03 kernel: [<ffffffff80046823>] try_to_wake_up+0x27/0x484 Feb 15 11:10:47 oss03 kernel: [<ffffffff8008cc1e>] __wake_up_common+0x3e/0x68 Feb 15 11:10:47 oss03 kernel: [<ffffffff8028b1ca>] __down_trylock+0x39/0x4e Feb 15 11:10:47 oss03 kernel: [<ffffffff8006473d>] __down_failed_trylock+0x35/0x3a Feb 15 11:10:47 oss03 kernel: [<ffffffff8886c0c1>] ldlm_pool_shrink+0x31/0xf0 [ptlrpc] Feb 15 11:10:47 oss03 kernel: [<ffffffff8884a1e6>] .text.lock.ldlm_resource+0x7d/0x87 [ptlrpc] Feb 15 11:10:47 oss03 kernel: [<ffffffff8886d24c>] ldlm_pools_shrink+0x15c/0x2f0 [ptlrpc] Feb 15 11:10:47 oss03 kernel: [<ffffffff80064614>] __down_read+0x12/0x92 Feb 15 11:10:47 oss03 kernel: [<ffffffff8003f285>] shrink_slab+0xdc/0x153 Feb 15 11:10:47 oss03 kernel: [<ffffffff800ce4ce>] zone_reclaim+0x235/0x2cd Feb 15 11:10:47 oss03 kernel: [<ffffffff800ca81e>] __rmqueue+0x44/0xc6 Feb 15 11:10:47 oss03 kernel: [<ffffffff8000a939>] get_page_from_freelist+0xbf/0x442 Feb 15 11:10:47 oss03 kernel: [<ffffffff8000f46f>] __alloc_pages+0x78/0x308 Feb 15 11:10:47 oss03 kernel: [<ffffffff80025e20>] find_or_create_page+0x32/0x72 Feb 15 11:10:47 oss03 kernel: [<ffffffff88b98445>] filter_get_page+0x35/0x70 [obdfilter] Feb 15 11:10:47 oss03 kernel: [<ffffffff88b9a68a>] filter_preprw+0x14da/0x1e00 [obdfilter] Feb 15 11:10:47 oss03 kernel: [<ffffffff8876d121>] LNetMDBind+0x301/0x450 [lnet] Feb 15 11:10:47 oss03 kernel: [<ffffffff887d5d30>] class_handle2object+0xe0/0x170 [obdclass] Feb 15 11:10:47 oss03 kernel: [<ffffffff88b4500c>] ost_brw_write+0xf9c/0x2480 [ost] Feb 15 11:10:47 oss03 kernel: [<ffffffff8876d121>] LNetMDBind+0x301/0x450 [lnet] Feb 15 11:10:47 oss03 kernel: [<ffffffff88889c65>] lustre_msg_set_limit+0x35/0xf0 [ptlrpc] Feb 15 11:10:47 oss03 kernel: [<ffffffff88883fe5>] lustre_msg_get_version+0x35/0xf0 [ptlrpc] Feb 15 11:10:47 oss03 kernel: [<ffffffff88883ef5>] lustre_msg_get_opc+0x35/0xf0 [ptlrpc] Feb 15 11:10:47 oss03 kernel: [<ffffffff8008e7f9>] default_wake_function+0x0/0xe Feb 15 11:10:47 oss03 kernel: [<ffffffff888840a8>] lustre_msg_check_version_v2+0x8/0x20 [ptlrpc] Feb 15 11:10:47 oss03 kernel: [<ffffffff88b4909e>] ost_handle+0x2bae/0x55b0 [ost] Feb 15 11:10:47 oss03 kernel: [<ffffffff887d5d30>] class_handle2object+0xe0/0x170 [obdclass] Feb 15 11:10:47 oss03 kernel: [<ffffffff8883e19a>] lock_res_and_lock+0xba/0xd0 [ptlrpc] Feb 15 11:10:47 oss03 kernel: [<ffffffff88843168>] __ldlm_handle2lock+0x2f8/0x360 [ptlrpc] Feb 15 11:10:47 oss03 kernel: [<ffffffff888936d9>] ptlrpc_server_handle_request+0x989/0xe00 [ptlrpc] Feb 15 11:10:47 oss03 kernel: [<ffffffff88893e35>] ptlrpc_wait_event+0x2e5/0x310 [ptlrpc] Feb 15 11:10:47 oss03 kernel: [<ffffffff8008cc1e>] __wake_up_common+0x3e/0x68 Feb 15 11:10:47 oss03 kernel: [<ffffffff88894dc6>] ptlrpc_main+0xf66/0x1120 [ptlrpc] Feb 15 11:10:47 oss03 kernel: [<ffffffff8005dfb1>] child_rip+0xa/0x11 Feb 15 11:10:47 oss03 kernel: [<ffffffff88893e60>] ptlrpc_main+0x0/0x1120 [ptlrpc] Feb 15 11:10:47 oss03 kernel: [<ffffffff8005dfa7>] child_rip+0x0/0x11
any related to bugs?
Attachments
Issue Links
- Trackbacks
-
Lustre 1.8.x known issues tracker
While testing against Lustre b18 branch, we would hit known bugs which were already reported in Lustre Bugzilla https://bugzilla.lustre.org/. In order to move away from relying on Bugzilla, we would create a JIRA