[LU-12985] sanity test_60g: Timeout occurred after 140 mins Created: 20/Nov/19 Updated: 12/Mar/22 |
|
| Status: | Open |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | None |
| Fix Version/s: | None |
| Type: | Bug | Priority: | Minor |
| Reporter: | Maloo | Assignee: | WC Triage |
| Resolution: | Unresolved | Votes: | 0 |
| Labels: | None | ||
| Issue Links: |
|
||||||||||||
| Severity: | 3 | ||||||||||||
| Rank (Obsolete): | 9223372036854775807 | ||||||||||||
| Description |
|
This issue was created by maloo for liuying <emoly.liu@intel.com> This issue relates to the following test suite run: https://testing.whamcloud.com/test_sets/b24183b8-0ab1-11ea-bbc3-52540065bddc test_60g failed with the following error: Timeout occurred after 140 mins, last suite running was sanity, restarting cluster to continue tests <<Please provide additional information about the failure here>> [ 4806.548607] Lustre: DEBUG MARKER: == sanity test 60g: transaction abort won't cause MDT hung =========================================== 05:13:15 (1574140395) [ 4806.746423] Lustre: DEBUG MARKER: /usr/sbin/lctl set_param fail_loc=0x8000019a [ 4807.187510] Lustre: *** cfs_fail_loc=19a, val=0*** [ 4807.774430] Lustre: DEBUG MARKER: /usr/sbin/lctl set_param fail_loc=0x8000019a [ 4808.259070] Lustre: *** cfs_fail_loc=19a, val=0*** [ 4808.472999] Lustre: DEBUG MARKER: /usr/sbin/lctl set_param fail_loc=0x8000019a [ 4809.183645] Lustre: DEBUG MARKER: /usr/sbin/lctl set_param fail_loc=0x8000019a [ 4809.491198] Lustre: *** cfs_fail_loc=19a, val=0*** [ 4809.492745] Lustre: Skipped 1 previous similar message [ 4809.493924] LustreError: 16969:0:(llog_cat.c:744:llog_cat_cancel_arr_rec()) lustre-MDT0000-osd: fail to cancel 1 llog-records: rc = -5 [ 4809.496232] LustreError: 16969:0:(llog_cat.c:781:llog_cat_cancel_records()) lustre-MDT0000-osd: fail to cancel 1 of 1 llog-records: rc = -5 [ 4809.880666] Lustre: DEBUG MARKER: /usr/sbin/lctl set_param fail_loc=0x8000019a [ 4810.586256] Lustre: DEBUG MARKER: /usr/sbin/lctl set_param fail_loc=0x8000019a [ 4810.762596] LustreError: 19797:0:(llog_cat.c:744:llog_cat_cancel_arr_rec()) lustre-MDT0001-osp-MDT0002: fail to cancel 1 llog-records: rc = -116 [ 4810.765173] LustreError: 19797:0:(llog_cat.c:781:llog_cat_cancel_records()) lustre-MDT0001-osp-MDT0002: fail to cancel 1 of 1 llog-records: rc = -116 [ 4811.290016] Lustre: DEBUG MARKER: /usr/sbin/lctl set_param fail_loc=0x8000019a [ 4811.828688] Lustre: *** cfs_fail_loc=19a, val=0*** [ 4811.829600] Lustre: Skipped 2 previous similar messages [ 4811.842495] LustreError: 21702:0:(llog_osd.c:248:llog_osd_read_header()) lustre-MDT0001-osp-MDT0002: error reading [0x240000402:0x3:0x0] log header size 92: rc = -14 ... [ 4851.890821] Lustre: mdt00_003: service thread pid 21702 was inactive for 40.059 seconds. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes: [ 4851.893891] Pid: 21702, comm: mdt00_003 3.10.0-957.27.2.el7_lustre.x86_64 #1 SMP Fri Nov 8 17:23:54 UTC 2019 [ 4851.895471] Call Trace: [ 4851.896004] [<ffffffffa1788ce8>] call_rwsem_down_read_failed+0x18/0x30 [ 4851.897198] [<ffffffffc0ae01de>] llog_cat_prep_log+0x33e/0x360 [obdclass] [ 4851.898438] [<ffffffffc0ae0279>] llog_cat_declare_add_rec+0x79/0x260 [obdclass] [ 4851.899728] [<ffffffffc0ad7178>] llog_declare_add+0x78/0x1a0 [obdclass] [ 4851.900974] [<ffffffffc0e85cde>] top_trans_start+0x17e/0x940 [ptlrpc] [ 4851.902438] [<ffffffffc12f32b4>] lod_trans_start+0x34/0x40 [lod] [ 4851.903651] [<ffffffffc13aef9a>] mdd_trans_start+0x1a/0x20 [mdd] [ 4851.904822] [<ffffffffc13932f2>] mdd_create+0xbe2/0x1630 [mdd] [ 4851.905840] [<ffffffffc121ce84>] mdt_create+0xb54/0x10e0 [mdt] [ 4851.906916] [<ffffffffc121d57b>] mdt_reint_create+0x16b/0x360 [mdt] [ 4851.908040] [<ffffffffc1222ab3>] mdt_reint_rec+0x83/0x210 [mdt] [ 4851.909115] [<ffffffffc11fc9e0>] mdt_reint_internal+0x7b0/0xba0 [mdt] [ 4851.910272] [<ffffffffc12086d7>] mdt_reint+0x67/0x140 [mdt] [ 4851.911292] [<ffffffffc0e6ffda>] tgt_request_handle+0x97a/0x1620 [ptlrpc] [ 4851.912527] [<ffffffffc0e16ab6>] ptlrpc_server_handle_request+0x256/0xb10 [ptlrpc] [ 4851.913969] [<ffffffffc0e1a5ec>] ptlrpc_main+0xbac/0x1540 [ptlrpc] [ 4851.915083] [<ffffffffa14c2e81>] kthread+0xd1/0xe0 [ 4851.915998] [<ffffffffa1b77c37>] ret_from_fork_nospec_end+0x0/0x39 VVVVVVV DO NOT REMOVE LINES BELOW, Added by Maloo for auto-association VVVVVVV |
| Comments |