[LU-16143] replay-single test_67b: FAIL: AT should have prevented reconnect Created: 08/Sep/22  Updated: 17/Jul/23

Status: Open
Project: Lustre
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Bug Priority: Minor
Reporter: Maloo Assignee: WC Triage
Resolution: Unresolved Votes: 0
Labels: None

Issue Links:
Related
Severity: 3
Rank (Obsolete): 9223372036854775807

 Description   

This issue was created by maloo for jianyu <yujian@whamcloud.com>

This issue relates to the following test suite run: https://testing.whamcloud.com/test_sets/3f70d951-7829-4baa-9f0e-32d12915ae9b

test_67b failed with the following error:

AT should have prevented reconnect
[ 5596.496426] Lustre: ll_ost00_014: service thread pid 66111 was inactive for 42.971 seconds. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes:
[ 5596.496433] Pid: 137452, comm: ll_ost00_051 4.18.0-348.2.1.el8_lustre.x86_64 #1 SMP Mon Aug 22 10:33:11 UTC 2022
[ 5596.499424] Lustre: Skipped 1 previous similar message
[ 5596.501026] Call Trace TBD:
[ 5596.501223] [<0>] do_get_write_access+0x2d8/0x430 [jbd2]
[ 5596.503228] [<0>] jbd2_journal_get_write_access+0x37/0x50 [jbd2]
[ 5596.504310] [<0>] __ldiskfs_journal_get_write_access+0x36/0x70 [ldiskfs]
[ 5596.505469] [<0>] osd_write+0x352/0xc30 [osd_ldiskfs]
[ 5596.506524] [<0>] dt_record_write+0x32/0x110 [obdclass]
[ 5596.507784] [<0>] tgt_client_data_update+0x513/0x6c0 [ptlrpc]
[ 5596.508771] [<0>] tgt_client_del+0x3a5/0x710 [ptlrpc]
[ 5596.509651] [<0>] ofd_obd_disconnect+0x1f4/0x210 [ofd]
[ 5596.510525] [<0>] target_handle_disconnect+0x227/0x4f0 [ptlrpc]
[ 5596.511517] [<0>] tgt_disconnect+0x4a/0x190 [ptlrpc]
[ 5596.512369] [<0>] tgt_request_handle+0xc93/0x1a40 [ptlrpc]
[ 5596.513289] [<0>] ptlrpc_server_handle_request+0x323/0xbd0 [ptlrpc]
[ 5596.514330] [<0>] ptlrpc_main+0xc06/0x1560 [ptlrpc]
[ 5596.515168] [<0>] kthread+0x116/0x130
[ 5596.515804] [<0>] ret_from_fork+0x35/0x40
[ 5596.516481] Pid: 66111, comm: ll_ost00_014 4.18.0-348.2.1.el8_lustre.x86_64 #1 SMP Mon Aug 22 10:33:11 UTC 2022
[ 5596.518056] Call Trace TBD:
[ 5596.518564] [<0>] do_get_write_access+0x2d8/0x430 [jbd2]
[ 5596.519438] [<0>] jbd2_journal_get_write_access+0x37/0x50 [jbd2]
[ 5596.520426] [<0>] __ldiskfs_journal_get_write_access+0x36/0x70 [ldiskfs]
[ 5596.521517] [<0>] osd_write+0x352/0xc30 [osd_ldiskfs]
[ 5596.522375] [<0>] dt_record_write+0x32/0x110 [obdclass]
[ 5596.523283] [<0>] tgt_client_data_update+0x513/0x6c0 [ptlrpc]
[ 5596.524253] [<0>] tgt_client_del+0x3a5/0x710 [ptlrpc]
[ 5596.525100] [<0>] ofd_obd_disconnect+0x1f4/0x210 [ofd]
[ 5596.525972] [<0>] target_handle_disconnect+0x227/0x4f0 [ptlrpc]
[ 5596.526976] [<0>] tgt_disconnect+0x4a/0x190 [ptlrpc]
[ 5596.527832] [<0>] tgt_request_handle+0xc93/0x1a40 [ptlrpc]
[ 5596.528764] [<0>] ptlrpc_server_handle_request+0x323/0xbd0 [ptlrpc]
[ 5596.529819] [<0>] ptlrpc_main+0xc06/0x1560 [ptlrpc]
[ 5596.530636] [<0>] kthread+0x116/0x130
[ 5596.531261] [<0>] ret_from_fork+0x35/0x40
[ 5596.531942] Pid: 132133, comm: ll_ost00_032 4.18.0-348.2.1.el8_lustre.x86_64 #1 SMP Mon Aug 22 10:33:11 UTC 2022
[ 5596.531944] Lustre: ll_ost00_013: service thread pid 66110 was inactive for 43.007 seconds. Watchdog stack traces are limited to 3 per 300 seconds, skipping this one.
[ 5596.533518] Call Trace TBD:
[ 5596.536275] [<0>] do_get_write_access+0x2d8/0x430 [jbd2]
[ 5596.537159] [<0>] jbd2_journal_get_write_access+0x37/0x50 [jbd2]
[ 5596.538139] [<0>] __ldiskfs_journal_get_write_access+0x36/0x70 [ldiskfs]
[ 5596.539221] [<0>] osd_write+0x352/0xc30 [osd_ldiskfs]
[ 5596.540074] [<0>] dt_record_write+0x32/0x110 [obdclass]
[ 5596.540972] [<0>] tgt_client_data_update+0x513/0x6c0 [ptlrpc]
[ 5596.541946] [<0>] tgt_client_del+0x3a5/0x710 [ptlrpc]
[ 5596.542778] [<0>] ofd_obd_disconnect+0x1f4/0x210 [ofd]
[ 5596.543650] [<0>] target_handle_disconnect+0x227/0x4f0 [ptlrpc]
[ 5596.544649] [<0>] tgt_disconnect+0x4a/0x190 [ptlrpc]
[ 5596.545510] [<0>] tgt_request_handle+0xc93/0x1a40 [ptlrpc]
[ 5596.546445] [<0>] ptlrpc_server_handle_request+0x323/0xbd0 [ptlrpc]
[ 5596.547487] [<0>] ptlrpc_main+0xc06/0x1560 [ptlrpc]
[ 5596.548285] [<0>] kthread+0x116/0x130
[ 5596.548901] [<0>] ret_from_fork+0x35/0x40

VVVVVVV DO NOT REMOVE LINES BELOW, Added by Maloo for auto-association VVVVVVV
replay-single test_67b - AT should have prevented reconnect



 Comments   
Comment by Nikitas Angelinas [ 19/Oct/22 ]

+1 on master: https://testing.whamcloud.com/test_sets/0b765dc8-ecd3-4dcb-bdcd-bac802d536d7

Generated at Sat Feb 10 03:24:23 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.