[LU-10083] racer test_1: test failed to respond and timed out Created: 05/Oct/17  Updated: 10/Oct/23  Resolved: 10/Sep/18

Status: Closed
Project: Lustre
Component/s: None
Affects Version/s: Lustre 2.11.0
Fix Version/s: None

Type: Bug Priority: Minor
Reporter: James Casper Assignee: WC Triage
Resolution: Cannot Reproduce Votes: 0
Labels: None
Environment:

trevis, full, x86_64 servers, ppc clients
servers: el7.4, ldiskfs, branch master, v2.10.53.1, b3642
clients: el7.4, branch master, v2.10.53.1, b3642


Severity: 3
Rank (Obsolete): 9223372036854775807

 Description   

https://testing.hpdd.intel.com/test_sessions/ba995751-659c-4e63-9b5b-fbf101137b78

About 15 processes became stuck (138 stuck traces total). Most traces look similar to this one in client dmesg:

[ 3827.855926] file_concat.sh  D 00003fffb2dad678     0 28316      1 0x00000082
[ 3827.855966] Call Trace:
[ 3827.855992] [c00000004885f5a0] [d000000004116129] __func__.50680+0x2c71/0xffffffffffffd220 [lustre] (unreliable)
[ 3827.856050] [c00000004885f770] [c000000000019634] .__switch_to+0x254/0x460
[ 3827.856092] [c00000004885f820] [c0000000009a9c1c] .__schedule+0x43c/0xb00
[ 3827.856131] [c00000004885f950] [c0000000009ab86c] .schedule_preempt_disabled+0x2c/0xa0
[ 3827.856177] [c00000004885f9c0] [c0000000009a7efc] .__mutex_lock_slowpath+0x10c/0x2c0
[ 3827.856222] [c00000004885fa90] [c0000000009a810c] .mutex_lock+0x5c/0x60
[ 3827.856262] [c00000004885fb10] [c0000000003367bc] .path_openat+0x9dc/0x1ad0
[ 3827.856301] [c00000004885fc50] [c00000000033b800] .do_filp_open+0x40/0xb0
[ 3827.856340] [c00000004885fd80] [c000000000319408] .SyS_open+0x128/0x270
[ 3827.856379] [c00000004885fe30] [c00000000000a184] system_call+0x38/0xb4


 Comments   
Comment by James Nunez (Inactive) [ 10/Sep/18 ]

Closing issue since we haven't seen it recently

Generated at Sat Feb 10 02:31:54 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.