Details
-
Bug
-
Resolution: Fixed
-
Major
-
None
-
Lustre 2.5.1
-
None
-
2
-
14002
Description
Hello,
We are seeing following error message on Lustre 2.5.1 clients, and it makes the system not responsive. multiple clients were affected with this issue.
System Details: Lustre 2.5.1 / RHEL 6.5
Here are the node names, time stamps and one according message:
May 4 11:03:28 uc1n055 kernel: LustreError: 1979:0:(statahead.c:1704:do_statahead_enter()) ASSERTION( lli->u.d.d_sai == ((void *)0) ) failed:
May 4 18:15:43 uc1n468 kernel: LustreError: 42888:0:(statahead.c:1704:do_statahead_enter()) ASSERTION( lli->u.d.d_sai == ((void *)0) ) failed:
May 4 18:54:19 uc1n059 kernel: LustreError: 111650:0:(lovsub_lock.c:103:lovsub_lock_state()) ASSERTION( cl_lock_is_mutexed(slice->cls_lock) ) failed:
May 9 09:21:08 uc1n129 kernel: LustreError: 93767:0:(statahead.c:1704:do_statahead_enter()) LBUG
May 10 09:28:14 uc1n996 kernel: LustreError: 7387:0:(osc_lock.c:1224:osc_lock_wait()) LBUG
May 15 07:50:57 uc1n198 kernel: LustreError: 25007:0:(statahead.c:1704:do_statahead_enter()) ASSERTION( lli->u.d.d_sai == ((void *)0) ) failed:
2.5.1 code has a glitch, kthread_run() returns thread id which could be a big value, and IS_ERR defines (-1000, -1) which is too narrow. 2.5.3 code does not has this issue.
# define IS_ERR_VALUE(x) ((x) >= (unsigned long)-4095)