Details
-
Bug
-
Resolution: Fixed
-
Major
-
None
-
Lustre 2.4.1
-
2
-
12546
Description
AWE has reported the following problem about Lustre.
Occasionally some of the Lustre OSTs on sprig2 (one of the login nodes) are evicted but never reconnect and they see the following error in the syslog:
Jan 14 08:54:13 sprig2 kernel: [680476.616831] LustreError: 809:0:(cl_lock.c:1420:cl_unuse_try()) result = -108, this is unlikely!
extract of syslog
Jan 14 08:55:19 sprig2 kernel: [680478.577953] LustreError: 829:0:(cl_lock.c:1435:cl_unuse_locked()) } lock@ffff882d00691eb8
Jan 14 08:55:19 sprig2 kernel: [680478.577958] LustreError: 829:0:(cl_lock.c:1435:cl_unuse_locked()) 3 0: —
Jan 14 08:55:19 sprig2 kernel: [680478.577963] LustreError: 829:0:(cl_lock.c:1435:cl_unuse_locked()) 4 0: —
Jan 14 08:55:19 sprig2 kernel: [680478.577972] LustreError: 829:0:(cl_lock.c:1435:cl_unuse_locked()) 5 0: lock@ffff8838357b34d8[0 5 0 0 0 00000000] R(1):[0, 18446744073709551615]@[0x100070000:0x41dd7e:0x0] {
Jan 14 08:55:19 sprig2 kernel: [680478.577982] LustreError: 829:0:(cl_lock.c:1435:cl_unuse_locked()) lovsub@ffff882cb4efc760: [5 ffff883bbf9a88e8 P(0):[0, 1844674
4073709551615]@[0x5000013f2:0x2872:0x0]]
Jan 14 08:55:19 sprig2 kernel: [680478.577992] LustreError: 829:0:(cl_lock.c:1435:cl_unuse_locked()) osc@ffff883cb5239d80: ffff883c470b7b40 0x20000041001 0x54d9d8518dc4ea31
Looks similar to issue reported on https://jira.hpdd.intel.com/browse/LU-3889.