[LU-873] IOR single shared file test fails Created: 22/Nov/11 Updated: 29/May/17 Resolved: 29/May/17 |
|
| Status: | Resolved |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | Lustre 2.2.0, Lustre 1.8.7 |
| Fix Version/s: | None |
| Type: | Bug | Priority: | Major |
| Reporter: | Cliff White (Inactive) | Assignee: | Zhenyu Xu |
| Resolution: | Cannot Reproduce | Votes: | 0 |
| Labels: | None | ||
| Environment: |
Hyperion/LLNL |
||
| Severity: | 3 |
| Rank (Obsolete): | 10220 |
| Description |
|
IOR fails when > 10 clients are run. 0289: ERROR in aiori-POSIX.c (line 256): transfer failed. |
| Comments |
| Comment by Peter Jones [ 22/Nov/11 ] |
|
Bobi Could you please look into this one? Thanks Peter |
| Comment by Cliff White (Inactive) [ 23/Nov/11 ] |
|
This may be related/identical to |
| Comment by Jinshan Xiong (Inactive) [ 23/Nov/11 ] |
|
I took a look at the log. It looks like the client hyperion360 was requesting an RW lock with local cookie: 0x389db5cc182f83cd from OST0000. However, the completion RPC was dropped(client never got this RPC) so that the status of this lock on the server is granted, but client kept waiting completion; then this lock on the client will never be revoked because a process is waitinig for it. This is just my guess because the log was truncated. Please also notice that completion AST is not resendable; also there are several bulk transfer error on the console. I don't know if network ran into problem at that time. |
| Comment by Christopher Morrone [ 01/Dec/11 ] |
|
LLNL's ticket is |
| Comment by Andreas Dilger [ 29/May/17 ] |
|
Close old ticket. |