[LU-8548] MDS crash during DNE2 testing with Lustre 2.9 Created: 26/Aug/16 Updated: 26/Aug/16 Resolved: 26/Aug/16 |
|
| Status: | Closed |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | Lustre 2.9.0 |
| Fix Version/s: | None |
| Type: | Bug | Priority: | Major |
| Reporter: | James A Simmons | Assignee: | Lai Siyao |
| Resolution: | Cannot Reproduce | Votes: | 0 |
| Labels: | None | ||
| Environment: |
Power8 clients running 2.8.56 and back end servers running the same. This OOM happened while running mdtest with the directory striped across 16 MDTs. |
||
| Attachments: |
|
| Severity: | 3 |
| Rank (Obsolete): | 9223372036854775807 |
| Description |
|
I attached the kern log that captured the OOM that happened while running mdtest with the directory striped across 16 MDTs. |
| Comments |
| Comment by Joseph Gmitter (Inactive) [ 26/Aug/16 ] |
|
Hi Lai, Could you please help to investigate this issue? Thanks. |
| Comment by Lai Siyao [ 26/Aug/16 ] |
|
The backtrace shows it's a soft lockup in mdt_lock_root_xattr()->mdt_remote_object_lock(), which is introduced in BTW, do you think it's happens with Power8 clients only? |
| Comment by James A Simmons [ 26/Aug/16 ] |
|
I updated the server side up to the latest master since it was over a month old and when I attempted to reproduce this problem it went away. So it looks like a version mismatch in version of pre-2.9 caused this issue. |