Details
-
Bug
-
Resolution: Cannot Reproduce
-
Critical
-
Lustre 2.5.5
-
None
-
Cray clients running unpatched lustre 2.8 GA clients. Server side running Lustre 2.5.5 with a patch set in a RHEL6.7 environment.
-
3
-
9223372036854775807
Description
Today we performed a test shot on our smaller Cray Aries cluster (700 nodes) with a non-patched lustre 2.8 GA client specially build for this system. The test were run against our atlas file system which is running a RHEL6.7 distro with the lustre version 2.5.5 with patches. During our test shot while running an IOR single shared file test across all nodes with the stripe count of 1008 the MDS server ran out of memory. I attached the dmesg output to this ticket.
We attempted to reproduce this assertion on Tuesday using the same conditions as last time, but it never crashed. After that, we moved to a server with 19717 and 18060 to hopefully prevent it in the future. I think we can close this ticket and reopen if we see it again. Thanks.