[LU-3914] Need sign off on Lustre RPM's and patches to include for ORNL 2.4 transition to Operations Created: 09/Sep/13  Updated: 23/Sep/13  Resolved: 23/Sep/13

Status: Resolved
Project: Lustre
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Task Priority: Minor
Reporter: Jason Hill (Inactive) Assignee: James Nunez (Inactive)
Resolution: Fixed Votes: 0
Labels: None

Rank (Obsolete): 10322

 Description   

Hi James,

Soon we'll open a Jira ticket for a (very) quick discussion of the RPM's
that we'll use to put our new storage (Atlas) into production here at
ORNL. We have had a new system that requires a 2.X filesystem (client is
only 2.4). We'll chime in on what we've been building (including
patches/jira tickets), and we need to get a sign-off that we're in good
shape to move into production. Any patches that Intel wants us to include
should also be mentioned so that we can evaluate the patch for our own
stability purposes.

This ticket will need pretty fast turn around so I wanted to make you
aware.



 Comments   
Comment by Peter Jones [ 09/Sep/13 ]

James will own providing the response on this. Jason, please provide the details of the patches you are evaluating at your convenience.

Comment by James A Simmons [ 09/Sep/13 ]

For the last set of rpms I only included one patch which was for LU-3497. That patch enables us to build ZFS without installing zfs rpms on your build machines. The only other patch I can think of is also http://review.whamcloud.com/#/c/6960 for LU-2987 but this is more to support cray SLES11 SP2 clients. Cray also saw issues with NFS which we might be interested in.

Comment by James Nunez (Inactive) [ 09/Sep/13 ]

I’d like to get a complete picture of what your system will be running and then we can comment on if you should apply patches and which ones. You stated that the clients will be 2.4. Client 2.4.1? What server version will you be using 2.4.1 or 2.5.0? You’ll be using ZFS and SLES 11 SP2. Correct?

Is there a ticket open for the NFS issues that Cray saw?

Thanks,
James

Comment by Jason Hill (Inactive) [ 09/Sep/13 ]

We will have the Cray 1.8.6 client (~18k nodes), a mix of 1.8.8 and 1.8.9 Linux clients(~250 nodes), and the Cray 2.4.0 client (~750 nodes).

Comment by James Nunez (Inactive) [ 09/Sep/13 ]

2.4.1 is the recommended upgrade path. Is this your target release?

Comment by Jason Hill (Inactive) [ 10/Sep/13 ]

James,

When is the GA release date for 2.4.1 (has it passed and I missed it?)? We need to be in production next week with a small filesystem, and then will transition all of the hardware to production in early October. If you have something that Intel will bless/support now then we'll run it. Otherwise we're going 2.4.0 for now and will evaluate 2.4.1 as an upgrade when it's GA'd and we have some time to test it internally.


-Jason

Comment by James Nunez (Inactive) [ 10/Sep/13 ]

Jason,

Thank you for your timeline on transitioning to 2.4. Let's find out what patches are recommended with a 2.4.0 installation.

Comment by James Nunez (Inactive) [ 10/Sep/13 ]

Oleg,

Would you please comment on recommended patches to 2.4.0?

Thank you,
James

Comment by James A Simmons [ 10/Sep/13 ]

Just as a note we are running a Lustre 2.4 snapshot from the middle of August.

Comment by James Nunez (Inactive) [ 10/Sep/13 ]

2.4.1 is in release testing at the moment and, depending upon how the testing progresses, should be out soon.

We recommend that your initial 2.4 deployment be to the latest b2_4 branch. We plan to include the patch for LU-3497 in the 2.4.2 release and, thus, feel comfortable patching the latest b2_4 code with that patch.

We absolutely will support ORNL running on this code.

After the initial deployment and testing, we can see whether any further patches may be needed.

Comment by James Nunez (Inactive) [ 23/Sep/13 ]

ORNL said they received an adequate response. Closing ticket.

Generated at Sat Feb 10 01:38:01 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.