[LU-5692] Lustre 2.5.3 client mounting Lustre 2.5.3 failed Created: 30/Sep/14 Updated: 25/Mar/16 Resolved: 25/Mar/16 |
|
| Status: | Resolved |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | Lustre 2.5.3 |
| Fix Version/s: | None |
| Type: | Bug | Priority: | Major |
| Reporter: | Haisong Cai (Inactive) | Assignee: | Jian Yu |
| Resolution: | Done | Votes: | 0 |
| Labels: | sdsc | ||
| Environment: |
Linux lustre-mds-8-0.local 2.6.32-431.23.3.el6_lustre.x86_64 #1 SMP Thu Aug 28 20:20:13 PDT 2014 x86_64 x86_64 x86_64 GNU/Linux |
||
| Attachments: |
|
| Severity: | 3 |
| Rank (Obsolete): | 15934 |
| Description |
|
We have a Lustre 2.4.2 file-system. It was upgraded from 1.8.7 without reformatting MDT/OST. Recently, we decided to upgrade it to 2.5.3. Prior to this upgrade, the file-system had clients running 2.4.2 and 2.5.3 at different time and running without problem. 4 clients only. Yesterday, we upgraded the server to 2.5.3. MDS first. Client running 2.4.2 hung for 15+ ,minutes. We then rebooted client and mount still hung. We then try to mount with 2 2.5.3 clients and both of them crashed. At this point, only MDS was upgraded to 2.5.3 and rest of OSSs were still running 2.4.2. I am attaching MDS logs here. Please advice |
| Comments |
| Comment by Peter Jones [ 01/Oct/14 ] |
|
Yu, Jian Could you please assist with this issue? Thanks Peter |
| Comment by Andreas Dilger [ 03/Oct/14 ] |
|
We do not test interoperability running MDS and OSS with different versions. Is there a particular reason you didn't upgrade the OSS at the same time? While that may not relate directly to your client problem, it introduces potential problems that could be easily avoided. |
| Comment by Haisong Cai (Inactive) [ 03/Oct/14 ] |
|
Hi Andreas, The reason was for rolling upgrade. So eventually, we will upgrade every server to a same version of Lustre. thanks, |
| Comment by Jian Yu [ 07/Oct/14 ] |
|
Hi Haisong, Could you please gather the vmcore crash dump file for the Lustre 2.5.3 client and upload it to "uploads/ |
| Comment by Jian Yu [ 07/Oct/14 ] |
|
I did an experiment on a small test cluster (2 Clients, 1 MGS/MDS, 1 OSS) with the following steps: 1. setup and start Lustre 1.8.8-wc1 filesystem 2. shutdown the entire Lustre 1.8.8-wc1 filesystem 3. clean upgrade all Lustre servers and clients at once to Lustre 2.4.2 4. start the entire Lustre 2.4.2 filesystem 5. run IOR and tar applications on the two live Lustre 2.4.2 Clients 6. rolling upgrade MGS/MDS to Lustre 2.5.3 7. rolling upgrade one Client to Lustre 2.5.3 8. rolling upgrade the other Client to Lustre 2.5.3 9. run IOR and tar applications on the two live Lustre 2.5.3 Clients 10. rolling upgrade OSS to Lustre 2.5.3 11. run IOR and Simul tests on the upgraded Lustre 2.5.3 filesystem I tried to provision Lustre 1.8.7-wc1 servers but got kernel panic failure caused by isci module, so I switched to use Lustre 1.8.8-wc1. All of the above steps passed testing. |
| Comment by Haisong Cai (Inactive) [ 07/Oct/14 ] |
|
Thanks you for the information. Our file-system was upgraded from 1.8.7. I guess we are out of luck for rolling upgrade. By the way, we have reconfigured failed file-system for other purpose therefore unable to get vmcore dump produced for the time being. Haisong |
| Comment by John Fuchs-Chesney (Inactive) [ 25/Mar/16 ] |
|
Hello Haisong, We are marking this one as resolved/done. If you need any further work done on this ticket, please let us know and we can re-open it. Thanks, |