[LU-15091] Trying to start OBD ls3-MDT0000_UUID using the wrong disk ls30000_UUID. Were the /dev/ assignments rearranged Created: 12/Oct/21 Updated: 18/Oct/21 |
|
| Status: | Open |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | Lustre 2.14.0 |
| Fix Version/s: | None |
| Type: | Bug | Priority: | Minor |
| Reporter: | Olaf Faaland | Assignee: | Yang Sheng |
| Resolution: | Unresolved | Votes: | 0 |
| Labels: | llnl | ||
| Environment: |
zfs-2.1.0_1llnl |
||
| Severity: | 3 |
| Rank (Obsolete): | 9223372036854775807 |
| Description |
|
After renaming a file system and updating NIDs on the targets, MDT0000 fails to mount with the following error: LustreError: 157-3: Trying to start OBD ls3-MDT0000_UUID using the wrong disk ls30000_UUID. Were the /dev/ assignments rearranged? Note that lsd->lsd_uuid is missing "-MDT" between the fs name ("ls3") and the MDT index ("0000"). The rename was probably accomplished with: tunefs.lustre --writeconf --fsname=ls3 --rename=lustre3 -v asp1/mdt1 And the NID update was probably accomplished with: tunefs.lustre --param=mgsnode=172.19.1.141@o2ib100:172.19.1.142@o2ib100 --param=failover.node=172.19.1.141@o2ib100:172.19.1.142@o2ib100 asp1/mdt1 Unfortunately I no longer have the output from those commands, and I'm not certain exactly when this occurred. This only occurred on one MDT out of 12 targets (4 MDT 8 OST). I don't know why this one was different. I don't think this is enough information to find the root cause and fix it, but am creating the issue in hopes it prompts anyone else who sees this issue to document what led up to it. |
| Comments |
| Comment by Olaf Faaland [ 12/Oct/21 ] |
|
Peter, I didn't label this topllnl because of the insufficient information. |
| Comment by Olaf Faaland [ 13/Oct/21 ] |
|
I am not certain, but it seems as if the only problem was the file system name in last_recvd. I stopped all the targets, mounted the dataset as type zfs ("mount -t zfs asp1/mdt1 /mnt/foo"), used a hex editor to alter /mnt/foo/last_recvd and set the correct target name at offset 0 in the file, and umounted /mnt/foo. That allowed the mount to proceed. |
| Comment by Peter Jones [ 13/Oct/21 ] |
|
Yang Sheng Any suggestions here? Peter |
| Comment by Olaf Faaland [ 18/Oct/21 ] |
|
For my reference, my local ticket is TOSS5317 |