[LU-10881] OST fails to mount after installing 2.10.3 Created: 04/Apr/18 Updated: 11/May/18 |
|
| Status: | Open |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | Lustre 2.10.4 |
| Fix Version/s: | None |
| Type: | Bug | Priority: | Major |
| Reporter: | Cliff White (Inactive) | Assignee: | WC Triage |
| Resolution: | Unresolved | Votes: | 0 |
| Labels: | soak | ||
| Environment: |
Soak cluster, lustre-b2_10-ib build 33 |
||
| Attachments: |
|
| Severity: | 3 |
| Rank (Obsolete): | 9223372036854775807 |
| Description |
|
Downgraded cluster from 2.11 to 2.10.3. OSTs refuse to mount Apr 4 22:20:57 soak-2 sshd[5680]: pam_unix(sshd:session): session opened for user root by (uid=0) Apr 4 22:20:58 soak-2 kernel: LustreError: 5814:0:(ofd_dev.c:251:ofd_stack_fini()) header@ffff8804147f6780[0x0, 1, [0x1:0x0:0x0] hash exist]{ Apr 4 22:20:58 soak-2 kernel: LustreError: 5814:0:(ofd_dev.c:251:ofd_stack_fini()) ....local_storage@ffff8804147f67d0 Apr 4 22:20:58 soak-2 kernel: LustreError: 5814:0:(ofd_dev.c:251:ofd_stack_fini()) ....osd-zfs@ffff880403614618osd-zfs-object@ffff880403614618 Apr 4 22:20:58 soak-2 kernel: LustreError: 5814:0:(ofd_dev.c:251:ofd_stack_fini()) } header@ffff8804147f6780 Apr 4 22:20:58 soak-2 kernel: LustreError: 5814:0:(ofd_dev.c:251:ofd_stack_fini()) header@ffff8800aeac7b00[0x0, 1, [0x200000003:0x0:0x0] hash exist]{ Apr 4 22:20:58 soak-2 kernel: LustreError: 5814:0:(ofd_dev.c:251:ofd_stack_fini()) ....local_storage@ffff8800aeac7b50 Apr 4 22:20:58 soak-2 kernel: LustreError: 5814:0:(ofd_dev.c:251:ofd_stack_fini()) ....osd-zfs@ffff8803fed72970osd-zfs-object@ffff8803fed72970 Apr 4 22:20:58 soak-2 kernel: LustreError: 5814:0:(ofd_dev.c:251:ofd_stack_fini()) } header@ffff8800aeac7b00 Apr 4 22:20:58 soak-2 kernel: LustreError: 5814:0:(ofd_dev.c:251:ofd_stack_fini()) header@ffff8804147f7a40[0x0, 1, [0x200000003:0x2:0x0] hash exist]{ Apr 4 22:20:58 soak-2 kernel: LustreError: 5814:0:(ofd_dev.c:251:ofd_stack_fini()) ....local_storage@ffff8804147f7a90 Apr 4 22:20:58 soak-2 kernel: LustreError: 5814:0:(ofd_dev.c:251:ofd_stack_fini()) ....osd-zfs@ffff880403614750osd-zfs-object@ffff880403614750 Apr 4 22:20:58 soak-2 kernel: LustreError: 5814:0:(ofd_dev.c:251:ofd_stack_fini()) } header@ffff8804147f7a40 Apr 4 22:20:59 soak-2 kernel: LustreError: 5814:0:(ofd_dev.c:251:ofd_stack_fini()) header@ffff88040a271ec0[0x0, 1, [0xa:0x0:0x0] hash exist]{ Apr 4 22:20:59 soak-2 kernel: LustreError: 5814:0:(ofd_dev.c:251:ofd_stack_fini()) ....local_storage@ffff88040a271f10 Apr 4 22:20:59 soak-2 kernel: LustreError: 5814:0:(ofd_dev.c:251:ofd_stack_fini()) ....osd-zfs@ffff88041434c888osd-zfs-object@ffff88041434c888 Apr 4 22:20:59 soak-2 kernel: LustreError: 5814:0:(ofd_dev.c:251:ofd_stack_fini()) } header@ffff88040a271ec0 Apr 4 22:20:59 soak-2 kernel: LustreError: 5814:0:(ofd_dev.c:251:ofd_stack_fini()) header@ffff88041722fa40[0x0, 1, [0xa:0x9:0x0] hash exist]{ Apr 4 22:20:59 soak-2 kernel: LustreError: 5814:0:(ofd_dev.c:251:ofd_stack_fini()) ....local_storage@ffff88041722fa90 Apr 4 22:20:59 soak-2 kernel: LustreError: 5814:0:(ofd_dev.c:251:ofd_stack_fini()) ....osd-zfs@ffff880035d05998osd-zfs-object@ffff880035d05998 Apr 4 22:20:59 soak-2 kernel: LustreError: 5814:0:(ofd_dev.c:251:ofd_stack_fini()) } header@ffff88041722fa40 Apr 4 22:20:59 soak-2 kernel: LustreError: 5814:0:(obd_config.c:558:class_setup()) setup soaked-OST0000 failed (-17) Apr 4 22:20:59 soak-2 kernel: LustreError: 5814:0:(obd_config.c:1682:class_config_llog_handler()) MGC192.168.1.108@o2ib: cfg command failed: rc = -17 Apr 4 22:20:59 soak-2 kernel: Lustre: cmd=cf003 0:soaked-OST0000 1:dev 2:0 3:f Apr 4 22:20:59 soak-2 kernel: LustreError: 15c-8: MGC192.168.1.108@o2ib: The configuration from log 'soaked-OST0000' failed (-17). This may be the result of communication errors between this node and the MGS, a bad configuration, or other errors. See the syslog for more information. Apr 4 22:20:59 soak-2 kernel: LustreError: 5705:0:(obd_mount_server.c:1386:server_start_targets()) failed to start server soaked-OST0000: -17 Apr 4 22:20:59 soak-2 kernel: LustreError: 5705:0:(obd_mount_server.c:1879:server_fill_super()) Unable to start targets: -17 Will try re-formatting the fs |
| Comments |
| Comment by Andreas Dilger [ 05/Apr/18 ] |
|
Finding the "hash exist" message was not very clear. This message is generated in two parts in lu_object_header_print(), and it appears that this is from lu_site_print->lu_site_obj_print->lu_object_print(). This is printed from ofd_stack_fini() during cleanup if ls_obj_hash() is not empty (apparently because local_oid_storage_fini() did not clean up properly), but that doesn't appear to be the reason why the startup failed. Are there earlier messages in the logs that indicate why the mount failed? |
| Comment by Cliff White (Inactive) [ 05/Apr/18 ] |
|
Unfortunately no. We dumped lctl log after one failure, file attached. System has been re-formatted, which seems to have removed the problem |
| Comment by Cliff White (Inactive) [ 11/May/18 ] |
|
Hit this again when downgrading to tip of 2.10. Will leave system in this state if further information desired. |
| Comment by Cliff White (Inactive) [ 11/May/18 ] |
|
Console log. mount attempt was made after reboot. [ 102.088947] Lustre: Lustre: Build Version: 2.10.3_132_g6910400 header@ffff8804046e0d80 [ 124.499730] LustreError: 5487:0:(ofd_dev.c:251:ofd_stack_fini()) header@ffff8801797dccc0[0x0, 1, [0x200000003:0x0:0x0] hash exist] { [ 124.514673] LustreError: 5487:0:(ofd_dev.c:251:ofd_stack_fini()) ....local_storage@ffff8801797dcd10 [ 124.526505] LustreError: 5487:0:(ofd_dev.c:251:ofd_stack_fini()) ....osd-zfs@ffff8803ff7f1728osd-zfs-object@ffff8803ff7f1728 [ 124.540748] LustreError: 5487:0:(ofd_dev.c:251:ofd_stack_fini()) }header@ffff8801797dccc0 [ 124.551682] LustreError: 5487:0:(ofd_dev.c:251:ofd_stack_fini()) header@ffff8804046e0540[0x0, 1, [0x200000003:0x2:0x0] hash exist] { [ 124.566619] LustreError: 5487:0:(ofd_dev.c:251:ofd_stack_fini()) ....local_storage@ffff8804046e0590 [ 124.578435] LustreError: 5487:0:(ofd_dev.c:251:ofd_stack_fini()) ....osd-zfs@ffff8800ae6a4af8osd-zfs-object@ffff8800ae6a4af8 [ 124.592668] LustreError: 5487:0:(ofd_dev.c:251:ofd_stack_fini()) }header@ffff8804046e0540 [ 124.603600] LustreError: 5487:0:(ofd_dev.c:251:ofd_stack_fini()) header@ffff8804046e0840[0x0, 1, [0xa:0x0:0x0] hash exist] { [ 124.617728] LustreError: 5487:0:(ofd_dev.c:251:ofd_stack_fini()) ....local_storage@ffff8804046e0890 [ 124.629531] LustreError: 5487:0:(ofd_dev.c:251:ofd_stack_fini()) ....osd-zfs@ffff8800ae6a55f0osd-zfs-object@ffff8800ae6a55f0 [ 124.643755] LustreError: 5487:0:(ofd_dev.c:251:ofd_stack_fini()) }header@ffff8804046e0840 [ 124.654672] LustreError: 5487:0:(ofd_dev.c:251:ofd_stack_fini()) header@ffff8800af39af00[0x0, 1, [0xa:0x5:0x0] hash exist] { [ 124.668796] LustreError: 5487:0:(ofd_dev.c:251:ofd_stack_fini()) ....local_storage@ffff8800af39af50 [ 124.514673] LustreError: 5487:0:(ofd_dev.c:251:ofd_stack_fini()) ....local_storage@ffff8801797dcd10 [ 124.526505] LustreError: 5487:0:(ofd_dev.c:251:ofd_stack_fini()) ....osd-zfs@ffff8803ff7f1728osd-zfs-object@ffff8803ff7f1728 [ 124.540748] LustreError: 5487:0:(ofd_dev.c:251:ofd_stack_fini()) }header@ffff8801797dccc0 [ 124.551682] LustreError: 5487:0:(ofd_dev.c:251:ofd_stack_fini()) header@ffff8804046e0540[0x0, 1, [0x200000003:0x2:0x0] hash exist] { [ 124.566619] LustreError: 5487:0:(ofd_dev.c:251:ofd_stack_fini()) ....local_storage@ffff8804046e0590 [ 124.578435] LustreError: 5487:0:(ofd_dev.c:251:ofd_stack_fini()) ....osd-zfs@ffff8800ae6a4af8osd-zfs-object@ffff8800ae6a4af8 [ 124.592668] LustreError: 5487:0:(ofd_dev.c:251:ofd_stack_fini()) }header@ffff8804046e0540 [ 124.603600] LustreError: 5487:0:(ofd_dev.c:251:ofd_stack_fini()) header@ffff8804046e0840[0x0, 1, [0xa:0x0:0x0] hash exist] { [ 124.617728] LustreError: 5487:0:(ofd_dev.c:251:ofd_stack_fini()) ....local_storage@ffff8804046e0890 [ 124.629531] LustreError: 5487:0:(ofd_dev.c:251:ofd_stack_fini()) ....osd-zfs@ffff8800ae6a55f0osd-zfs-object@ffff8800ae6a55f0 [ 124.643755] LustreError: 5487:0:(ofd_dev.c:251:ofd_stack_fini()) }header@ffff8804046e0840 [ 124.654672] LustreError: 5487:0:(ofd_dev.c:251:ofd_stack_fini()) header@ffff8800af39af00[0x0, 1, [0xa:0x5:0x0] hash exist] { [ 124.668796] LustreError: 5487:0:(ofd_dev.c:251:ofd_stack_fini()) ....local_storage@ffff8800af39af50 [ 124.681882] LustreError: 5487:0:(ofd_dev.c:251:ofd_stack_fini()) ....osd-zfs@ffff88040c464138osd-zfs-object@ffff88040c464138 [ 124.698719] LustreError: 5487:0:(ofd_dev.c:251:ofd_stack_fini()) }header@ffff8800af39af00 [ 124.712451] LustreError: 5487:0:(obd_config.c:558:class_setup()) setup soaked-OST0000 failed (-17) [ 124.759606] LustreError: 15c-8: MGC192.168.1.108@o2ib: The configuration from log 'soaked-OST0000' failed (-17). This may be the result of communication errors between this node and the MGS, a bad configuration, or other errors. See the syslog for more information. |
| Comment by Cliff White (Inactive) [ 11/May/18 ] |
|
I've also reproduced the failure on soak-3, so it's definitely a code issue. stack dumping/Crash dumping soak-3 now, crash dump will be available on Spirit |