[LU-13332] Can't start lhsmtool Created: 05/Mar/20 Updated: 05/Mar/20 Resolved: 05/Mar/20 |
|
| Status: | Closed |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | Lustre 2.13.0 |
| Fix Version/s: | None |
| Type: | Question/Request | Priority: | Minor |
| Reporter: | Mahmoud Hanafi | Assignee: | Ben Evans (Inactive) |
| Resolution: | Not a Bug | Votes: | 0 |
| Labels: | None | ||
| Rank (Obsolete): | 9223372036854775807 |
| Description |
|
Not sure what I am doing wrong. Followed the documentation. nbptest3-srv1 ~ # lctl get_param mdt.*.hsm_control mdt.nbptest3-MDT0000.hsm_control=enabled When I try to start hsm I get an error service198 ~ # lhsmtool_posix --hsm-root /mnt/pcc -v /nobackuptest3 lhsmtool_posix: 1583393927.046985 lhsmtool_posix[10762]: action=0 src=(null) dst=(null) mount_point=/nobackuptest3 lhsmtool_posix: cannot start copytool on '/nobackuptest3': No such device or address (6) lhsmtool_posix: 1583393927.080040 lhsmtool_posix[10762]: cannot start copytool interface: No such device or address (6) lhsmtool_posix: 1583393927.080134 lhsmtool_posix[10762]: process finished, errs: 0 major, 0 minor, rc=-6 (No such device or address) service198 ~ # df Filesystem 1K-blocks Used Available Use% Mounted on devtmpfs 16300832 0 16300832 0% /dev tmpfs 16315784 4 16315780 1% /dev/shm tmpfs 16315784 14768 16301016 1% /run /dev/sda32 237959828 9766176 216099292 5% / tmpfs 16315784 0 16315784 0% /sys/fs/cgroup /dev/sda11 1176704 72300 1042964 7% /boot 10.151.27.53@o2ib:/nbptest3 311935815440 14440 296206997160 1% /nobackuptest3 /dev/nvme0n1 1537235176 46143840 1412934264 4% /mnt/pcc tmpfs 3263160 0 3263160 0% /run/user/11312 |
| Comments |
| Comment by Peter Jones [ 05/Mar/20 ] |
|
Ben Could you please advise? Thanks Peter |
| Comment by Ben Evans (Inactive) [ 05/Mar/20 ] |
|
Mahmoud, could you please run lfs df on the lhsmtool client? If you could also get /var/log/messages from the MDS and the client where this is run (just the last few lines when you try to start the copytool should be sufficient) |
| Comment by Mahmoud Hanafi [ 05/Mar/20 ] |
|
Nothing on the mds Mar 5 13:19:33 nbptest3-srv1 kernel: [146234.152375] Lustre: nbptest3-OST0000: Connection restored to 964be805-c593-4 (at 10.151.27.56@o2ib) Mar 5 13:19:33 nbptest3-srv1 kernel: [146234.152378] Lustre: Skipped 1 previous similar message Mar 5 13:20:01 nbptest3-srv1 systemd[1]: Created slice User Slice of root. Mar 5 13:20:01 nbptest3-srv1 systemd[1]: Started Session 1443 of user root. Mar 5 13:20:01 nbptest3-srv1 systemd[1]: Started Session 1444 of user root. Mar 5 13:20:01 nbptest3-srv1 systemd[1]: Started Session 1445 of user root. Mar 5 13:20:01 nbptest3-srv1 systemd[1]: Removed slice User Slice of root. nbptest3-srv1 ~ #
Mar 5 13:19:14 service198 kernel: [21244.122717] LNet: Using FMR for registration
Mar 5 13:19:16 service198 kernel: [21246.847040] LNet: Added LNI 10.151.27.56@o2ib [32/125536/0/0]
Mar 5 13:19:33 service198 kernel: [21263.646119] Lustre: Mounted nbptest3-client
Mar 5 13:20:02 service198 systemd[1]: Created slice User Slice of root.
Mar 5 13:20:02 service198 systemd[1]: Started Session 216 of user root.
Mar 5 13:20:02 service198 systemd[1]: Started Session 214 of user root.
Mar 5 13:20:02 service198 systemd[1]: Started Session 215 of user root.
Mar 5 13:20:03 service198 systemd[1]: Removed slice User Slice of root.
Mar 5 13:20:28 service198 kernel: [21318.585281] LustreError: 33188:0:(lmv_obd.c:778:lmv_hsm_ct_register()) nbptest3-clilmv-ffffa0b793817000: iocontrol MDC nbptest3-MDT0001_UUID on MDT idx 1 cmd 401866d5: err = -6
Mar 5 13:20:28 service198 kernel: [21318.603000] VFS: Close: file count is 0
On the cmd line service198 ~ # lhsmtool_posix --daemon --hsm-root /mnt/pcc /nobackuptest3 lhsmtool_posix: 1583443391.886380 lhsmtool_posix[33322]: action=0 src=(null) dst=(null) mount_point=/nobackuptest3 service198 ~ # lhsmtool_posix: cannot start copytool on '/nobackuptest3': No such device or address (6) lhsmtool_posix: 1583443391.920160 lhsmtool_posix[33323]: cannot start copytool interface: No such device or address (6) lhsmtool_posix: 1583443391.920277 lhsmtool_posix[33323]: process finished, errs: 0 major, 0 minor, rc=-6 (No such device or address) Mounted Filesystem Filesystem 1K-blocks Used Available Use% Mounted on devtmpfs 16300832 0 16300832 0% /dev tmpfs 16315784 4 16315780 1% /dev/shm tmpfs 16315784 14672 16301112 1% /run /dev/sda32 237959828 13620136 212245332 7% / tmpfs 16315784 0 16315784 0% /sys/fs/cgroup /dev/sda11 1176704 72300 1042964 7% /boot /dev/nvme0n1 1537235176 46143840 1412934264 4% /mnt/pcc 10.151.27.53@o2ib:/nbptest3 311935815440 14440 296206997160 1% /nobackuptest3 service198 /nobackuptest3 # lctl dl 0 UP mgc MGC10.151.27.53@o2ib eddb5d50-3b93-4 4 1 UP lov nbptest3-clilov-ffffa0b793817000 a30aa14f-9ea3-4 3 2 UP lmv nbptest3-clilmv-ffffa0b793817000 a30aa14f-9ea3-4 4 3 UP mdc nbptest3-MDT0001-mdc-ffffa0b793817000 a30aa14f-9ea3-4 4 4 UP mdc nbptest3-MDT0000-mdc-ffffa0b793817000 a30aa14f-9ea3-4 4 5 UP osc nbptest3-OST0005-osc-ffffa0b793817000 a30aa14f-9ea3-4 4 6 UP osc nbptest3-OST0003-osc-ffffa0b793817000 a30aa14f-9ea3-4 4 7 UP osc nbptest3-OST0007-osc-ffffa0b793817000 a30aa14f-9ea3-4 4 8 UP osc nbptest3-OST0004-osc-ffffa0b793817000 a30aa14f-9ea3-4 4 9 UP osc nbptest3-OST0009-osc-ffffa0b793817000 a30aa14f-9ea3-4 4 10 UP osc nbptest3-OST0001-osc-ffffa0b793817000 a30aa14f-9ea3-4 4 11 UP osc nbptest3-OST0008-osc-ffffa0b793817000 a30aa14f-9ea3-4 4 12 UP osc nbptest3-OST0000-osc-ffffa0b793817000 a30aa14f-9ea3-4 4 13 UP osc nbptest3-OST0006-osc-ffffa0b793817000 a30aa14f-9ea3-4 4 14 UP osc nbptest3-OST0002-osc-ffffa0b793817000 a30aa14f-9ea3-4 4 |
| Comment by Mahmoud Hanafi [ 05/Mar/20 ] |
|
We can close this I figured out the issue. I forgot about the second MDT and didn't enable hsm_control on the second MDT. Its working now. |
| Comment by Ben Evans (Inactive) [ 05/Mar/20 ] |
|
Resolved by user |