[LU-12992] sanity test_240: failed to mount client Created: 21/Nov/19 Updated: 22/Oct/20 Resolved: 17/Mar/20 |
|
| Status: | Resolved |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | Lustre 2.14.0 |
| Fix Version/s: | Lustre 2.14.0 |
| Type: | Bug | Priority: | Minor |
| Reporter: | Maloo | Assignee: | Sebastien Buisson |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | SSK | ||
| Issue Links: |
|
||||
| Severity: | 3 | ||||
| Rank (Obsolete): | 9223372036854775807 | ||||
| Description |
|
This issue was created by maloo for liuying <emoly.liu@intel.com> Failure Rate: 3.08% of most recent 65 runs, 35 skipped (all branches), especially in group review-dne-ssk testing This issue relates to the following test suite run: https://testing.whamcloud.com/test_sets/3f7f577a-0b7d-11ea-9487-52540065bddc test_240 failed with the following error: Starting client: onyx-30vm1.onyx.whamcloud.com: -o user_xattr,flock onyx-30vm4@tcp:/lustre /mnt/lustre CMD: onyx-30vm1.onyx.whamcloud.com mkdir -p /mnt/lustre CMD: onyx-30vm1.onyx.whamcloud.com mount -t lustre -o user_xattr,flock onyx-30vm4@tcp:/lustre /mnt/lustre mount.lustre: mount onyx-30vm4@tcp:/lustre at /mnt/lustre failed: Operation not permitted sanity test_240: @@@@@@ FAIL: failed to mount client <<Please provide additional information about the failure here>> VVVVVVV DO NOT REMOVE LINES BELOW, Added by Maloo for auto-association VVVVVVV |
| Comments |
| Comment by James Nunez (Inactive) [ 12/Dec/19 ] |
|
I’m seeing a similar issue for various sanity-sec tests For example, at https://testing.whamcloud.com/test_sets/203d27d0-1cb1-11ea-b1e8-52540065bddc, we see sanity-sec test 25 fail with, from the suite_log, trevis-40vm2: mount.lustre: mount trevis-40vm4@tcp:/lustre at /mnt/lustre failed: Operation not permitted sanity-sec test_25: @@@@@@ FAIL: unable to mount clients trevis-40vm1.trevis.whamcloud.com,trevis-40vm2 Since the client (vm2) isn’t able to mount, looking at the console log for that node, we see a couple of Lustre errors [23381.792880] LustreError: 18418:0:(gss_keyring.c:1411:gss_kt_update()) negotiation: rpc err 0, gss err 0 [23381.794690] LustreError: 18418:0:(gss_keyring.c:1411:gss_kt_update()) Skipped 1 previous similar message [23381.796257] Lustre: 18418:0:(sec_gss.c:315:cli_ctx_expire()) ctx ffff9dcc80055e00(0->lustre-MDT0000_UUID) get expired: 1576134447(+40s) [23381.799069] Lustre: 18418:0:(sec_gss.c:315:cli_ctx_expire()) Skipped 3 previous similar messages [23381.829815] Lustre: 18421:0:(sec_gss.c:377:gss_cli_ctx_uptodate()) client refreshed ctx ffff9dcce6b5d300 idx 0xf9e70ab0fc7179a7 (0->lustre-MDT0001_UUID), expiry 1576739197(+604790s) [23381.832536] Lustre: 18421:0:(sec_gss.c:377:gss_cli_ctx_uptodate()) Skipped 1 previous similar message [23381.834400] Lustre: 18421:0:(gss_svc_upcall.c:804:gss_svc_upcall_install_rvs_ctx()) create reverse svc ctx ffff9dcce7e1ce40 to lustre-MDT0001_UUID: idx 0xa48733bdf9b2ed6f [23381.836877] Lustre: 18421:0:(gss_svc_upcall.c:804:gss_svc_upcall_install_rvs_ctx()) Skipped 1 previous similar message [23386.829605] Lustre: 18430:0:(sec_gss.c:377:gss_cli_ctx_uptodate()) client refreshed ctx ffff9dccfaadf100 idx 0x6e9f4cddf09a985c (0->lustre-MDT0000_UUID), expiry 1576739202(+604790s) [23386.832351] Lustre: 18430:0:(sec_gss.c:377:gss_cli_ctx_uptodate()) Skipped 2 previous similar messages [23386.833903] Lustre: 18430:0:(gss_svc_upcall.c:804:gss_svc_upcall_install_rvs_ctx()) create reverse svc ctx ffff9dcc80169240 to lustre-MDT0000_UUID: idx 0xa48733bdf9b2ed72 [23386.836387] Lustre: 18430:0:(gss_svc_upcall.c:804:gss_svc_upcall_install_rvs_ctx()) Skipped 2 previous similar messages [23386.845207] LustreError: 18400:0:(lmv_obd.c:1261:lmv_statfs()) lustre-MDT0000-mdc-ffff9dcce7c96800: can't stat MDS #0: rc = -1 [23386.846471] Lustre: 14703:0:(sec_gss.c:1228:gss_cli_ctx_fini_common()) gss.keyring@ffff9dccf9e40c00: destroy ctx ffff9dcc80055e00(0->lustre-MDT0000_UUID) [23386.846473] Lustre: 14703:0:(sec_gss.c:1228:gss_cli_ctx_fini_common()) Skipped 11 previous similar messages [23386.861940] Lustre: 17601:0:(gss_cli_upcall.c:398:gss_do_ctx_fini_rpc()) client finishing forward ctx ffff9dccfaadf100 idx 0x6e9f4cddf09a985c (0->lustre-MDT0000_UUID) [23386.864376] Lustre: 17601:0:(gss_cli_upcall.c:398:gss_do_ctx_fini_rpc()) Skipped 11 previous similar messages [23386.893798] LustreError: 18400:0:(lov_obd.c:827:lov_cleanup()) lustre-clilov-ffff9dcce7c96800: lov tgt 0 not cleaned! deathrow=0, lovrc=1 [23386.902000] Lustre: Unmounted lustre-client [23386.903744] LustreError: 18400:0:(obd_mount.c:1672:lustre_fill_super()) Unable to mount (-1) [23387.123495] Lustre: DEBUG MARKER: /usr/sbin/lctl mark sanity-sec test_25: @@@@@@ FAIL: unable to mount clients trevis-40vm1.trevis.whamcloud.com,trevis-40vm2 In the console logs for the MDS1/3 (vm4), we see [ 5283.481582] Lustre: MGS: Connection restored to decbd92d-dc4c-4 (at 10.9.5.228@tcp) [ 5283.482930] Lustre: Skipped 1 previous similar message [ 5283.604715] Lustre: 6471:0:(sec_gss.c:2066:gss_svc_handle_init()) create svc ctx ffff9117dbd56640: user from 10.9.5.228@tcp authenticated as root [ 5283.606945] Lustre: 6471:0:(sec_gss.c:2066:gss_svc_handle_init()) Skipped 1 previous similar message [ 5283.638705] LustreError: 2455:0:(gss_svc_upcall.c:969:gss_svc_upcall_handle_init()) authentication failed [ 5283.640512] LustreError: 2455:0:(gss_svc_upcall.c:969:gss_svc_upcall_handle_init()) Skipped 3 previous similar messages [ 5283.642785] Lustre: 2455:0:(sec_gss.c:370:gss_cli_ctx_uptodate()) server installed reverse ctx ffff9117dc872100 idx 0xf84af856cc294b8, expiry 1576739209(+604800s) [ 5283.645641] Lustre: 2455:0:(sec_gss.c:370:gss_cli_ctx_uptodate()) Skipped 1 previous similar message [ 5288.660041] Lustre: 2454:0:(sec_gss.c:2066:gss_svc_handle_init()) create svc ctx ffff9117b645ae40: user from 10.9.5.229@tcp authenticated as root [ 5288.662394] Lustre: 2454:0:(sec_gss.c:2066:gss_svc_handle_init()) Skipped 2 previous similar messages [ 5288.688635] Lustre: 2454:0:(sec_gss.c:370:gss_cli_ctx_uptodate()) server installed reverse ctx ffff9117dca36600 idx 0xa48733bdf9b2ed72, expiry 1576739214(+604800s) [ 5288.691011] Lustre: 2454:0:(sec_gss.c:370:gss_cli_ctx_uptodate()) Skipped 2 previous similar messages [ 5288.692523] Lustre: lustre-MDT0000: Connection restored to 15155991-9df2-4 (at 10.9.5.229@tcp) [ 5288.693925] Lustre: Skipped 4 previous similar messages [ 5288.716302] Lustre: 2454:0:(sec_gss.c:2326:gss_svc_handle_destroy()) destroy svc ctx ffff9117b645ae40 idx 0x6e9f4cddf09a985c (0->10.9.5.229@tcp) [ 5288.718414] Lustre: 2454:0:(sec_gss.c:2326:gss_svc_handle_destroy()) Skipped 3 previous similar messages [ 5288.980369] Lustre: DEBUG MARKER: /usr/sbin/lctl mark sanity-sec test_25: @@@@@@ FAIL: unable to mount clients trevis-40vm1.trevis.whamcloud.com,trevis-40vm2 Similar failure for sanity-sec test 27a at https://testing.whamcloud.com/test_sets/74c35394-1bab-11ea-adca-52540065bddc . In the client (vm6) console log we see that gss_kt_update() rpc error is -85 and not 0 as above [23759.672542] LustreError: 3900:0:(gss_keyring.c:1411:gss_kt_update()) negotiation: rpc err -85, gss err 0 |
| Comment by Gerrit Updater [ 17/Dec/19 ] |
|
Sebastien Buisson (sbuisson@ddn.com) uploaded a new patch: https://review.whamcloud.com/37048 |
| Comment by Gerrit Updater [ 19/Dec/19 ] |
|
Sebastien Buisson (sbuisson@ddn.com) uploaded a new patch: https://review.whamcloud.com/37064 |
| Comment by Gerrit Updater [ 19/Dec/19 ] |
|
Sebastien Buisson (sbuisson@ddn.com) uploaded a new patch: https://review.whamcloud.com/37065 |
| Comment by Gerrit Updater [ 17/Mar/20 ] |
|
Oleg Drokin (green@whamcloud.com) merged in patch https://review.whamcloud.com/37064/ |
| Comment by Peter Jones [ 17/Mar/20 ] |
|
Landed for 2.14 |
| Comment by Gerrit Updater [ 22/Oct/20 ] |
|
Jian Yu (yujian@whamcloud.com) uploaded a new patch: https://review.whamcloud.com/40364 |