[LU-8892] SSK - sanity test_27j FAIL: setstripe failed Created: 02/Dec/16  Updated: 14/Dec/21  Resolved: 14/Dec/21

Status: Resolved
Project: Lustre
Component/s: None
Affects Version/s: Lustre 2.9.0
Fix Version/s: None

Type: Bug Priority: Minor
Reporter: Sarah Liu Assignee: Jeremy Filizetti
Resolution: Cannot Reproduce Votes: 1
Labels: None
Environment:

lustre-b2_9/#21 2.9.0


Severity: 3
Rank (Obsolete): 9223372036854775807

 Description   

running sanity with shared key crypto enabled on clients to servers; inter servers and MGS
https://testing.hpdd.intel.com/test_sets/959702c4-bb0e-11e6-884d-5254006e85c2

== sanity test 27j: setstripe with bad stripe offset (should return error) ======
====================== 16:35:54 (1480638954)
 sanity test_27j: @@@@@@ FAIL: setstripe failed
  Trace dump:
  = /usr/lib64/lustre/tests/test-framework.sh:4841:error()
  = /usr/lib64/lustre/tests/sanity.sh:1422:test_27j()
  = /usr/lib64/lustre/tests/test-framework.sh:5117:run_one()
  = /usr/lib64/lustre/tests/test-framework.sh:5156:run_one_logged()
  = /usr/lib64/lustre/tests/test-framework.sh:5003:run_test()
  = /usr/lib64/lustre/tests/sanity.sh:1424:main()


 Comments   
Comment by Oleg Drokin [ 05/Dec/16 ]

Earlier in this same testrun we see:

[182925.453499] Lustre: DEBUG MARKER: == sanity test 17o: stat file with incompat LMA feature ============================================== 16:21:39 (1480638099)
[182939.135139] Lustre: 5243:0:(client.c:2111:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1480638105/real 1480638105]  req@ffff88002442e900 x1552551977797424/t0(0) o801->MGC10.2.2.59@tcp@10.2.2.59@tcp:26/25 lens 224/224 e 0 to 1 dl 1480638112 ref 2 fl Rpc:X/0/ffffffff rc 0/-1
[182939.149932] LustreError: 5243:0:(gss_keyring.c:1418:gss_kt_update()) negotiation: rpc err -85, gss err 0
[182939.152803] Lustre: 5243:0:(sec_gss.c:316:cli_ctx_expire()) ctx ffff880078fee180(0->MGS) get expired: 1480638305(+193s)
[182939.159543] Lustre: 14687:0:(sec_gss.c:1226:gss_cli_ctx_fini_common()) gss.keyring@ffff88007bc85900: destroy ctx ffff880078fee180(0->MGS)
[182944.724628] Lustre: DEBUG MARKER: mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 0 sec
[182945.855133] LustreError: 11-0: lustre-MDT0000-mdc-ffff880078944000: operation ldlm_enqueue to node 10.2.2.59@tcp failed: rc = -107
[182945.860461] Lustre: lustre-MDT0000-mdc-ffff880078944000: Connection to lustre-MDT0000 (at 10.2.2.59@tcp) was lost; in progress operations using this service will wait for recovery to complete
[182945.939833] Lustre: lustre-MDT0000-mdc-ffff880078944000: Connection restored to 10.2.2.59@tcp (at 10.2.2.59@tcp)
[182946.195189] Lustre: 5264:0:(client.c:2111:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1480638113/real 1480638113]  req@ffff88002442e900 x1552551977797440/t0(0) o801->MGC10.2.2.59@tcp@10.2.2.59@tcp:26/25 lens 224/224 e 0 to 1 dl 1480638120 ref 2 fl Rpc:X/0/ffffffff rc 0/-1
[182946.209679] LustreError: 5264:0:(gss_keyring.c:1418:gss_kt_update()) negotiation: rpc err -85, gss err 0
[182946.212506] Lustre: 5264:0:(sec_gss.c:316:cli_ctx_expire()) ctx ffff88007826b240(0->MGS) get expired: 1480638313(+193s)
[182946.218563] Lustre: 14687:0:(sec_gss.c:1226:gss_cli_ctx_fini_common()) gss.keyring@ffff88007bc85900: destroy ctx ffff88007826b240(0->MGS)
[182946.395724] Lustre: 5767:0:(sec_gss.c:378:gss_cli_ctx_uptodate()) client refreshed ctx ffff880025b313c0 idx 0xbd7bdcb06c054c4d (0->MGS), expiry 1481242910(+604790s)
[182946.401046] Lustre: 5767:0:(sec_gss.c:378:gss_cli_ctx_uptodate()) Skipped 3 previous similar messages
[182946.415206] Lustre: 5767:0:(gss_svc_upcall.c:857:gss_svc_upcall_install_rvs_ctx()) create reverse svc ctx ffff8800789c4a40 to MGS: idx 0x79614580b3b227d9
[182946.420478] Lustre: 5767:0:(gss_svc_upcall.c:857:gss_svc_upcall_install_rvs_ctx()) Skipped 3 previous similar messages
[182946.425902] LustreError: 166-1: MGC10.2.2.59@tcp: Connection to MGS (at 10.2.2.59@tcp) was lost; in progress operations using this service will fail
[182946.432359] Lustre: Evicted from MGS (at 10.2.2.59@tcp) after server handle changed from 0xf921f261b26792e4 to 0xf921f261b2679530

So I guess it is related?

Comment by Peter Jones [ 05/Dec/16 ]

Jeremy

Could you please advise on the seriousness of this issue?

Thanks

Peter

Generated at Sat Feb 10 02:21:27 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.