Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-17228

sanity test_36i: expect 200 got 1698166494

Details

    • Bug
    • Resolution: Duplicate
    • Minor
    • None
    • None
    • None
    • 3
    • 9223372036854775807

    Description

      This issue was created by maloo for S Buisson <sbuisson@ddn.com>

      This issue relates to the following test suite run: https://testing.whamcloud.com/test_sets/092044e2-44a4-4fac-8108-9b95801c2180

      test_36i failed with the following error:

      expect 200 got 1698166494
      

      Test session details:
      clients: https://build.whamcloud.com/job/lustre-reviews/99900 - 4.18.0-477.21.1.el8_8.x86_64
      servers: https://build.whamcloud.com/job/lustre-reviews/99900 - 4.18.0-477.21.1.el8_lustre.x86_64

      Possibly because the file could not be stated:

      stat: cannot statx '/mnt/lustre/d36i.sanity/striped_dir': No such file or directory
      

      VVVVVVV DO NOT REMOVE LINES BELOW, Added by Maloo for auto-association VVVVVVV
      sanity test_36i - expect 200 got 1698166494

      Attachments

        Issue Links

          Activity

            [LU-17228] sanity test_36i: expect 200 got 1698166494

            Fixed with landing of LU-17317.

            adilger Andreas Dilger added a comment - Fixed with landing of LU-17317 .

            This issue is fixed with landing of patch "LU-17317 gss: do not continue using expired reverse context" https://review.whamcloud.com/c/fs/lustre-release/+/53375.
            The two failures reported above are for patches not sitting on top of master, hence lacking the fix.

            sebastien Sebastien Buisson added a comment - This issue is fixed with landing of patch " LU-17317 gss: do not continue using expired reverse context" https://review.whamcloud.com/c/fs/lustre-release/+/53375 . The two failures reported above are for patches not sitting on top of master, hence lacking the fix.
            adilger Andreas Dilger added a comment - +1 on master: https://testing.whamcloud.com/test_sets/6e5fa464-b884-4959-98fa-b086a97fdce8
            adilger Andreas Dilger added a comment - +1 on master: https://testing.whamcloud.com/test_sets/1e766618-c631-4e9c-b822-fcb1f538e578
            sebastien Sebastien Buisson added a comment - - edited

            Thanks Etienne. Looks similar to LU-17173 that I am currently investigating.

            sebastien Sebastien Buisson added a comment - - edited Thanks Etienne. Looks similar to LU-17173 that I am currently investigating.

            This test continues to fail on https://review.whamcloud.com/53202.
            It seems to be related to GSS:

            client dmesg:

            [ 2861.312457] Lustre: DEBUG MARKER: /usr/sbin/lctl mark == sanity test 36i: change mtime on striped directory ==== 10:22:05 \(1701080525\)
            [ 2861.534950] Lustre: DEBUG MARKER: == sanity test 36i: change mtime on striped directory ==== 10:22:05 (1701080525)
            [ 2861.606338] Lustre: 18765:0:(gss_svc_upcall.c:1620:gss_svc_upcall_get_ctx()) Invalid gss ctx idx 0x64bb119e1d581425 from 10.240.23.70@tcp
            [ 2861.936148] Lustre: DEBUG MARKER: /usr/sbin/lctl mark  sanity test_36i: @@@@@@ FAIL: expect 200 got 1701080525 
            [ 2862.151355] Lustre: DEBUG MARKER: sanity test_36i: @@@@@@ FAIL: expect 200 got 1701080525
            [ 2862.470230] Lustre: DEBUG MARKER: /usr/sbin/lctl dk > /autotest/autotest-2/2023-11-27/lustre-reviews_review-dne-selinux-ssk-part-1_100496_24_a3948d5e-bc2b-4bec-96bc-b5afaffeb1fb//sanity.test_36i.debug_log.$(hostname -s).1701080526.log;
                           		dmesg > /autotest/autotest-2/2023-11-27/lustre-revi
            

            MDS (2) logs:

            [ 2940.267050] Lustre: DEBUG MARKER: == sanity test 36i: change mtime on striped directory ==== 10:22:05 (1701080525)
            [ 2940.323819] Lustre: 11588:0:(sec_gss.c:686:gss_cli_ctx_handle_err_notify()) req x1783709610067456/t0, ctx 000000007b39aa0b idx 0x64bb119e1d581425(0->c): reverseserver respond (00080000/00000000)
            [ 2940.326903] LustreError: 11588:0:(client.c:1491:after_reply()) @@@ unwrap reply failed: rc = -22  req@0000000027c4ce16 x1783709610067456/t0(0) o104->lustre-MDT0001@10.240.23.66@tcp:15/16 lens 328/224 e 0 to 0 dl 1701080541 ref 1 fl Rpc:RQU/0/ffffffff rc 0/-1 job:'' uid:4294967295 gid:4294967295
            [ 2940.654427] Lustre: DEBUG MARKER: /usr/sbin/lctl mark  sanity test_36i: @@@@@@ FAIL: expect 200 got 1701080525 
            [ 2940.910097] Lustre: DEBUG MARKER: sanity test_36i: @@@@@@ FAIL: expect 200 got 1701080525
            [ 2941.195988] Lustre: DEBUG MARKER: /usr/sbin/lctl dk > /autotest/autotest-2/2023-11-27/lustre-reviews_review-dne-selinux-ssk-part-1_100496_24_a3948d5e-bc2b-4bec-96bc-b5afaffeb1fb//sanity.test_36i.debug_log.$(hostname -s).1701080526.log;
                           		dmesg > /autotest/autotest-2/2023-11-27/lustre-revi
            

            From what I understood, the error returned by server is: major: 00080000 = GSS_S_BAD_NAMETYPE

            I found the same traces on LU-17286.

            eaujames Etienne Aujames added a comment - This test continues to fail on https://review.whamcloud.com/53202 . It seems to be related to GSS: client dmesg: [ 2861.312457] Lustre: DEBUG MARKER: /usr/sbin/lctl mark == sanity test 36i: change mtime on striped directory ==== 10:22:05 \(1701080525\) [ 2861.534950] Lustre: DEBUG MARKER: == sanity test 36i: change mtime on striped directory ==== 10:22:05 (1701080525) [ 2861.606338] Lustre: 18765:0:(gss_svc_upcall.c:1620:gss_svc_upcall_get_ctx()) Invalid gss ctx idx 0x64bb119e1d581425 from 10.240.23.70@tcp [ 2861.936148] Lustre: DEBUG MARKER: /usr/sbin/lctl mark sanity test_36i: @@@@@@ FAIL: expect 200 got 1701080525 [ 2862.151355] Lustre: DEBUG MARKER: sanity test_36i: @@@@@@ FAIL: expect 200 got 1701080525 [ 2862.470230] Lustre: DEBUG MARKER: /usr/sbin/lctl dk > /autotest/autotest-2/2023-11-27/lustre-reviews_review-dne-selinux-ssk-part-1_100496_24_a3948d5e-bc2b-4bec-96bc-b5afaffeb1fb//sanity.test_36i.debug_log.$(hostname -s).1701080526.log; dmesg > /autotest/autotest-2/2023-11-27/lustre-revi MDS (2) logs: [ 2940.267050] Lustre: DEBUG MARKER: == sanity test 36i: change mtime on striped directory ==== 10:22:05 (1701080525) [ 2940.323819] Lustre: 11588:0:(sec_gss.c:686:gss_cli_ctx_handle_err_notify()) req x1783709610067456/t0, ctx 000000007b39aa0b idx 0x64bb119e1d581425(0->c): reverseserver respond (00080000/00000000) [ 2940.326903] LustreError: 11588:0:(client.c:1491:after_reply()) @@@ unwrap reply failed: rc = -22 req@0000000027c4ce16 x1783709610067456/t0(0) o104->lustre-MDT0001@10.240.23.66@tcp:15/16 lens 328/224 e 0 to 0 dl 1701080541 ref 1 fl Rpc:RQU/0/ffffffff rc 0/-1 job:'' uid:4294967295 gid:4294967295 [ 2940.654427] Lustre: DEBUG MARKER: /usr/sbin/lctl mark sanity test_36i: @@@@@@ FAIL: expect 200 got 1701080525 [ 2940.910097] Lustre: DEBUG MARKER: sanity test_36i: @@@@@@ FAIL: expect 200 got 1701080525 [ 2941.195988] Lustre: DEBUG MARKER: /usr/sbin/lctl dk > /autotest/autotest-2/2023-11-27/lustre-reviews_review-dne-selinux-ssk-part-1_100496_24_a3948d5e-bc2b-4bec-96bc-b5afaffeb1fb//sanity.test_36i.debug_log.$(hostname -s).1701080526.log; dmesg > /autotest/autotest-2/2023-11-27/lustre-revi From what I understood, the error returned by server is: major: 00080000 = GSS_S_BAD_NAMETYPE I found the same traces on LU-17286 .
            eaujames Etienne Aujames added a comment - +1 on master: https://testing.whamcloud.com/test_sets/54903a1b-c187-4eab-a26f-7d010e36f1c6

            People

              wc-triage WC Triage
              maloo Maloo
              Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: