Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-18284

interop sanity test_119e test_119f: UDIO files differ, bsize 1048575, 2.12 servers crash

    XMLWordPrintable

Details

    • Bug
    • Resolution: Fixed
    • Major
    • Lustre 2.16.0
    • Lustre 2.16.0, Lustre 2.12.9
    • None
    • 3
    • 9223372036854775807

    Description

      This issue was created by maloo for Andreas Dilger <adilger@whamcloud.com>

      This issue relates to the following test suite run with 2.12.9 servers, 2.15.91 master client:
      https://testing.whamcloud.com/test_sets/a2ba69c8-7050-435a-9042-b3b074d98626

      test_119e failed with the following error:

      files differ, bsize 1048575
      

      Test session details:
      clients: https://build.whamcloud.com/job/lustre-master/4544 - 5.14.0-284.30.1.el9_2.x86_64
      servers: https://build.whamcloud.com/job/lustre-b2_12/164 - 3.10.0-1160.49.1.el7_lustre.x86_64

      This test started failing on 2024-06-25 and is still failing.

      This causes list corruption in test_119e:

      WARNING: CPU: 1 PID: 18212 at lib/list_debug.c:62 __list_del_entry+0x82/0xd0
      list_del corruption. next->prev should be ffff9516e8be88e8, but was f2756b16e8be88e8
      CPU: 1 PID: 18212 Comm: ll_ost_io00_015   3.10.0-1160.49.1.el7_lustre.x86_64 #1
       Call Trace:
       dump_stack+0x19/0x1b
       __warn+0xd8/0x100
       warn_slowpath_fmt+0x5f/0x80
       __list_del_entry+0x82/0xd0
       ptlrpc_server_hpreq_fini+0x70/0x170 [ptlrpc]
       ptlrpc_server_finish_active_request+0x8a/0x140 [ptlrpc]
       ptlrpc_server_handle_request+0x401/0xab0 [ptlrpc]
       ptlrpc_main+0xb34/0x1470 [ptlrpc]
       kthread+0xd1/0xe0
      

      Then the crash in test_119f again due to list corruption:

      general protection fault: 0000 [#1] SMP 
      CPU: 1 PID: 16242 Comm: ll_ost00_023  3.10.0-1160.49.1.el7_lustre.x86_64 #1
      Call Trace:
       ldlm_add_blocked_lock+0xb2/0xc0 [ptlrpc]
       ldlm_add_waiting_lock+0x1ac/0x300 [ptlrpc]
       ldlm_server_blocking_ast+0x66c/0xa40 [ptlrpc]
       tgt_blocking_ast+0x159/0x630 [ptlrpc]
       ldlm_work_bl_ast_lock+0x11c/0x300 [ptlrpc]
       ptlrpc_set_wait+0x72/0x790 [ptlrpc]
       ldlm_run_ast_work+0xd5/0x3a0 [ptlrpc]
       ldlm_handle_conflict_lock+0x70/0x290 [ptlrpc]
       ldlm_lock_enqueue+0x470/0x9b0 [ptlrpc]
       ldlm_cli_enqueue_local+0x367/0x830 [ptlrpc]
       ofd_destroy_by_fid+0x1d1/0x510 [ofd]
       ofd_destroy_hdl+0x257/0x9d0 [ofd]
       tgt_request_handle+0xada/0x1570 [ptlrpc]
       ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc]
       ptlrpc_main+0xb34/0x1470 [ptlrpc]
       kthread+0xd1/0xe0
      

      VVVVVVV DO NOT REMOVE LINES BELOW, Added by Maloo for auto-association VVVVVVV
      sanity test_119e - files differ, bsize 1048575

      Attachments

        Issue Links

          Activity

            People

              stancheff Shaun Tancheff
              maloo Maloo
              Votes:
              0 Vote for this issue
              Watchers:
              7 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: