Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-12781

sanity test_272a crashes with SSK

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Minor
    • Resolution: Fixed
    • Affects Version/s: Lustre 2.13.0
    • Fix Version/s: Lustre 2.14.0
    • Labels:
    • Severity:
      3
    • Rank (Obsolete):
      9223372036854775807

      Description

      With the recent landing of patch "LU-12443 ptlrpc: fix reply buffers shrinking and growing" (https://review.whamcloud.com/35243), sanity test_272a crashes with SSK enabled.

      I used the following patches to trigger tests:
      https://review.whamcloud.com/36226
      https://review.whamcloud.com/36227

      Without SSK, test_272a does not crash. With SSK, test_272a crashed unless patch "LU-12443 ptlrpc: fix reply buffers shrinking and growing" is reverted.

      The crash is due to an assertion failed:

      [  406.653680] Lustre: DEBUG MARKER: == sanity test 272a: DoM migration: new layout with the same DOM component =========================== 08:37:07 (1568795827)
      [  406.726294] format at mdt_io.c:215:mdt_rw_hpreq_check doesn't end in newline
      [  406.743661] format at mdt_io.c:215:mdt_rw_hpreq_check doesn't end in newline
      [  406.792396] LustreError: 15793:0:(pack_generic.c:454:lustre_shrink_msg_v2()) ASSERTION( msg->lm_buflens[segment] >= newlen ) failed: 
      [  406.793584] LustreError: 15793:0:(pack_generic.c:454:lustre_shrink_msg_v2()) LBUG
      [  406.794352] Pid: 15793, comm: mdt00_002 3.10.0-957.27.2.el7_lustre.x86_64 #1 SMP Thu Sep 12 03:53:14 UTC 2019
      [  406.795309] Call Trace:
      [  406.795600]  [<ffffffffc09188ac>] libcfs_call_trace+0x8c/0xc0 [libcfs]
      [  406.796459]  [<ffffffffc091895c>] lbug_with_loc+0x4c/0xa0 [libcfs]
      [  406.797125]  [<ffffffffc0e32c54>] lustre_shrink_msg+0x164/0x200 [ptlrpc]
      [  406.797912]  [<ffffffffc146e11e>] gss_svc_authorize+0x16e/0x5b0 [ptlrpc_gss]
      [  406.798676]  [<ffffffffc0e647c5>] sptlrpc_svc_wrap_reply+0x55/0x1d0 [ptlrpc]
      [  406.799455]  [<ffffffffc0e2eca8>] ptlrpc_send_reply+0x1e8/0x830 [ptlrpc]
      [  406.800340]  [<ffffffffc0ded6be>] target_send_reply_msg+0x8e/0x170 [ptlrpc]
      [  406.801092]  [<ffffffffc0df7d4e>] target_send_reply+0x30e/0x730 [ptlrpc]
      [  406.801847]  [<ffffffffc0e9d3d1>] tgt_request_handle+0x2f1/0x15c0 [ptlrpc]
      [  406.802620]  [<ffffffffc0e42516>] ptlrpc_server_handle_request+0x256/0xb10 [ptlrpc]
      [  406.803501]  [<ffffffffc0e4604c>] ptlrpc_main+0xbac/0x1540 [ptlrpc]
      [  406.804193]  [<ffffffff954c2e81>] kthread+0xd1/0xe0
      [  406.804779]  [<ffffffff95b77c37>] ret_from_fork_nospec_end+0x0/0x39
      [  406.805484]  [<ffffffffffffffff>] 0xffffffffffffffff
      [  406.806058] Kernel panic - not syncing: LBUG
      [  406.806628] CPU: 1 PID: 15793 Comm: mdt00_002 Kdump: loaded Tainted: G           OE  ------------   3.10.0-957.27.2.el7_lustre.x86_64 #1
      [  406.807770] Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011
      [  406.808333] Call Trace:
      [  406.808604]  [<ffffffff95b65147>] dump_stack+0x19/0x1b
      [  406.809120]  [<ffffffff95b5e850>] panic+0xe8/0x21f
      [  406.809595]  [<ffffffffc09189ab>] lbug_with_loc+0x9b/0xa0 [libcfs]
      [  406.810219]  [<ffffffffc0e32c54>] lustre_shrink_msg+0x164/0x200 [ptlrpc]
      [  406.810867]  [<ffffffffc146e11e>] gss_svc_authorize+0x16e/0x5b0 [ptlrpc_gss]
      [  406.811570]  [<ffffffffc0e647c5>] sptlrpc_svc_wrap_reply+0x55/0x1d0 [ptlrpc]
      [  406.812272]  [<ffffffffc0e2eca8>] ptlrpc_send_reply+0x1e8/0x830 [ptlrpc]
      [  406.812946]  [<ffffffffc0ded6be>] target_send_reply_msg+0x8e/0x170 [ptlrpc]
      [  406.813633]  [<ffffffffc0df7d4e>] target_send_reply+0x30e/0x730 [ptlrpc]
      [  406.814305]  [<ffffffffc0e362d7>] ? lustre_msg_set_last_committed+0x27/0xa0 [ptlrpc]
      [  406.815083]  [<ffffffffc0e9d3d1>] tgt_request_handle+0x2f1/0x15c0 [ptlrpc]
      [  406.815752]  [<ffffffffc0a60f3e>] ? libcfs_nid2str_r+0xfe/0x130 [lnet]
      [  406.816412]  [<ffffffffc0e42516>] ptlrpc_server_handle_request+0x256/0xb10 [ptlrpc]
      [  406.817157]  [<ffffffff954cfeb4>] ? __wake_up+0x44/0x50
      [  406.817689]  [<ffffffffc0e4604c>] ptlrpc_main+0xbac/0x1540 [ptlrpc]
      [  406.818302]  [<ffffffff954d1ad0>] ? finish_task_switch+0x50/0x1c0
      [  406.818914]  [<ffffffffc0e454a0>] ? ptlrpc_register_service+0xf90/0xf90 [ptlrpc]
      [  406.819620]  [<ffffffff954c2e81>] kthread+0xd1/0xe0
      [  406.820102]  [<ffffffff954c2db0>] ? insert_kthread_work+0x40/0x40
      [  406.820685]  [<ffffffff95b77c37>] ret_from_fork_nospec_begin+0x21/0x21
      [  406.821312]  [<ffffffff954c2db0>] ? insert_kthread_work+0x40/0x40
      

        Attachments

          Activity

            People

            • Assignee:
              tappro Mikhail Pershin
              Reporter:
              sebastien Sebastien Buisson
            • Votes:
              0 Vote for this issue
              Watchers:
              8 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: