Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-14095

Multiple tests crash with “ASSERTION( rsi->h.cache_list.next == ((void *)0) ) failed “

    XMLWordPrintable

Details

    • Bug
    • Resolution: Fixed
    • Critical
    • Lustre 2.14.0
    • Lustre 2.14.0
    • None
    • RHEL8.2 DNE/SSK
    • 3
    • 9223372036854775807

    Description

      We have a variety of tests crashing with

      [ 4493.310724] LustreError: 13384:0:(gss_svc_upcall.c:236:rsi_put()) ASSERTION( rsi->h.cache_list.next == ((void *)0) ) failed: 
      [ 4493.312757] LustreError: 13384:0:(gss_svc_upcall.c:236:rsi_put()) LBUG
      [ 4493.314408] Pid: 13384, comm: kworker/0:0 4.18.0-193.6.3.el8_lustre.x86_64 #1 SMP Fri Sep 25 21:03:21 UTC 2020
      [ 4493.316084] Call Trace TBD:
      [ 4493.316572] Kernel panic - not syncing: LBUG
      [ 4493.317295] CPU: 0 PID: 13384 Comm: kworker/0:0 Kdump: loaded Tainted: G           OE    --------- -  - 4.18.0-193.6.3.el8_lustre.x86_64 #1
      [ 4493.319387] Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011
      [ 4493.320460] Workqueue: events_power_efficient do_cache_clean [sunrpc]
      [ 4493.321533] Call Trace:
      [ 4493.322057]  dump_stack+0x5c/0x80
      [ 4493.322652]  panic+0xe7/0x2a9
      [ 4493.323246]  lbug_with_loc.cold.10+0x18/0x18 [libcfs]
      [ 4493.324166]  rsi_put+0x10f/0x140 [ptlrpc_gss]
      [ 4493.324920]  cache_clean+0x2a4/0x2e0 [sunrpc]
      [ 4493.325691]  do_cache_clean+0xa/0x60 [sunrpc]
      [ 4493.326449]  process_one_work+0x1a7/0x3b0
      [ 4493.327136]  worker_thread+0x30/0x390
      [ 4493.327755]  ? create_worker+0x1a0/0x1a0
      [ 4493.328425]  kthread+0x112/0x130
      [ 4493.328983]  ? kthread_flush_work_fn+0x10/0x10
      [ 4493.329745]  ret_from_fork+0x35/0x40
      

      So far, this is only seen on RHEL8.2 with security test groups, dne-ssk and dne-selinux-ssk, and started on 29 OCT 2020 with 2.13.56.46 for
      sanity-sec test_16 https://testing.whamcloud.com/test_sets/b52da834-abb7-4080-9469-9bad89885f38

      Other test failures are
      recovery-small test_4 https://testing.whamcloud.com/test_sets/12054d95-b53a-4383-9fed-73e454728408
      recovery-small test_10e https://testing.whamcloud.com/test_sets/67e2fa12-1d98-40ba-a7a0-4ce181b1f6d4
      sanity-sec test_0 https://testing.whamcloud.com/test_sets/2d07b4f8-e891-4116-8c13-a2f1ff2370ae
      sanity-sec test_17 https://testing.whamcloud.com/test_sets/c0ebf4e3-e06d-4773-9cdb-c785e65b8e29

      Attachments

        Issue Links

          Activity

            People

              sebastien Sebastien Buisson
              jamesanunez James Nunez (Inactive)
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: