Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-20132

sanity-lnet test cleanup: NULL deref in __wake_up_common via tcp_child_process on lnet reload

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Unresolved
    • Icon: Critical Critical
    • None
    • None
    • None
    • 3
    • 9223372036854775807

      This issue was created by maloo for Marc Vef <mvef@whamcloud.com>

      This issue relates to the following test suite run: https://testing.whamcloud.com/test_sets/0598b105-002b-4ab2-a6dd-39b194f4ce91

      Test session details:
      clients: https://build.whamcloud.com/job/lustre-reviews/123835 - 4.18.0-553.111.1.el8_10.x86_64
      servers: https://build.whamcloud.com/job/lustre-reviews/123835 - 4.18.0-553.111.1.el8_lustre.x86_64

      Kernel NULL pointer dereference on client 0.8s after LNet/ksocklnd module reload between test sessions.

      Timeline (dmesg seconds):
      
      13831.132 — LNet: Removed LNI 10.240.23.92@tcp1 (prior session teardown; lnet_acceptor_remove_socket: Interface eth0 not found)
      14093.834 — libcfs/ksocklnd/LNet modules reloaded
      14094.658 — LNet: Added LNI 10.240.23.92@tcp
      14094.659 — Oops
      
      BUG: unable to handle kernel NULL pointer dereference at 0000000000000000
      RIP: __wake_up_common+0x4c/0x190
      Call Trace: <IRQ>
        __wake_up_common_lock+0x81
        tcp_child_process+0x181
        tcp_v4_rcv+0xa47
        ip_local_deliver_finish → ip_rcv → napi_gro_receive → virtnet_poll
      Modules: ksocklnd ptlrpc obdclass lnet libcfs ... [last unloaded: libcfs]
      

            hornc Chris Horn
            maloo Maloo
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

              Created:
              Updated: