Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-16223

Setting debug_peer_on_timeout=1 can cause kernel NULL pointer deref

    XMLWordPrintable

Details

    • Bug
    • Resolution: Unresolved
    • Minor
    • None
    • None
    • None
    • 3
    • 9223372036854775807

    Description

      Setting debug_peer_on_timeout=1 on a client and then rebooting a lnet-router causes this to happen:

      ===

      [Fri Oct  7 15:50:28 2022] LNetError: 246:0:(o2iblnd_cb.c:3044:kiblnd_rejected()) 172.27.243.18@o2ib240 rejected: o2iblnd fatal error
      [Fri Oct  7 15:50:28 2022] Lustre: 3803:0:(client.c:2182:ptlrpc_expire_one_request()) @@@ Request sent has failed due to network error: [sent 1665150627/real 1665150627]  req@00000000e4b7f0d8 x1746036694498560/t0(0) o400->stor10-MDT0003-mdc-ffff8a3a1e709800@172.27.1.33@o2ib1:12/10 lens 224/224 e 0 to 1 dl 1665151070 ref 1 fl Rpc:eXN/0/ffffffff rc 0/-1
      [Fri Oct  7 15:50:28 2022] BUG: kernel NULL pointer dereference, address: 000000000000003

      ===

       

      The client is running DDN lustre 2.12.8-ddn9 but I suspect this will be a problem for upstream too.

      Attachments

        Activity

          People

            wc-triage WC Triage
            ake_s Åke Sandgren
            Votes:
            1 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated: