Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-1092

NULL pointer dereference in filter_export_stats_init()

    XMLWordPrintable

Details

    • 3
    • 4682

    Description

      We had three occurrences of this crash on our classified 2.1 Lustre cluster, all on OSS nodes.

      BUG: unable to handle kernel NULL pointer dereference at 0000000000000038
      IP: [<ffffffffa0a8e061>] filter_export_stats_init+0x1f1/0x500 [obdfilter]

      machine_kexec
      crash_kexec
      oops_end
      no_context
      __bad_area_nosemaphore
      bad_area_nosemaphore
      __do_page_fault
      do_page_fault
      page_fault
      [exception RIP: filter_export_stats_init+497]
      filter_reconnect
      target_handle_connect
      ost_handle
      ptlrpc_main
      kernel_thread

      The timeframe conincided with the ASSERT reported in LU-1085. As in the other bugs we hit during that window, this crash was preceded by hundreds of messages like this:

      LustreError: 14210:0:(genops.c:1270:class_disconnect_stale_exports()) ls5-OST0349: disconnect stale client [UUID]@<unknown>

      Oleg has suggested that the patch for LU-106 may help here, and we have pulled it into our branch but haven't pushed it out yet.

      Attachments

        Issue Links

          Activity

            People

              laisiyao Lai Siyao
              nedbass Ned Bass
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: