Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-1092

NULL pointer dereference in filter_export_stats_init()

    XMLWordPrintable

Details

    • 3
    • 4682

    Description

      We had three occurrences of this crash on our classified 2.1 Lustre cluster, all on OSS nodes.

      BUG: unable to handle kernel NULL pointer dereference at 0000000000000038
      IP: [<ffffffffa0a8e061>] filter_export_stats_init+0x1f1/0x500 [obdfilter]

      machine_kexec
      crash_kexec
      oops_end
      no_context
      __bad_area_nosemaphore
      bad_area_nosemaphore
      __do_page_fault
      do_page_fault
      page_fault
      [exception RIP: filter_export_stats_init+497]
      filter_reconnect
      target_handle_connect
      ost_handle
      ptlrpc_main
      kernel_thread

      The timeframe conincided with the ASSERT reported in LU-1085. As in the other bugs we hit during that window, this crash was preceded by hundreds of messages like this:

      LustreError: 14210:0:(genops.c:1270:class_disconnect_stale_exports()) ls5-OST0349: disconnect stale client [UUID]@<unknown>

      Oleg has suggested that the patch for LU-106 may help here, and we have pulled it into our branch but haven't pushed it out yet.

      Attachments

        Issue Links

          Activity

            People

              laisiyao Lai Siyao
              nedbass Ned Bass (Inactive)
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: