Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-16014

sanity test_27M: crash in lod_qos_prep_create()

    XMLWordPrintable

Details

    • Bug
    • Resolution: Duplicate
    • Major
    • None
    • Lustre 2.16.0
    • None
    • 3
    • 9223372036854775807

    Description

      Noticed a regular crash that looks like this in boilpot:

      Lustre: DEBUG MARKER: == sanity test 27M: test O_APPEND striping ====== 21:09:25 (1657760965)
      BUG: unable to handle kernel paging request at ffff8801466bccb0
      IP: [<ffffffffa13f68f6>] lod_qos_prep_create+0xe96/0x1ab0 [lod]
      Oops: 0000 [#1] SMP DEBUG_PAGEALLOC
      CPU: 3 PID: 2694 Comm: mdt01_002 Kdump: loaded  3.10.0-7.9-debug #2
      Hardware name: Red Hat KVM, BIOS 1.15.0-1.module_el8.6.0+1087+b42c8331 04/01/2014
      Call Trace:
       lod_prepare_create+0x23b/0x320 [lod]
       lod_declare_striped_create+0xf8/0xa50 [lod]
       lod_declare_create+0x1f5/0x600 [lod]
       mdd_declare_create_object_internal+0xd3/0x3b0 [mdd]
       mdd_declare_create_object.isra.35+0x51/0xb60 [mdd]
       mdd_declare_create+0x66/0x480 [mdd]
       mdd_create+0x9a9/0x1d30 [mdd]
       mdt_reint_open+0x2004/0x2c10 [mdt]
       mdt_reint_rec+0x87/0x240 [mdt]
       mdt_reint_internal+0x76c/0xb50 [mdt]
       mdt_intent_open+0x93/0x480 [mdt]
       mdt_intent_opc+0x1dd/0xc10 [mdt]
       mdt_intent_policy+0x1a1/0x360 [mdt]
       ldlm_lock_enqueue+0x3c2/0xb40 [ptlrpc]
       ldlm_handle_enqueue0+0x8c6/0x1780 [ptlrpc]
       tgt_enqueue+0x64/0x240 [ptlrpc]
       tgt_request_handle+0x93a/0x19c0 [ptlrpc]
       ptlrpc_server_handle_request+0x250/0xc30 [ptlrpc]
       ptlrpc_main+0xbd9/0x15f0 [ptlrpc]
       kthread+0xe4/0xf0
      

      I think it came from LU-15727 patch https://review.whamcloud.com/47014

      First hit on June 20th and then it really intensified in the past few days for some reason.

      Very first crash (has vmcore and all):

      http://testing.linuxhacker.ru/lustre-reports/external/crashes/boilpot-bigmem-98-2022-06-20-10:50:27/

      most recent crash with vmcore out of current master-next:

      http://testing.linuxhacker.ru/lustre-reports/external/crashes/boilpot-bigmem-28-2022-07-12-03:11:17/

      Attachments

        Issue Links

          Activity

            People

              wc-triage WC Triage
              green Oleg Drokin
              Votes:
              0 Vote for this issue
              Watchers:
              9 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: