Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-17332

sanity test_820: kernel BUG at fs/jbd2/transaction.c:378

    XMLWordPrintable

Details

    • Bug
    • Resolution: Fixed
    • Critical
    • Lustre 2.16.0
    • Lustre 2.16.0
    • None
    • 3
    • 9223372036854775807

    Description

      This issue was created by maloo for Andreas Dilger <adilger@whamcloud.com>

      This issue relates to the following test suite run: https://testing.whamcloud.com/test_sets/58a6b07c-fb1f-4a2d-ac3c-d7578d6b134f

      test_820 failed with the following error:

      trevis-28vm3 crashed during sanity test_820
      
      [26282.338565] Lustre: server umount lustre-OST0004 complete
      [26282.411017] ------------[ cut here ]------------
      [26282.412061] kernel BUG at fs/jbd2/transaction.c:378!
      [26282.413171] invalid opcode: 0000 [#1] SMP PTI
      [26282.414068] CPU: 1 PID: 784404 Comm: kworker/1:5 4.18.0-477.15.1.el8_lustre.x86_64 #1
      [26282.416473] Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011
      [26282.435639] Call Trace:
      [26282.438083]  jbd2__journal_start+0xee/0x1f0 [jbd2]
      [26282.439047]  jbd2_journal_start+0x19/0x20 [jbd2]
      [26282.439979]  flush_stashed_stats_work+0x36/0x90 [ldiskfs]
      [26282.441086]  process_one_work+0x1a7/0x360
      [26282.442753]  worker_thread+0x30/0x390
      [26282.444311]  kthread+0x134/0x150
      

      Test session details:
      clients: https://build.whamcloud.com/job/lustre-master/4445 - 4.18.0-477.15.1.el8_8.x86_64
      servers: https://build.whamcloud.com/job/lustre-master/4445 - 4.18.0-477.15.1.el8_lustre.x86_64

      This started around 2023-07-21 +/- 7 days. It looks like the workqueue is somehow running after the journal is cleaned up, since the BUG is

      int jbd2_journal_destroy(journal_t *journal)
      {       
              /* Wait for the commit thread to wake up and die. */
              journal_kill_thread(journal);
              :
      }
      
      static void journal_kill_thread(journal_t *journal)
      {               
              journal->j_flags |= JBD2_UNMOUNT;
              :
      }
      
      static int start_this_handle(journal_t *journal, handle_t *handle,
                                   gfp_t gfp_mask)
      {
              :
              BUG_ON(journal->j_flags & JBD2_UNMOUNT);
              :
      }
      

      VVVVVVV DO NOT REMOVE LINES BELOW, Added by Maloo for auto-association VVVVVVV
      sanity test_820 - trevis-28vm3 crashed during sanity test_820

      Attachments

        Issue Links

          Activity

            People

              dongyang Dongyang Li
              maloo Maloo
              Votes:
              0 Vote for this issue
              Watchers:
              7 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: