Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-19370

lfsck thread may not quit after lfsck_stop()

    XMLWordPrintable

Details

    • Bug
    • Resolution: Fixed
    • Medium
    • Lustre 2.17.0
    • None
    • None
    • 3
    • 9223372036854775807

    Description

      Below crash is because lfsck thread is accessing FLD local cache after mdt_fini()->lfsck_stop(), but the cache was cleared already:

      [  883.655478] LustreError: 44359:0:(qsd_reint.c:56:qsd_reint_completion()) coda1-MDT0000: failed to enqueue global quota lock, glb fid:[0x200000006:0x20000:0x0], rc:-5
      [  883.657058] LustreError: 42845:0:(qsd_reint.c:637:qqi_reint_delayed()) coda1-MDT0000: Delaying reintegration for qtype:0 until pending updates are flushed.
      [  883.744239] BUG: unable to handle kernel NULL pointer dereference at 0000000000000018
      [  883.746069] PGD 0
      [  883.746683] Oops: 0000 [#1] SMP NOPTI
      [  883.747456] CPU: 21 PID: 44356 Comm: lfsck_namespace Kdump: loaded Tainted: G           OE     -------- -  - 4.18.0-553.53.1.el8_lustre.ddn17.x86_64 #1
      [  883.749646] Hardware name: DDN SFA200NVX2E, BIOS 1.16.3-20250321_133525-43f84ec51920 04/01/2014
      [  883.751132] RIP: 0010:fld_local_lookup+0x49/0x250 [fld]
      [  883.752143] Code: 0e 8b 3d 7a e9 ef ff 85 ff 0f 88 ce 00 00 00 48 89 df 48 c7 c6 e0 b8 eb c0 e8 93 d1 06 00 48 89 c3 48 85 c0 0f 84 07 01 00 00 <49> 8b 7c 24 18 48 8d 50 18 4c 89 ee e8 86 e7 ff ff 85 c0 75 67 8b
      [  883.755331] RSP: 0018:ff540067685a7cc0 EFLAGS: 00010282
      [  883.756346] RAX: ff176e7d0c8d4240 RBX: ff176e7d0c8d4240 RCX: ff176e7b6f9395e0
      [  883.757639] RDX: ff176e8019773e00 RSI: ffffffffc0ebb8e0 RDI: ff176e7d0c8d45a0
      [  883.758925] RBP: ff176e7b6f9395e0 R08: 0000000000000001 R09: 0000000000000001
      [  883.760208] R10: ff176e7dfa1d8960 R11: 0000000000000001 R12: 0000000000000000
      [  883.761485] R13: 0000000200025fe6 R14: ff176e82c26f7040 R15: ff176e7b6f939000
      [  883.762779] FS:  0000000000000000(0000) GS:ff176ead31b40000(0000) knlGS:0000000000000000
      [  883.764216] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [  883.765314] CR2: 0000000000000018 CR3: 0000001598410005 CR4: 0000000000771ee0
      [  883.766610] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      [  883.767901] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
      [  883.769181] PKRU: 55555554
      [  883.769820] Call Trace:
      [  883.770441]  ? __die_body+0x1a/0x60
      [  883.771194]  ? no_context+0x1ba/0x3f0
      [  883.771971]  ? __bad_area_nosemaphore+0x157/0x180
      [  883.772900]  ? do_page_fault+0x37/0x12d
      [  883.773706]  ? page_fault+0x1e/0x30
      [  883.774458]  ? fld_local_lookup+0x49/0x250 [fld]
      [  883.775365]  fld_server_lookup+0x53/0x330 [fld]
      [  883.776253]  lfsck_find_mdt_idx_by_fid+0x65/0xe0 [lfsck]
      [  883.777290]  lfsck_namespace_assistant_handler_p1+0x1c6/0x1de0 [lfsck]
      [  883.778490]  lfsck_assistant_engine+0x362/0x1b90 [lfsck]
      [  883.779509]  ? finish_task_switch+0x86/0x2f0
      [  883.780360]  ? __schedule+0x2d9/0x870
      [  883.781118]  ? finish_wait+0x80/0x80
      [  883.781847]  ? lfsck_master_engine+0xc90/0xc90 [lfsck]
      [  883.782829]  kthread+0x134/0x150
      [  883.783514]  ? set_kthread_struct+0x50/0x50
      [  883.784327]  ret_from_fork+0x1f/0x40
      

      Attachments

        Issue Links

          Activity

            People

              laisiyao Lai Siyao
              laisiyao Lai Siyao
              Votes:
              0 Vote for this issue
              Watchers:
              6 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: