Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-18024

sanity-lsnapshot test_1b: Null pointer dreference in queue_work via qmt_lvbo_free

    XMLWordPrintable

Details

    • Bug
    • Resolution: Fixed
    • Minor
    • Lustre 2.16.0
    • Lustre 2.16.0
    • 3
    • 9223372036854775807

    Description

      This issue was created by maloo for Oleg Drokin <green@whamcloud.com>

      This issue relates to the following test suite run: https://testing.whamcloud.com/test_sets/c1d8852e-126c-4a30-af92-a8fa44082ee9

      test_1b failed with the following error:

      onyx-103vm4 crashed during sanity-lsnapshot test_1b
      

      Test session details:
      clients: https://build.whamcloud.com/job/lustre-master/4541 - 5.14.0-362.24.1.el9_3.x86_64
      servers: https://build.whamcloud.com/job/lustre-master/4541 - 5.14.0-362.24.1_lustre.el9.x86_64

      for about a month this is a regular crash in sanity-lsnapshot test 1b, traces differ somewhat but always end up in qmt_lvbo_free and then the NULL pointer dereference in the __queue_work:

      [13893.061855] Lustre: DEBUG MARKER: /usr/sbin/lctl mark == sanity-lsnapshot test 1b: mount snapshot without original filesystem mounted ========================================================== 08:20:07 \(1718785207\)
      [13893.273668] Lustre: DEBUG MARKER: == sanity-lsnapshot test 1b: mount snapshot without original filesystem mounted ========================================================== 08:20:07 (1718785207)
      [13893.439579] Lustre: DEBUG MARKER: /usr/sbin/lctl snapshot_create -F lustre -n lss_1b_0
      [13895.994830] Lustre: DEBUG MARKER: /usr/sbin/lctl snapshot_list -F lustre -n lss_1b_0 -d
      [13900.112891] Lustre: DEBUG MARKER: grep -c /mnt/lustre-mds1' ' /proc/mounts || true
      [13900.441082] Lustre: DEBUG MARKER: umount -d -f /mnt/lustre-mds1
      [13900.721496] BUG: kernel NULL pointer dereference, address: 0000000000000102
      [13900.722489] #PF: supervisor read access in kernel mode
      [13900.723149] #PF: error_code(0x0000) - not-present page
      [13900.723783] PGD 0 P4D 0 
      [13900.724150] Oops: 0000 [#1] PREEMPT SMP PTI
      [13900.724697] CPU: 0 PID: 225194 Comm: umount Kdump: loaded Tainted: P           OE     -------  ---  5.14.0-362.24.1_lustre.el9.x86_64 #1
      [13900.726105] Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011
      [13900.726808] RIP: 0010:__queue_work+0x20/0x370
      [13900.727396] Code: 90 90 90 90 90 90 90 90 90 90 0f 1f 44 00 00 41 57 41 56 49 89 d6 41 55 41 54 41 89 fc 55 48 89 f5 53 48 83 ec 10 89 7c 24 04 <f6> 86 02 01 00 00 01 0f 85 ac 02 00 00 e8 fe c7 07 00 49 c7 c5 ac
      [13900.729479] RSP: 0018:ffffa5290a5e3938 EFLAGS: 00010082
      [13900.730140] RAX: ffffffffc1cd86b0 RBX: 0000000000000202 RCX: 0000000000000000
      [13900.730998] RDX: ffff990ab188e340 RSI: 0000000000000000 RDI: 0000000000002000
      [13900.731848] RBP: 0000000000000000 R08: ffff990aa7fda8b8 R09: ffffa5290a5e3940
      [13900.732701] R10: 0000000000000101 R11: 000000000000000f R12: 0000000000002000
      [13900.733574] R13: ffff990aa7fda82c R14: ffff990ab188e340 R15: 0000000000000000
      [13900.734429] FS:  00007f1bff822540(0000) GS:ffff990b3fc00000(0000) knlGS:0000000000000000
      [13900.735387] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [13900.736097] CR2: 0000000000000102 CR3: 00000000045f6003 CR4: 00000000001706f0
      [13900.736960] Call Trace:
      [13900.737318]  <TASK>
      [13900.737636]  ? show_trace_log_lvl+0x1c4/0x2df
      [13900.738212]  ? show_trace_log_lvl+0x1c4/0x2df
      [13900.738774]  ? queue_work_on+0x24/0x30
      [13900.739268]  ? __die_body.cold+0x8/0xd
      [13900.739765]  ? page_fault_oops+0x134/0x170
      [13900.740329]  ? kernelmode_fixup_or_oops+0x84/0x110
      [13900.740936]  ? exc_page_fault+0x62/0x150
      [13900.741474]  ? asm_exc_page_fault+0x22/0x30
      [13900.742034]  ? __pfx_qmt_lvbo_free+0x10/0x10 [lquota]
      [13900.742772]  ? __queue_work+0x20/0x370
      [13900.743272]  ? __wake_up_common_lock+0x91/0xd0
      [13900.743851]  queue_work_on+0x24/0x30
      [13900.744325]  qmt_lvbo_free+0xaf/0x160 [lquota]
      [13900.744929]  ldlm_resource_putref+0x18a/0x290 [ptlrpc]
      [13900.745721]  cfs_hash_for_each_relax+0x1ab/0x480 [libcfs]
      [13900.746468]  ? __pfx_ldlm_resource_clean+0x10/0x10 [ptlrpc]
      [13900.747268]  ? __pfx_ldlm_resource_clean+0x10/0x10 [ptlrpc]
      [13900.748069]  cfs_hash_for_each_nolock+0x12e/0x210 [libcfs]
      [13900.748755]  ldlm_namespace_cleanup+0x2b/0xc0 [ptlrpc]
      [13900.749514]  __ldlm_namespace_free+0x58/0x4f0 [ptlrpc]
      [13900.750288]  ldlm_namespace_free_prior+0x5a/0x1f0 [ptlrpc]
      [13900.751093]  mdt_fini+0xd6/0x570 [mdt]
      [13900.751631]  mdt_device_fini+0x2b/0xc0 [mdt]
      [13900.752224]  obd_precleanup+0x1e4/0x220 [obdclass]
      [13900.753213]  class_cleanup+0x2d5/0x600 [obdclass]
      [13900.753885]  class_process_config+0x10c0/0x1bc0 [obdclass]
      [13900.754627]  ? __kmalloc+0x19b/0x370
      [13900.755138]  class_manual_cleanup+0x439/0x7a0 [obdclass]
      [13900.755871]  server_put_super+0x7ee/0xa40 [ptlrpc]
      [13900.756604]  generic_shutdown_super+0x74/0x120
      [13900.757193]  kill_anon_super+0x14/0x30
      [13900.757681]  deactivate_locked_super+0x31/0xa0
      [13900.758272]  cleanup_mnt+0x100/0x160
      [13900.758775]  task_work_run+0x5c/0x90
      [13900.759257]  exit_to_user_mode_loop+0x122/0x130
      [13900.759854]  exit_to_user_mode_prepare+0xb6/0x100
      [13900.760450]  syscall_exit_to_user_mode+0x12/0x40
      [13900.761045]  do_syscall_64+0x69/0x90
      [13900.761515]  ? syscall_exit_to_user_mode+0x22/0x40
      [13900.762130]  ? do_syscall_64+0x69/0x90
      [13900.762619]  ? exc_page_fault+0x62/0x150
      [13900.763134]  entry_SYSCALL_64_after_hwframe+0x72/0xdc
      

      VVVVVVV DO NOT REMOVE LINES BELOW, Added by Maloo for auto-association VVVVVVV
      sanity-lsnapshot test_1b - onyx-103vm4 crashed during sanity-lsnapshot test_1b

      Attachments

        Issue Links

          Activity

            People

              hongchao.zhang Hongchao Zhang
              maloo Maloo
              Votes:
              0 Vote for this issue
              Watchers:
              10 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: