Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-20152

vvp_dump_pgcache_seq_release() vs umount race

    XMLWordPrintable

Details

    • Bug
    • Resolution: Won't Fix
    • Medium
    • None
    • None
    • None
    • 3
    • 9223372036854775807

    Description

      
      

      [ 116.120038] watchdog: BUG: soft lockup - CPU#1 stuck for 23s! [lctl:11917]
      [ 116.120196] Modules linked in: zfs(O) zunicode(O) zzstd(O) zlua(O) zcommon(O) znvpair(O) zavl(O) icp(O) spl(O) lzstd(O) llz4hc(O) llz4(O) lustre(O) osp(O) ofd(O) lod(O) ost(O) mdt(O) mdd(O) mgs(O) osd_ldiskfs(O) ldiskfs(O) lquota(O) lfsck(O) obdecho(O) mgc(O) mdc(O) lov(O) osc(O) lmv(O) fid(O) fld(O) ptlrpc(O) obdclass(O) ksocklnd(O) lnet(O) libcfs(O)
      [ 116.121156] irq event stamp: 0
      [ 116.126709] hardirqs last enabled at (0): [<0000000000000000>] 0x0
      [ 116.126752] hardirqs last disabled at (0): [<ffffffff9f0e73d1>] copy_process+0x4b1/0x1a30
      [ 116.126810] softirqs last enabled at (0): [<ffffffff9f0e73d1>] copy_process+0x4b1/0x1a30
      [ 116.126851] softirqs last disabled at (0): [<0000000000000000>] 0x0
      [ 116.126886] CPU: 1 PID: 11917 Comm: lctl Tainted: G W O -------- - - 4.18.0 #1
      [ 116.126934] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.17.0-4.fc41 04/01/2014
      [ 116.126983] RIP: 0010:native_safe_halt+0xe/0x10
      [ 116.127026] Code: 88 ea 9d ff fb 66 0f 1f 44 00 00 65 48 8b 04 25 c0 ce 01 00 f0 80 60 02 df c3 90 90 e9 07 00 00 00 0f 00 2d 3c 8e 44 00 fb f4 <c3> 90 e9 07 00 00 00 0f 00 2d 2c 8e 44 00 f4 c3 90 90 41 54 55 53
      [ 116.127111] RSP: 0018:ffff9c57e6003dd0 EFLAGS: 00000246 ORIG_RAX: ffffffffffffff13
      [ 116.127155] RAX: 0000000000000001 RBX: 0000000000000001 RCX: 0000000000000000
      [ 116.127208] RDX: 0000000000000202 RSI: 0000000000000001 RDI: ffffffff9f0cb665
      [ 116.127253] RBP: ffff9c58d0941814 R08: 0000000000000000 R09: 0000000000000000
      [ 116.127299] R10: ffff9c57e80cb000 R11: 0000000000000000 R12: ffffffffa0910840
      [ 116.127344] R13: 0000000000000001 R14: ffff9c58d0941814 R15: 0000000000080000
      [ 116.127389] FS: 00007f6c7aedc740(0000) GS:ffff9c58d0900000(0000) knlGS:0000000000000000
      [ 116.127431] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [ 116.127471] CR2: 00007fdd9e24a000 CR3: 000000016810f000 CR4: 0000000000350ea0
      [ 116.127516] Call Trace:
      [ 116.127563] <IRQ>
      [ 116.127587] ? watchdog_timer_fn+0x23c/0x280
      [ 116.127644] ? __hrtimer_run_queues+0x1aa/0x490
      [ 116.127684] ? hrtimer_interrupt+0xf9/0x210
      [ 116.127712] ? smp_apic_timer_interrupt+0x9e/0x290
      [ 116.127745] ? apic_timer_interrupt+0xf/0x20
      [ 116.127776] </IRQ>
      [ 116.130076] ? kvm_wait+0x65/0x80
      [ 116.130148] ? native_safe_halt+0xe/0x10
      [ 116.130289] kvm_wait+0x6a/0x80
      [ 116.134861] __pv_queued_spin_lock_slowpath+0x225/0x2b0
      [ 116.134902] do_raw_spin_lock+0xac/0xb0
      [ 116.134929] rhashtable_walk_exit+0x13/0x60
      [ 116.134962] vvp_dump_pgcache_seq_release+0x8f/0xa0 [lustre]
      [ 116.135038] full_proxy_release+0x33/0xa0
      [ 116.135071] __fput+0xc5/0x260
      [ 116.135104] task_work_run+0x8a/0xc0
      [ 116.135134] exit_to_usermode_loop+0xc5/0xd0
      [ 116.135167] do_syscall_64+0x157/0x1d0

      {cocde}

      this is lctl get_param llite.*.dump_page_cache vs umount race:
      debugfs_remove_recursive() doesn't wait till lctl get_param finish with dump_page_cache entry

      Attachments

        Activity

          People

            wc-triage WC Triage
            bzzz Alex Zhuravlev
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: