Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-7287

osc_cache_shrink_scan() unsafe against concurrrent OSC device removal

    Details

    • Type: Bug
    • Status: Open
    • Priority: Major
    • Resolution: Unresolved
    • Affects Version/s: Lustre 2.8.0
    • Fix Version/s: None
    • Labels:
      None
    • Severity:
      3
    • Rank (Obsolete):
      9223372036854775807

      Description

      In osc_cache_shrink_scan() the OSC device selected as the anchor may be removed once the list is unlocked resulting in a infinite loop.

      t:~# export MOUNT_2=y
      t:~# export DEBUG_SIZE=1024
      t:~# export PTLDEBUG=trace
      t:~# llmount.sh
      ...
      t:~# while true; do dd if=/dev/zero of=/mnt/lustre/$((RANDOM % 30)) bs=1M count=20; done &
      t:~# while true; do echo 2 > /proc/sys/vm/drop_caches ; done &
      t:~# while true; do umount /mnt/lustre2; mount t@tcp:/lustre /mnt/lustre2 -t lustre; done
      
      [  960.026008] BUG: soft lockup - CPU#0 stuck for 67s! [bash:20730]
      [  960.026008] Modules linked in: ...
      [  960.026008] irq event stamp: 0
      [  960.026008] hardirqs last  enabled at (0): [<(null)>] (null)
      [  960.026008] hardirqs last disabled at (0): [<ffffffff81072032>] copy_process+0x5e2/0x1660
      [  960.026008] softirqs last  enabled at (0): [<ffffffff81072032>] copy_process+0x5e2/0x1660
      [  960.026008] softirqs last disabled at (0): [<(null)>] (null)
      [  960.026008] CPU 0
      [  960.026008] Modules linked in: ...
      [  960.026008]
      [  960.026008] Pid: 20730, comm: bash Tainted: G        W  ---------------    2.6.32-431.29.2.el6.lustre.x86_64 #1 Bochs Bochs
      [  960.026008] RIP: 0010:[<ffffffffa08f9425>]  [<ffffffffa08f9425>] libcfs_debug_vmsg2+0x455/0xbe0 [libcfs]
      [  960.026008] RSP: 0018:ffff8800a4059ad8  EFLAGS: 00000246
      [  960.026008] RAX: ffff8800bffdb347 RBX: ffff8800a4059c38 RCX: 0000000000000000
      [  960.026008] RDX: ffff8800bffdb356 RSI: ffffffffa1016f4f RDI: ffff8800bffdb347
      [  960.026008] RBP: ffffffff8100bc8e R08: 0000000000000000 R09: 0000000000000001
      [  960.026008] R10: 0000000000000001 R11: 0000000000000000 R12: ffff8800a4059a98
      [  960.026008] R13: 0000000000000002 R14: ffffffff812acb86 R15: ffff8800a4059ac8
      [  960.026008] FS:  00007fe2c41f7700(0000) GS:ffff88002c000000(0000) knlGS:0000000000000000
      [  960.026008] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [  960.026008] CR2: 00000000004077c0 CR3: 00000000a4ddb000 CR4: 00000000000006f0
      [  960.026008] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      [  960.026008] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
      [  960.026008] Process bash (pid: 20730, threadinfo ffff8800a4058000, task ffff8800b2034740)
      [  960.026008] Stack:
      [  960.026008]  ffff8800a4059fd8 ffff8800a4059b68 ffffffffa10230c0 000000000000004a
      [  960.026008] <d> ffff8800a4059fd8 ffff8800b2034740 ffffffffa102b7e0 000000000000004a
      [  960.026008] <d> ffff88011c1ba2e0 0000000181555df2 ffffffffa1018bdf 0000000000000000
      [  960.026008] Call Trace:
      [  960.026008]  [<ffffffffa08f9bf1>] ? libcfs_debug_msg+0x41/0x50 [libcfs]
      [  960.026008]  [<ffffffffa0ff84b4>] ? osc_lru_shrink+0x2f4/0x780 [osc]
      [  960.026008]  [<ffffffffa0ff89dc>] ? osc_cache_shrink_scan+0x9c/0x190 [osc]
      [  960.026008]  [<ffffffffa0ff89dc>] ? osc_cache_shrink_scan+0x9c/0x190 [osc]
      [  960.026008]  [<ffffffffa0ff8a72>] ? osc_cache_shrink_scan+0x132/0x190 [osc]
      [  960.026008]  [<ffffffffa0fe6a9e>] ? osc_cache_shrink+0x2e/0x50 [osc]
      [  960.026008]  [<ffffffff8114c45d>] ? shrink_slab+0x13d/0x1c0
      [  960.026008]  [<ffffffff811d25a4>] ? drop_caches_sysctl_handler+0x44/0x1c0
      [  960.026008]  [<ffffffff8121a607>] ? proc_sys_call_handler+0x97/0xd0
      [  960.026008]  [<ffffffff8121a654>] ? proc_sys_write+0x14/0x20
      [  960.026008]  [<ffffffff811a2e48>] ? vfs_write+0xb8/0x1a0
      [  960.026008]  [<ffffffff811a3821>] ? sys_write+0x51/0x90
      

        Attachments

          Activity

            People

            • Assignee:
              bobijam Zhenyu Xu
              Reporter:
              jhammond John Hammond
            • Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

              • Created:
                Updated: