Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-4157

Removing files hangs with 100%CPU on 3.12-rc7 client

    XMLWordPrintable

Details

    • Bug
    • Resolution: Fixed
    • Critical
    • None
    • Lustre 2.4.1
    • None
    • Vanilla kernel 3.12-rc7 client mounting a 2.4.1 ZFS server that works fine with a 2.4.1 client.
    • 4
    • 11279

    Description

      Deleting files with the in-kernel client (3.12-rc7) is impossible. The rm command gets stuck at 100%CPU and is unkillable with an example call trace like the following:

      Oct 23 15:49:05 beo-05 kernel: [ 1361.539903] [<ffffffff815020ff>] ? __schedule+0x2ff/0x8d0
      Oct 23 15:49:05 beo-05 kernel: [ 1361.539907] [<ffffffff812567c4>] ? radix_tree_next_chunk+0x1a4/0x210
      Oct 23 15:49:05 beo-05 kernel: [ 1361.539911] [<ffffffff810d665a>] ? find_get_pages+0xca/0x150
      Oct 23 15:49:05 beo-05 kernel: [ 1361.539914] [<ffffffff810d666a>] ? find_get_pages+0xda/0x150
      Oct 23 15:49:05 beo-05 kernel: [ 1361.539917] [<ffffffff810e042d>] ? pagevec_lookup+0x1d/0x30
      Oct 23 15:49:05 beo-05 kernel: [ 1361.539921] [<ffffffff810e20f1>] ? truncate_inode_pages_range.part.11+0xa1/0x630
      Oct 23 15:49:05 beo-05 kernel: [ 1361.539925] [<ffffffffa09376e5>] ? lmv_lock_match+0xf5/0x2d0 [lmv]
      Oct 23 15:49:05 beo-05 kernel: [ 1361.539933] [<ffffffffa087dffb>] ? ll_have_md_lock+0x14b/0x3e0 [lustre]
      Oct 23 15:49:05 beo-05 kernel: [ 1361.539936] [<ffffffff810e26c1>] ? truncate_inode_pages_range+0x41/0x50
      Oct 23 15:49:05 beo-05 kernel: [ 1361.539939] [<ffffffff810e2750>] ? truncate_inode_pages+0x10/0x20
      Oct 23 15:49:05 beo-05 kernel: [ 1361.539947] [<ffffffffa089d4c7>] ? ll_md_blocking_ast+0x447/0x650 [lustre]
      Oct 23 15:49:05 beo-05 kernel: [ 1361.539951] [<ffffffff8120314c>] ? fuse_request_send_nowait_locked+0x6c/0xd0
      Oct 23 15:49:05 beo-05 kernel: [ 1361.539960] [<ffffffffa05dd177>] ? ldlm_cancel_callback+0x67/0x190 [ptlrpc]
      Oct 23 15:49:05 beo-05 kernel: [ 1361.539969] [<ffffffffa05e68aa>] ? ldlm_cli_cancel_local+0x7a/0x3c0 [ptlrpc]
      Oct 23 15:49:05 beo-05 kernel: [ 1361.539979] [<ffffffffa05e8fbd>] ? ldlm_cli_cancel_list_local+0xdd/0x240 [ptlrpc]
      Oct 23 15:49:05 beo-05 kernel: [ 1361.539989] [<ffffffffa05e9295>] ? ldlm_cancel_resource_local+0x175/0x1e0 [ptlrpc]
      Oct 23 15:49:05 beo-05 kernel: [ 1361.539995] [<ffffffffa07806d7>] ? mdc_resource_get_unused+0xd7/0x170 [mdc]
      Oct 23 15:49:05 beo-05 kernel: [ 1361.539998] [<ffffffff810d60c8>] ? filemap_fault+0x88/0x550
      Oct 23 15:49:05 beo-05 kernel: [ 1361.540003] [<ffffffffa078170d>] ? mdc_unlink+0x9d/0x4d0 [mdc]
      Oct 23 15:49:05 beo-05 kernel: [ 1361.540007] [<ffffffffa0944fba>] ? lmv_unlink+0x1ba/0x500 [lmv]
      Oct 23 15:49:05 beo-05 kernel: [ 1361.540015] [<ffffffffa08a2ff4>] ? ll_unlink+0x164/0x410 [lustre]
      Oct 23 15:49:05 beo-05 kernel: [ 1361.540018] [<ffffffff81144f1d>] ? vfs_unlink+0x8d/0x100
      Oct 23 15:49:05 beo-05 kernel: [ 1361.540022] [<ffffffff81145123>] ? do_unlinkat+0x193/0x230
      Oct 23 15:49:05 beo-05 kernel: [ 1361.540025] [<ffffffff81137ce4>] ? vfs_read+0x124/0x170
      Oct 23 15:49:05 beo-05 kernel: [ 1361.540029] [<ffffffff8114777d>] ? SyS_unlinkat+0x1d/0x40
      Oct 23 15:49:05 beo-05 kernel: [ 1361.540032] [<ffffffff8150a226>] ? system_call_fastpath+0x1a/0x1f

      Attachments

        Issue Links

          Activity

            People

              wc-triage WC Triage
              rfehren Roland Fehrenbacher
              Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: