Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-2703

racer: BUG: soft lockup - CPU#0 stuck for 67s! [dd:1404]

    XMLWordPrintable

Details

    • 3
    • 6296

    Description

      While running racer test, the following issue occurred on one of the two clients:

      00:11:31:Lustre: DEBUG MARKER: == racer test 1: racer on clients: client-28vm1,client-28vm2.lab.whamcloud.com DURATION=900 == 00:11:30 (1359447090)
      00:11:32:Lustre: DEBUG MARKER: DURATION=900 /usr/lib64/lustre/tests/racer/racer.sh /mnt/lustre/racer 
      00:24:12:BUG: soft lockup - CPU#0 stuck for 67s! [dd:1404]
      00:24:12:Modules linked in: mgc(U) lustre(U) lov(U) mdc(U) lquota(U) osc(U) ksocklnd(U) ptlrpc(U) obdclass(U) lnet(U) lvfs(U) libcfs(U) nfs fscache nfsd lockd nfs_acl auth_rpcgss exportfs autofs4 sunrpc ib_ipoib rdma_ucm ib_ucm ib_uverbs ib_umad rdma_cm ib_cm iw_cm ib_addr ipv6 ib_sa ib_mad ib_core microcode virtio_balloon 8139too 8139cp mii i2c_piix4 i2c_core ext3 jbd mbcache virtio_blk virtio_pci virtio_ring virtio pata_acpi ata_generic ata_piix dm_mirror dm_region_hash dm_log dm_mod [last unloaded: speedstep_lib]
      00:24:12:CPU 0 
      00:24:12:Modules linked in: mgc(U) lustre(U) lov(U) mdc(U) lquota(U) osc(U) ksocklnd(U) ptlrpc(U) obdclass(U) lnet(U) lvfs(U) libcfs(U) nfs fscache nfsd lockd nfs_acl auth_rpcgss exportfs autofs4 sunrpc ib_ipoib rdma_ucm ib_ucm ib_uverbs ib_umad rdma_cm ib_cm iw_cm ib_addr ipv6 ib_sa ib_mad ib_core microcode virtio_balloon 8139too 8139cp mii i2c_piix4 i2c_core ext3 jbd mbcache virtio_blk virtio_pci virtio_ring virtio pata_acpi ata_generic ata_piix dm_mirror dm_region_hash dm_log dm_mod [last unloaded: speedstep_lib]
      00:24:12:
      00:24:12:Pid: 1404, comm: dd Not tainted 2.6.32-279.19.1.el6.x86_64 #1 Red Hat KVM
      00:24:12:RIP: 0010:[<ffffffff814ec53e>]  [<ffffffff814ec53e>] _spin_lock+0x1e/0x30
      00:24:12:RSP: 0018:ffff880030ce16a8  EFLAGS: 00000206
      00:24:13:RAX: 0000000000000001 RBX: ffff880030ce16a8 RCX: ffff8800783c6ba0
      00:24:13:RDX: 0000000000000000 RSI: ffff88004ecec7c0 RDI: ffff88007b510a1c
      00:24:13:RBP: ffffffff8100bb8e R08: 0000000000000102 R09: 0000000000000000
      00:24:14:R10: 0000000003b5e000 R11: 000000000000000e R12: ffffffffa058315f
      00:24:14:R13: ffff880030ce1638 R14: 0000000003b5e000 R15: 0000000003b5efff
      00:24:14:FS:  00007f288786d700(0000) GS:ffff880002200000(0000) knlGS:0000000000000000
      00:24:14:CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
      00:24:14:CR2: 00000036526cd710 CR3: 000000003d0ee000 CR4: 00000000000006f0
      00:24:14:DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      00:24:14:DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
      00:24:14:Process dd (pid: 1404, threadinfo ffff880030ce0000, task ffff88007d082aa0)
      00:24:14:Stack:
      00:24:14: ffff880030ce17f8 ffffffffa0275889 00000000002537dc ffffea000115a9f0
      00:24:14:<d> ffff88003e877950 0000000000000000 0000000003b5e000 0000000003b5efff
      00:24:14:<d> ffff880030ce1998 ffffffffa079bddd f869fda2cf0897a1 0000000000000000
      00:24:14:Call Trace:
      00:24:15: [<ffffffffa0275889>] ? osc_queue_async_io+0x399/0x1140 [osc]
      00:24:15: [<ffffffffa079bddd>] ? ll_prepare_write+0x50d/0x1230 [lustre]
      00:24:15: [<ffffffffa072adce>] ? lov_stripe_offset+0x28e/0x340 [lov]
      00:24:15: [<ffffffffa072a8db>] ? lov_tgt_seq_show+0x26b/0x300 [lov]
      00:24:16: [<ffffffffa070d0a9>] ? lov_queue_async_io+0x149/0x4a0 [lov]
      00:24:16: [<ffffffffa0795780>] ? queue_or_sync_write+0x160/0xda0 [lustre]
      00:24:16: [<ffffffffa07a2c2b>] ? ll_stats_ops_tally+0x6b/0xd0 [lustre]
      00:24:16: [<ffffffffa079cde5>] ? ll_commit_write+0x2e5/0x750 [lustre]
      00:24:16: [<ffffffffa07b4333>] ? ll_write_begin+0x83/0x210 [lustre]
      00:24:16: [<ffffffffa07b4280>] ? ll_write_end+0x30/0x60 [lustre]
      00:24:16: [<ffffffff811107fa>] ? generic_file_buffered_write+0x18a/0x2e0
      00:24:16: [<ffffffff81070f97>] ? current_fs_time+0x27/0x30
      00:24:16: [<ffffffff81112130>] ? __generic_file_aio_write+0x250/0x480
      00:24:16: [<ffffffffa0765dba>] ? ll_file_get_tree_lock_iov+0x14a/0x810 [lustre]
      00:24:16: [<ffffffff811123cf>] ? generic_file_aio_write+0x6f/0xe0
      00:24:16: [<ffffffffa0772449>] ? ll_file_aio_write+0xa19/0x1c60 [lustre]
      00:24:16: [<ffffffffa0773760>] ? ll_file_write+0xd0/0xf0 [lustre]
      00:24:16: [<ffffffff8105a5c3>] ? perf_event_task_sched_out+0x33/0x80
      00:24:16: [<ffffffff81090990>] ? autoremove_wake_function+0x0/0x40
      00:24:16: [<ffffffff8120ca26>] ? security_file_permission+0x16/0x20
      00:24:16: [<ffffffff8117646d>] ? rw_verify_area+0x5d/0xc0
      00:24:16: [<ffffffff81176588>] ? vfs_write+0xb8/0x1a0
      00:24:16: [<ffffffff81176e81>] ? sys_write+0x51/0x90
      00:24:16: [<ffffffff810d3a75>] ? __audit_syscall_exit+0x265/0x290
      00:24:16: [<ffffffff8100b072>] ? system_call_fastpath+0x16/0x1b
      00:24:16:Code: 00 00 00 01 74 05 e8 72 8c d8 ff c9 c3 55 48 89 e5 0f 1f 44 00 00 b8 00 00 01 00 3e 0f c1 07 0f b7 d0 c1 e8 10 39 c2 74 0e f3 90 <0f> 1f 44 00 00 83 3f 00 75 f4 eb df c9 c3 0f 1f 40 00 55 48 89
      

      Maloo report: https://maloo.whamcloud.com/test_sets/83fa5d12-6a23-11e2-85d4-52540035b04c

      Attachments

        Issue Links

          Activity

            People

              bobijam Zhenyu Xu
              yujian Jian Yu
              Votes:
              0 Vote for this issue
              Watchers:
              9 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: