Details
-
Bug
-
Resolution: Fixed
-
Minor
-
None
-
Lustre 2.8.0
-
None
-
3
-
17551
Description
This issue was created by maloo for wangdi <di.wang@intel.com>
This issue relates to the following test suite run: https://testing.hpdd.intel.com/test_sets/e9670654-b77f-11e4-a08c-5254006e85c2.
The sub-test test_101 failed with the following error:
test failed to respond and timed out
Please provide additional information about the failure here.
Info required for matching: replay-single 101
03:15:46:LustreError: 2829:0:(client.c:1148:ptlrpc_import_delay_req()) Skipped 5 previous similar messages
03:15:46:Removing read-only on unknown block (0xfd00000)
03:15:46:Lustre: DEBUG MARKER: lsmod | grep lnet > /dev/null && lctl dl | grep ' ST '
03:15:46:Lustre: DEBUG MARKER: hostname
03:15:46:Lustre: DEBUG MARKER: test -b /dev/lvm-Role_MDS/P1
03:15:46:Lustre: DEBUG MARKER: mkdir -p /mnt/mds1; mount -t lustre -o abort_recovery /dev/lvm-Role_MDS/P1 /mnt/mds1
03:15:46:LDISKFS-fs (dm-0): recovery complete
03:15:46:LDISKFS-fs (dm-0): mounted filesystem with ordered data mode. quota=on. Opts:
03:15:46:LustreError: 7321:0:(mdt_handler.c:5779:mdt_iocontrol()) lustre-MDT0000: Aborting recovery for device
03:15:46:LustreError: 7321:0:(ldlm_lib.c:2256:target_stop_recovery_thread()) lustre-MDT0000: Aborting recovery
03:15:46:Lustre: 7400:0:(ldlm_lib.c:1819:target_recovery_overseer()) recovery is aborted, evict exports in recovery
03:15:46:Lustre: 7400:0:(ldlm_lib.c:1819:target_recovery_overseer()) Skipped 2 previous similar messages
03:15:46:Lustre: lustre-MDT0000: disconnecting 5 stale clients
03:15:46:BUG: unable to handle kernel NULL pointer dereference at 0000000000000028
03:15:46:IP: [<ffffffffa08771c4>] out_tx_attr_set_undo+0x64/0x90 [ptlrpc]
03:15:46:PGD 0
03:15:46:Oops: 0000 1 SMP
03:15:46:last sysfs file: /sys/devices/pci0000:00/0000:00:04.0/virtio0/block/vda/queue/scheduler
03:15:46:CPU 1
03:15:46:Modules linked in: osp(U) mdd(U) lod(U) mdt(U) lfsck(U) mgs(U) mgc(U) osd_ldiskfs(U) lquota(U) lustre(U) lov(U) mdc(U) fid(U) lmv(U) fld(U) ksocklnd(U) ptlrpc(U) obdclass(U) lnet(U) sha512_generic sha256_generic libcfs(U) ldiskfs(U) jbd2 nfsd exportfs nfs lockd fscache auth_rpcgss nfs_acl sunrpc ib_ipoib rdma_ucm ib_ucm ib_uverbs ib_umad rdma_cm ib_cm iw_cm ib_addr ipv6 ib_sa ib_mad ib_core microcode virtio_balloon 8139too 8139cp mii i2c_piix4 i2c_core ext3 jbd mbcache virtio_blk virtio_pci virtio_ring virtio pata_acpi ata_generic ata_piix dm_mirror dm_region_hash dm_log dm_mod [last unloaded: speedstep_lib]
03:15:46:
03:15:46:Pid: 7400, comm: tgt_recov Not tainted 2.6.32-431.29.2.el6_lustre.x86_64 #1 Red Hat KVM
03:15:46:RIP: 0010:[<ffffffffa08771c4>] [<ffffffffa08771c4>] out_tx_attr_set_undo+0x64/0x90 [ptlrpc]
03:15:46:RSP: 0018:ffff88005a3c9d70 EFLAGS: 00010282
03:15:46:RAX: ffff88005dc30a20 RBX: ffff88007a9b1e00 RCX: 0000000000000000
03:15:46:RDX: 0000000000000000 RSI: ffffffffa08c9c40 RDI: ffffffffa09303e0
03:15:46:RBP: ffff88005a3c9d80 R08: ffff88005e5d9c18 R09: 0000000000000000
03:15:46:R10: ffff88006d399680 R11: 0000000000000040 R12: 0000000000000000
03:15:46:R13: ffff880079c7b800 R14: ffff88005cb80640 R15: 0000000000000008
03:15:46:FS: 0000000000000000(0000) GS:ffff880002300000(0000) knlGS:0000000000000000
03:15:46:CS: 0010 DS: 0018 ES: 0018 CR0: 000000008005003b
03:15:46:CR2: 0000000000000028 CR3: 000000007d971000 CR4: 00000000000006e0
03:15:46:DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
03:15:46:DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
03:15:46:Process tgt_recov (pid: 7400, threadinfo ffff88005a3c8000, task ffff88005a3c7540)
03:15:46:Stack:
03:15:46: ffff88005cb80640 ffff88006cf55800 ffff88005a3c9e20 ffffffffa088743c
03:15:46:<d> 0000000000000001 ffff88007a9b1ef0 0000000000000212 ffff88005cb800c0
03:15:46:<d> ffff88005db0c940 0000000000000008 0000000220000100 0000000000000000
03:15:46:Call Trace:
03:15:46: [<ffffffffa088743c>] distribute_txn_replay_handle+0x7ec/0x940 [ptlrpc]
03:15:46: [<ffffffffa07d1871>] target_recovery_thread+0x9e1/0x1ad0 [ptlrpc]
03:15:46: [<ffffffffa07d0e90>] ? target_recovery_thread+0x0/0x1ad0 [ptlrpc]
03:15:46: [<ffffffff8109abf6>] kthread+0x96/0xa0
03:15:46: [<ffffffff8100c20a>] child_rip+0xa/0x20
03:15:46: [<ffffffff8109ab60>] ? kthread+0x0/0xa0
03:15:46: [<ffffffff8100c200>] ? child_rip+0x0/0x20
03:15:46:Code: 00 5b 02 00 00 48 c7 05 57 92 0b 00 10 04 93 a0 c7 05 45 92 0b 00 00 00 02 00 48 8b 42 10 48 8b 16 48 c7 c6 40 9c 8c a0 48 8b 00 <48> 8b 52 28 44 8b 48 0c 44 8b 40 08 48 83 c2 0c c7 04 24 f4 fd
03:15:46:RIP [<ffffffffa08771c4>] out_tx_attr_set_undo+0x64/0x90 [ptlrpc]
Attachments
Issue Links
- is related to
-
LU-3534 async update cross-MDTs
- Resolved