[LU-5302] Test failure sanity-lfsck test_13: mdt panic Created: 07/Jul/14  Updated: 03/Sep/14  Resolved: 07/Jul/14

Status: Resolved
Project: Lustre
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Bug Priority: Minor
Reporter: Maloo Assignee: nasf (Inactive)
Resolution: Duplicate Votes: 0
Labels: None

Issue Links:
Duplicate
duplicates LU-3534 async update cross-MDTs Resolved
Severity: 3
Rank (Obsolete): 14796

 Description   

This issue was created by maloo for Nathaniel Clark <nathaniel.l.clark@intel.com>

This issue relates to the following test suite run: https://testing.hpdd.intel.com/test_sets/ffea426a-02ee-11e4-ace8-5254006e85c2.

The sub-test test_13 failed with the following error:

(2) unexpected status

Info required for matching: sanity-lfsck 13

MDT Panic:

11:08:39:BUG: unable to handle kernel NULL pointer dereference at (null)
11:08:39:IP: [<ffffffffa02d430a>] lod_get_sub_trans+0x19a/0x520 [lod]
11:08:39:PGD 0 
11:08:39:Oops: 0000 [#1] SMP 
11:08:39:last sysfs file: /sys/devices/system/cpu/online
11:08:39:CPU 1 
11:08:39:Modules linked in: osp(U) mdd(U) lod(U) mdt(U) lfsck(U) mgs(U) nodemap(U) mgc(U) osd_zfs(U) lquota(U) lustre(U) lov(U) mdc(U) fid(U) lmv(U) fld(U) ksocklnd(U) ptlrpc(U) obdclass(U) lnet(U) libcfs(U) zfs(P)(U) jbd2 sha512_generic sha256_generic nfsd exportfs nfs lockd fscache auth_rpcgss nfs_acl sunrpc ib_ipoib rdma_ucm ib_ucm ib_uverbs ib_umad rdma_cm ib_cm iw_cm ib_addr ipv6 ib_sa ib_mad ib_core zcommon(P)(U) znvpair(P)(U) zavl(P)(U) zunicode(P)(U) spl(U) zlib_deflate microcode virtio_balloon 8139too 8139cp mii i2c_piix4 i2c_core ext3 jbd mbcache virtio_blk virtio_pci virtio_ring virtio pata_acpi ata_generic ata_piix dm_mirror dm_region_hash dm_log dm_mod [last unloaded: lnet_selftest]
11:08:39:
11:08:39:Pid: 29582, comm: lfsck Tainted: P        W  ---------------    2.6.32-431.17.1.el6_lustre.gbabd429.x86_64 #1 Red Hat KVM
11:08:39:RIP: 0010:[<ffffffffa02d430a>]  [<ffffffffa02d430a>] lod_get_sub_trans+0x19a/0x520 [lod]
11:08:39:RSP: 0018:ffff88002b3b3b50  EFLAGS: 00010202
11:08:39:RAX: ffff88001b629c50 RBX: 0000000000000000 RCX: ffff88001b629c58
11:08:39:RDX: ffff88001b629c48 RSI: ffff88001b629c00 RDI: ffff88001bf60dc0
11:08:39:RBP: ffff88002b3b3b90 R08: 0000000000000001 R09: ffff88001b629c00
11:08:39:R10: ffff88001b629c00 R11: 0000000000000200 R12: ffff88006077b000
11:08:39:R13: ffff88001b629c00 R14: ffff88001bf60dc0 R15: ffff88005e6d3c00
11:08:39:FS:  0000000000000000(0000) GS:ffff880002300000(0000) knlGS:0000000000000000
11:08:39:CS:  0010 DS: 0018 ES: 0018 CR0: 000000008005003b
11:08:39:CR2: 0000000000000000 CR3: 000000007af77000 CR4: 00000000000006e0
11:08:39:DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
11:08:39:DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
11:08:39:Process lfsck (pid: 29582, threadinfo ffff88002b3b2000, task ffff88006fc32ae0)
11:08:39:Stack:
11:08:39: ffff88006077b000 ffff88001b629c58 ffff88002b3b3b80 ffff88001ce15998
11:08:39:<d> ffff880033888388 ffff88001bf60dc0 ffff88001b629c00 ffff88005e6d3c00
11:08:39:<d> ffff88002b3b3bf0 ffffffffa02effec ffff88001b629c00 ffff8800190b0830
11:08:39:Call Trace:
11:08:39: [<ffffffffa02effec>] lod_declare_xattr_set+0x11c/0x330 [lod]
11:08:39: [<ffffffffa0f32d9d>] lfsck_layout_master_exec_oit+0x41d/0xc70 [lfsck]
11:08:39: [<ffffffffa1e915b5>] ? dmu_object_next+0x45/0x60 [zfs]
11:08:39: [<ffffffffa0f01840>] lfsck_exec_oit+0x70/0x9e0 [lfsck]
11:08:39: [<ffffffffa0f0cc3a>] lfsck_master_oit_engine+0x41a/0x18b0 [lfsck]
11:08:39: [<ffffffff8152806e>] ? thread_return+0x4e/0x760
11:08:39: [<ffffffffa0f0e43a>] lfsck_master_engine+0x36a/0x6f0 [lfsck]
11:08:39: [<ffffffff8152806e>] ? thread_return+0x4e/0x760
11:08:39: [<ffffffff81061d00>] ? default_wake_function+0x0/0x20
11:08:39: [<ffffffffa0f0e0d0>] ? lfsck_master_engine+0x0/0x6f0 [lfsck]
11:08:39: [<ffffffff8109ab56>] kthread+0x96/0xa0
11:08:39: [<ffffffff8100c20a>] child_rip+0xa/0x20
11:08:39: [<ffffffff8109aac0>] ? kthread+0x0/0xa0
11:08:39: [<ffffffff8100c200>] ? child_rip+0x0/0x20
11:08:39:Code: 58 49 8d 4d 58 48 89 4d c8 48 39 c1 48 8d 50 f8 75 15 eb 57 0f 1f 44 00 00 48 8b 42 08 48 39 45 c8 48 8d 50 f8 74 44 48 8b 58 f8 <4c> 39 23 75 e9 f6 05 4e 3c f4 ff 01 0f 84 3d ff ff ff f6 05 3d 
11:08:39:RIP  [<ffffffffa02d430a>] lod_get_sub_trans+0x19a/0x520 [lod]
11:08:39: RSP <ffff88002b3b3b50>
11:08:39:CR2: 0000000000000000


 Comments   
Comment by Di Wang [ 07/Jul/14 ]

This comes from my DNE2 development patch(async update for 2.7), since we are still discussing the way how will we handle the transaction, and I will track all of the async update patch in LU-3534. so I will close this one.

Generated at Sat Feb 10 01:50:18 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.