[LU-4364] OST Page Fault test sanity test_133f: fldb_seq_start+0x6d Created: 09/Dec/13 Updated: 04/Mar/14 Resolved: 04/Mar/14 |
|
| Status: | Resolved |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | Lustre 2.6.0 |
| Fix Version/s: | Lustre 2.6.0 |
| Type: | Bug | Priority: | Critical |
| Reporter: | Maloo | Assignee: | Di Wang |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | revzfs, zfs | ||
| Severity: | 3 |
| Rank (Obsolete): | 11947 |
| Description |
|
This issue was created by maloo for Nathaniel Clark <nathaniel.l.clark@intel.com> This issue relates to the following test suite run: http://maloo.whamcloud.com/test_sets/8b0e35da-4d42-11e3-95a5-52540035b04c. The sub-test test_133f failed with the following error:
Info required for matching: sanity 133f OST console log: 01:15:03:BUG: unable to handle kernel paging request at fffffffffffffffe 01:15:03:IP: [<ffffffffa0b86f8d>] fldb_seq_start+0x6d/0xc0 [fld] 01:15:03:PGD 1a87067 PUD 1a88067 PMD 0 01:15:03:Oops: 0000 [#1] SMP 01:15:03:last sysfs file: /sys/devices/system/cpu/possible 01:15:03:CPU 0 01:15:03:Modules linked in: osp(U) ofd(U) lfsck(U) ost(U) mgc(U) osd_zfs(U) lquota(U) lustre(U) lov(U) osc(U) mdc(U) fid(U) fld(U) ksocklnd(U) ptlrpc(U) obdclass(U) lnet(U) sha512_generic sha256_generic libcfs(U) nfsd exportfs autofs4 nfs lockd fscache auth_rpcgss nfs_acl sunrpc ib_ipoib rdma_ucm ib_ucm ib_uverbs ib_umad rdma_cm ib_cm iw_cm ib_addr ipv6 ib_sa ib_mad ib_core zfs(P)(U) zcommon(P)(U) znvpair(P)(U) zavl(P)(U) zunicode(P)(U) spl(U) zlib_deflate microcode virtio_balloon 8139too 8139cp mii i2c_piix4 i2c_core ext3 jbd mbcache virtio_blk virtio_pci virtio_ring virtio pata_acpi ata_generic ata_piix dm_mirror dm_region_hash dm_log dm_mod [last unloaded: speedstep_lib] 01:15:03: 01:15:03:Pid: 13161, comm: cat Tainted: P --------------- 2.6.32-358.23.2.el6_lustre.g02571dc.x86_64 #1 Red Hat KVM 01:15:03:RIP: 0010:[<ffffffffa0b86f8d>] [<ffffffffa0b86f8d>] fldb_seq_start+0x6d/0xc0 [fld] 01:15:03:RSP: 0018:ffff880072bdbdf8 EFLAGS: 00010246 01:15:03:RAX: fffffffffffffffe RBX: ffff8800662bea40 RCX: 0000000000000000 01:15:03:RDX: ffff880072e7b800 RSI: ffff88007200e470 RDI: ffff88006d1d1400 01:15:03:RBP: ffff880072bdbe18 R08: ffffc90004447000 R09: 0000000000000000 01:15:03:R10: 0000000000000001 R11: ffffffffffffffff R12: ffff880072bdbe60 01:15:03:R13: 0000000000000000 R14: 0000000000008000 R15: ffff880072bdbe60 01:15:03:FS: 00007f4e65a26700(0000) GS:ffff880002200000(0000) knlGS:0000000000000000 01:15:03:CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 01:15:03:CR2: fffffffffffffffe CR3: 000000006fa80000 CR4: 00000000000006f0 01:15:03:DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 01:15:03:DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 01:15:03:Process cat (pid: 13161, threadinfo ffff880072bda000, task ffff88006faa1500) 01:15:03:Stack: 01:15:03: ffff88007bfeb740 ffff88006ea59e00 ffff88007bfeb740 0000000000000000 01:15:03:<d> ffff880072bdbe98 ffffffff811a5356 ffff88006faa1500 0000000001860000 01:15:03:<d> ffff88007bfeb778 ffff880072bdbf48 0000000000008000 0000000000000000 01:15:03:Call Trace: 01:15:03: [<ffffffff811a5356>] seq_read+0x96/0x400 01:15:03: [<ffffffff811e9bae>] proc_reg_read+0x7e/0xc0 01:15:03: [<ffffffff81181ac5>] vfs_read+0xb5/0x1a0 01:15:03: [<ffffffff81181c01>] sys_read+0x51/0x90 01:15:03: [<ffffffff8100b072>] system_call_fastpath+0x16/0x1b First occurrence is linked above, came on patch http://review.whamcloud.com/7884 which touch fld code and may be responsible. |
| Comments |
| Comment by Peter Jones [ 10/Dec/13 ] |
|
Di Could you please comment on this one? Thanks Peter |
| Comment by Oleg Drokin [ 10/Dec/13 ] |
|
it appears like an attempt to dereference a pointer that's -Esomething. |
| Comment by Di Wang [ 11/Dec/13 ] |
|
Hmm, it seems zfs has different iteration behavior than ldiskfs. probably this line needs to change *pos = be64_to_cpu(*(__u64 *)iops->key(¶m->fsp_env, param->fsp_it)); i.e. we need check return value of key. |
| Comment by Di Wang [ 11/Dec/13 ] |
| Comment by nasf (Inactive) [ 27/Dec/13 ] |
|
Another failure instance: https://maloo.whamcloud.com/test_sets/ec56daa6-6e83-11e3-b713-52540035b04c |
| Comment by Andreas Dilger [ 28/Dec/13 ] |
|
Patch landed to master a few hours ago, hopefully it will fix the problem. |
| Comment by Jodi Levi (Inactive) [ 04/Mar/14 ] |
|
Patch landed to Master. |