[LU-2406] Interop 2.3<->2.4 Failure: unable to handle kernel NULL pointer dereference at (null) Created: 29/Nov/12 Updated: 26/Dec/12 Resolved: 13/Dec/12 |
|
| Status: | Resolved |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | Lustre 2.4.0 |
| Fix Version/s: | Lustre 2.4.0 |
| Type: | Bug | Priority: | Blocker |
| Reporter: | Maloo | Assignee: | Andreas Dilger |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | None | ||
| Environment: |
server: 2.3 RHEL6 |
||
| Issue Links: |
|
||||||||
| Severity: | 3 | ||||||||
| Rank (Obsolete): | 5701 | ||||||||
| Description |
|
This issue was created by maloo for sarah <sarah@whamcloud.com> This issue relates to the following test suite run: https://maloo.whamcloud.com/test_sets/72277ac4-397d-11e2-9fda-52540035b04c. The sub-test test_26a failed with the following error:
From OST console log: 15:34:24:Lustre: DEBUG MARKER: == sanity test 26a: multiple component symlink ========================= 15:34:02 (1353972842) 15:34:24:Lustre: DEBUG MARKER: lctl set_param -n fail_loc=0 2>/dev/null || true 15:34:24:BUG: unable to handle kernel NULL pointer dereference at (null) 15:34:24:IP: [<(null)>] (null) 15:34:24:PGD 7bc31067 PUD 7bc38067 PMD 0 15:34:24:Oops: 0010 [#1] SMP 15:34:24:last sysfs file: /sys/devices/system/cpu/possible 15:34:24:CPU 0 15:34:24:Modules linked in: osd_ldiskfs(U) fsfilt_ldiskfs(U) ldiskfs(U) lustre(U) ofd(U) ost(U) cmm(U) mdt(U) mdd(U) mds(U) mgs(U) jbd2 obdecho(U) mgc(U) lquota(U) lov(U) osc(U) mdc(U) lmv(U) fid(U) fld(U) ptlrpc(U) obdclass(U) lvfs(U) ksocklnd(U) lnet(U) sha512_generic sha256_generic libcfs(U) nfs fscache nfsd lockd nfs_acl auth_rpcgss exportfs autofs4 sunrpc ib_ipoib rdma_ucm ib_ucm ib_uverbs ib_umad rdma_cm ib_cm iw_cm ib_addr ipv6 ib_sa ib_mad ib_core microcode virtio_balloon 8139too 8139cp mii i2c_piix4 i2c_core ext3 jbd mbcache virtio_blk virtio_pci virtio_ring virtio pata_acpi ata_generic ata_piix dm_mirror dm_region_hash dm_log dm_mod [last unloaded: speedstep_lib] 15:34:24: 15:34:24:Pid: 17619, comm: ll_ost00_007 Not tainted 2.6.32-279.5.1.el6_lustre.gb16fe80.x86_64 #1 Red Hat KVM 15:34:24:RIP: 0010:[<0000000000000000>] [<(null)>] (null) 15:34:24:RSP: 0018:ffff88007b21bdf8 EFLAGS: 00010093 15:34:24:RAX: ffff88007b2a1e30 RBX: ffffffffffffffe8 RCX: 0000000000000000 15:34:24:RDX: 0000000000000000 RSI: 0000000000000003 RDI: ffff88007b2a1e30 15:34:24:RBP: ffff88007b21be40 R08: 0000000000000000 R09: 0000000000000000 15:34:24:R10: 000000000000000f R11: 000000000000000f R12: 0000000000000000 15:34:24:R13: ffff880078a8a280 R14: 0000000000000000 R15: 0000000000000000 15:34:24:FS: 00007f0021cd2700(0000) GS:ffff880002200000(0000) knlGS:0000000000000000 15:34:24:CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b 15:34:24:CR2: 0000000000000000000fffbc000 - 0000000100000000 (reserved) |
| Comments |
| Comment by Andreas Dilger [ 30/Nov/12 ] |
|
Unfortunately, there is not enough information in the stack dump to know anything about what failed, or where. Looking more closely, it appears that the root problem is that the 2.4 test-framework.sh defaults to "USE_OFD=yes", which causes the 2.3 code to run with the ofd and osd-ldiskfs modules on the OST. I'm just working on a patch to remove "USE_OFD" from the 2.4 t-f entirely. |
| Comment by Andreas Dilger [ 30/Nov/12 ] |
| Comment by Andreas Dilger [ 30/Nov/12 ] |
|
This is caused by b2_3 interop tests failing due to |
| Comment by Jodi Levi (Inactive) [ 05/Dec/12 ] |
|
Per Andreas and Oleg this can be removed as an NF Blocker, but will remain a top blocker for 2.4. |
| Comment by Sarah Liu [ 12/Dec/12 ] |
|
In the latest testing between tag-2.3.57 and b2_3, this test passed: |
| Comment by Peter Jones [ 13/Dec/12 ] |
|
Landed for 2.4 |