[LU-9279] coral-beta-combined build 124 kernel BUG at include/linux/scatterlist.h:65! invalid opcode: 0000 [#1] SMP Created: 15/Mar/17 Updated: 18/Aug/17 Resolved: 18/Aug/17 |
|
| Status: | Resolved |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | Lustre 2.9.0 |
| Fix Version/s: | None |
| Type: | Bug | Priority: | Critical |
| Reporter: | John Salinas (Inactive) | Assignee: | Nathaniel Clark |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | LS_RZ, prod | ||
| Environment: |
Lustre 2.9.0, but with special zfs: fs/zfs -b coral-betel-combined build 124 |
||
| Issue Links: |
|
||||||||||||||||
| Severity: | 1 | ||||||||||||||||
| Rank (Obsolete): | 9223372036854775807 | ||||||||||||||||
| Description |
|
Running IOR, Mdtest, fsx, and FileAger on 4 clients to two OSS with dRAID pools with metadata segregation and 1 MDS we hit the following: [78289.557925] ----------- Version information: PID: 51095 TASK: ffff882001da8b80 CPU: 37 COMMAND: "ll_ost_io00_005" |
| Comments |
| Comment by John Salinas (Inactive) [ 16/Mar/17 ] |
|
Dump is in: /scratch/dumps/wolf-3.wolf.hpdd.intel.com/10.8.1.3-2017-03-15-15:01:39/ 0x6f50 is in cfs_crypto_hash_update_page (/usr/src/debug/lustre-2.9.0_dirty/libcfs/libcfs/linux/linux-crypto.c:230). 1953 2899 0xf3f1 is in target_send_reply_msg (/usr/src/debug/lustre-2.9.0_dirty/lustre/ldlm/ldlm_lib.c:2902). 2897 RETURN(0); 2898 } 2899 2900 static int target_send_reply_msg(struct ptlrpc_request *req, 2901 int rc, int fail_id) 2902 { 2903 if (OBD_FAIL_CHECK_ORSET(fail_id & ~OBD_FAIL_ONCE, OBD_FAIL_ONCE)) {2904 DEBUG_REQ(D_ERROR, req, "dropping reply");2905 return -ECOMM;2906 } |
| Comment by Peter Jones [ 31/Mar/17 ] |
|
Nathaniel Could you please assist with this one? Oleg wonders if it is a similar issue to Peter |
| Comment by John Salinas (Inactive) [ 12/Apr/17 ] |
|
Nathaniel do you have any questions for us? |
| Comment by Nathaniel Clark [ 12/Apr/17 ] |
|
This does look like the same area as The crash is due to an asssertion in the scatterlist code: 0xacb1 is in cfs_crypto_hash_update_page (include/linux/scatterlist.h:65). 60 61 /* 62 * In order for the low bit stealing approach to work, pages 63 * must be aligned at a 32-bit boundary as a minimum. 64 */ 65 BUG_ON((unsigned long) page & 0x03); 66 #ifdef CONFIG_DEBUG_SG 67 BUG_ON(sg->sg_magic != SG_MAGIC); 68 BUG_ON(sg_is_chain(sg)); 69 #endif Do any of your patches on 2.9, go anywhere near this code? Are you using the vanilla 2.9 code? It looks like the kiov_page in the ptlrpc_bulk_desc either didn't get initialized or was set from a bad value. |
| Comment by John Salinas (Inactive) [ 13/Apr/17 ] |
|
We use vanilla 2.9 code, but we have enabled tried both of these on the clients: And this on the server: All of our code changes are in ZFS where we have a dRAID pool instead of a RAIDz pool and that pool has metadata segregation. |
| Comment by Nathaniel Clark [ 13/Apr/17 ] |
|
Do you have a crash dump from this? (possibly in /var/crash/<whateverdateandtime>/) Could you also attach an sosreport from the MDS? Thanks |
| Comment by John Salinas (Inactive) [ 13/Apr/17 ] |
|
We are using the vanilla 2.9.0 code, but the ZFS code we have added a new RAID type (dRAID). We are making use of both 16MB RPCs from Lustre Client to OSS and have BRW size to 16 as well. Hope that helps, |
| Comment by John Salinas (Inactive) [ 13/Apr/17 ] |
|
The dump was in: /scratch/dumps/wolf-3.wolf.hpdd.intel.com/10.8.1.3-2017-03-15-15:01:39/ – I have asked doc if it can be restored I am not sure. I guess we need a more persistent place for these ... |
| Comment by Nathaniel Clark [ 13/Apr/17 ] |
|
Given this and |
| Comment by John Salinas (Inactive) [ 13/Apr/17 ] |
|
I could try to reproduce this on raidz if that would help. I am not sure how to tell this but a bunch of work was done to integrate ABD (arc buffer data) which included switching how zfs handles linux memory to make it more efficient. |
| Comment by Nathaniel Clark [ 13/Apr/17 ] |
|
That sounds suspicious. Which branch is that on (just to simplify my searching)? |
| Comment by John Salinas (Inactive) [ 13/Apr/17 ] |
|
fs/zfs -b coral-beta-combined I have run heavy loads with just zfs without issue However, running with Lustre 2.9.0 + this causes issues. The ABD worked has already been merged in theory if that is the issue we could reproduce this with just 0.7.0. |
| Comment by Nathaniel Clark [ 14/Apr/17 ] |
|
The ABD code uses scatterlist's to track pages. It's like the bits sg uses on page_link are leaking... but that should cause lots of problems. Getting a crash dump from this would be the most useful way forward. |
| Comment by Andreas Dilger [ 25/Apr/17 ] |
|
If there is memory corruption in |
| Comment by Jinshan Xiong (Inactive) [ 05/May/17 ] |
|
Try Centos 7.3 and check if you can see the issue. I suspect this is kernel bug from crypto framework. |
| Comment by John Salinas (Inactive) [ 05/May/17 ] |
|
Will do I am working on getting 0.7.0 RC4 + Centos 7.3 + 2.10 Tag |
| Comment by Andreas Dilger [ 12/May/17 ] |
|
Copying the comments from
|
| Comment by Andreas Dilger [ 12/May/17 ] |
|
John S., could you please comment on whether you have been able to test with RHEL7.3 and if you are still seeing the checksum errors/corruption? |
| Comment by Peter Jones [ 18/May/17 ] |
|
Descoping from 2.10 in the absence of any information that this is still a live issue |
| Comment by Nathaniel Clark [ 18/Aug/17 ] |
|
This appears to be fixed by rhel 7.3 kernel update. |