[LU-3974] Support for linux 3.11 kernel Created: 19/Sep/13 Updated: 27/Jan/16 Resolved: 02/Mar/14 |
|
| Status: | Resolved |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | Lustre 2.5.0, Lustre 2.6.0 |
| Fix Version/s: | Lustre 2.6.0 |
| Type: | Improvement | Priority: | Minor |
| Reporter: | Bob Glossman (Inactive) | Assignee: | Bob Glossman (Inactive) |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | None | ||
| Issue Links: |
|
||||||||||||||||
| Rank (Obsolete): | 10608 | ||||||||||||||||
| Description |
|
Tracker for 3.11 kernel support As fc19 just recently updated their kernel from 3.10 to 3.11 we will need to support the 3.11 kernel soon. |
| Comments |
| Comment by Bob Glossman (Inactive) [ 19/Sep/13 ] |
|
Already found at least one incompatibility with current lustre. num_physpages, a global kernel export used by lustre code, no longer exists. I'm thinking we will need some autoconf change(s) to adapt to this fact. |
| Comment by Dmitry Eremin (Inactive) [ 20/Sep/13 ] |
|
num_physpages can be replaced by inline function get_num_physpages() |
| Comment by Bob Glossman (Inactive) [ 20/Sep/13 ] |
|
address_space_operations.invalidatepage now take 3 args instead of 2. This will force mods in ll_invalidatepage, truncate_complete_page, maybe ldiskfs, maybe elsewhere I haven't identified yet. |
| Comment by James A Simmons [ 20/Sep/13 ] |
|
Last time I tried this also d_compare api changed as well which affects some of the llite code. |
| Comment by Bob Glossman (Inactive) [ 23/Sep/13 ] |
|
I see a couple of recent patches that do part of the job of getting client builds working again: http://review.whamcloud.com/#/c/7726 These are in addition to the set of patches listed in |
| Comment by James A Simmons [ 23/Sep/13 ] |
|
Yep. More coming. I just need to do some more testing to make sure they work every where. |
| Comment by James A Simmons [ 24/Sep/13 ] |
|
For inital 3.11 client support you need the following: All the patches from
That will give you basic 3.10 support. The rest needed to be applied in the below order http://review.whamcloud.com/#/c/7741 Changes affecting NFS happened so that needs to be tested for possible regressions. |
| Comment by Bob Glossman (Inactive) [ 24/Sep/13 ] |
|
I think you left out http://review.whamcloud.com/#/c/7726 |
| Comment by Bob Glossman (Inactive) [ 24/Sep/13 ] |
|
I've verified that with the new additional patches client builds work again for fc19 with the 3.11 kernel. Any ETA for server builds? |
| Comment by James A Simmons [ 24/Sep/13 ] |
|
Hm. I seeing problems with the infiniband stack. Are you? Lustre: Lustre: Build Version: 2.4.92-gb1bc221-CHANGED-3.11.1 As for the server support. That will be awhile still. I'm half way done. |
| Comment by Bob Glossman (Inactive) [ 24/Sep/13 ] |
|
So far have only done building, haven't actually tried an install so I wouldn't have seen missing symbols. Will try that in a while. |
| Comment by Bob Glossman (Inactive) [ 24/Sep/13 ] |
|
I think all those symbols should be in infiniband .ko loadable modules. You didn't maybe config your kernel without infiniband or try to force load ko2iblnd lnet module without having inifiniband modules loaded, did you? |
| Comment by James A Simmons [ 24/Sep/13 ] |
|
[root@ninja05 infiniband]# lsmod | grep mlx |
| Comment by Bob Glossman (Inactive) [ 25/Sep/13 ] |
|
I haven't been able to reproduce your failure with IB symbols. I'm running in VMs without any IB, but if I force load of ko2iblnd with 'modprobe ko2iblnd' I don't seen any Unknown symbol errors and I do see the following set of previously not loaded kernel modules getting loaded: rdma_cm 42363 1 ko2iblnd That's the good news. On the bad news side I can't get the fc19 client to actually function. It builds fine, appears to install fine. I can do a mount of a lustre file system on Centos servers that works fine in Centos and SuSE clients, but any access fails. example: [root@fedora19 x86_64]# mount -t lustre centos2:/lustre /mnt/lustre [root@fedora19 x86_64]# ls -l /mnt/lustre ls: reading directory /mnt/lustre: Input/output error total 0 I'm still trying to debug what it's unhappy about. |
| Comment by Bob Glossman (Inactive) [ 25/Sep/13 ] |
|
It seems the failures I'm seeing only happen during directory access. I can actually create some files, put data in them, read the data back all from the fc19 client. Only see errors during ls (so far). Pretty sure the directories and inodes on disk are really OK. All look fine viewed from other clients. |
| Comment by James A Simmons [ 25/Sep/13 ] |
|
Have you tried the lustre client that comes with 3.11. I'm curious to see if that works. |
| Comment by Bob Glossman (Inactive) [ 25/Sep/13 ] |
|
I did make some attempt to build it. Needs some non-obvious manual edits to Kconfig files just to get it to try to build. Doesn't build in its current form, at least for me. Strongly suspect it need one or more of the patches in this ticket or the 3.10 kernel ticket or both. |
| Comment by Andreas Dilger [ 25/Sep/13 ] |
|
Bob, We can also use that ticket (or this one) to track the landing of patches into master that are getting applied to the upstream kernel (where that makes sense). I know there are a number of good cleanups going into the Lustre client in the kernel, but there are also some code deletions that we either need to rework in order to get something acceptable to upstream, or keep them in the out-of-tree client for our own needs (e.g. watchdogs and stack tracing on servers, etc). |
| Comment by Bob Glossman (Inactive) [ 01/Oct/13 ] |
|
About the problems with directory access. Each attempt to do ls /mnt/lustre generates some errors that look like: Oct 1 09:33:45 fedora19 kernel: [174400.937657] LustreError: 24568:0:(dir.c:398:ll_get_dir_page()) dir page locate: [0x200000007:0x1:0x0] at 18446744073709551614: rc -5 Oct 1 09:33:45 fedora19 kernel: [174400.937661] LustreError: 24568:0:(dir.c:595:ll_dir_read()) error reading dir [0x200000007:0x1:0x0] at 18446744073709551614: rc -5 Attempting to mkdir in /mnt/lustre just hangs. |
| Comment by Bob Glossman (Inactive) [ 01/Oct/13 ] |
|
Current build recipe for clients is: http://review.whamcloud.com/#/c/5512 # http://review.whamcloud.com/#/c/7135 # http://review.whamcloud.com/#/c/5380 # http://review.whamcloud.com/#/c/7741 # A few of these need refresh. In particular http://review.whamcloud.com/#/c/5512 adds a couple more instances of d_refcount() that aren't yet fixed in http://review.whamcloud.com/#/c/7741. No solution yet about functional problems. |
| Comment by Bob Glossman (Inactive) [ 01/Oct/13 ] |
|
In an attempt to narrow down the source of the runtime problems with the 3.11 client code I applied and built the same recipe (- the patches from |
| Comment by Bob Glossman (Inactive) [ 04/Oct/13 ] |
|
Retried the same recipe using the recent refresh of the |
| Comment by Peng Tao [ 15/Oct/13 ] |
|
Bob, I didn't see the ll_read_dir() error message in upstream kernel client but I did see mkdir hung. It is because Lustre always saves the first page of a dir inode mapping at index ~0UL. And after 3.11 commit 5a72039 (mm: teach truncate_inode_pages_range() to handle non page aligned ranges), truncate_inode_pages_range() NO LONGER truncates the page that is sitting at index ~0UL. My patch to fix it is at: It is not merge in upstream kernel yet because the problem was not there when I tested the code in staging tree before 3.11. |
| Comment by Bob Glossman (Inactive) [ 15/Oct/13 ] |
|
Is that mod to lustre_lite.h safe in any lustre version or only good in upstream lustre on 3.11 that has the commit 5a72039 (mm: teach truncate_inode_pages_range() to handle non page aligned ranges) you mention above? |
| Comment by Peng Tao [ 15/Oct/13 ] |
|
It is safe to any Lustre version. |
| Comment by Bob Glossman (Inactive) [ 15/Oct/13 ] |
|
Tried out the patch in https://github.com/bergwolf/linux/commit/28fde12d26d7d4b54cb4689dd6117c2ed791f985. Does solve the problem of mkdir hang. No impact on the other directory access issues. Still see errors like: # ll -aR /mnt/lustre /mnt/lustre: ls: reading directory /mnt/lustre: Input/output error total 20 drwxr-xr-x 4 root root 4096 Oct 15 09:37 . drwxr-xr-x. 5 root root 4096 Sep 24 12:20 .. drwxr-xr-x 2 root root 4096 Oct 15 09:37 dtest drwxr-xr-x 3 root root 4096 Aug 29 14:05 .lustre -rw-r--r-- 1 root root 4 Sep 25 09:10 xxx /mnt/lustre/dtest: ls: reading directory /mnt/lustre/dtest: Input/output error total 8 drwxr-xr-x 2 root root 4096 Oct 15 09:37 . drwxr-xr-x 4 root root 4096 Oct 15 09:37 .. /mnt/lustre/.lustre: ls: reading directory /mnt/lustre/.lustre: Input/output error total 12 drwxr-xr-x 3 root root 4096 Aug 29 14:05 . drwxr-xr-x 4 root root 4096 Oct 15 09:37 .. d--x------ 2 root root 4096 Aug 29 14:05 fid /mnt/lustre/.lustre/fid: ls: reading directory /mnt/lustre/.lustre/fid: Input/output error total 8 d--x------ 2 root root 4096 Aug 29 14:05 . drwxr-xr-x 3 root root 4096 Aug 29 14:05 .. coupled with log errors like: Oct 15 09:37:04 fedora19 kernel: [ 204.735226] LustreError: 2647:0:(dir.c:422:ll_get_dir_page()) read cache page: [0x200000007:0x1:0x0] at 18446744073709551614: rc -5 Oct 15 09:37:04 fedora19 kernel: [ 204.735231] LustreError: 2647:0:(dir.c:422:ll_get_dir_page()) Skipped 2 previous similar messages Oct 15 09:37:04 fedora19 kernel: [ 204.735235] LustreError: 2647:0:(dir.c:595:ll_dir_read()) error reading dir [0x200000007:0x1:0x0] at 18446744073709551614: rc -5 Oct 15 09:37:04 fedora19 kernel: [ 204.735236] LustreError: 2647:0:(dir.c:595:ll_dir_read()) Skipped 2 previous similar messages Oct 15 09:37:04 fedora19 kernel: [ 204.735555] LustreError: 2647:0:(dir.c:398:ll_get_dir_page()) dir page locate: [0x200000002:0x1:0x0] at 18446744073709551614: rc -5 |
| Comment by James A Simmons [ 15/Oct/13 ] |
|
I think I know what the bug might be. Patch coming soon. |
| Comment by James A Simmons [ 15/Oct/13 ] |
|
Bob I just updated http://review.whamcloud.com/#/c/7747. Can you try it now with Peng's fix. |
| Comment by Bob Glossman (Inactive) [ 15/Oct/13 ] |
|
The refresh of #7747 had no effect that I can see. Still get similar errors on directory access. Peng's fix still appears to work. |
| Comment by Bob Glossman (Inactive) [ 15/Oct/13 ] |
|
I think the refresh did have some effect. It looks like there is now exactly one log error associated with each directory failure. 1 per directory and no more. Think it was a bit more variable before the refresh. |
| Comment by James A Simmons [ 15/Oct/13 ] |
|
Can you post your last error. |
| Comment by Bob Glossman (Inactive) [ 15/Oct/13 ] |
|
most recent errors. stdout/stderr: # ll -aR /mnt/lustre /mnt/lustre: ls: reading directory /mnt/lustre: Input/output error total 20 drwxr-xr-x 4 root root 4096 Oct 15 11:18 . drwxr-xr-x. 5 root root 4096 Sep 24 12:20 .. drwxr-xr-x 2 root root 4096 Oct 15 10:48 ddd drwxr-xr-x 3 root root 4096 Aug 29 14:05 .lustre -rw-r--r-- 1 root root 4 Sep 25 09:10 xxx /mnt/lustre/ddd: ls: reading directory /mnt/lustre/ddd: Input/output error total 8 drwxr-xr-x 2 root root 4096 Oct 15 10:48 . drwxr-xr-x 4 root root 4096 Oct 15 11:18 .. /mnt/lustre/.lustre: ls: reading directory /mnt/lustre/.lustre: Input/output error total 12 drwxr-xr-x 3 root root 4096 Aug 29 14:05 . drwxr-xr-x 4 root root 4096 Oct 15 11:18 .. d--x------ 2 root root 4096 Aug 29 14:05 fid /mnt/lustre/.lustre/fid: ls: reading directory /mnt/lustre/.lustre/fid: Input/output error total 8 d--x------ 2 root root 4096 Aug 29 14:05 . drwxr-xr-x 3 root root 4096 Aug 29 14:05 .. /var/log/messages: Oct 15 13:41:21 fedora19 kernel: [ 8177.767876] LustreError: 25405:0:(dir.c:398:ll_get_dir_page()) dir page locate: [0x200000007:0x1:0x0] at 18446744073709551614: rc -5 Oct 15 13:41:21 fedora19 kernel: [ 8177.767884] LustreError: 25405:0:(dir.c:596:ll_dir_read()) error reading dir [0x200000007:0x1:0x0] at 18446744073709551614: rc -5 Oct 15 13:41:21 fedora19 kernel: [ 8177.767887] LustreError: 25405:0:(dir.c:596:ll_dir_read()) Skipped 2 previous similar messages Oct 15 13:41:21 fedora19 kernel: [ 8177.769821] LustreError: 25405:0:(dir.c:398:ll_get_dir_page()) dir page locate: [0x200000002:0x1:0x0] at 18446744073709551614: rc -5 Oct 15 13:41:21 fedora19 kernel: [ 8177.769824] LustreError: 25405:0:(dir.c:398:ll_get_dir_page()) Skipped 1 previous similar message |
| Comment by Bob Glossman (Inactive) [ 15/Oct/13 ] |
|
The offset reported in all those logged errors is 18446744073709551614 == 0xfffffffffffffffe == MDS_DIR_END_OFF. I wonder if that's a clue. |
| Comment by Bob Glossman (Inactive) [ 15/Oct/13 ] |
|
I think I may have trial & errored my way to a fix. Don't know if it's a good fix. If I add the following the errors go away.
diff --git a/lustre/llite/dir.c b/lustre/llite/dir.c
index 5f9e46b..9836c02 100644
--- a/lustre/llite/dir.c
+++ b/lustre/llite/dir.c
@@ -640,7 +640,9 @@ static int ll_readdir(struct file *filp, void *cookie, filld
GOTO(out, rc = 0);
#ifdef HAVE_DIR_CONTEXT
+ ctx->pos = pos;
rc = ll_dir_read(inode, ctx);
+ pos = ctx->pos;
#else
rc = ll_dir_read(inode, &pos, cookie, filldir);
#endif
Seems like a piece missing from #7747. |
| Comment by Peng Tao [ 16/Oct/13 ] |
|
So the patch in question seems to be http://review.whamcloud.com/#/c/7747. James, I do not have Fedora 19 kernel but in upstream kernel, 3.11 release converts .readdir to .iterate in struct file_operations. And it comes along with the struct dir_context change. Does Fedora 19 kernel include them all or just the dir_context part? In upstream kernel client, my patch for dealing with .iterate change is https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=0b09d381bb2b4ccab15711cf98858a7146b24749. You may want to take a look and consider merging it into your patch. |
| Comment by James A Simmons [ 16/Oct/13 ] |
|
Peng I also only test with the upstream kernels. Thanks for pointing out the patch. Bob got part of the fix but I see one other thing missing. Bob your clue points also to the ctx->pos not being updated when the pos is set to |
| Comment by James A Simmons [ 16/Oct/13 ] |
|
Patch has been updated. |
| Comment by Bob Glossman (Inactive) [ 16/Oct/13 ] |
|
Today's refresh of #7747 seems to do the job. No errors on dir access seen. On a different topic, how close are working server patches? With the ldiskfs patch set in |
| Comment by James A Simmons [ 16/Oct/13 ] |
|
Currently I have pushed what has been working for me to gerrit. I also have patches for the lod, mdd, mdt, and osp layer as well but currently their are show stopper bugs that need to be addressed. I said a one to two weeks of work to sort it get it going. |
| Comment by Bob Glossman (Inactive) [ 16/Oct/13 ] |
|
James, thanks for the update. I have been paying close attention to the new patches as they show up in |
| Comment by Bob Glossman (Inactive) [ 16/Oct/13 ] |
|
Peng, do you plan to push a version of https://github.com/bergwolf/linux/commit/28fde12d26d7d4b54cb4689dd6117c2ed791f985 to gerrit? Seems like we will need it in master. |
| Comment by Peng Tao [ 21/Oct/13 ] |
|
James, I took a look at v3.11 upstream kernel, it already requires iterate replacing readdir in struct file_operations, by kernel commit 2233f31aade393641f0eaed43a71110e629bb900 commit 2233f31aade393641f0eaed43a71110e629bb900 [readdir] ->readdir() is gone everything's converted to ->iterate() Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> diff --git a/Documentation/filesystems/Locking b/Documentation/filesystems/Locking
[mpfssvr@linux-lustre]$git tag --contains 2233f31aade393641f0eaed43a71110e629bb900 So I think we need a similar patch like https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=0b09d381bb2b4ccab15711cf98858a7146b24749 rather than just coping with struct dir_context. In fact, struct dir_context is introduced in the same kernel commit as .iterate. So whenever dir_context is there, .iterate exists. Please check your local tree to see if it somehow missed the commit. |
| Comment by Peng Tao [ 22/Oct/13 ] |
|
Sorry I mis-read 7747. Jemes' patch looks good. I will submit https://github.com/bergwolf/linux/commit/28fde12d26d7d4b54cb4689dd6117c2ed791f985 once it is merged in upstream kernel. |
| Comment by James A Simmons [ 23/Oct/13 ] |
|
Patches needed to support 3.11 kernel are: http://review.whamcloud.com/#/c/7741 Now that we have support for server side my testing for ldiskfs also shows we need to update osd_handler.c due the readdir -> iterate changes. We have lustre-2.5.50/lustre/osd-ldiskfs/osd_handler.c: In function 'osd_ldiskfs_it_fill': to be address as well. |
| Comment by James A Simmons [ 01/Nov/13 ] |
|
With these patches plus the patches listed in [root@spoon46 tests]# df |
| Comment by Andreas Dilger [ 08/Nov/13 ] |
|
Note that Yang Sheng gas ldiskfs patches for 3.11 in http://review.whamcloud.com/7263 |
| Comment by James A Simmons [ 18/Nov/13 ] |
|
Excellent news. I have built a real file system this time using a 3.11.1 kernel on all the servers. Now to mount it on our cray test bed and run a bunch of jobs on it. |
| Comment by James A Simmons [ 06/Dec/13 ] |
|
Patches needed to support 3.11 kernel are: http://review.whamcloud.com/#/c/7741 These patches are ready for review and merger. |
| Comment by James A Simmons [ 18/Dec/13 ] |
|
http://review.whamcloud.com/#/c/7746 is ready for review and possible merger. |
| Comment by James A Simmons [ 10/Jan/14 ] |
|
Finally got patch 7746 to pass maloo. Patch http://review.whamcloud.com/#/c/7746 is ready for inspection and possible landing. |
| Comment by James A Simmons [ 13/Feb/14 ] |
|
Two patches left to inspect and possibly merge. http://review.whamcloud.com/#/c/7746 |
| Comment by James A Simmons [ 02/Mar/14 ] |
|
All patches have landed. This ticket can be closed. |
| Comment by Peter Jones [ 02/Mar/14 ] |
|
Thanks James |