[LU-10300] Can the Lustre 2.10.x clients support 64K kernel page? Created: 30/Nov/17  Updated: 27/Oct/22  Resolved: 27/Oct/22

Status: Resolved
Project: Lustre
Component/s: None
Affects Version/s: Lustre 2.10.0
Fix Version/s: None

Type: Question/Request Priority: Major
Reporter: ZhangWei Assignee: James A Simmons
Resolution: Cannot Reproduce Votes: 0
Labels: None
Environment:

Redhat


Issue Links:
Related
is related to LU-14970 sanity-flr test_50a: FAIL: Mirrored f... Open
is related to LU-14346 sanity-pcc test 7a hangs in multiop/d... In Progress
is related to LU-10157 LNET_MAX_IOV hard coded to 256 Resolved
is related to LU-15364 Kernel oops when stripe on Arm64 Serv... Resolved
is related to LU-7650 ko2iblnd map_on_demand can't negotita... Resolved
is related to LU-11597 sanityn test 16a failed with direct I/O Resolved
is related to LU-11785 conf-sanity test_98 fails with 'Buffe... Resolved
is related to LU-11788 sanity test 104a fails with ‘lfs df f... Resolved
is related to LU-6387 Add Power8 support to Lustre Resolved
is related to LU-11200 Centos 8 arm64 server support Resolved
is related to LU-15293 Add lbuild support for latest Arm64 C... Resolved
is related to LU-11671 sanity test 45: FAIL: write wasn't ca... Open
is related to LU-12255 sanity-dom test 42e fails with ‘test... Open
is related to LU-4398 mdt_object_open_lock() may not flush ... Resolved
is related to LU-10073 lnet-selftest test_smoke: lst Error f... Resolved
is related to LU-11729 ARM: sanity test_810: BAD WRITE CHECK... Resolved
is related to LU-11595 sanity-dom sanityn test 11: LBUG: (fi... Resolved
is related to LU-11596 sanity test_42d/test_42e: FAIL: faile... Resolved
is related to LU-12014 check correct size in ll_dom_finish_o... Resolved
is related to LU-15722 IO write gets stuck on some sanityn t... Resolved
is related to LU-11667 sanity test 317: FAIL: Expected Block... Resolved
is related to LU-11787 sanityn test 71a fails with ‘data is ... Resolved
is related to LU-12362 kernel warning 'do not call blocking ... Resolved
is related to LU-12419 ppc64le: "LNetError: RDMA has too man... Closed
is related to LU-15223 Improve partial page read/write In Progress
Rank (Obsolete): 9223372036854775807

 Description   

We test Lustre in the kernel 4.11.0 which use 64K memory page, and we found that there are no config items in the configure.



 Comments   
Comment by James A Simmons [ 30/Nov/17 ]

Actually I have tested such a setup on Power8 which is 64K page based. The only issue that came up was when dealing with the ko2iblnd witth map_on_demand enabled. Otherwise it works.

Comment by ZhangWei [ 30/Nov/17 ]

Thanks for your replay, with the map_ondemand enabled, can the Lustre only support 4k kernel memory page?

Comment by Andreas Dilger [ 30/Nov/17 ]

We definitely used to run Lustre with 64KB PAGE_SIZE on IA64 clients, and in theory this would still work but we haven't tested it in a long time.  We never had much success with 64KB PAGE_SIZE on the server, since this caused problems with 4KB PAGE_SIZE clients doing writes to the 64KB PAGE_SIZE server.

Comment by ZhangWei [ 30/Nov/17 ]

Yes, I tested the 64KB PAGE_SIZE server with 4KB PAGE_SIZE Client and 64KB PAGE_SIZE Client, there are some errors both in these two clients.Thanks for help !

Comment by Andreas Dilger [ 20/Dec/17 ]

It probably makes sense to print a build warning in the server code when PAGE_SIZE isn't 4096, just so people are aware that this isn't being tested. If someone starts testing this in the future, the warning can be removed.

Comment by James A Simmons [ 03/Aug/18 ]

Once the RHEL ARM/Power8 server support work is complete we can test 64K pages on the server side.

Comment by James A Simmons [ 21/Aug/18 ]

So I have managed to get Lustre ZFS servers running on Power8 nodes. I got it to mount and then it just locked up with any attempt to use the file system  Well its a start

Comment by James A Simmons [ 06/Sep/18 ]

This time I tested ZFS lustre servers using the ethernet interface and it worked. Currently their is a bug in the RHEL7 alt kernel that I'm using in netlink that shows up when using k2oiblnd for some reason. Their appears to be a fix but requires rebuild the RHEL7 alt kernel. Now to test ldiskfs.

Comment by Andreas Dilger [ 27/Nov/18 ]

James, I'd suggest to move your previous comment into a separate ticket related to 64KB PAGE_SIZE on the server, and leave this one for tracking 64KB PAGE_SIZE on the client. I suspect most people care about client-side support more than server-side, and this will help understand which issues are important to fix for the two different cases. In any case, it doesn't make sense to have multiple independent issues being worked on in the same Jira ticket.

Comment by Gerrit Updater [ 04/Jan/22 ]

"James Simmons <jsimmons@infradead.org>" uploaded a new patch: https://review.whamcloud.com/45962
Subject: LU-10300 test: does arm testing work
Project: fs/lustre-release
Branch: master
Current Patch Set: 1
Commit: 2036c3298804defa44e0457e65893e91d1da3985

Comment by Andreas Dilger [ 27/Oct/22 ]

The aarch64 clients are working with any recent Lustre release.

Generated at Sat Feb 10 02:33:49 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.