[LU-1346] cleanup libcfs Created: 26/Apr/12 Updated: 30/Sep/13 Resolved: 16/Sep/13 |
|
| Status: | Resolved |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | Lustre 2.4.0, Lustre 2.5.0 |
| Fix Version/s: | Lustre 2.4.0, Lustre 2.5.0 |
| Type: | Improvement | Priority: | Minor |
| Reporter: | Peng Tao | Assignee: | Keith Mannthey (Inactive) |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | emc, patch | ||
| Issue Links: |
|
||||||||
| Epic: | client | ||||||||
| Rank (Obsolete): | 7767 | ||||||||
| Description |
|
Migrate libcfs to emulate Linux kernel APIs, so that when submitting Linux client to upstream kernel, we don't need the abstraction layer which will be rejected by kernel maintainers. The majority of libcfs are cfs_* or ll_* wrappers, and can be cleaned up with scripts doing like "s/cfs_spin_lock/spin_lock". For other functions that cannot be cleaned up this way, we need to look at them in a case-by-case basis. |
| Comments |
| Comment by Xuezhao Liu [ 17/May/12 ] |
|
Per the discussion with Andreas etc., we can use a script to do libcfs cleanup at compiling time first. We can implement the script incrementally, when the script is good enough we can apply it to lustre tree to change code directly. The code change will be huge and will break most in-flight patches and break different branch rebase, the script can help to resolve the confliction as well. We ever considered implementing the script based on coccinelle, it is potentially a good tool to resolve collateral evolution issues. But it cannot well process some "#ifdef .... #else ... #endif" cases, or some complicate macro codes. So we implemented a sed script to do libcfs cleanup, together with some direct code changes. 3 patches submitted for review: |
| Comment by Xuezhao Liu [ 09/Dec/12 ] |
|
http://review.whamcloud.com/4775 cleanup libcfs primitive (linux-prim.h) The above patches are only script plus some other code changes that not-easy to be handled by script. I submitted only for review, we can merge them and run the script to generate the final patch at appropriate time. There are some other cleanup patches have not been submitted for example LWT code and some other obsolete macros cleanup. |
| Comment by John Hammond [ 15/Mar/13 ] |
|
http://review.whamcloud.com/#change,2829 (cleanup libcfs lock and bit operations) has landed. |
| Comment by James A Simmons [ 01/Jul/13 ] |
|
I just tried out the latest master to discover that it no longer builds on SLES11 SP1. The api for mem_cache_create has a extra argument in the base line 2.6.32 kernel which was removed later. |
| Comment by James A Simmons [ 01/Jul/13 ] |
|
Sorry it was another problem. Your patch doesn't impact SLES11 SP1. |
| Comment by Peng Tao [ 12/Jul/13 ] |
|
I split http://review.whamcloud.com/#/c/6954/ The procfs primitive cleanup is dropped because it will be soon changed with the procfs change in 3.10 kernel support. The last one patch (6959) is partial and converts only atomic primitives in libcfs directory. I'll update incremental patches to clean other modules. |
| Comment by Keith Mannthey (Inactive) [ 13/Jul/13 ] |
|
There seems to be some build problem related to this patch set. A rebuild failed as well so it likely has to do with the base change itself somehow. |
| Comment by Peng Tao [ 16/Jul/13 ] |
|
I don't have SLES11 environment but I tried to build the patchset against SLES11SP2 kernel on my CentOS box, and no error was found. I couldn't build rpms though because they require different spec file rules. And the error seems to be from rpmbuild. Could you please help to retrieve below file? The error message says: It seems that find-requires scanned some script and found undefined variable $RPM_BUILD_ROOT. But the patchset doesn't change any scripts except for libcfscleanup.sed which clearly doesn't reference $RPM_BUILD_ROOT. |
| Comment by James A Simmons [ 16/Jul/13 ] |
|
Can you try patch http://review.whamcloud.com/#/c/5491. That might fix it up for you. |
| Comment by Keith Mannthey (Inactive) [ 16/Jul/13 ] |
|
I am working to find a way to get the malformed file so we can better see what went wrong. We have a separate internal ticket tracking this build issue. The present thought is the first patch in this series is wrecking havoc is this odd way. I will see about a debug rebase ontop of the SuSE build change. |
| Comment by Peng Tao [ 17/Jul/13 ] |
|
Thanks James. It seems that http://review.whamcloud.com/#/c/5491 is related. I'll update to include it and see if it fixes the build issue. |
| Comment by Peng Tao [ 19/Jul/13 ] |
|
Turned out to be a bug in the base patch. The error message was misleading. I've fixed it and now build succeeded. |
| Comment by Keith Mannthey (Inactive) [ 19/Jul/13 ] |
|
I am glad this got sorted it. I agree the error message was very misleading. |
| Comment by Peng Tao [ 22/Jul/13 ] |
|
Following up the cfs_atomic primitive change, below seven patches convert the rest pieces of code incrementally. They can each be merged independently. |
| Comment by Keith Mannthey (Inactive) [ 23/Jul/13 ] |
|
These 7 patches are each dependant on about 10 other patches. We will need to land these changes in order and a lowish level patch needing a rebase is going to cause the whole stack to be rebuilt and tested at least once more. http://review.whamcloud.com/4779 is low in the stack and out of date. The lowest patch is http://review.whamcloud.com/4776 and is being testing. 4777 and 4778 look to to land once 4776 finishes testing. If there is any way to break the dependencies up the speed at which this code will land could improve. |
| Comment by Peng Tao [ 23/Jul/13 ] |
|
shall I rebase 4779 and all the patches that rely on 4770 now, or wait until 4776/4777/4778 on are landed? They are dependent mostly because they all modify the libcfs-cleanup.sed file. |
| Comment by James A Simmons [ 23/Jul/13 ] |
|
Can you hold off rebasing 4779. I'm working on getting Lustre working with a 3.10.2 kernel and that version of the kernel impacts process name space handling. First I need to handle the proc changes in 3.10+ |
| Comment by Peng Tao [ 23/Jul/13 ] |
|
James, do you mean you are working on procfs change in 3.10? It is tracked on |
| Comment by Cory Spitz [ 28/Aug/13 ] |
|
James posted http://review.whamcloud.com/#/c/7454 for gnilnd |
| Comment by Jodi Levi (Inactive) [ 16/Sep/13 ] |
|
Follow on work for this ticket is being tracked in |
| Comment by Alexey Lyashkov [ 27/Sep/13 ] |
|
please revert that defines needed because Linux capability now have size more then lustre wire protocol so we don't able to pack all capability into wire. That is main reason to introduce these macroses, please look to commit and ticket where it's adds to lustre tree. I see you are drop compatibility with different OS'es and kernels, but that is very very bad idea. |
| Comment by Alexey Lyashkov [ 27/Sep/13 ] |
|
and PLEASE REMEMBER - any on wire constants MUST be platform independent to avoid situation when it's different on different sides of lustre cluster. as additional objection about that work - 'fls' cleanup landed some time ago. that is broke lustre to build on system already have these functions but with different arguments as you expected. |
| Comment by James A Simmons [ 27/Sep/13 ] |
|
The patch just did a bunch of renaming of variables. cfs_cap_t is still a __u32. Can you point to exactly the break is where you believe it to be. Please push fixes for the fls and gettimeofday issues. Patches always welcomed |
| Comment by Alexey Lyashkov [ 27/Sep/13 ] |
|
OS defined flags should be never applied to the custom type, so CAP_* flags may be applied to the kernel_cap type, other is incorrect by definition, but kernel_cap_t can't be put in _u32 in wire. as about breakage. In file included from /Users/shadow/work/lustre/work/WC-review/mac-os/libcfs/include/libcfs/posix/libcfs.h:108,
from /Users/shadow/work/lustre/work/WC-review/mac-os/libcfs/include/libcfs/libcfs.h:45,
from nidstrings.c:43:
/Users/shadow/work/lustre/work/WC-review/mac-os/libcfs/include/libcfs/user-bitops.h:80: error: conflicting types for ‘fls’
/usr/include/strings.h:90: error: previous declaration of ‘fls’ was here
so part for cfs_fls need to be reverted. one more additional bug gcc -DHAVE_CONFIG_H -I. -I../.. -D__arch_lib__ -D_LARGEFILE64_SOURCE=1 -include /Users/shadow/work/lustre/work/WC-review/mac-os/config.h -I/Users/shadow/work/lustre/work/WC-review/mac-os/libcfs/include -I/Users/shadow/work/lustre/work/WC-review/mac-os/lnet/include -I/Users/shadow/work/lustre/work/WC-review/mac-os/lustre/include -g -Wall -fPIC -D_GNU_SOURCE -g -O2 -Werror -MT libcfs_a-posix-debug.o -MD -MP -MF .deps/libcfs_a-posix-debug.Tpo -c -o libcfs_a-posix-debug.o `test -f 'posix/posix-debug.c' || echo './'`posix/posix-debug.c cc1: warnings being treated as errors posix/posix-debug.c: In function ‘libcfs_debug_vmsg2’: posix/posix-debug.c:210: warning: implicit declaration of function ‘cfs_gettimeofday’ |
| Comment by James A Simmons [ 27/Sep/13 ] |
|
Okay we can revert the last patch. The caps patch had less of a impact on the code than the other patches for |
| Comment by Alexey Lyashkov [ 28/Sep/13 ] |
|
Thanks! I will send maces/freebsd support patches with fuse client son (i hope in next two weeks). |
| Comment by Alexey Lyashkov [ 30/Sep/13 ] |
|
I created patch with some cleanups need to compile lustre with clang. most of them is c89 style fixes with correct includes. |