Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-17793

warning: objtool: __cfs_fail_check_set() falls through to next function __cfs_fail_timeout_set()

Details

    • Bug
    • Resolution: Fixed
    • Minor
    • Lustre 2.16.0
    • Lustre 2.16.0
    • None
    • 3
    • 9223372036854775807

    Description

      While building master branch on Ubuntu 24.04 with kernel 6.8.0-31-generic, there are lots of objtool warnings as follows:

        CC [M]  /root/lustre-release/libcfs/libcfs/fail.o
      /root/lustre-release/libcfs/libcfs/fail.o: warning: objtool: __cfs_fail_check_set() falls through to next function __cfs_fail_timeout_set()
        CC [M]  /root/lustre-release/libcfs/libcfs/module.o
      /root/lustre-release/libcfs/libcfs/module.o: warning: objtool: libcfs_force_lbug() falls through to next function debugfs_doint()
        CC [M]  /root/lustre-release/libcfs/libcfs/tracefile.o
        CC [M]  /root/lustre-release/libcfs/libcfs/libcfs_string.o
        CC [M]  /root/lustre-release/libcfs/libcfs/hash.o
      /root/lustre-release/libcfs/libcfs/hash.o: warning: objtool: cfs_hash_bd_del_locked() falls through to next function cfs_hash_cond_del_locked()
      /root/lustre-release/libcfs/libcfs/hash.o: warning: objtool: cfs_hash_bd_from_key() falls through to next function cfs_hash_buckets_free()
      /root/lustre-release/libcfs/libcfs/hash.o: warning: objtool: cfs_hash_bd_get() falls through to next function cfs_hash_multi_bd_findadd_locked.constprop.0()
      /root/lustre-release/libcfs/libcfs/hash.o: warning: objtool: cfs_hash_create() falls through to next function cfs_hash_bd_move_locked()
      /root/lustre-release/libcfs/libcfs/hash.o: warning: objtool: cfs_hash_bd_move_locked() falls through to next function cfs_hash_rehash_worker()
      /root/lustre-release/libcfs/libcfs/hash.o: warning: objtool: cfs_hash_dual_bd_get() falls through to next function cfs_hash_rehash_key()
      /root/lustre-release/libcfs/libcfs/hash.o: warning: objtool: cfs_hash_rehash_cancel() falls through to next function cfs_hash_putref()
      /root/lustre-release/libcfs/libcfs/hash.o: warning: objtool: cfs_hash_for_each_enter() falls through to next function cfs_hash_rehash()
      

      A full list is in the attached make_debs.log.
      According to https://github.com/torvalds/linux/blob/master/tools/objtool/Documentation/objtool.txt :

      8. file.o: warning: objtool: funcA() falls through to next function funcB()
      
         This means that funcA() doesn't end with a return instruction or an
         unconditional jump, and that objtool has determined that the function
         can fall through into the next function.  There could be different
         reasons for this:
      
         1) funcA()'s last instruction is a call to a "noreturn" function like
            panic().  In this case the noreturn function needs to be added to
            objtool's hard-coded global_noreturns array.  Feel free to bug the
            objtool maintainer, or you can submit a patch.
      
         2) funcA() uses the unreachable() annotation in a section of code
            that is actually reachable.
      
         3) If funcA() calls an inline function, the object code for funcA()
            might be corrupt due to a gcc bug.  For more details, see:
            https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70646
      

      Attachments

        Issue Links

          Activity

            [LU-17793] warning: objtool: __cfs_fail_check_set() falls through to next function __cfs_fail_timeout_set()
            pjones Peter Jones added a comment -

            Merged for 2.16

            pjones Peter Jones added a comment - Merged for 2.16

            "Oleg Drokin <green@whamcloud.com>" merged in patch https://review.whamcloud.com/c/fs/lustre-release/+/55505/
            Subject: LU-17793 libcfs: fix objtool warning in lbug_with_loc()
            Project: fs/lustre-release
            Branch: master
            Current Patch Set:
            Commit: 9f44a48b365924ee6f576c95bf0649160b6be58a

            gerrit Gerrit Updater added a comment - "Oleg Drokin <green@whamcloud.com>" merged in patch https://review.whamcloud.com/c/fs/lustre-release/+/55505/ Subject: LU-17793 libcfs: fix objtool warning in lbug_with_loc() Project: fs/lustre-release Branch: master Current Patch Set: Commit: 9f44a48b365924ee6f576c95bf0649160b6be58a

            "Jian Yu <yujian@whamcloud.com>" uploaded a new patch: https://review.whamcloud.com/c/fs/lustre-release/+/55505
            Subject: LU-17793 libcfs: fix objtool warning in lbug_with_loc()
            Project: fs/lustre-release
            Branch: master
            Current Patch Set: 1
            Commit: d54fa6b7853e943d8f35ab1136c9c524ef1331e0

            gerrit Gerrit Updater added a comment - "Jian Yu <yujian@whamcloud.com>" uploaded a new patch: https://review.whamcloud.com/c/fs/lustre-release/+/55505 Subject: LU-17793 libcfs: fix objtool warning in lbug_with_loc() Project: fs/lustre-release Branch: master Current Patch Set: 1 Commit: d54fa6b7853e943d8f35ab1136c9c524ef1331e0

            Quoting the relevant section of that document:

            If you really need objtool to ignore something, and are 100% sure that it won't affect kernel stack traces, you can tell objtool to ignore it:

            • To skip validation of a function, use the STACK_FRAME_NON_STANDARD macro.
            • To skip validation of a file, add "OBJECT_FILES_NON_STANDARD_filename.o := y" to the Makefile.
            • To skip validation of a directory, add "OBJECT_FILES_NON_STANDARD := y" to the Makefile.

            so it seems possible that we could use the STACK_FRAME_NON_STANDARD(lbug_with_loc) so it would be skipped by all of the callers? I don't want to annotate every function that calls LASSERT().

            Failing that, we could potentially fix the code in lbug_with_loc() so that it calls panic() after the while(1) loop, so that all of the call paths in that function "terminate" and hopefully this is enough to convince the static checkers to ignore this:

                    :
                    set_current_state(TASK_UNINTERRUPTIBLE);               
                    while (1)                             
                            schedule();
                    /* not reached */
                    panic();
            } 
            

            We might need to remove the __noreturn annotation from lbug_with_loc() in that case, since the panic() call should already be properly annotated and we would call it on all code paths in that case, even if the one at the end of the function is never actually reached.

            On a related note, I've been thinking it would be useful to add e.g. msleep(cfs_time_seconds(6)) before the panic() calls so that there is some chance that the dump_stack() and other error messages can be saved to the syslog rather than crashing the node instantaneously on error. I don't think this makes a huge difference in the grand scheme of failover/recovery time, but would make debugging easier. If someone was worried about the delay duing failover, this could be a tunable, either overloading libcfs_panic_on_lbug to be the number of seconds to delay (or msec, if a minimum 1 second delay is considered too long), or adding yet another tunable like libcfs_panic_delay_sec to control this.

            adilger Andreas Dilger added a comment - Quoting the relevant section of that document: If you really need objtool to ignore something, and are 100% sure that it won't affect kernel stack traces, you can tell objtool to ignore it: To skip validation of a function, use the STACK_FRAME_NON_STANDARD macro. To skip validation of a file, add " OBJECT_FILES_NON_STANDARD_filename.o := y " to the Makefile. To skip validation of a directory, add " OBJECT_FILES_NON_STANDARD := y " to the Makefile. so it seems possible that we could use the STACK_FRAME_NON_STANDARD(lbug_with_loc) so it would be skipped by all of the callers? I don't want to annotate every function that calls LASSERT() . Failing that, we could potentially fix the code in lbug_with_loc() so that it calls panic() after the while(1) loop, so that all of the call paths in that function "terminate" and hopefully this is enough to convince the static checkers to ignore this: : set_current_state(TASK_UNINTERRUPTIBLE); while (1) schedule(); /* not reached */ panic(); } We might need to remove the __noreturn annotation from lbug_with_loc() in that case, since the panic() call should already be properly annotated and we would call it on all code paths in that case, even if the one at the end of the function is never actually reached. On a related note, I've been thinking it would be useful to add e.g. msleep(cfs_time_seconds(6)) before the panic() calls so that there is some chance that the dump_stack() and other error messages can be saved to the syslog rather than crashing the node instantaneously on error. I don't think this makes a huge difference in the grand scheme of failover/recovery time, but would make debugging easier. If someone was worried about the delay duing failover, this could be a tunable, either overloading libcfs_panic_on_lbug to be the number of seconds to delay (or msec, if a minimum 1 second delay is considered too long), or adding yet another tunable like libcfs_panic_delay_sec to control this.

            This is a userspace tool, included in Linux kernel tree. It is run on each .o file at build time. As it provided by the kernel, this is hard to patch unfortunately.

            I just found there is a macro to skip some functions (instead of the whole file or a directory). To be tested, but that could be a bit better than the actual workaround.

            https://github.com/torvalds/linux/blob/master/tools/objtool/Documentation/objtool.txt#L443

            adegremont_nvda Aurelien Degremont added a comment - This is a userspace tool, included in Linux kernel tree. It is run on each .o file at build time. As it provided by the kernel, this is hard to patch unfortunately. I just found there is a macro to skip some functions (instead of the whole file or a directory). To be tested, but that could be a bit better than the actual workaround. https://github.com/torvalds/linux/blob/master/tools/objtool/Documentation/objtool.txt#L443

            I'm not familiar with objtool, is this a tool that runs as part of the build in userspace, or is it part of the kernel? If in userspace, could we supply our own (fixed) objtool as part of the Lustre sources that excludes this function as it did before?

            adilger Andreas Dilger added a comment - I'm not familiar with objtool, is this a tool that runs as part of the build in userspace, or is it part of the kernel? If in userspace, could we supply our own (fixed) objtool as part of the Lustre sources that excludes this function as it did before?
            yujian Jian Yu added a comment -

            Hi adilger,
            While building Lustre client on Ubuntu, we specify "--with-linux" with linux-headers path instead of kernel source path. So, it seems there is no way to patch tools/objtool/check.c in the kernel source.

            yujian Jian Yu added a comment - Hi adilger , While building Lustre client on Ubuntu, we specify "--with-linux" with linux-headers path instead of kernel source path. So, it seems there is no way to patch tools/objtool/check.c in the kernel source.

            Is there some way we can patch objtool during the Lustre module build process so that this warning no longer fails?

            adilger Andreas Dilger added a comment - Is there some way we can patch objtool during the Lustre module build process so that this warning no longer fails?
            yujian Jian Yu added a comment -

            The issue doesn't exist on kernel v6.4, but on v6.5. I narrowed down the commit to the following one:

            commit 34245659debd194cbd4148d2ee5176306bdf8899
            Author:     Josh Poimboeuf <jpoimboe@kernel.org>
            AuthorDate: Tue Apr 18 14:27:52 2023 -0700
            Commit:     Josh Poimboeuf <jpoimboe@kernel.org>
            CommitDate: Tue May 16 06:31:54 2023 -0700
            
                objtool: Remove superfluous global_noreturns entries
                
                lbug_with_loc() no longer exists, and resume_play_dead() is static
                (objtool only checks globals and weaks).
                
                Reviewed-by: Miroslav Benes <mbenes@suse.cz>
                Link: https://lore.kernel.org/r/2725d7f2ccc2361c6903de9ebaa2b5bb304f7ac2.1681853186.git.jpoimboe@kernel.org
                Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>
            
            diff --git a/tools/objtool/check.c b/tools/objtool/check.c
            index 8dac1e35264e..8c2762de7ae3 100644
            --- a/tools/objtool/check.c
            +++ b/tools/objtool/check.c
            @@ -217,7 +217,6 @@ static bool __dead_end_function(struct objtool_file *file, struct symbol *func,
                            "kthread_complete_and_exit",
                            "kthread_exit",
                            "kunit_try_catch_throw",
            -               "lbug_with_loc",
                            "machine_real_restart",
                            "make_task_dead",
                            "mpt_halt_firmware",
            @@ -225,7 +224,6 @@ static bool __dead_end_function(struct objtool_file *file, struct symbol *func,
                            "panic",
                            "panic_smp_self_stop",
                            "rest_init",
            -               "resume_play_dead",
                            "rewind_stack_and_make_dead",
                            "sev_es_terminate",
                            "snp_abort",
            
            yujian Jian Yu added a comment - The issue doesn't exist on kernel v6.4, but on v6.5. I narrowed down the commit to the following one: commit 34245659debd194cbd4148d2ee5176306bdf8899 Author: Josh Poimboeuf <jpoimboe@kernel.org> AuthorDate: Tue Apr 18 14:27:52 2023 -0700 Commit: Josh Poimboeuf <jpoimboe@kernel.org> CommitDate: Tue May 16 06:31:54 2023 -0700 objtool: Remove superfluous global_noreturns entries lbug_with_loc() no longer exists, and resume_play_dead() is static (objtool only checks globals and weaks). Reviewed-by: Miroslav Benes <mbenes@suse.cz> Link: https: //lore.kernel.org/r/2725d7f2ccc2361c6903de9ebaa2b5bb304f7ac2.1681853186.git.jpoimboe@kernel.org Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org> diff --git a/tools/objtool/check.c b/tools/objtool/check.c index 8dac1e35264e..8c2762de7ae3 100644 --- a/tools/objtool/check.c +++ b/tools/objtool/check.c @@ -217,7 +217,6 @@ static bool __dead_end_function(struct objtool_file *file, struct symbol *func, "kthread_complete_and_exit" , "kthread_exit" , "kunit_try_catch_throw" , - "lbug_with_loc" , "machine_real_restart" , "make_task_dead" , "mpt_halt_firmware" , @@ -225,7 +224,6 @@ static bool __dead_end_function(struct objtool_file *file, struct symbol *func, "panic" , "panic_smp_self_stop" , "rest_init" , - "resume_play_dead" , "rewind_stack_and_make_dead" , "sev_es_terminate" , "snp_abort" ,

            This seems a bit more complex, as objtool was actually run before, under the config option CONFIG_STACK_VALIDATION, for Ubuntu kernel 5.15 :/ So not sure why this problem appears now.

            But it seems objtool is confused by the __noreturn functions and does not detect that properly for Lustre...

            adegremont_nvda Aurelien Degremont added a comment - This seems a bit more complex, as objtool was actually run before, under the config option CONFIG_STACK_VALIDATION, for Ubuntu kernel 5.15 :/ So not sure why this problem appears now. But it seems objtool is confused by the __noreturn functions and does not detect that properly for Lustre...

            People

              yujian Jian Yu
              yujian Jian Yu
              Votes:
              0 Vote for this issue
              Watchers:
              6 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: