[LU-2288] missing debuginfo for lustre modules Created: 06/Nov/12  Updated: 03/Oct/14  Resolved: 08/Jan/13

Status: Resolved
Project: Lustre
Component/s: None
Affects Version/s: Lustre 2.4.0
Fix Version/s: Lustre 2.4.0

Type: Bug Priority: Minor
Reporter: John Hammond Assignee: Keith Mannthey (Inactive)
Resolution: Fixed Votes: 0
Labels: build, patch
Environment:

RHEL 6.3 and others.


Severity: 3
Rank (Obsolete): 5478

 Description   

Lustre RPM builds don't produce packages containing debuginfo for the Lustre kernel modules. This is because kernel modules are not marked as executable and so are ignored by find-debuginfo.sh. Based on the approach taken in the RHEL kernel.spec, I added the following to the end of the %install section:

# mark modules executable so that strip-to-file can strip them
find $RPM_BUILD_ROOT/lib/modules/%{kversion}/updates -name "*.ko" -type f | \
  xargs --no-run-if-empty chmod u+x

This done, debuginfo files for the kernel modules appear in lustre-debuginfo. I would have liked it if a lustre-modules-debuginfo package was automagically created by rpmbuild. But alas no. So can we live with a monolithic debuginfo package? After much introspection I have decided that I can.

I wish it were this simple, but there is an obnoxious bug in debugedit (see https://bugzilla.redhat.com/show_bug.cgi?id=304121) which may cause rpmbuild to fail if configure is given noncanonical paths to the o2ib tree (or others). In particular,

./configure --with-linux=/usr/src/linux-2.6.32-279.11.1.el6.lm.x86_64/ --with-o2ib=/opt/ofed/src/openib
make rpms

will succeed, while

./configure --with-linux=/usr/src/linux-2.6.32-279.11.1.el6.lm.x86_64/ --with-o2ib=/opt/ofed/src/openib/
make rpms

will fail with something like:

...
extracting debug info from /root/rpmbuild/BUILDROOT/lustre-2.3.54-2.6.32_279.11.1.el6.lm.x86_64_g7f57b6a.x86_64/lib/modules/2.6.32-279.11.1.el6.lm.x86_64/updates/kernel/net/lustre/ko2iblnd.ko
/usr/lib/rpm/debugedit: canonicalization unexpectedly shrank by one character
error: Bad exit status from /var/tmp/rpm-tmp.pTyzxn (%install)

This could be addressed, say, by doing path c14n in lnet/autoconf/lustre-lnet.m4. I assume that the same would apply to qsnet, portals, rapid array, model railroad, and all of the other LNDs whose names I forget.

Similarly lustre-ldiskfs-debuginfo is created but empty.

I admit that this adds some complexity/fragility to lustre.spec but the benefits are easily seen by anyone who runs "perf top" on a busy lustre client.

I welcome suggestions as I continue to poke at this.



 Comments   
Comment by Oleg Drokin [ 06/Nov/12 ]

Please note that in current scheme of things Lustre modules are already containing necessaary debug info when they sit in /lib/modules, so extra debuginfo rpm is not needed.
I routinely use module .ko files for my debugging needs with gdb.

Comment by Brian Murrell (Inactive) [ 06/Nov/12 ]

The issue of the trailing / needing to be present on paths should be resolvable by configure itself, yes? i.e. if it's not there, configure adds it when assigning the path argument to the variable?

Comment by John Hammond [ 06/Nov/12 ]

Hi Brian,

It's not that trailing slashes are needed. It's that if they are used on args to configure then some paths end up with double slashes and are non canonical which angers debugedit.

Comment by John Hammond [ 06/Nov/12 ]

Hi Oleg,

Thanks for pointing this out. I guess the issue then is that perf does not expect to find a kernel module's debuginfo in the module itself. Rather it only checks for the symlinks of the form /usr/lib/debug/.build-id/bc/73e36cd1bc59cbf4b73768de0af7b28c47b678.debug.

To see this, after installing the uncooked RPMs I did:

build_id_dir=/usr/lib/debug/.build-id

for file in $(rpm -ql lustre-modules lustre-ldiskfs); do
    if build_id=$(/usr/lib/rpm/debugedit -i $file 2>/dev/null); then
        ln -s ../../../../../$file $build_id_dir/${build_id:0:2}/${build_id:2}
        ln -s ../../../../../$file $build_id_dir/${build_id:0:2}/${build_id:2}.debug
    fi
done

and perf resolved the lustre symbols just fine.

What about adding something morally equivalent to the end of %install in the Lustre and ldiskfs specs?

Comment by Oleg Drokin [ 06/Nov/12 ]

I think it's a bug in perf then, and it needs to try and fetch symbols from both places and where it is successful in doing so?
the /usr/lib/debug is probably a redhat-ism anyway (I never heard of it before) and will not work on other distros.

Comment by Brian Murrell (Inactive) [ 07/Nov/12 ]

the /usr/lib/debug is probably a redhat-ism anyway (I never heard of it before) and will not work on other distros

It actually exists on Ubuntu also so maybe it's just a bleeding edge feature rather than a single-distro-ism and other distros will "catch up" eventually.

It's a nice feature actually since it allows you to only incur the debug-bloat cost for software you actually need to debug rather than carrying the debug-bloat for everything, "just in case".

Comment by John Hammond [ 07/Nov/12 ]

> I think it's a bug in perf then, and it needs to try and fetch symbols from both places and where it is successful in doing so?

OK, I'll file a bug upstream: "Hello perf maintainers! Your utility works with everything except the lustre modules."

In the mean time, please see http://review.whamcloud.com/4491 for a patch.

Comment by John Hammond [ 12/Nov/12 ]

I have updated the patch to use the first, more RHEL-ish, method of marking the modules executable during %install. This has the added benefit of not being broken on RHEL 5.

Note that gdb automagically handles modules with split debuginfo:

# gdb /lib/modules/2.6.32-279.11.1.el6.lm.x86_64/updates/kernel/fs/lustre/lustre.ko
GNU gdb (GDB) Red Hat Enterprise Linux (7.2-56.el6)
Copyright (C) 2010 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-redhat-linux-gnu".
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>...
Reading symbols from /lib/modules/2.6.32-279.11.1.el6.lm.x86_64/updates/kernel/fs/lustre/lustre.ko...Reading symbols from /usr/lib/debug/lib/modules/2.6.32-279.11.1.el6.lm.x86_64/updates/kernel/fs/lustre/lustre.ko.debug...done.
done.
(gdb) disassemble /m ll_file_open
Dump of assembler code for function ll_file_open:
494  {
   0x000000000001cf50 <+0>:     push   %rbp
   0x000000000001cf51 <+1>:     mov    %rsp,%rbp
   0x000000000001cf54 <+4>:     push   %r15
   0x000000000001cf56 <+6>:     push   %r14
   0x000000000001cf58 <+8>:     push   %r13
   0x000000000001cf5a <+10>:    push   %r12
   0x000000000001cf5c <+12>:    push   %rbx
   0x000000000001cf5d <+13>:    sub    $0xa8,%rsp
   0x000000000001cf64 <+20>:    callq  0x1cf69 <ll_file_open+25>
   0x000000000001cf75 <+37>:    mov    %rdi,-0x80(%rbp)
   0x000000000001cf7f <+47>:    mov    %rsi,%r15

495                           struct ll_inode_info *lli = ll_i2info(inode);
496                                   struct lookup_intent *it, oit = { .it_op = IT_OPEN,
497                                                                             .it_flags = file->f_flags };
...
Comment by Peter Jones [ 08/Jan/13 ]

Landed for 2.4

Generated at Sat Feb 10 01:23:57 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.