[LU-1484] Test failure on test suite recovery-small, subtest test_57 Created: 05/Jun/12  Updated: 18/Apr/13  Resolved: 21/Feb/13

Status: Resolved
Project: Lustre
Component/s: None
Affects Version/s: Lustre 2.3.0, Lustre 2.1.2, Lustre 2.1.3, Lustre 2.1.4, Lustre 1.8.8
Fix Version/s: Lustre 2.4.0, Lustre 2.1.5, Lustre 1.8.9

Type: Bug Priority: Blocker
Reporter: Maloo Assignee: Nathaniel Clark
Resolution: Fixed Votes: 0
Labels: None
Environment:

Lustre Tag: v2_1_2_RC2
Lustre Build: http://build.whamcloud.com/job/lustre-b2_1/87/
Distro/Arch: RHEL5.8/x86_64
Network: TCP (1GigE)


Attachments: File config.h     Text File config.log    
Severity: 3
Rank (Obsolete): 4529

 Description   

This issue was created by maloo for yujian <yujian@whamcloud.com>

This issue relates to the following test suite run: https://maloo.whamcloud.com/test_sets/743bea58-af48-11e1-a585-52540035b04c.

The sub-test test_57 failed with the following error:

== recovery-small test 57: read procfs entries causes kernel crash =================================== 05:43:48 (1338900228)
fail_loc=0x80000B00
Stopping client client-28vm6.lab.whamcloud.com /mnt/lustre (opts

test failed to respond and timed out

Info required for matching: recovery-small 57



 Comments   
Comment by Jian Yu [ 05/Jun/12 ]

Console log on Client 4 (client-28vm6) showed that:

05:43:48:Lustre: DEBUG MARKER: == recovery-small test 57: read procfs entries causes kernel crash =================================== 05:43:48 (1338900228)
05:43:53:LustreError: 28843:0:(fail.c:126:__cfs_fail_timeout_set()) cfs_fail_timeout id b00 sleeping for 10000ms
05:44:01:LustreError: 28843:0:(fail.c:130:__cfs_fail_timeout_set()) cfs_fail_timeout id b00 awake
05:46:17:INFO: task lctl:28843 blocked for more than 120 seconds.
05:46:17:"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
05:46:18:lctl          D 0000000000001000     0 28843  28631                     (NOTLB)
05:46:18: ffff8100517dfe38 0000000000000086 ffffffff800cfa4c ffff810037d56d40
05:46:18: 0000000000000282 0000000000000007 ffff810067000080 ffff8100668c47e0
05:46:18: 000005e137f5c0e1 000000000001b643 ffff810067000268 0000000000000001
05:46:18:Call Trace:
05:46:18: [<ffffffff800cfa4c>] zone_statistics+0x3e/0x6d
05:46:18: [<ffffffff8000f40b>] __alloc_pages+0x78/0x308
05:46:19: [<ffffffff8006468c>] __down_read+0x7a/0x92
05:46:19: [<ffffffff888d90e2>] :obdclass:lprocfs_fops_read+0x82/0x1e0
05:46:19: [<ffffffff8010ab77>] proc_reg_read+0x7e/0x99
05:46:19: [<ffffffff8000b721>] vfs_read+0xcb/0x171
05:46:19: [<ffffffff80011d15>] sys_read+0x45/0x6e
05:46:19: [<ffffffff8005d28d>] tracesys+0xd5/0xe0
05:46:19:
05:46:19:INFO: task umount:28863 blocked for more than 120 seconds.
05:46:22:"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
05:46:24:umount        D ffff810002536420     0 28863  28862                     (NOTLB)
05:46:24: ffff810057f81828 0000000000000082 0000000000000000 0000000100000002
05:46:24: 0000000000000000 0000000000000007 ffff8100668c47e0 ffffffff80319b60
05:46:24: 000005e137f5e08c 0000000000001fab ffff8100668c49c8 0000000000000000
05:46:24:Call Trace:
05:46:24: [<ffffffff80064cb5>] __reacquire_kernel_lock+0x2e/0x47
05:46:24: [<ffffffff80063171>] wait_for_completion+0x79/0xa2
05:46:24: [<ffffffff8008ee74>] default_wake_function+0x0/0xe
05:46:24: [<ffffffff8010dcdb>] remove_proc_entry+0xfb/0x1c7
05:46:25: [<ffffffff888d7603>] :obdclass:lprocfs_remove+0x103/0x130
05:46:25: [<ffffffff888d6a46>] :obdclass:lprocfs_free_stats+0x1e6/0x230
05:46:25: [<ffffffff888d7a1f>] :obdclass:lprocfs_obd_cleanup+0x6f/0x80
05:46:28: [<ffffffff88b7ca32>] :osc:osc_precleanup+0x292/0x370
05:46:28: [<ffffffff888ff13c>] :obdclass:lu_context_fini+0x1c/0x50
05:46:28: [<ffffffff888e303f>] :obdclass:class_cleanup+0xc6f/0xe30
05:46:28: [<ffffffff888e6d8c>] :obdclass:class_process_config+0x1e5c/0x3200
05:46:28: [<ffffffff888e97f7>] :obdclass:class_manual_cleanup+0xad7/0xe80
05:46:28: [<ffffffff8002ff6f>] __up_write+0x27/0xf2
05:46:30: [<ffffffff88bba26c>] :lov:lov_putref+0xb0c/0xb90
05:46:30: [<ffffffff88bc2b98>] :lov:lov_disconnect+0x308/0x3e0
05:46:30: [<ffffffff88c66d94>] :lustre:client_common_put_super+0x894/0xed0
05:46:30: [<ffffffff88c676e5>] :lustre:ll_put_super+0x195/0x310
05:46:30: [<ffffffff800f079e>] invalidate_inodes+0xce/0xe0
05:46:30: [<ffffffff800e77ab>] generic_shutdown_super+0x79/0xfb
05:46:30: [<ffffffff800e787b>] kill_anon_super+0x9/0x35
05:46:30: [<ffffffff800e792c>] deactivate_super+0x6a/0x82
05:46:31: [<ffffffff800f1e8b>] sys_umount+0x245/0x27b
05:46:31: [<ffffffff800ba767>] audit_syscall_entry+0x1a8/0x1d3
05:46:31: [<ffffffff8005d28d>] tracesys+0xd5/0xe0

For Lustre 2.1.1, we also hit recovery-small test 57 hanging, but it's another failure on client: LU-1097.

Comment by Jian Yu [ 06/Jun/12 ]

Another instance:
https://maloo.whamcloud.com/test_sets/18995f38-aafb-11e1-b191-52540035b04c

And here are the historical reports with status "TIMEOUT" on Maloo:
http://tinyurl.com/cym3do3

Comment by Peter Jones [ 23/Jul/12 ]

Bobijam

Could you please look into this one?

Thanks

Peter

Comment by Zhenyu Xu [ 24/Jul/12 ]

there's a deadlock here: (!HAVE_PROCFS_USER && HAVE_PROCFS_DELETED)

proc reader                                    proc remover
proc_reg_read()                                LPROCFS_WRITE_ENTRY() // down_write _lprocfs_lock semaphore  -------> (1)
   pdeaux->pde_users++
   lprocfs_fops_read()
      LPROCFS_ENTRY_AND_CHECK() // down_read  _lprocfs_lock semaphore, wait here        ---------------------------> (2)
                                               remove_proc_entry()
                                                  if (pdeaux->pde_users > 0)
                                                      wait_for_completion()    ------------------------------------> (3)
...
  pde_users_dec() // pdeaux->pde_users--, complete()      ---------------------------------------------------------- (4)

the issue is when remover get to (3), the proc reader will wait at (2), while proc remover cannot move on until proc reader reachs (4), a deadlock ensues.

Comment by Zhenyu Xu [ 24/Jul/12 ]

patch tracking at http://review.whamcloud.com/3455

lprocfs: fix a deadlock

There is a deadlock between proc reader and proc remover.

proc reader proc remover
LPROCFS_WRITE_ENTRY() -----> (1)
proc_reg_read()
pdeaux->pde_users++
lprocfs_fops_read()
LPROCFS_ENTRY_AND_CHECK() // wait semaphore <----- (2)
remove_proc_entry()
if (pdeaux->pde_users > 0)
wait_for_completion() -> (3)
...
pde_users_dec() // pdeaux->pde_users-, complete() <---- (4)

when remover gets to (3), the proc reader will wait at (2), while
proc remover cannot move on until proc reader reaches (4), a deadlock
ensues.

Comment by Zhenyu Xu [ 24/Jul/12 ]

update patch

lprocfs: refine LC_PROCFS_USERS check

In some RHEL patched 2.6.18 kernels, pde_users member is added in
another struct proc_dir_entry_aux instead of in struct proc_dir_entry
in later kernel version of 2.6.23.

Comment by Peter Jones [ 26/Jul/12 ]

Landed for 2.3

Comment by Jian Yu [ 13/Aug/12 ]

Lustre Tag: v2_1_3_RC1
Lustre Build: http://build.whamcloud.com/job/lustre-b2_1/113/
Distro/Arch: RHEL5.8/x86_64 (kernel version: 2.6.18-308.11.1.el5)
Network: TCP (1GigE)

The same issue exists in Lustre 2.1.3: https://maloo.whamcloud.com/test_sets/89731b4c-e415-11e1-b6d3-52540035b04c

Will the patch for this ticket be cherry-picked/ported to b2_1 branch?

Comment by Zhenyu Xu [ 13/Aug/12 ]

b2_1 patch tracking at http://review.whamcloud.com/3471

Comment by Peng Tao [ 14/Nov/12 ]

With commit 76bf16d1e12cd3c2d2f48a31e3e6c1ad66523638 (LU-1484 lprocfs: refine LC_PROCFS_USERS check) for the ticket, I got following build errors with RHEL 2.6.18-274.12.1.el5.i686 kernel.

CC [M] /home/bergwolf/src/lustre-testing/libcfs/libcfs/linux/linux-tracefile.o
In file included from /home/bergwolf/src/lustre-testing/libcfs/include/libcfs/libcfs.h:322,
from /home/bergwolf/src/lustre-testing/libcfs/libcfs/linux/linux-tracefile.c:38:
/home/bergwolf/src/lustre-testing/libcfs/include/libcfs/params_tree.h:107:2: error: #error proc_dir_entry->deleted is conflicted with proc_dir_entry->pde_users
make[6]: *** [/home/bergwolf/src/lustre-testing/libcfs/libcfs/linux/linux-tracefile.o] Error 1
make[5]: *** [/home/bergwolf/src/lustre-testing/libcfs/libcfs] Error 2
make[4]: *** [/home/bergwolf/src/lustre-testing/libcfs] Error 2

Comment by Peng Tao [ 14/Nov/12 ]

config.h shows that both HAVE_PROCFS_DELETED and HAVE_PROCFS_USERS are defined.

460 /* kernel has deleted member in procfs entry struct */
461 #define HAVE_PROCFS_DELETED 1
462
463 /* kernel has pde_users member in proc_dir_entry_aux */
464 #define HAVE_PROCFS_USERS 1

Comment by Stephen Champion [ 15/Nov/12 ]

Building b2_1, I have verified that the RHEL 5.8 updates 2.6.18-308.11.1.el5 and 2.6.18-308.16.1.el5 include both proc_dir_entry.deleted and proc_dir_entry_aux.pde_users. As Peng reported, this leads to

libcfs/include/libcfs/params_tree.h:107:2: error: #error proc_dir_entry->deleted is conflicted with proc_dir_entry->pde_user

anytime the source for one of these RHEL 5.8 updates is used for --with-linux=.

Comment by Stephen Champion [ 15/Nov/12 ]

Follow up in
LU-2334 build errors with 2.6.18 kernel

Comment by Jian Yu [ 10/Dec/12 ]

Lustre Branch: b2_1
Lustre Build: http://build.whamcloud.com/job/lustre-b2_1/148
Distro/Arch: RHEL5.8/x86_64 (kernel version: 2.6.18-308.20.1.el5)

The same issue occurred again on recovery-small test 57:

== recovery-small test 57: read procfs entries causes kernel crash =================================== 19:13:57 (1354936437)
fail_loc=0x80000B00
CMD: fat-intel-3vm6.lab.whamcloud.com grep -c /mnt/lustre' ' /proc/mounts
Stopping client fat-intel-3vm6.lab.whamcloud.com /mnt/lustre (opts:)
CMD: fat-intel-3vm6.lab.whamcloud.com lsof -t /mnt/lustre
CMD: fat-intel-3vm6.lab.whamcloud.com umount  /mnt/lustre 2>&1

Console log on client (fat-intel-3vm6):

19:14:02:Lustre: DEBUG MARKER: == recovery-small test 57: read procfs entries causes kernel crash =================================== 19:13:57 (1354936437)
19:14:02:Lustre: DEBUG MARKER: grep -c /mnt/lustre' ' /proc/mounts
19:14:02:LustreError: 9173:0:(fail.c:133:__cfs_fail_timeout_set()) cfs_fail_timeout id b00 sleeping for 10000ms
19:14:02:Lustre: DEBUG MARKER: lsof -t /mnt/lustre
19:14:02:Lustre: DEBUG MARKER: umount /mnt/lustre 2>&1
19:14:13:LustreError: 9173:0:(fail.c:137:__cfs_fail_timeout_set()) cfs_fail_timeout id b00 awake
19:16:57:INFO: task lctl:9173 blocked for more than 120 seconds.
19:16:57:"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
19:16:57:lctl          D 0000000000001000     0  9173   8913                     (NOTLB)
19:16:57: ffff810044841e38 0000000000000086 ffffffff800cfa80 ffff810037d56d40
19:16:57: 0000000000000282 0000000000000007 ffff81004de4f860 ffff81005f2120c0
19:16:57: 00002a16f848196c 0000000000020344 ffff81004de4fa48 0000000000000001
19:16:57:Call Trace:
19:16:57: [<ffffffff800cfa80>] zone_statistics+0x3e/0x6d
19:16:58: [<ffffffff8000f47b>] __alloc_pages+0x78/0x308
19:16:58: [<ffffffff8006468c>] __down_read+0x7a/0x92
19:16:58: [<ffffffff889170c2>] :obdclass:lprocfs_fops_read+0x82/0x200
19:16:58: [<ffffffff8010b73e>] proc_reg_read+0x7e/0x99
19:16:58: [<ffffffff8000b72f>] vfs_read+0xcb/0x171
19:16:58: [<ffffffff80011d85>] sys_read+0x45/0x6e
19:16:58: [<ffffffff8005d28d>] tracesys+0xd5/0xe0
19:16:58:
19:16:58:INFO: task umount:9195 blocked for more than 120 seconds.
19:16:58:"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
19:16:58:umount        D ffff810002536420     0  9195   9194                     (NOTLB)
19:16:58: ffff810058613a08 0000000000000086 00000000ffffffff 0000000000000020
19:16:58: 00000000ffffffff 0000000000000007 ffff81005f2120c0 ffffffff8031ab60
19:16:58: 00002a16f8483cb6 000000000000234a ffff81005f2122a8 0000000000000000
19:16:58:Call Trace:
19:16:58: [<ffffffff80064cb5>] __reacquire_kernel_lock+0x2e/0x47
19:16:58: [<ffffffff80063171>] wait_for_completion+0x79/0xa2
19:16:58: [<ffffffff8008ee97>] default_wake_function+0x0/0xe
19:16:58: [<ffffffff8010e8a2>] remove_proc_entry+0xfb/0x1c7
19:16:58: [<ffffffff88915573>] :obdclass:lprocfs_remove+0x103/0x130
19:16:58: [<ffffffff889159d0>] :obdclass:lprocfs_obd_cleanup+0x90/0xa0
19:16:58: [<ffffffff88caf665>] :osc:osc_precleanup+0x2e5/0x3a0
19:16:58: [<ffffffff88920c35>] :obdclass:class_cleanup+0xc55/0xda0
19:16:58: [<ffffffff889241f6>] :obdclass:class_process_config+0x1b46/0x2cc0
19:16:58: [<ffffffff88926a0b>] :obdclass:class_manual_cleanup+0x9bb/0xd70
19:16:58: [<ffffffff88d03c5d>] :lov:lov_putref+0xa7d/0xaf0
19:16:58: [<ffffffff88cfe623>] :lov:lov_del_target+0x6d3/0x720
19:16:58: [<ffffffff88d0b78b>] :lov:lov_disconnect+0x39b/0x440
19:16:58: [<ffffffff88de73ea>] :lustre:client_common_put_super+0x83a/0xe10
19:16:58: [<ffffffff88de7d15>] :lustre:ll_put_super+0x1a5/0x330
19:16:58: [<ffffffff800f120a>] invalidate_inodes+0xce/0xe0
19:16:58: [<ffffffff800e78ae>] generic_shutdown_super+0x79/0xfb
19:16:58: [<ffffffff800e797e>] kill_anon_super+0x9/0x35
19:16:58: [<ffffffff800e7a2f>] deactivate_super+0x6a/0x82
19:16:58: [<ffffffff800f28f7>] sys_umount+0x245/0x27b
19:16:58: [<ffffffff800ba78a>] audit_syscall_entry+0x1a8/0x1d3
19:16:58: [<ffffffff8005d28d>] tracesys+0xd5/0xe0

Maloo report: https://maloo.whamcloud.com/test_sets/a59d9126-41bc-11e2-a653-52540035b04c

Comment by Zhenyu Xu [ 10/Dec/12 ]

In Lustre Branch: b2_1
Lustre Build: http://build.whamcloud.com/job/lustre-b2_1/148
Distro/Arch: RHEL5.8/x86_64 (kernel version: 2.6.18-308.20.1.el5)

In rhel5 kernel build log

checking if kernel has pde_users member in procfs entry struct... no

this test result does not match the supposed result from 2.6.18-308.20.1.el5 kernel source, my local test shows

checking if kernel has pde_users member in procfs entry struct... yes

and build failed as Stephen pointed out:

/root/work/lustre.clone/libcfs/include/libcfs/params_tree.h:113:2: error: #error proc_dir_entry->deleted is conflicted with proc_dir_entry->pde_users

This means there is something wrong about the RHEL5 kernel build.

Also the hung stack also reveals that proc_dir_entry_aux::pre_users is used.

remove_proc_entry() in 2.6.18-308.20.1.el5 source
                /* Wait until all existing callers into module are done. */
                if (pdeaux->pde_users > 0) {
                        DECLARE_COMPLETION_ONSTACK(c);
                        pdeaux = to_pde_aux(de);
                        if (!pdeaux->pde_unload_completion)
                                pdeaux->pde_unload_completion = &c;

                        spin_unlock(&pdeaux->pde_unload_lock);
                        spin_unlock(&proc_subdir_lock);

                        wait_for_completion(pdeaux->pde_unload_completion);

                        spin_lock(&proc_subdir_lock);
                        goto continue_removing;
                }

I'll update a patch to make 2.6.18-308.20.1.el5 build pass.

Comment by Zhenyu Xu [ 10/Dec/12 ]

b2_1 patch handling build error for 2.6.18-308 RHEL5 kernel is tracked at http://review.whamcloud.com/4794

commit message
LU-1484 kernel: pass RHEL5 build for 2.6.18-308

For vanilla kernel, proc_dir_entry::deleted and ::pde_users co-exists
from 2.6.23 to 2.6.23.17.

For some RHEL5 kernels, it defines co-existings
proc_dir_entry::deleted and proc_dir_entry_aux::pde_users.
Comment by Zhenyu Xu [ 11/Dec/12 ]

strangely the RHEL5 build in http://build.whamcloud.com/job/lustre-reviews/11128/arch=x86_64,build_type=server,distro=el5,ib_stack=inkernel/ still shows that pde_users test failed.

Chris Gearing, could you help me check out why RHEL5 (b2_1) build does not detect pde_users members? Thanks.

Check items:

  • it is defined in fs/proc/internal.h, in proc_dir_entry_aux structure.
  • lustre/autoconf/lusre-core.m4 checked this member in LC_PROCFS_USERS
  • after configure-ed, the config.h contains "#define HAVE_PROCFS_DELETED 1"
Comment by Peter Jones [ 19/Dec/12 ]

Landed for 2.1.4. RHEL5 is not supported in 2.4 so this latest change is not needed to master

Comment by Jian Yu [ 22/Dec/12 ]

Lustre Tag: v2_1_4_RC2
Lustre Build: http://build.whamcloud.com/job/lustre-b2_1/164
Distro/Arch: RHEL5.8/x86_64

The issue occurred again: https://maloo.whamcloud.com/test_sets/baaad7ac-4c1d-11e2-875d-52540035b04c

Comment by Jian Yu [ 31/Dec/12 ]

Lustre Branch: b1_8
Lustre Build: http://build.whamcloud.com/job/lustre-b1_8/236/
Distro/Arch: RHEL5.8/x86_64
Test Group: failover

The same issue occurred: https://maloo.whamcloud.com/test_sets/e6be996c-51b5-11e2-a904-52540035b04c

Comment by Chris Gearing (Inactive) [ 03/Jan/13 ]

Zhenyu Xu:

I'm not sure what you are asking of me, but perhaps you could tell me what this line indicates?

after configure-ed, the config.h contains "#define HAVE_PROCFS_DELETED 1"

which config.h file is this?

Comment by Zhenyu Xu [ 03/Jan/13 ]
  • pde_users should be defined in fs/proc/internal.h, in proc_dir_entry_aux structure.
  • lustre/autoconf/lusre-core.m4 should has checked this member in LC_PROCFS_USERS
  • after configure-ed, the config.h generated under lustre build root directory should contains "#define HAVE_PROCFS_DELETED 1" and "#define HAVE_PROCFS_USERS 1"
Comment by Jian Yu [ 06/Jan/13 ]

Lustre Branch: b1_8
Lustre Build: http://build.whamcloud.com/job/lustre-b1_8/236/
Distro/Arch: RHEL5.8/x86_64

The same issue occurred again: https://maloo.whamcloud.com/test_sets/6ed434e6-57ca-11e2-9cc9-52540035b04c

Comment by Chris Gearing (Inactive) [ 08/Jan/13 ]

I presume this is a build issue,

If I look at the latest b1_8 head build on server then

BUILD/BUILD/lustre-1.8.8.60/config.h

does have

#define HAVE_PROCFS_DELETED 1

but

/* kernel has pde_users member in procfs entry struct */
/* #undef HAVE_PROCFS_USERS */

I've attached the config file and the config.log

Comment by Zhenyu Xu [ 08/Jan/13 ]

I need port relevant patches to b1_8 branch then. it's tracked at http://review.whamcloud.com/4976

commit message
LU-1484 lprocfs: refine LC_PROCFS_USERS check

In some RHEL patched 2.6.18 kernels, pde_users member is added in
another struct proc_dir_entry_aux instead of in struct proc_dir_entry
in later kernel version of 2.6.23.
Comment by Peter Jones [ 14/Jan/13 ]

Landed to b1_8

Comment by Jian Yu [ 19/Jan/13 ]

Lustre Branch: b1_8
Lustre Build: http://build.whamcloud.com/job/lustre-b1_8/248
Distro/Arch: RHEL5.8/x86_64 (kernel: 2.6.18-308.11.1.el5)

The issue still occurred:
https://maloo.whamcloud.com/test_sets/2b536b04-623a-11e2-b20c-52540035b04c

Comment by Jian Yu [ 20/Jan/13 ]

Another instance on Lustre build http://build.whamcloud.com/job/lustre-b1_8/249 :
https://maloo.whamcloud.com/test_sets/3155439e-6355-11e2-ae8b-52540035b04c

Comment by Zhenyu Xu [ 20/Jan/13 ]

b1_8 still need land another patch http://review.whamcloud.com/5129

commit message
LU-1484 kernel: pass RHEL5 build for 2.6.18-308

For vanilla kernel, proc_dir_entry::deleted and ::pde_users co-exists
from 2.6.23 to 2.6.23.17.

For some RHEL5 kernels, it defines co-existings
proc_dir_entry::deleted and proc_dir_entry_aux::pde_users.
Comment by Jian Yu [ 29/Jan/13 ]

Lustre Branch: b1_8
Lustre Build: http://build.whamcloud.com/job/lustre-b1_8/252
Distro/Arch: RHEL5.9 (kernel version: 2.6.18-348.1.1.el5)

recovery-small test 57 failed again: https://maloo.whamcloud.com/test_sets/68c48694-6a28-11e2-85d4-52540035b04c

Comment by Zhenyu Xu [ 29/Jan/13 ]

Chris,

Would you mind checking the build system again for the following info of the latest b1_8 build (http://build.whamcloud.com/job/lustre-b1_8/252)?

  • pde_users should be defined in fs/proc/internal.h, in proc_dir_entry_aux structure.
  • lustre/autoconf/lusre-core.m4 should has checked this member in LC_PROCFS_USERS
  • after configure-ed, the config.h generated under lustre build root directory should contains "#define HAVE_PROCFS_DELETED 1" and "#define HAVE_PROCFS_USERS 1"
Comment by Chris Gearing (Inactive) [ 30/Jan/13 ]

Xu,

Can you provide me a lot more info please, I really do not know what you are asking me to check.

fs/proc/internal.h comes as part of the lustre source? If so how does the build system affect the presence of pde_users, and if not where does fs/proc/internal.h come from.

I guess I'm just not understanding how the build system affects these things.

What do I need to provide to help you debug this?

Comment by Zhenyu Xu [ 30/Jan/13 ]

sorry, fs/proc/internal.h is kernel's file, like linux-2.6.32-xxx/fs/proc/internal.h

I'll also check my local 2.6.18-348.1.1.el5 kernel.

Comment by Zhenyu Xu [ 30/Jan/13 ]

Chris,

This is my local VM test environment test confirm procedure (CentOS 5.9, 2.6.18-348.1.1.el5 kernel), can you confirm them on the build node?

$ tail -n 15 linux-2.6.18-308.11.1.el5-b18/fs/proc/internal.h 
/*
 * RHEL internal wrapper to extend struct proc_dir_entry
 */
struct proc_dir_entry_aux {
	struct proc_dir_entry pde;
	int pde_users;  /* number of callers into module in progress */
	spinlock_t pde_unload_lock; /* proc_fops checks and pde_users bumps */
	struct completion *pde_unload_completion;
	char name[]; /* PDE name */
};

static inline struct proc_dir_entry_aux *to_pde_aux(struct proc_dir_entry *d)
{
	return container_of(d, struct proc_dir_entry_aux, pde);
}



$ ./configure --with-linux=/path-to/linux-2.6.18-308.11.1.el5-b18
...
...
checking if kernel has pde_users member in procfs entry struct... yes
...
checking if kernel has deleted member in procfs entry struct... yes
...



$ grep PROCFS ~/work/lustre-b18/config.h
440:#define HAVE_PROCFS_DELETED 1
443:#define HAVE_PROCFS_USERS 1



$ONLY=57 bash recovery-small.sh
...
...
== test 57: read procfs entries causes kernel crash == 20:41:25
fail_loc=0x80000B00
Stopping client test3 /mnt/lustre (opts:)
fail_loc=0x80000B00
Stopping /mnt/mds (opts:)
Failover mds to test3
20:41:49 (1359549709) waiting for test3 network 900 secs ...
20:41:49 (1359549709) network interface is UP
Starting mds: -o loop -o abort_recovery /tmp/lustre-mdt /mnt/mds
lnet.debug=0x33f1504
lnet.subsystem_debug=0xffb7e3ff
lnet.debug_mb=32
Started lustre-MDT0000
recovery-small.sh: line 992: kill: (1143) - No such process
fail_loc=0
Starting client: test3: -o user_xattr,acl,flock test3@tcp:/lustre /mnt/lustre
lnet.debug=0x33f1504
lnet.subsystem_debug=0xffb7e3ff
lnet.debug_mb=32
Filesystem           1K-blocks      Used Available Use% Mounted on
test3@tcp:/lustre       562408     53656    478752  11% /mnt/lustre
Resetting fail_loc on all nodes...done.
PASS 57 (26s)
...===== recovery-small.sh test complete, duration 28 sec ======================
Comment by Zhenyu Xu [ 31/Jan/13 ]

in client build log "config.log" I found this

 #include <linux/kernel.h>
|
|               #include "/var/lib/jenkins/workspace/lustre-b1_8/arch/x86_64/build_type/client/distro/el5/ib_stack/inkernel/BUILD/reused/usr/src/kernels/2.6.18-348.1.1.el5-x86_64/fs/proc/internal.h"
|
| int
| main (void)
| {
|
|               struct proc_dir_entry_aux pde_aux;
|
|               pde_aux.pde_users = 0;
|
|   ;
|   return 0;
| }
configure:14056: result: no

And I checked the client build environment Chris copied for debugging

bobijam@brent:/scratch/help-bob-jam/client/BUILD/reused/usr/src/kernels/2.6.18-3
48.1.1.el5-x86_64$ ll fs/proc/
total 12
drwxr-xr-x 2 533 503 4096 Jan 31 09:50 ./
drwxr-xr-x 66 533 503 4096 Jan 31 09:50 ../
rw-rr- 1 533 503 378 Jan 31 09:50 Makefile

there's no files under fs/proc/, while this RHEL kernel (vanilla kernel + RHEL patches) should have C/H files under fs/proc/

Brian, does the build process in this stage only use vanilla kernel (i.e. hasn't applied RHEL patches)? Given that the RHEL kernel src rpm only provides vanilla kernel source plus RHEL's patches, and the patches only get applied when rpmbuild executes its "%prep" stage.

Comment by Brian Murrell (Inactive) [ 01/Feb/13 ]

Does the client build even use full kernel source at all? It shouldn't need it I don't think, just kernel-devel. Yes, looking in lbuild in build_with_srpm() where $PATCHLESS == true:

        if ! kernelrpm=$(find_linux_rpm "-$DEVEL_KERNEL_TYPE"); then
            fatal 1 "Could not find the kernel-$DEVEL_KERNEL_TYPE RPM in ${KERNELRPMSBASE}/${lnxmaj}/${DISTRO}"
        fi
        if ! lnxrel="$lnxrel" unpack_linux_devel_rpm "$kernelrpm" "-"; then
            fatal 1 "Could not find the Linux tree in $kernelrpm"
        fi

and we can see in the build log in:

http://build.whamcloud.com/job/lustre-b1_8/252/arch=x86_64,build_type=client,distro=el5,ib_stack=inkernel/consoleText

+ kernelrpm=/var/lib/jenkins/lbuild-data/kernelrpm/2.6.18/rhel5/x86_64/kernel-devel-2.6.18-348.1.1.el5.x86_64.rpm
...
+ unpack_linux_devel_rpm /var/lib/jenkins/lbuild-data/kernelrpm/2.6.18/rhel5/x86_64/kernel-devel-2.6.18-348.1.1.el5.x86_64.rpm -

So indeed, it's kernel-devel that lustre's configure is pointed at in build_with_srpm()->build_lustre():

++ ./configure --build=x86_64-redhat-linux-gnu --host=x86_64-redhat-linux-gnu --target=x86_64-redhat-linux-gnu --program-prefix= --prefix=/usr --exec-prefix=/usr --bindir=/usr/bin --sbindir=/usr/sbin --sysconfdir=/etc --datadir=/usr/share --includedir=/usr/include --libdir=/usr/lib64 --libexecdir=/usr/libexec --localstatedir=/var --sharedstatedir=/usr/com --mandir=/usr/share/man --infodir=/usr/share/info --with-linux=/var/lib/jenkins/workspace/lustre-b1_8/arch/x86_64/build_type/client/distro/el5/ib_stack/inkernel/BUILD/reused/usr/src/kernels/2.6.18-348.1.1.el5-x86_64 --with-linux-obj=/var/lib/jenkins/workspace/lustre-b1_8/arch/x86_64/build_type/client/distro/el5/ib_stack/inkernel/BUILD/reused/usr/src/kernels/2.6.18-348.1.1.el5-x86_64 --disable-server --enable-liblustre --enable-liblustre-tests --with-release=wc1_2.6.18_348.1.1.el5_g3480bb0 --enable-tests --enable-liblustre-tests

If you look in kernel-devel you will find that fs/proc/ is empty because kernel-devel is "kernel headers" not full kernel source.

Ultimate what this means is that you need to re-cook your test so as to not need kernel source but kernel headers only. This is the standard for external kernel modules: they should be buildable with kernel headers only since kernel headers represent the API and reaching behind the API is "cheating".

Comment by Zhenyu Xu [ 01/Feb/13 ]

Andreas,

RHEL 5.9 hasn't revealed pre_users in its devel package, I find no other way to detect proc_dir_entry_aux::pde_users, and since the pde_users are all used in later kernels, is it ok to change the lprocfs_status.[c|h] code to assume HAVE_PROCFS_USERS is always defined?

Comment by Andreas Dilger [ 01/Feb/13 ]

I can't find any way to check for proc_dir_entry_aux, so we can't depend on checking it for patchless clients.

I think what needs to change here is two things:

  • the code in lprocfs_status.h (1.8) and param_tree.h (master) should be changed to check for HAVE_PROCFS_USERS first, then HAVE_PROCFS_DELETED secondly, so that if both are available it uses the HAVE_PROCFS_USERS method
  • always check for pde_fops == NULL, regardless of whether we detect HAVE_PROCFS_USERS
  • always check for deleted, if HAVE_PROCFS_DELETED is set, even if HAVE_PROCFS_USERS is also present

At worst this causes some small race where a /proc entry will not be shown when it is just loaded or unloaded, but should be safe against crashing.

static inline int LPROCFS_ENTRY_AND_CHECK(struct proc_dir_entry *dp)
{
        int deleted = 0;

#ifdef HAVE_PROCFS_USERS
        spin_lock(&dp->pde_unload_lock);
#endif
        if (unlikely(dp->proc_fops == NULL)) 
                deleted = 1;
#ifdef HAVE_PROCFS_USERS
        spin_unlock(&dp->pde_unload_lock);
#endif

        LPROCFS_ENTRY();
#if defined(HAVE_PROCFS_DELETED)
        if (unlikely(dp->deleted)) {
                LPROCFS_EXIT();
                deleted = 1;
        }
#endif

        return deleted ? -ENODEV : 0;
}

I haven't tested this at all, nor even compiled it yet.

Comment by Andreas Dilger [ 01/Feb/13 ]

Patch at http://review.whamcloud.com/5253, let's hope it builds and tests OK.

Comment by Jian Yu [ 05/Feb/13 ]

Lustre Branch: b1_8
Lustre Build: http://build.whamcloud.com/job/lustre-b1_8/253
Distro/Arch: RHEL5.9/x86_64

The issue still occurred: https://maloo.whamcloud.com/test_sets/583b7710-7009-11e2-a955-52540035b04c

Comment by Zhenyu Xu [ 06/Feb/13 ]

Since recovery-small test_57 is intended to test proc removing while reading it, so the patch (review#5253) cannot avoid the hung of the test w/ patchless client build upon the hidden proc_dir_entry users kernels.

Since later kernels all use proc_dir_entry users, I think we can presume it and define LPROCFS_

{ENTRY,END}

empty ops.

Comment by Nathaniel Clark [ 14/Feb/13 ]

Patch to assume proc_dir_entry for rhel kernels: http://review.whamcloud.com/5439

Comment by Peter Jones [ 18/Feb/13 ]

Nathaniel,

Is this patch needed for b2_1 also?

Peter

Comment by Jian Yu [ 18/Feb/13 ]

Per http://wiki.whamcloud.com/display/ENG/Lustre+2.1.4+release+testing+tracker, the issue still exists in Lustre 2.1.4, so we need the patch on the current b2_1 branch for Lustre 2.1.5.

Comment by Nathaniel Clark [ 19/Feb/13 ]

Peter,

Yes. This patch can cleanly apply to b2_1 (all the way through master). It should be applied to anything we want to support rhel 5 on. Should I submit additional patches?

Comment by Nathaniel Clark [ 19/Feb/13 ]

b2_1 patch:
http://review.whamcloud.com/5468

Comment by Peter Jones [ 21/Feb/13 ]

Landed for 1.8.9 and 2.1.5

Generated at Sat Feb 10 01:17:02 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.