[LU-2615] group of OSS crashed at umount - Whamcloud Community JIRA

Details

Type: Bug
Resolution: Done
Priority: Major
Fix Version/s: None
Affects Version/s: Lustre 2.1.3
Labels:
- ptr

Severity:
3
Rank (Obsolete):
6118

Description

We have got 4 OSSes that crash at the same time, at umount, with the following bt :

PID: 18173 TASK: ffff8803376dc040 CPU: 4 COMMAND: "umount"
#0 [ffff8802b115f8d0] machine_kexec at ffffffff8102895b
0000001 [ffff8802b115f930] crash_kexec at ffffffff810a4622
0000002 [ffff8802b115fa00] panic at ffffffff81484657
0000003 [ffff8802b115fa80] lbug_with_loc at ffffffffa04ade5b [libcfs]
0000004 [ffff8802b115faa0] llog_recov_thread_stop at ffffffffa072e55b [ptlrpc]
0000005 [ffff8802b115fad0] llog_recov_thread_fini at ffffffffa072e593 [ptlrpc]
0000006 [ffff8802b115faf0] filter_llog_finish at ffffffffa0c7d3dd [obdfilter]
0000007 [ffff8802b115fb20] obd_llog_finish at ffffffffa057c2f8 [obdclass]
0000008 [ffff8802b115fb40] filter_precleanup at ffffffffa0c7cdaf [obdfilter]
0000009 [ffff8802b115fba0] class_cleanup at ffffffffa05a3ca7 [obdclass]
0000010 [ffff8802b115fc20] class_process_config at ffffffffa05a5feb [obdclass]
0000011 [ffff8802b115fcb0] class_manual_cleanup at ffffffffa05a6d29 [obdclass]
0000012 [ffff8802b115fd70] server_put_super at ffffffffa05b2c0c [obdclass]
0000013 [ffff8802b115fe40] generic_shutdown_super at ffffffff8116542b
0000014 [ffff8802b115fe60] kill_anon_super at ffffffff81165546
0000015 [ffff8802b115fe80] lustre_kill_super at ffffffffa05a8966 [obdclass]
0000016 [ffff8802b115fea0] deactivate_super at ffffffff811664e0
0000017 [ffff8802b115fec0] mntput_no_expire at ffffffff811826bf
0000018 [ffff8802b115fef0] sys_umount at ffffffff81183188
0000019 [ffff8802b115ff80] system_call_fastpath at ffffffff810030f2
RIP: 00007f62ddfbdd67 RSP: 00007fffab738308 RFLAGS: 00010202
RAX: 00000000000000a6 RBX: ffffffff810030f2 RCX: 0000000000000010
RDX: 0000000000000000 RSI: 0000000000000000 RDI: 00007f62deeb3bb0
RBP: 00007f62deeb3b80 R8: 00007f62deeb3bd0 R9: 0000000000000000
R10: 00007fffab738130 R11: 0000000000000246 R12: 0000000000000000
R13: 0000000000000000 R14: 0000000000000000 R15: 00007f62deeb3c10
ORIG_RAX: 00000000000000a6 CS: 0033 SS: 002b

This bt is identical as the one shown ~~LU-1194~~ which is supposed to be fixed in 2.1.3.

Site is classified so I can't upload the binary crash but I can export the content of some structures upon request.

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending
- Thumbnails
- List
- Download All

ptlrpcd.c
32 kB
11/Mar/13 10:56 AM
recov_thread.c
24 kB
11/Mar/13 10:56 AM

Activity

[LU-2615] group of OSS crashed at umount

Li Xi (Inactive) added a comment - 02/Aug/13 7:22 AM

Hi Hongchao,

Sorry, may be 'race' is not the right word to express my thought.

At the time llcd_send() returns, the completion handler llcd_interpret() might not be called yet, right? When the llcd is still under use by the RPC on flight, llog_recov_thread_stop() will hit a LBUG. I can't find any codes in filter_llog_finish() which waits for the RPC finishes, so I guess it is possible that when llog_recov_thread_stop() is called, the RPC is still on flight. Am I right?

Thanks
Li Xi

Li Xi (Inactive) added a comment - 02/Aug/13 7:22 AM Hi Hongchao, Sorry, may be 'race' is not the right word to express my thought. At the time llcd_send() returns, the completion handler llcd_interpret() might not be called yet, right? When the llcd is still under use by the RPC on flight, llog_recov_thread_stop() will hit a LBUG. I can't find any codes in filter_llog_finish() which waits for the RPC finishes, so I guess it is possible that when llog_recov_thread_stop() is called, the RPC is still on flight. Am I right? Thanks Li Xi

Hongchao Zhang added a comment - 02/Aug/13 4:14 AM

what are the two threads involved in the race?
normally, the llog_recov_thread_stop is only called by llog_recov_thread_fini, and "llog_recov_thread_stop" is called in two places,
one is the cleanup for the failed llog_recov_thread_init call, the other is the normal cleanup phase during device cleanup
(called in filter_llog_finish). they can't be called simultaneously

could you please attach some more info about this issue, and can it be reproduced on your site?

Hongchao Zhang added a comment - 02/Aug/13 4:14 AM what are the two threads involved in the race? normally, the llog_recov_thread_stop is only called by llog_recov_thread_fini, and "llog_recov_thread_stop" is called in two places, one is the cleanup for the failed llog_recov_thread_init call, the other is the normal cleanup phase during device cleanup (called in filter_llog_finish). they can't be called simultaneously could you please attach some more info about this issue, and can it be reproduced on your site?

Li Xi (Inactive) added a comment - 02/Aug/13 2:50 AM

We hit the same problem on lustre-2.1.6 too.

After reading a few codes, I am wondering whether it is possible for following race problem to happen. Please correct me if I am wrong.

filter_llog_finish
--llog_recov_thread_fini
----llog_sync
------llog_obd_repl_sync
--------llog_cancel
----------llog_obd_repl_cancel
------------llcd_push
--------------llcd_send
----------------Sending async
----llog_recov_thread_stop
------LBUG，because llcd_send is sending a llcd and llcd_interpret() is not called since no reply has been got now.

Thanks!

Li Xi (Inactive) added a comment - 02/Aug/13 2:50 AM We hit the same problem on lustre-2.1.6 too. After reading a few codes, I am wondering whether it is possible for following race problem to happen. Please correct me if I am wrong. filter_llog_finish --llog_recov_thread_fini ----llog_sync ------llog_obd_repl_sync --------llog_cancel ----------llog_obd_repl_cancel ------------llcd_push --------------llcd_send ----------------Sending async ----llog_recov_thread_stop ------LBUG，because llcd_send is sending a llcd and llcd_interpret() is not called since no reply has been got now. Thanks!

Sebastien Buisson (Inactive) added a comment - 19/Jun/13 6:57 AM

Hi,

I have asked people on site for the results of the tests.

Cheers,
Sebastien.

Sebastien Buisson (Inactive) added a comment - 19/Jun/13 6:57 AM Hi, I have asked people on site for the results of the tests. Cheers, Sebastien.

Hongchao Zhang added a comment - 18/Jun/13 9:45 AM

Hi, what is the output of the test? Thanks

Hongchao Zhang added a comment - 18/Jun/13 9:45 AM Hi, what is the output of the test? Thanks

Hongchao Zhang added a comment - 07/Jun/13 3:00 PM

Hi,

Yes, it will disable the ptlrpcd thread pools (although not shaking off the patch completely) and it should be still a relevant test.

Thanks

Hongchao Zhang added a comment - 07/Jun/13 3:00 PM Hi, Yes, it will disable the ptlrpcd thread pools (although not shaking off the patch completely) and it should be still a relevant test. Thanks

Sebastien Buisson (Inactive) added a comment - 16/May/13 3:31 PM

Hi,

It might be difficult to have the opportunity to install packages with those 2 patches reverted at customer site.
Instead, could we just set ptlrpcd_bind_policy=1 and max_ptlrpcds=2 as options for the ptlrpc kernel module, so that it behaves like if patch from ORNL-22 was not applied?
Is it still a relevant test for you?

Thanks,
Sebastien.

Sebastien Buisson (Inactive) added a comment - 16/May/13 3:31 PM Hi, It might be difficult to have the opportunity to install packages with those 2 patches reverted at customer site. Instead, could we just set ptlrpcd_bind_policy=1 and max_ptlrpcds=2 as options for the ptlrpc kernel module, so that it behaves like if patch from ORNL-22 was not applied? Is it still a relevant test for you? Thanks, Sebastien.

Hongchao Zhang added a comment - 02/May/13 10:16 AM

the remaining "llcd" should have been sent over ptlrpc_request for llog_ctxt->loc_llcd == NULL, and the request could not finish, then "llcd_interpret" wasn't
called to free the "llcd", there are 2 patches (ORNL-22 general ptlrpcd threads pool support; ~~LU-1144~~ implement a NUMA aware ptlrpcd binding policy) among
the patches applied currently is related to it, could you please help to revert the 2 patches and test it, Thanks!

Hongchao Zhang added a comment - 02/May/13 10:16 AM the remaining "llcd" should have been sent over ptlrpc_request for llog_ctxt->loc_llcd == NULL, and the request could not finish, then "llcd_interpret" wasn't called to free the "llcd", there are 2 patches (ORNL-22 general ptlrpcd threads pool support; LU-1144 implement a NUMA aware ptlrpcd binding policy) among the patches applied currently is related to it, could you please help to revert the 2 patches and test it, Thanks!

Hongchao Zhang added a comment - 26/Apr/13 10:51 AM

Hi,
Does the kernel dump referred in comment at 11/Feb/13 3:56 PM exist? if so, could you please print the content at 0xffff88021b2c2050
as "struct llog_canceld_ctxt"? besides, can the console output(just the part related to Lustre) be attached here? Thanks a lot!

Hongchao Zhang added a comment - 26/Apr/13 10:51 AM Hi, Does the kernel dump referred in comment at 11/Feb/13 3:56 PM exist? if so, could you please print the content at 0xffff88021b2c2050 as "struct llog_canceld_ctxt"? besides, can the console output(just the part related to Lustre) be attached here? Thanks a lot!

Alexandre Louvet (Inactive) added a comment - 11/Apr/13 1:35 PM

> Could it be memory corrupt?
It's unexpected. The serveur is fully ECC protected and, as it's an OSS, almost only linux & lustre are running on this node.

> does the issue occur again recently?
It did occurs the last 4 times we did stop lustre on the node.

Alexandre Louvet (Inactive) added a comment - 11/Apr/13 1:35 PM > Could it be memory corrupt? It's unexpected. The serveur is fully ECC protected and, as it's an OSS, almost only linux & lustre are running on this node. > does the issue occur again recently? It did occurs the last 4 times we did stop lustre on the node.

Hongchao Zhang added a comment - 12/Mar/13 7:35 AM

the list "lcm_llcds" is corrupted for its value of "next" and "prev" is wrong (it's not in the address region of "struct llog_commit_master").
Could it be memory corrupt? there is no trace of the bug yet, sorry!

does the issue occur again recently?

Hongchao Zhang added a comment - 12/Mar/13 7:35 AM the list "lcm_llcds" is corrupted for its value of "next" and "prev" is wrong (it's not in the address region of "struct llog_commit_master"). Could it be memory corrupt? there is no trace of the bug yet, sorry! does the issue occur again recently?

People

Assignee:: Hongchao Zhang

Reporter:: Alexandre Louvet (Inactive)

Votes:: 0 Vote for this issue

Watchers:: 8 Start watching this issue

Dates

Created:: 14/Jan/13 9:16 AM

Updated:: 17/Dec/15 11:57 PM

Resolved:: 17/Dec/15 11:57 PM