[LU-9791] When umount client, kobject_put crashed the kernel Created: 23/Jul/17 Updated: 11/Sep/17 Resolved: 10/Sep/17 |
|
| Status: | Resolved |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | Lustre 2.11.0 |
| Fix Version/s: | Lustre 2.11.0 |
| Type: | Bug | Priority: | Critical |
| Reporter: | Li Xi (Inactive) | Assignee: | John Hammond |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | patch | ||
| Issue Links: |
|
||||||||||||||||
| Severity: | 3 | ||||||||||||||||
| Rank (Obsolete): | 9223372036854775807 | ||||||||||||||||
| Description |
|
I was testing the latest master branch (b2c8846,
[ 118.118013] ----------- If the client runs on a seperate host, everything will be fine. |
| Comments |
| Comment by Bruno Faccini (Inactive) [ 24/Jul/17 ] |
|
Well, this should have been introduced by recent landing of "LU-8066 obdclass : Add infrastructure for procfs to sysfs migration" master's patch, commit 4594c6656d3224eb4f8eff100a2320df53c05a8f. Do you mean this problem is 100% reproducible on a single-node setup ? |
| Comment by Li Xi (Inactive) [ 25/Jul/17 ] |
|
Hi Bruno, This is 100% reproducable. So, would you please reproduce it? Uploading crash dump costs too much time for me. And I think a reproduce environment is helpful for testing the patch later. |
| Comment by Bruno Faccini (Inactive) [ 25/Jul/17 ] |
|
Sure, I told you |
| Comment by Bruno Faccini (Inactive) [ 26/Jul/17 ] |
|
Humm sorry but, trying to mount/umount Client on a single-node setup does not reproduce for me... To be complete I am using the current master version which has only the following list of patch on top of yours : 6c63418 LU-9500 lnd: Don't Page Align remote_addr with FastReg c084c62 LU-9749 llite: Reduce overhead for ll_do_fast_read 834e942 LU-7129 tests: fsx with directio 25e1cea LU-9772 utils: Enable new ZFS MMP on mkfs 9761b5c LU-9769 lnet: Fix lost lock c4ff984 LU-9019 target: migrate to 64 bit time 829a24f LU-8849 ofd: Client hanges on ladvise with large start values b2c8846 LU-6210 utils: Use C99 struct initializer for long_opt_start ....................... |
| Comment by John Hammond [ 16/Aug/17 ] |
|
For this to be triggered the osp module must be loaded before the osc module. To see why look at the part of osc_setup() that calls lprocfs_obd_setup(). diff --git a/lustre/tests/test-framework.sh b/lustre/tests/test-framework.sh
index e94f941..6e38ca1 100755
--- a/lustre/tests/test-framework.sh
+++ b/lustre/tests/test-framework.sh
@@ -621,7 +621,6 @@ load_modules_local() {
load_module fid/fid
load_module lmv/lmv
load_module mdc/mdc
- load_module osc/osc
load_module lov/lov
load_module mgc/mgc
load_module obdecho/obdecho
@@ -656,6 +655,7 @@ load_modules_local() {
load_module osp/osp
fi
+ load_module osc/osc
load_module llite/lustre
[ -d /r ] && OGDB=${OGDB:-"/r/tmp"}
OGDB=${OGDB:-$TMP}
Then do llmount.sh && umount /mnt/lustre. |
| Comment by John Hammond [ 16/Aug/17 ] |
|
2.10 is not affected but 2.11 is. This was introduced by https://review.whamcloud.com/26020 LU-8066 obdclass : Add infrastructure for procfs to sysfs migration. |
| Comment by Peter Jones [ 16/Aug/17 ] |
|
James Do you have input on this one? Peter |
| Comment by James A Simmons [ 16/Aug/17 ] |
|
I will take a look. I have a idea of what is going on. |
| Comment by Peter Jones [ 23/Aug/17 ] |
|
John knows how to fix this |
| Comment by Gerrit Updater [ 23/Aug/17 ] |
|
John L. Hammond (john.hammond@intel.com) uploaded a new patch: https://review.whamcloud.com/28668 |
| Comment by Gerrit Updater [ 27/Aug/17 ] |
|
James Simmons (uja.ornl@yahoo.com) uploaded a new patch: https://review.whamcloud.com/28747 |
| Comment by Gerrit Updater [ 10/Sep/17 ] |
|
Oleg Drokin (oleg.drokin@intel.com) merged in patch https://review.whamcloud.com/28747/ |
| Comment by Peter Jones [ 10/Sep/17 ] |
|
Landed for 2.11 |
| Comment by Peter Jones [ 10/Sep/17 ] |
|
Does this affect b2_10? |
| Comment by James A Simmons [ 11/Sep/17 ] |
|
No. The sysfs stuff only has landed for 2.11. |
| Comment by Peter Jones [ 11/Sep/17 ] |
|
Thanks James |