[LU-8715] Regression from LU-8057 causes loading of fld.ko hung in 2.7.2 Created: 18/Oct/16 Updated: 18/Apr/18 Resolved: 18/Apr/18 |
|
| Status: | Resolved |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | Lustre 2.7.0 |
| Fix Version/s: | None |
| Type: | Bug | Priority: | Critical |
| Reporter: | Jay Lan (Inactive) | Assignee: | Amir Shehata (Inactive) |
| Resolution: | Cannot Reproduce | Votes: | 0 |
| Labels: | None | ||
| Environment: |
lustre server nas-2.7.2-3nasS running in centos 6.7. |
||
| Issue Links: |
|
||||||||
| Severity: | 3 | ||||||||
| Rank (Obsolete): | 9223372036854775807 | ||||||||
| Description |
|
Since our nas-2.7.2-2nas rebased to b2_7_fe to nas-2.7.2-3nas, we found loading lustre module fld.ko hanged. Modprobe took 100% cpu time and could not be killed. I identified the culprit of the problem using git bisect: It was a b2_7_fe back port from the following one: |
| Comments |
| Comment by Bruno Faccini (Inactive) [ 18/Oct/16 ] |
|
Well, both the failure and suspected cause look surprising. |
| Comment by Mahmoud Hanafi [ 18/Oct/16 ] |
|
Module load time before was about 2-5mins, because we have large ntx values. |
| Comment by James A Simmons [ 18/Oct/16 ] |
|
The fix is correct and it fixes a real bug. What this change did is exposed another problem in the ko2iblnd driver. I have to ask is your system really consuming all those credits? I don't think the IB driver queue pair depth is big enough to handle all those credits. |
| Comment by Joseph Gmitter (Inactive) [ 18/Oct/16 ] |
|
Hi Doug, Can you please have a look into the issue since it relates to the Thanks. |
| Comment by Jay Lan (Inactive) [ 18/Oct/16 ] |
|
@Bruno Faccini: Yes, I can reproduce the problem on our freshly rebooted lustre servers by doing 'modprobe fld.' |
| Comment by Mahmoud Hanafi [ 18/Oct/16 ] |
|
we have >12,000 clients. We do see some servers consume all the credits. |
| Comment by Mahmoud Hanafi [ 18/Oct/16 ] |
|
perf top showed during module load all the time is spent in __vmalloc_node. Samples: 748K of event 'cycles', Event count (approx.): 53812402443 Overhead Shared Object Symbol 96.21% [kernel] [k] __vmalloc_node 0.91% [kernel] [k] read_hpet 0.28% [kernel] [k] get_vmalloc_info 0.26% [kernel] [k] __write_lock_failed 0.25% [kernel] [k] __read_lock_failed 0.05% [kernel] [k] apic_timer_interrupt 0.05% [kernel] [k] _spin_lock 0.04% perf [.] dso__find_symbol 0.03% [kernel] [k] find_busiest_group 0.03% [kernel] [k] clear_page_c 0.03% [kernel] [k] page_fault 0.03% [kernel] [k] memset 0.02% [kernel] [k] rcu_process_gp_end 0.02% perf [.] perf_evsel__parse_sample 0.02% [kernel] [k] sha_transform 0.02% [kernel] [k] native_write_msr_safe |
| Comment by James A Simmons [ 18/Oct/16 ] |
|
I know exactly what your problem is. We saw this problem in the lustre core some time ago and changed the OBD_ALLOC macros. The libcfs/LNet layer uses it own LIBCFS_ALLOC macros which means when the allocations are more than 2 pages in size they hit the vmalloc spinlock serialization issue. We need a fix for libcfs much like lustre had. |
| Comment by Doug Oucharek (Inactive) [ 18/Oct/16 ] |
|
James, can we do that fix under this ticket? |
| Comment by James A Simmons [ 18/Oct/16 ] |
|
Why not. The problem is the LIBCFS_ALLOC and FREE macros. Looking at the macros gave me a headache so no patch from me. I need to get into the right mental state to tackle it |
| Comment by Mahmoud Hanafi [ 23/Oct/16 ] |
|
Any updates? |
| Comment by Doug Oucharek (Inactive) [ 24/Oct/16 ] |
|
As best we can figure out, the change in As James has indicated, a proper fix will be to change how to allocate memory in LNet. That is going to take some time to get right as the potential to break all of LNet is pretty good. I don't believe the fix for |
| Comment by Jay Lan (Inactive) [ 25/Oct/16 ] |
|
Doug, You wrote in previous commet: Did you actually mean to write " |
| Comment by Doug Oucharek (Inactive) [ 25/Oct/16 ] |
|
Yes, sorry about that. |
| Comment by James A Simmons [ 18/Apr/18 ] |
|
Is this still a problem
|
| Comment by Jay Lan (Inactive) [ 18/Apr/18 ] |
|
This case can be closed. Thanks. |
| Comment by Peter Jones [ 18/Apr/18 ] |
|
ok thanks! |