[LU-9903] kernel update [RHEL6.9 2.6.32-696.10.1.el6] Created: 22/Aug/17 Updated: 13/Sep/17 Resolved: 13/Sep/17 |
|
| Status: | Resolved |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | None |
| Fix Version/s: | None |
| Type: | Bug | Priority: | Minor |
| Reporter: | Bob Glossman (Inactive) | Assignee: | Bob Glossman (Inactive) |
| Resolution: | Won't Fix | Votes: | 0 |
| Labels: | None | ||
| Issue Links: |
|
||||||||||||
| Severity: | 3 | ||||||||||||
| Rank (Obsolete): | 9223372036854775807 | ||||||||||||
| Description |
|
This update fixes the following bugs: When running a Red Hat Enterprise Linux 6.8 VM with audit watches on With a sunrpc regression in Red Hat Enterprise Linux 6.9, timer code was removed that allowed the NFS client to reset a TCP connection stuck in FIN-WAIT-2 state. This fix adds TCP keepalives for NFS client TCP connections and allows the NFS client to recover the TCP connection if stuck in FIN-WAIT-2. (BZ#1462094) Due to a sunrpc regression introduced in Red Hat Enterprise Linux 6.9 , an NFS client with TCP timestamps initiated a TCP disconnect sequence: the NFS TCP connection could not be reconnected for 60 seconds during the TIME_WAIT state because the source TCP port could not be re-used. As a consequence, during this 60 second period multiple side-effects occurred including unresponsive NFS mount points, an rpciod kernel thread consuming 100% CPU, the "retrans" number of "nfsstat -r" becoming a very large number. In addition, certain mount options returned "not responding" errors and even I/O errors could occur. With this update, a different source port is selected when an NFS TCP connection needs to reconnect during TIME_WAIT. As a result, the NFS TCP connection can reconnect immediately after a disconnect sequence and no longer waits 60 seconds for TIME_WAIT to complete. (BZ#1472128) While running automated array reboots on the InfiniBand Host Channel Adapter (HCA), the system experienced a kernel panic with the crash dump reporting the "Hard Lockup". The provided set of patches ensures the reboot path A fails over to the reboot path B without entering a kernel panic. (BZ#1462097) If multiple tasks attempted to read statistics for Fibre Channel over Ethernet (FCoE) Host Bus Adapter (HBA), the start_req_done completion could be re-initialized while still being used by another task. Consequently, the system crash occurred with the crash dump reporting the "Hard Lockup". This patch adds a mutex to serialize the calls to the bnx2fc_get_host_stats() function, thus fixing this bug. (BZ#1467323) |
| Comments |
| Comment by Gerrit Updater [ 24/Aug/17 ] |
|
Bob Glossman (bob.glossman@intel.com) uploaded a new patch: https://review.whamcloud.com/28684 |
| Comment by Gerrit Updater [ 24/Aug/17 ] |
|
Bob Glossman (bob.glossman@intel.com) uploaded a new patch: https://review.whamcloud.com/28685 |
| Comment by Gerrit Updater [ 25/Aug/17 ] |
|
John L. Hammond (john.hammond@intel.com) merged in patch https://review.whamcloud.com/28685/ |
| Comment by Bob Glossman (Inactive) [ 13/Sep/17 ] |
|
Replaced by |