Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-9943

LU-7124 caused a connection problems under load.

Details

    • Bug
    • Resolution: Fixed
    • Critical
    • Lustre 2.11.0
    • Lustre 2.8.0, Lustre 2.9.0, Lustre 2.10.0
    • 3
    • 9223372036854775807

    Description

      LU-7124 is completely incorrect patch.
      It decrease a WR array into kernel, but o2ib lnd stay assume about own queue depth.
      It caused situation

      Aug  8 21:53:00 lstr1n07 kernel: mlx5_warn:mlx5_0:mlx5_ib_post_send:4184:(pid 26272): 
      Failed to prepare WQE
      Aug  8 21:53:00 lstr1n07 kernel: mlx5_warn:mlx5_0:begin_wqe:4085:(pid 9590): work queue overflow
      

      after several ENOMEM hits.

      Attachments

        Activity

          People

            ashehata Amir Shehata (Inactive)
            shadow Alexey Lyashkov
            Votes:
            0 Vote for this issue
            Watchers:
            6 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: