[LU-5828] DLC: Cannot verify if the routing buffer has been set or not Created: 30/Oct/14  Updated: 20/Feb/15  Resolved: 20/Feb/15

Status: Resolved
Project: Lustre
Component/s: None
Affects Version/s: Lustre 2.7.0
Fix Version/s: Lustre 2.7.0

Type: Bug Priority: Minor
Reporter: Sarah Liu Assignee: Amir Shehata (Inactive)
Resolution: Fixed Votes: 0
Labels: None
Environment:

server and client: lustre-master build #2702


Severity: 3
Rank (Obsolete): 16349

 Description   

Cannot verify if the routing buffer has been set or not, the same happens on small and large buffers as well.

[root@onyx-27 proc]# more /sys/module/lnet/parameters/forwarding 
enabled
[root@onyx-27 proc]# lnetctl set tiny_buffers 2048
[root@onyx-27 proc]# lnetctl routing show
routing:
    - cpt[0]:
          tiny:
              npages: 0
              nbuffers: 512
              credits: 512
              mincredits: 512
          small:
              npages: 1
              nbuffers: 4096
              credits: 4096
              mincredits: 4096
          large:
              npages: 256
              nbuffers: 256
              credits: 256
              mincredits: 256
    - enable: 1


 Comments   
Comment by Amir Shehata (Inactive) [ 31/Oct/14 ]

When I tested on my VM it works. Please try the following steps:

1. Unload lnet
2. reload lnet
3. enable routing via "lnetctl set routing 1"
4. check values via "lnetctl routing show"
5. increase the value of one of the buffers via "lnetctl set tiny_buffers <number>"
6. check values via "lnetctl routing show"
7. Please paste the output of the above commands in this bug.

Also note that if you reduce the number of buffers the reduction doesn't show immediately, as buffers are freed when they have been used and are being returned. So as traffic starts up, you'll see that the buffers are reduced.

The best way to test so you can immediately see, is to increase the number of buffers.

Comment by Sarah Liu [ 03/Nov/14 ]

Hi this is what I got

[root@onyx-27 ~]# lsmod|grep lnet
[root@onyx-27 ~]# modprobe lnet
LNet: HW CPU cores: 32, npartitions: 4
alg: No test for adler32 (adler32-zlib)
alg: No test for crc32 (crc32-table)
alg: No test for crc32 (crc32-pclmul)
padlock: VIA PadLock Hash Engine not detected.
[root@onyx-27 ~]# lsmod|grep lnet
lnet                  343308  0 
libcfs                491216  1 lnet
[root@onyx-27 ~]# lnetctl set routing 1
add:
    - routing:
          errno: -100
          descr: "cannot enable routing Network is down"
[root@onyx-27 ~]# lctl network up
LNet: Added LNI 192.168.4.65@o2ib [8/256/0/180]
LNet: Added LNI 10.2.4.65@tcp [8/256/0/180]
LNet: Accept secure, port 7988
LNET configured
[root@onyx-27 ~]# lnetctl set routing 1
[root@onyx-27 ~]# lnetctl routing show
routing:
    - cpt[0]:
          tiny:
              npages: 0
              nbuffers: 512
              credits: 512
              mincredits: 512
          small:
              npages: 1
              nbuffers: 4096
              credits: 4096
              mincredits: 4096
          large:
              npages: 256
              nbuffers: 256
              credits: 256
              mincredits: 256
    - enable: 1
[root@onyx-27 ~]# lnetctl set tiny_buffers 1024
[root@onyx-27 ~]# lnetctl routing show
routing:
    - cpt[0]:
          tiny:
              npages: 0
              nbuffers: 512
              credits: 512
              mincredits: 512
          small:
              npages: 1
              nbuffers: 4096
              credits: 4096
              mincredits: 4096
          large:
              npages: 256
              nbuffers: 256
              credits: 256
              mincredits: 256
    - enable: 1
[root@onyx-27 ~]# 

Comment by Amir Shehata (Inactive) [ 04/Nov/14 ]

can you look at /var/log/messages to see if there are any errors?

Comment by Amir Shehata (Inactive) [ 05/Nov/14 ]

onyx-27 has 4 CPTs. However due to overloading of iterator variable in the show routing information function, only the first CPT was displayed.

http://review.whamcloud.com/12593

Just note however, the number of buffers specified on the lnetctl command line is the total number of buffers for all CPTs. That value is then divided by the number of CPTs, so that each CPT has an equal number of buffers. The per CPT number of buffers need to be greater than the minimum per buffer type:

#define LNET_NRB_TINY_MIN»······512»····/* min value for each CPT */
#define LNET_NRB_SMALL_MIN»·····4096»···/* min value for each CPT */
#define LNET_NRB_LARGE_MIN»·····256»····/* min value for each CPT */

The logic is that if you try to set the per CPT number of buffers below the minimum, it's not accepted silently.

As an example the minimum per CPT for tiny buffers is 512
If you have 4 CPTs and you try to set the value of the total number of tiny buffers to 1024. 1024/4 = 256 < 512 so you don't see the change.

Comment by Gerrit Updater [ 09/Dec/14 ]

Oleg Drokin (oleg.drokin@intel.com) merged in patch http://review.whamcloud.com/12593/
Subject: LU-5828 lnet: showing buffers problem with mulitple CPTs
Project: fs/lustre-release
Branch: master
Current Patch Set:
Commit: b4b00be8a93cf06f232d3edc613f03d06b112d32

Comment by Sarah Liu [ 15/Jan/15 ]

This has been verified in build #2808

Comment by Jodi Levi (Inactive) [ 20/Feb/15 ]

Reopening to add fix version.

Generated at Sat Feb 10 01:54:52 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.