[LU-11939] ASSERTION( tgd->tgd_tot_granted >= ted->ted_grant ) on OSS - Whamcloud Community JIRA

Details

Type: Bug
Resolution: Fixed
Priority: Critical
Fix Version/s: Lustre 2.14.0, Lustre 2.12.8
Affects Version/s: Lustre 2.12.0
Labels:
None
Environment:
CentOS 7.6, 3.10.0-957.1.3.el7_lustre.x86_64

Severity:
3
Rank (Obsolete):
9223372036854775807

Description

We just hit the following LBUG with Lustre 2.12 on an OSS (Fir). All clients are running Lustre 2.12 also.

[1708550.581820] LustreError: 123124:0:(tgt_grant.c:1079:tgt_grant_discard()) ASSERTION( tgd->tgd_tot_granted >= ted->ted_grant ) failed: fir-OST001b: tot_granted 50041695803 cli d5e4b60f-fe33-b991-7d48-5b8db7e07ab0/ffff926b10975c00 ted_grant -49152
[1708550.603611] LustreError: 123124:0:(tgt_grant.c:1079:tgt_grant_discard()) LBUG
[1708550.610923] Pid: 123124, comm: ll_ost00_019 3.10.0-957.1.3.el7_lustre.x86_64 #1 SMP Fri Dec 7 14:50:35 PST 2018
[1708550.621180] Call Trace:
[1708550.623814]  [<ffffffffc0aa37cc>] libcfs_call_trace+0x8c/0xc0 [libcfs]
[1708550.630548]  [<ffffffffc0aa387c>] lbug_with_loc+0x4c/0xa0 [libcfs]
[1708550.636935]  [<ffffffffc0f220bc>] tgt_grant_discard+0x1dc/0x1e0 [ptlrpc]
[1708550.643892]  [<ffffffffc14c81d4>] ofd_obd_disconnect+0x74/0x220 [ofd]
[1708550.650541]  [<ffffffffc0e60157>] target_handle_disconnect+0xd7/0x450 [ptlrpc]
[1708550.658005]  [<ffffffffc0efeb77>] tgt_disconnect+0x37/0x140 [ptlrpc]
[1708550.664609]  [<ffffffffc0f0635a>] tgt_request_handle+0xaea/0x1580 [ptlrpc]
[1708550.671734]  [<ffffffffc0eaa92b>] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc]
[1708550.679628]  [<ffffffffc0eae25c>] ptlrpc_main+0xafc/0x1fc0 [ptlrpc]
[1708550.686136]  [<ffffffff8dcc1c31>] kthread+0xd1/0xe0
[1708550.691224]  [<ffffffff8e374c24>] ret_from_fork_nospec_begin+0xe/0x21
[1708550.697873]  [<ffffffffffffffff>] 0xffffffffffffffff
[1708550.703065] Kernel panic - not syncing: LBUG
[1708550.707509] CPU: 20 PID: 123124 Comm: ll_ost00_019 Kdump: loaded Tainted: G           OE  ------------   3.10.0-957.1.3.el7_lustre.x86_64 #1
[1708550.720273] Hardware name: Dell Inc. PowerEdge R6415/065PKD, BIOS 1.6.7 10/29/2018
[1708550.728015] Call Trace:
[1708550.730645]  [<ffffffff8e361e41>] dump_stack+0x19/0x1b
[1708550.735962]  [<ffffffff8e35b550>] panic+0xe8/0x21f
[1708550.740937]  [<ffffffffc0aa38cb>] lbug_with_loc+0x9b/0xa0 [libcfs]
[1708550.747346]  [<ffffffffc0f220bc>] tgt_grant_discard+0x1dc/0x1e0 [ptlrpc]
[1708550.754230]  [<ffffffffc14c81d4>] ofd_obd_disconnect+0x74/0x220 [ofd]
[1708550.760880]  [<ffffffffc0e9ed81>] ? lustre_pack_reply+0x11/0x20 [ptlrpc]
[1708550.767783]  [<ffffffffc0ec3933>] ? req_capsule_server_pack+0x43/0xf0 [ptlrpc]
[1708550.775207]  [<ffffffffc0e60157>] target_handle_disconnect+0xd7/0x450 [ptlrpc]
[1708550.782634]  [<ffffffffc0efeb77>] tgt_disconnect+0x37/0x140 [ptlrpc]
[1708550.789194]  [<ffffffffc0f0635a>] tgt_request_handle+0xaea/0x1580 [ptlrpc]
[1708550.796272]  [<ffffffffc0edfa51>] ? ptlrpc_nrs_req_get_nolock0+0xd1/0x170 [ptlrpc]
[1708550.804022]  [<ffffffffc0aa3bde>] ? ktime_get_real_seconds+0xe/0x10 [libcfs]
[1708550.811281]  [<ffffffffc0eaa92b>] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc]
[1708550.819142]  [<ffffffffc0ea77b5>] ? ptlrpc_wait_event+0xa5/0x360 [ptlrpc]
[1708550.826110]  [<ffffffff8dcd67c2>] ? default_wake_function+0x12/0x20
[1708550.832548]  [<ffffffff8dccba9b>] ? __wake_up_common+0x5b/0x90
[1708550.838589]  [<ffffffffc0eae25c>] ptlrpc_main+0xafc/0x1fc0 [ptlrpc]
[1708550.845068]  [<ffffffffc0ead760>] ? ptlrpc_register_service+0xf80/0xf80 [ptlrpc]
[1708550.852636]  [<ffffffff8dcc1c31>] kthread+0xd1/0xe0
[1708550.857688]  [<ffffffff8dcc1b60>] ? insert_kthread_work+0x40/0x40
[1708550.863956]  [<ffffffff8e374c24>] ret_from_fork_nospec_begin+0xe/0x21
[1708550.870567]  [<ffffffff8dcc1b60>] ? insert_kthread_work+0x40/0x40

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending
- Thumbnails
- List
- Download All

vmcore-dmesg_fir-io1-s2_2019-02-06_21_29_56.txt
930 kB
07/Feb/19 11:14 PM
vmcore-dmesg_fir-io3-s2_2019-02-06_12_50_35.txt
912 kB
07/Feb/19 11:14 PM

Issue Links

duplicates

LU-12120 LustreError: 15069:0:(tgt_grant.c:561:tgt_grant_incoming()) LBUG

Resolved

Activity

[LU-11939] ASSERTION( tgd->tgd_tot_granted >= ted->ted_grant ) on OSS

Peter Jones added a comment - 18/Sep/19 11:20 AM

Mike confirms that this is a duplicate of ~~LU-12120~~

Peter Jones added a comment - 18/Sep/19 11:20 AM Mike confirms that this is a duplicate of LU-12120

Patrick Farrell (Inactive) added a comment - 12/Jul/19 6:51 PM

Yes, that looks like the right one. Do you agree that should take care of this issue as well?

Patrick Farrell (Inactive) added a comment - 12/Jul/19 6:51 PM Yes, that looks like the right one. Do you agree that should take care of this issue as well?

Mikhail Pershin added a comment - 12/Jul/19 6:42 PM

Patrick, do you mean patch from ~~LU-12120~~?

Mikhail Pershin added a comment - 12/Jul/19 6:42 PM Patrick, do you mean patch from LU-12120 ?

Patrick Farrell (Inactive) added a comment - 12/Jul/19 2:13 PM

tappro:

Mike,

Didn't you fix this grant bug in another LU? I can't find it right now...

Patrick Farrell (Inactive) added a comment - 12/Jul/19 2:13 PM tappro : Mike, Didn't you fix this grant bug in another LU? I can't find it right now...

Patrick Farrell (Inactive) added a comment - 08/Apr/19 7:03 PM

Nah, we've still got a patch to track under this

Patrick Farrell (Inactive) added a comment - 08/Apr/19 7:03 PM Nah, we've still got a patch to track under this

Peter Jones added a comment - 16/Feb/19 4:47 PM

So ok to close this one as a duplicate of ~~LU-11919~~?

Peter Jones added a comment - 16/Feb/19 4:47 PM So ok to close this one as a duplicate of LU-11919 ?

Stephane Thiell added a comment - 09/Feb/19 7:41 AM

OK, we never noticed that before (with 2.10 clients). Thanks for your help! I used set_param -P on the MGS of Fir to set max_dirty_mb to 256 and it did work.

lctl set_param -P osc.*.max_dirty_mb=256

[root@sh-ln06 ~]# lctl get_param osc.*.max_dirty_mb
osc.fir-OST0000-osc-ffff9bad01395000.max_dirty_mb=256
osc.fir-OST0001-osc-ffff9bad01395000.max_dirty_mb=256
...
osc.fir-OST002e-osc-ffff9bad01395000.max_dirty_mb=256
osc.fir-OST002f-osc-ffff9bad01395000.max_dirty_mb=256
osc.oak-OST0000-osc-ffff9baceaa3d800.max_dirty_mb=256
osc.oak-OST0001-osc-ffff9baceaa3d800.max_dirty_mb=256
...
osc.oak-OST0070-osc-ffff9baceaa3d800.max_dirty_mb=256
osc.oak-OST0071-osc-ffff9baceaa3d800.max_dirty_mb=256
osc.regal-OST0000-osc-ffff9bace6e28800.max_dirty_mb=256
osc.regal-OST0001-osc-ffff9bace6e28800.max_dirty_mb=256
osc.regal-OST0002-osc-ffff9bace6e28800.max_dirty_mb=256
...
osc.regal-OST006b-osc-ffff9bace6e28800.max_dirty_mb=256

So that should be much better. I'll report any new event regarding this issue, but so far so good. Thanks again.

Stephane Thiell added a comment - 09/Feb/19 7:41 AM OK, we never noticed that before (with 2.10 clients). Thanks for your help! I used set_param -P on the MGS of Fir to set max_dirty_mb to 256 and it did work. lctl set_param -P osc.*.max_dirty_mb=256 [root@sh-ln06 ~]# lctl get_param osc.*.max_dirty_mb osc.fir-OST0000-osc-ffff9bad01395000.max_dirty_mb=256 osc.fir-OST0001-osc-ffff9bad01395000.max_dirty_mb=256 ... osc.fir-OST002e-osc-ffff9bad01395000.max_dirty_mb=256 osc.fir-OST002f-osc-ffff9bad01395000.max_dirty_mb=256 osc.oak-OST0000-osc-ffff9baceaa3d800.max_dirty_mb=256 osc.oak-OST0001-osc-ffff9baceaa3d800.max_dirty_mb=256 ... osc.oak-OST0070-osc-ffff9baceaa3d800.max_dirty_mb=256 osc.oak-OST0071-osc-ffff9baceaa3d800.max_dirty_mb=256 osc.regal-OST0000-osc-ffff9bace6e28800.max_dirty_mb=256 osc.regal-OST0001-osc-ffff9bace6e28800.max_dirty_mb=256 osc.regal-OST0002-osc-ffff9bace6e28800.max_dirty_mb=256 ... osc.regal-OST006b-osc-ffff9bace6e28800.max_dirty_mb=256 So that should be much better. I'll report any new event regarding this issue, but so far so good. Thanks again.

Patrick Farrell (Inactive) added a comment - 09/Feb/19 1:57 AM

Hmm, I'm not familiar with the script, so I don't really know. I don't think so, though...?

It's possible you're hitting:
https://jira.whamcloud.com/browse/LU-11919

Which is basically "cl_max_dirty_mb is supposed to start at zero, but instead starts with whatever was in memory". Then, whatever was in memory is processed like it was a setting from userspace. So if it's not zero (the most likely case, especially at startup), it's reasonably likely (though not guaranteed - it's more complicated than just "existing value in memory is > 2000 means 2000") to get set to the max.

Anyway, you can override that with a set_param -P.

Patrick Farrell (Inactive) added a comment - 09/Feb/19 1:57 AM Hmm, I'm not familiar with the script, so I don't really know. I don't think so, though...? It's possible you're hitting: https://jira.whamcloud.com/browse/LU-11919 Which is basically "cl_max_dirty_mb is supposed to start at zero, but instead starts with whatever was in memory". Then, whatever was in memory is processed like it was a setting from userspace. So if it's not zero (the most likely case, especially at startup), it's reasonably likely (though not guaranteed - it's more complicated than just "existing value in memory is > 2000 means 2000") to get set to the max. Anyway, you can override that with a set_param -P.

Stephane Thiell added a comment - 09/Feb/19 1:01 AM

Wow, thanks much for the detailed explanation. This is SUPER helpful. But... I don't think we have explicitly changed the value of max_dirty_mb. So I've been trying to track down why it is so high for ALL of our Lustre filesystems mounted on Sherlock (regal, oak and fir). If I understand correctly, /sys/fs/lustre/max_dirty_mb is used by the udev script provided by the lustre-client RPM right? and then the values probably max out at 2000?

[root@sh-ln06 ~]# cat /sys/fs/lustre/version
2.12.0
[root@sh-ln06 ~]# cat /sys/fs/lustre/max_dirty_mb 
32107
[root@sh-ln06 ~]# ls -l /sys/fs/lustre/max_dirty_mb
-rw-r--r-- 1 root root 4096 Feb  8 16:54 /sys/fs/lustre/max_dirty_mb
[root@sh-ln06 ~]# cat /etc/udev/rules.d/99-lustre.rules
KERNEL=="obd", MODE="0666"

# set sysfs values on client
SUBSYSTEM=="lustre", ACTION=="change", ENV{PARAM}=="?*", RUN+="/usr/sbin/lctl set_param '$env{PARAM}=$env{SETTING}'"


[root@sh-ln06 ~]# rpm -q --info lustre-client
Name        : lustre-client
Version     : 2.12.0
Release     : 1.el7
Architecture: x86_64
Install Date: Tue 05 Feb 2019 05:17:58 PM PST
Group       : System Environment/Kernel
Size        : 2007381
License     : GPL
Signature   : (none)
Source RPM  : lustre-client-2.12.0-1.el7.src.rpm
Build Date  : Fri 21 Dec 2018 01:53:18 PM PST
Build Host  : trevis-307-el7-x8664-3.trevis.whamcloud.com
Relocations : (not relocatable)
URL         : https://wiki.whamcloud.com/
Summary     : Lustre File System
Description :
Userspace tools and files for the Lustre file system.


[root@sh-ln06 ~]# lctl get_param osc.*.max_dirty_mb
osc.fir-OST0000-osc-ffff9bad01395000.max_dirty_mb=2000
osc.fir-OST0001-osc-ffff9bad01395000.max_dirty_mb=2000
...
osc.fir-OST002e-osc-ffff9bad01395000.max_dirty_mb=2000
osc.fir-OST002f-osc-ffff9bad01395000.max_dirty_mb=2000
osc.oak-OST0000-osc-ffff9baceaa3d800.max_dirty_mb=2000
osc.oak-OST0001-osc-ffff9baceaa3d800.max_dirty_mb=2000
...
osc.oak-OST0071-osc-ffff9baceaa3d800.max_dirty_mb=2000
osc.regal-OST0000-osc-ffff9bace6e28800.max_dirty_mb=2000
...
osc.regal-OST006a-osc-ffff9bace6e28800.max_dirty_mb=2000
osc.regal-OST006b-osc-ffff9bace6e28800.max_dirty_mb=2000

Stephane Thiell added a comment - 09/Feb/19 1:01 AM Wow, thanks much for the detailed explanation. This is SUPER helpful. But... I don't think we have explicitly changed the value of max_dirty_mb . So I've been trying to track down why it is so high for ALL of our Lustre filesystems mounted on Sherlock (regal, oak and fir). If I understand correctly, /sys/fs/lustre/max_dirty_mb is used by the udev script provided by the lustre-client RPM right? and then the values probably max out at 2000? [root@sh-ln06 ~]# cat /sys/fs/lustre/version 2.12.0 [root@sh-ln06 ~]# cat /sys/fs/lustre/max_dirty_mb 32107 [root@sh-ln06 ~]# ls -l /sys/fs/lustre/max_dirty_mb -rw-r--r-- 1 root root 4096 Feb 8 16:54 /sys/fs/lustre/max_dirty_mb [root@sh-ln06 ~]# cat /etc/udev/rules.d/99-lustre.rules KERNEL=="obd", MODE="0666" # set sysfs values on client SUBSYSTEM=="lustre", ACTION=="change", ENV{PARAM}=="?*", RUN+="/usr/sbin/lctl set_param '$env{PARAM}=$env{SETTING}'" [root@sh-ln06 ~]# rpm -q --info lustre-client Name : lustre-client Version : 2.12.0 Release : 1.el7 Architecture: x86_64 Install Date: Tue 05 Feb 2019 05:17:58 PM PST Group : System Environment/Kernel Size : 2007381 License : GPL Signature : (none) Source RPM : lustre-client-2.12.0-1.el7.src.rpm Build Date : Fri 21 Dec 2018 01:53:18 PM PST Build Host : trevis-307-el7-x8664-3.trevis.whamcloud.com Relocations : (not relocatable) URL : https://wiki.whamcloud.com/ Summary : Lustre File System Description : Userspace tools and files for the Lustre file system. [root@sh-ln06 ~]# lctl get_param osc.*.max_dirty_mb osc.fir-OST0000-osc-ffff9bad01395000.max_dirty_mb=2000 osc.fir-OST0001-osc-ffff9bad01395000.max_dirty_mb=2000 ... osc.fir-OST002e-osc-ffff9bad01395000.max_dirty_mb=2000 osc.fir-OST002f-osc-ffff9bad01395000.max_dirty_mb=2000 osc.oak-OST0000-osc-ffff9baceaa3d800.max_dirty_mb=2000 osc.oak-OST0001-osc-ffff9baceaa3d800.max_dirty_mb=2000 ... osc.oak-OST0071-osc-ffff9baceaa3d800.max_dirty_mb=2000 osc.regal-OST0000-osc-ffff9bace6e28800.max_dirty_mb=2000 ... osc.regal-OST006a-osc-ffff9bace6e28800.max_dirty_mb=2000 osc.regal-OST006b-osc-ffff9bace6e28800.max_dirty_mb=2000

Patrick Farrell (Inactive) added a comment - 08/Feb/19 9:23 PM

Pages per RPC isn't a big deal (so, no, probably not), but max_dirty_mb may be.

Hmm, 2000 is not the default (the default is, I think, max_rpcs_in_flight * RPC size... It's certainly much smaller than this value.). So that's getting set somewhere.

It's also potentially too high. max_dirty_mb is used in calculating grant requests, and there are some grant overflow bugs that occur when it's set that high. (Particularly with 16 MiB RPCs.) All the ones I know of are fixed in 2.12, but...

I strongly suspect you may have hit an overflow, leading to the grant inconsistency, leading to this crash.

The grant value reported for the export for this client is negative - ted_grant -49152 in one case, in the other ted_grant -12582912. These small-ish negative values strongly suggest overflow. The server side value being compared against (tot_granted) is unsigned, and comparison with this negative value is why the "total grant >= grant for this export" assert we hit failed. (The fact that your max_dirty_mb is at 2 GiB just makes the overflow explanation more likely.)

Your max_dirty_mb is above what should help performance, so tuning it down is a good idea.

The rule I use for max_dirty_mb is 2 * mb_per_rpc * rpcs_in_flight - the idea being that you can accumulate some dirty data so you're always ready to make an RPC when one completes, but you don't have tons of dirty data sitting around if it isn't getting processed fast enough. (There's some docs floating around that say 4*, but with RPC sizes and counts increasing, that tends to be too much data. 2* should be plenty for good performance.)

So for you, that's 2*16*8 = 256, or in the case of your RBH nodes, that's 2*16*32=1024.

So I'd suggest turning down your max_dirty_mb to no more than 1 GiB.

Patrick Farrell (Inactive) added a comment - 08/Feb/19 9:23 PM Pages per RPC isn't a big deal (so, no, probably not), but max_dirty_mb may be. Hmm, 2000 is not the default (the default is, I think, max_rpcs_in_flight * RPC size... It's certainly much smaller than this value.). So that's getting set somewhere. It's also potentially too high. max_dirty_mb is used in calculating grant requests, and there are some grant overflow bugs that occur when it's set that high. (Particularly with 16 MiB RPCs.) All the ones I know of are fixed in 2.12, but... I strongly suspect you may have hit an overflow, leading to the grant inconsistency, leading to this crash. The grant value reported for the export for this client is negative - ted_grant -49152 in one case, in the other ted_grant -12582912. These small-ish negative values strongly suggest overflow. The server side value being compared against (tot_granted) is unsigned, and comparison with this negative value is why the "total grant >= grant for this export" assert we hit failed. (The fact that your max_dirty_mb is at 2 GiB just makes the overflow explanation more likely.) Your max_dirty_mb is above what should help performance, so tuning it down is a good idea. The rule I use for max_dirty_mb is 2 * mb_per_rpc * rpcs_in_flight - the idea being that you can accumulate some dirty data so you're always ready to make an RPC when one completes, but you don't have tons of dirty data sitting around if it isn't getting processed fast enough. (There's some docs floating around that say 4*, but with RPC sizes and counts increasing, that tends to be too much data. 2* should be plenty for good performance.) So for you, that's 2*16*8 = 256, or in the case of your RBH nodes, that's 2*16*32=1024. So I'd suggest turning down your max_dirty_mb to no more than 1 GiB.

Stephane Thiell added a comment - 08/Feb/19 8:13 PM

Thanks Patrick!

On our clients, we have:

max_rpcs_in_flight=8 (default), only the data transfer nodes and robinhood server have max_rpcs_in_flight=32
max_dirty_mb=2000 (default), only the data transfer nodes and robinhood server have max_dirty_mb=128

As for max_pages_per_rpc, it should be set to 4096 and brw_size=16, but I noticed that it doesn't seem to be the case on all clients:

[root@sh-101-20 ~]# cd /proc/fs/lustre/osc; for o in fir-*; do echo -n "$o:"; cat $o/max_pages_per_rpc; done
fir-OST0000-osc-ffff9d0cad3de000:4096
fir-OST0001-osc-ffff9d0cad3de000:4096
fir-OST0002-osc-ffff9d0cad3de000:1024
fir-OST0003-osc-ffff9d0cad3de000:1024
fir-OST0004-osc-ffff9d0cad3de000:1024
fir-OST0005-osc-ffff9d0cad3de000:1024
fir-OST0006-osc-ffff9d0cad3de000:1024
fir-OST0007-osc-ffff9d0cad3de000:4096
fir-OST0008-osc-ffff9d0cad3de000:1024
fir-OST0009-osc-ffff9d0cad3de000:1024
fir-OST000a-osc-ffff9d0cad3de000:1024
fir-OST000b-osc-ffff9d0cad3de000:1024
fir-OST000c-osc-ffff9d0cad3de000:1024
fir-OST000d-osc-ffff9d0cad3de000:1024
fir-OST000e-osc-ffff9d0cad3de000:1024
fir-OST000f-osc-ffff9d0cad3de000:1024
fir-OST0010-osc-ffff9d0cad3de000:4096
fir-OST0011-osc-ffff9d0cad3de000:4096
fir-OST0012-osc-ffff9d0cad3de000:4096
fir-OST0013-osc-ffff9d0cad3de000:1024
fir-OST0014-osc-ffff9d0cad3de000:1024
fir-OST0015-osc-ffff9d0cad3de000:4096
fir-OST0016-osc-ffff9d0cad3de000:1024
fir-OST0017-osc-ffff9d0cad3de000:1024
fir-OST0018-osc-ffff9d0cad3de000:4096
fir-OST0019-osc-ffff9d0cad3de000:4096
fir-OST001a-osc-ffff9d0cad3de000:1024
fir-OST001b-osc-ffff9d0cad3de000:1024
fir-OST001c-osc-ffff9d0cad3de000:4096
fir-OST001d-osc-ffff9d0cad3de000:4096
fir-OST001e-osc-ffff9d0cad3de000:1024
fir-OST001f-osc-ffff9d0cad3de000:1024
fir-OST0020-osc-ffff9d0cad3de000:4096
fir-OST0021-osc-ffff9d0cad3de000:4096
fir-OST0022-osc-ffff9d0cad3de000:1024
fir-OST0023-osc-ffff9d0cad3de000:1024
fir-OST0024-osc-ffff9d0cad3de000:1024
fir-OST0025-osc-ffff9d0cad3de000:1024
fir-OST0026-osc-ffff9d0cad3de000:1024
fir-OST0027-osc-ffff9d0cad3de000:1024
fir-OST0028-osc-ffff9d0cad3de000:1024
fir-OST0029-osc-ffff9d0cad3de000:1024
fir-OST002a-osc-ffff9d0cad3de000:4096
fir-OST002b-osc-ffff9d0cad3de000:4096
fir-OST002c-osc-ffff9d0cad3de000:4096
fir-OST002d-osc-ffff9d0cad3de000:1024
fir-OST002e-osc-ffff9d0cad3de000:1024
fir-OST002f-osc-ffff9d0cad3de000:1024

We used this on the MGS:

lctl set_param -P fir-OST*.osc.max_pages_per_rpc=4096

So this isn't good. I just re-applied this command on the MGS:

[138882.544463] Lustre: Modifying parameter osc.fir-OST*.osc.max_pages_per_rpc in log params

and a newly mounted client is now set up at 4096. Do you think that could have caused this issue?

As for brw_size:

# clush -w @oss -b 'cat /proc/fs/lustre/obdfilter/*/brw_size'
---------------
fir-io[1-4]-s[1-2] (8)
---------------
16
16
16
16
16
16

Stephane Thiell added a comment - 08/Feb/19 8:13 PM Thanks Patrick! On our clients, we have: max_rpcs_in_flight=8 (default), only the data transfer nodes and robinhood server have max_rpcs_in_flight=32 max_dirty_mb=2000 (default), only the data transfer nodes and robinhood server have max_dirty_mb=128 As for max_pages_per_rpc, it should be set to 4096 and brw_size=16, but I noticed that it doesn't seem to be the case on all clients: [root@sh-101-20 ~]# cd /proc/fs/lustre/osc; for o in fir-*; do echo -n "$o:"; cat $o/max_pages_per_rpc; done fir-OST0000-osc-ffff9d0cad3de000:4096 fir-OST0001-osc-ffff9d0cad3de000:4096 fir-OST0002-osc-ffff9d0cad3de000:1024 fir-OST0003-osc-ffff9d0cad3de000:1024 fir-OST0004-osc-ffff9d0cad3de000:1024 fir-OST0005-osc-ffff9d0cad3de000:1024 fir-OST0006-osc-ffff9d0cad3de000:1024 fir-OST0007-osc-ffff9d0cad3de000:4096 fir-OST0008-osc-ffff9d0cad3de000:1024 fir-OST0009-osc-ffff9d0cad3de000:1024 fir-OST000a-osc-ffff9d0cad3de000:1024 fir-OST000b-osc-ffff9d0cad3de000:1024 fir-OST000c-osc-ffff9d0cad3de000:1024 fir-OST000d-osc-ffff9d0cad3de000:1024 fir-OST000e-osc-ffff9d0cad3de000:1024 fir-OST000f-osc-ffff9d0cad3de000:1024 fir-OST0010-osc-ffff9d0cad3de000:4096 fir-OST0011-osc-ffff9d0cad3de000:4096 fir-OST0012-osc-ffff9d0cad3de000:4096 fir-OST0013-osc-ffff9d0cad3de000:1024 fir-OST0014-osc-ffff9d0cad3de000:1024 fir-OST0015-osc-ffff9d0cad3de000:4096 fir-OST0016-osc-ffff9d0cad3de000:1024 fir-OST0017-osc-ffff9d0cad3de000:1024 fir-OST0018-osc-ffff9d0cad3de000:4096 fir-OST0019-osc-ffff9d0cad3de000:4096 fir-OST001a-osc-ffff9d0cad3de000:1024 fir-OST001b-osc-ffff9d0cad3de000:1024 fir-OST001c-osc-ffff9d0cad3de000:4096 fir-OST001d-osc-ffff9d0cad3de000:4096 fir-OST001e-osc-ffff9d0cad3de000:1024 fir-OST001f-osc-ffff9d0cad3de000:1024 fir-OST0020-osc-ffff9d0cad3de000:4096 fir-OST0021-osc-ffff9d0cad3de000:4096 fir-OST0022-osc-ffff9d0cad3de000:1024 fir-OST0023-osc-ffff9d0cad3de000:1024 fir-OST0024-osc-ffff9d0cad3de000:1024 fir-OST0025-osc-ffff9d0cad3de000:1024 fir-OST0026-osc-ffff9d0cad3de000:1024 fir-OST0027-osc-ffff9d0cad3de000:1024 fir-OST0028-osc-ffff9d0cad3de000:1024 fir-OST0029-osc-ffff9d0cad3de000:1024 fir-OST002a-osc-ffff9d0cad3de000:4096 fir-OST002b-osc-ffff9d0cad3de000:4096 fir-OST002c-osc-ffff9d0cad3de000:4096 fir-OST002d-osc-ffff9d0cad3de000:1024 fir-OST002e-osc-ffff9d0cad3de000:1024 fir-OST002f-osc-ffff9d0cad3de000:1024 We used this on the MGS: lctl set_param -P fir-OST*.osc.max_pages_per_rpc=4096 So this isn't good. I just re-applied this command on the MGS: [138882.544463] Lustre: Modifying parameter osc.fir-OST*.osc.max_pages_per_rpc in log params and a newly mounted client is now set up at 4096. Do you think that could have caused this issue? As for brw_size: # clush -w @oss -b 'cat /proc/fs/lustre/obdfilter/*/brw_size' --------------- fir-io[1-4]-s[1-2] (8) --------------- 16 16 16 16 16 16

People

Assignee:: Mikhail Pershin

Reporter:: Stephane Thiell

Votes:: 0 Vote for this issue

Watchers:: 7 Start watching this issue

Dates

Created:: 06/Feb/19 9:25 PM

Updated:: 22/Nov/21 3:52 PM

Resolved:: 18/Nov/21 12:11 AM