Details
-
Bug
-
Resolution: Fixed
-
Minor
-
None
-
None
-
3
-
9223372036854775807
Description
This relates to these crashes in sanity test 270a: https://knox.linuxhacker.ru/crashdb_ui_external.py.cgi?newid=68931
A file is created and should have 4 stripes but one OST gets deactivated and only 3 stripes get created. There is a race condition and if the OST gets deactivated at the wrong moment, then stripe count doesn't correctly get updated and later there is a crash.
Here are relevant lines from the debug log prior to this crash:
00020000:00000001:4.0:1689017245.043107:0:7535:0:(lod_qos.c:2686:lod_qos_prep_create()) Process entered ... 00020000:00000001:4.0:1689017245.043110:0:7535:0:(lod_qos.c:2088:lod_get_stripe_count()) Process leaving (rc=4 : 4 : 4) 00020000:00000010:4.0:1689017245.043118:0:7535:0:(lod_qos.c:2723:lod_qos_prep_create()) kmalloced '(stripe)': 32 at ffff880295611e38. 00020000:00000010:4.0:1689017245.043124:0:7535:0:(lod_qos.c:2726:lod_qos_prep_create()) kmalloced '(ost_indices)': 16 at ffff8802d8267868. 00020000:00001000:4.0:1689017245.043125:0:7535:0:(lod_qos.c:2734:lod_qos_prep_create()) tgt_count 4 stripe_count 4 ... 00020000:00000001:4.0:1689017245.043136:0:7535:0:(lod_qos.c:1533:lod_ost_alloc_qos()) Process entered ... 00020000:00000001:4.0:1689017245.043147:0:7535:0:(lod_qos.c:109:lod_statfs_and_check()) Process entered 00000004:00000001:4.0:1689017245.043149:0:7535:0:(osp_dev.c:795:osp_statfs()) Process entered 00000004:00001000:4.0:1689017245.043150:0:7535:0:(osp_dev.c:815:osp_statfs()) lustre-OST0000-osc-MDT0000: blocks=61184, bfree=1024, bavail=0, bsize=4096, reserved_mb_low=1, reserved_mb_high=3, files=35818, ffree=128, state=20 00000004:00000001:4.0:1689017245.043153:0:7535:0:(osp_dev.c:833:osp_statfs()) Process leaving (rc=0 : 0 : 0) 00020000:01000000:4.0:1689017245.043154:0:7535:0:(lod_qos.c:141:lod_statfs_and_check()) lustre-OST0000-osc-MDT0000: turns inactive 00020000:00000001:4.0:1689017245.043155:0:7535:0:(lod_qos.c:168:lod_statfs_and_check()) Process leaving (rc=18446744073709551588 : -28 : ffffffffffffffe4) ... 00020000:00001000:4.0:1689017245.043173:0:7535:0:(lod_qos.c:1639:lod_ost_alloc_qos()) found 3 good osts ... # there are only 3 of these lines that actually allocated stripes... 00000004:00000010:4.0:1689017245.043188:0:7535:0:(osp_dev.c:118:osp_object_alloc()) slab-alloced 'o': 456 at ffff8801a38115b0. ... 00020000:00000001:4.0:1689017245.043460:0:7535:0:(lod_qos.c:1771:lod_ost_alloc_qos()) Process leaving (rc=0 : 0 : 0) 00020000:00000001:4.0:1689017245.043462:0:7535:0:(lod_qos.c:2820:lod_qos_prep_create()) Process leaving (rc=0 : 0 : 0)
In most cases, lod_ost_alloc_qos() will return -EAGAIN when it can't allocate enough stripes, and then lod_ost_alloc_rr() later will be called.
lod_ost_alloc_rr() will adjust lod_comp->llc_stripe_count if it needs to be reduced because fewer stripes are allocated than requested.
However, in this case, if an OST is deactivated after the call to ltd_qos_is_usable() on line 1592 but before lod_statfs_and_check() on line 1615, then we can end up with fewer stripes than requested, but lod_ost_alloc_qos() still returns 0, not EAGAIN, so the llc_stripe_count is never reduced to the right value.
This can happen as long as the number of available OSTs is greater than stripe_count_min (but less than stripe_count).
Here is how to reproduce. There might be a more elegant way to reproduce this but this works for me...
# this is just to make one OST usage higher so that the QOS algorithm is used instead of RR lfs setstripe -i0 -c1 /mnt/lustre/bigfile && head --bytes=$((1024 * 100000)) /dev/zero > /mnt/lustre/bigfile # on my setup (single VM, 2 MDT, 4 OST), this triggers the LBUG pretty reliably within a few hundred loops for i in {0..500}; do lctl set_param osp.lustre-OST0000-osc-MDT0000.max_create_count=0 & lfs setstripe -c -1 /mnt/lustre/f$i lctl set_param osp.lustre-OST0000-osc-MDT0000.max_create_count=1000 & lfs setstripe -c -1 /mnt/lustre/g$i done
Attachments
Issue Links
- is related to
-
LU-16623 lod_statfs_and_check() does not skip unusable OSTs
- Resolved