Details
-
Bug
-
Resolution: Cannot Reproduce
-
Minor
-
None
-
Lustre 2.11.0
-
None
-
Kraken cluster,
2 OSS, 8 OSTs
2 MDS, 4 MDTs
1 client
lustre version - 2.10.55 + dom
branch: lustre-reviews
build - 52057
-
3
-
9223372036854775807
Description
While running FIO on above mentioned setup using command mentioned below, files in directory /mnt/lustre/xxx got corrupt. But when changing the parameter nrfiles=256 it works fine.
fio --name=smallio --ioengine=posixaio --iodepth=32 --directory=/mnt/lustre/dom3 --nrfiles=512 --openfiles=10000 --numjobs=8 --filesize=64k --lockfile=readwrite --bs=4k --rw=ra ndread --buffered=1 --bs_unaligned=1 --file_service_type=random --randrepeat=0 --norandommap --group_reporting=1 --loops=4
[root@kapollo04 lustre]# rm -rf dom3 rm: cannot remove ‘dom3’: Directory not empty
client dmesg
[227470.685094] LustreError: 15069:0:(events.c:199:client_bulk_callback()) event type 2, status -90, desc ffff880eaafd7c00 [227470.686839] LustreError: 15067:0:(events.c:199:client_bulk_callback()) event type 2, status -90, desc ffff880eaafd7c00 [227470.688803] LustreError: 15069:0:(events.c:199:client_bulk_callback()) event type 2, status -90, desc ffff880eaafd7c00 [227470.690502] LustreError: 15070:0:(events.c:199:client_bulk_callback()) event type 2, status -90, desc ffff880eaafd7c00 [227470.692567] LustreError: 15068:0:(events.c:199:client_bulk_callback()) event type 2, status -90, desc ffff880eaafd7c00 [227470.694514] LustreError: 15067:0:(events.c:199:client_bulk_callback()) event type 2, status -90, desc ffff880eaafd7c00 [227470.696363] LustreError: 15070:0:(events.c:199:client_bulk_callback()) event type 2, status -90, desc ffff880eaafd7c00 [227470.698380] LustreError: 15069:0:(events.c:199:client_bulk_callback()) event type 2, status -90, desc ffff880eaafd7c00 [227470.700589] LustreError: 15068:0:(events.c:199:client_bulk_callback()) event type 2, status -90, desc ffff880eaafd7c00 [227470.702449] LustreError: 15067:0:(events.c:199:client_bulk_callback()) event type 2, status -90, desc ffff880eaafd7c00 [227470.704257] LustreError: 15068:0:(events.c:199:client_bulk_callback()) event type 2, status -90, desc ffff880eaafd7c00 [227470.706338] LustreError: 15069:0:(events.c:199:client_bulk_callback()) event type 2, status -90, desc ffff880eaafd7c00 [227470.708125] LustreError: 15067:0:(events.c:199:client_bulk_callback()) event type 2, status -90, desc ffff880eaafd7c00 [227470.710179] LustreError: 15069:0:(events.c:199:client_bulk_callback()) event type 2, status -90, desc ffff880eaafd7c00 [227470.712075] LustreError: 15068:0:(events.c:199:client_bulk_callback()) event type 2, status -90, desc ffff880eaafd7c00 [227471.546843] LustreError: 12768:0:(mdc_request.c:944:mdc_getpage()) lustre-MDT0000-mdc-ffff88105e0f6800: too many resend retries: rc = -5
MDS dmesg
[259415.913026] LustreError: 137-5: nvmefs-MDT0001_UUID: not available for connect from 192.168.213.233@o2ib (no target). If you are running an HA pair check that the target is mounted on the other server. [259415.913667] LustreError: Skipped 71 previous similar messages [259502.137146] LustreError: 20014:0:(ldlm_lib.c:3208:target_bulk_io()) @@@ timeout on bulk READ after 100+0s req@ffff881029e1f450 x1583747470242320/t0(0) o37->24b31bec-af52-1a41-a067-af1c7d84e837@192.168.213.218@o2ib:597/0 lens 568/440 e 3 to 0 dl 1510613657 ref 1 fl Interpret:/2/0 rc 0/0 [260015.863227] LustreError: 137-5: nvmefs-MDT0000_UUID: not available for connect from 192.168.213.233@o2ib (no target). If you are running an HA pair check that the target is mounted on the other server. [260015.863971] LustreError: Skipped 71 previous similar messages [260643.179888] LustreError: 137-5: nvmefs-MDT0000_UUID: not available for connect from 192.168.213.126@o2ib (no target). If you are running an HA pair check that the target is mounted on the other server. [260643.180541] LustreError: Skipped 73 previous similar messages
lustre version - 2.10.55 + dom
branch: lustre-reviews
build - 52057
Which should be same as lustre-master build 3671
This needs to be investigated.
Attachments
Issue Links
- is related to
-
LU-10180 DoM technical debts
- Resolved