[LU-14187] LU-14134 causes many racer timeouts Created: 07/Dec/20 Updated: 01/Feb/21 Resolved: 05/Jan/21 |
|
| Status: | Resolved |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | None |
| Fix Version/s: | Lustre 2.14.0 |
| Type: | Bug | Priority: | Minor |
| Reporter: | Alex Zhuravlev | Assignee: | Wang Shilong (Inactive) |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | None | ||
| Environment: |
local testing |
||
| Attachments: |
|
||||||||||||
| Issue Links: |
|
||||||||||||
| Severity: | 3 | ||||||||||||
| Rank (Obsolete): | 9223372036854775807 | ||||||||||||
| Description |
|
w/o the patch only 3 from 36 racer runs timed out. w/ the patch - 35 from 36 did. |
| Comments |
| Comment by Alex Zhuravlev [ 07/Dec/20 ] |
PID: 1677 TASK: ffff88012f27e000 CPU: 1 COMMAND: "ll_ost_io00_030" #0 [ffff880105c5ba80] __schedule at ffffffff816af9ad /home/lustre/linux-4.18.0-32.el8/kernel/sched/core.c: 2853 #1 [ffff880105c5bb18] schedule at ffffffff816b0234 /home/lustre/linux-4.18.0-32.el8/./arch/x86/include/asm/preempt.h: 81 #2 [ffff880105c5bb28] rwsem_down_write_failed at ffffffff816b479b /home/lustre/linux-4.18.0-32.el8/./arch/x86/include/asm/current.h: 15 #3 [ffff880105c5bbb8] call_rwsem_down_write_failed at ffffffff816abb13 /home/lustre/linux-4.18.0-32.el8/arch/x86/lib/rwsem.S: 118 #4 [ffff880105c5bbf8] down_write at ffffffff816b3a34 /home/lustre/linux-4.18.0-32.el8/./arch/x86/include/asm/rwsem.h: 142 #5 [ffff880105c5bc18] osd_trunc_lock at ffffffffa0b2da21 [osd_ldiskfs] /home/lustre/master-mine/lustre/osd-ldiskfs/osd_io.c: 2530 #6 [ffff880105c5bc48] osd_declare_punch at ffffffffa0b2e0fe [osd_ldiskfs] /home/lustre/master-mine/libcfs/include/libcfs/libcfs_debug.h: 155 #7 [ffff880105c5bc70] ofd_object_punch at ffffffffa0e824f9 [ofd] /home/lustre/master-mine/lustre/include/dt_object.h: 2570 #8 [ffff880105c5bd48] ofd_punch_hdl at ffffffffa0e6c54c [ofd] /home/lustre/master-mine/lustre/ofd/ofd_dev.c: 2125 PID: 6515 TASK: ffff88012c996000 CPU: 1 COMMAND: "ll_ost_io00_006" #0 [ffff88010d40b9a8] __schedule at ffffffff816af9ad /home/lustre/linux-4.18.0-32.el8/kernel/sched/core.c: 2853 #1 [ffff88010d40ba40] schedule at ffffffff816b0234 /home/lustre/linux-4.18.0-32.el8/./arch/x86/include/asm/preempt.h: 81 #2 [ffff88010d40ba50] wait_transaction_locked at ffffffff812ef568 /home/lustre/linux-4.18.0-32.el8/fs/jbd2/transaction.c: 160 #3 [ffff88010d40baa8] add_transaction_credits at ffffffff812ef700 /home/lustre/linux-4.18.0-32.el8/fs/jbd2/transaction.c: 188 #4 [ffff88010d40bb00] start_this_handle at ffffffff812efa1c /home/lustre/linux-4.18.0-32.el8/fs/jbd2/transaction.c: 357 #5 [ffff88010d40bb80] osd_write_commit at ffffffffa0b2ccc0 [osd_ldiskfs] /home/lustre/master-mine/lustre/osd-ldiskfs/osd_io.c: 1476 #6 [ffff88010d40bbf8] ofd_commitrw at ffffffffa0e892df [ofd] /home/lustre/master-mine/lustre/include/dt_object.h: 2530 #7 [ffff88010d40bcb0] tgt_brw_write at ffffffffa05c756c [ptlrpc] /home/lustre/master-mine/libcfs/include/libcfs/libcfs_debug.h: 155 basically |
| Comment by Wang Shilong (Inactive) [ 08/Dec/20 ] |
|
bzzz Thanks for helping track it down. |
| Comment by Alex Zhuravlev [ 14/Dec/20 ] |
|
while trunc lock was involved in this specific example, I think the issue is not directly related to that one - regular dt locks (e.g. ofd_write_lock() and ofd_read_lock()) are taken within a running transaction and transaction restart breaks this. actually that's the reason why truncate was implemented via osd_trans_stop(). |
| Comment by Wang Shilong (Inactive) [ 15/Dec/20 ] |
|
What do you think we move IO restart upper layer(in OFD layer), or we should implement similar way like truncate? |
| Comment by Wang Shilong (Inactive) [ 15/Dec/20 ] |
|
It looks currently, some ofd_write_lock is held outside transaction, some inside, so maybe that is reason why it could timeout even without patch |
| Comment by Alex Zhuravlev [ 15/Dec/20 ] |
I found this only in fallocate path which was landed just recently and this is clearly a bug. |
| Comment by Wang Shilong (Inactive) [ 15/Dec/20 ] |
|
bzzzyup, i saw your pushed patch, i'll try to fix by moving restart to ofd layer. |
| Comment by Gerrit Updater [ 16/Dec/20 ] |
|
Wang Shilong (wshilong@whamcloud.com) uploaded a new patch: https://review.whamcloud.com/40991 |
| Comment by Alex Zhuravlev [ 23/Dec/20 ] |
|
with the latest version I hit sanity-quota/3b very often: == sanity-quota test 3b: Quota pools: Block soft limit (start timer, expires, stop timer) ============ 13:06:59 (1608710819) limit 4 glbl_limit 8 grace 20 glbl_grace 40 User quota in qpool1(soft limit:4 MB grace:20 seconds) Creating new pool Pool lustre.qpool1 created Adding targets to pool OST lustre-OST0000_UUID added to pool lustre.qpool1 OST lustre-OST0001_UUID added to pool lustre.qpool1 Trying to set grace for pool qpool1 Trying to set quota for pool qpool1 Waiting for local destroys to complete Creating test directory fail_val=0 fail_loc=0 Write up to soft limit running as uid/gid/euid/egid 60000/60000/60000/60000, groups: [dd] [if=/dev/zero] [bs=1M] [of=/mnt/lustre/d3b.sanity-quota/f3b.sanity-quota-0] [count=4] 4+0 records in 4+0 records out 4194304 bytes (4.2 MB, 4.0 MiB) copied, 0.205771 s, 20.4 MB/s Write to exceed soft limit running as uid/gid/euid/egid 60000/60000/60000/60000, groups: [dd] [if=/dev/zero] [of=/mnt/lustre/d3b.sanity-quota/f3b.sanity-quota-0] [bs=1K] [count=10] [seek=4096] 10+0 records in 10+0 records out 10240 bytes (10 kB, 10 KiB) copied, 0.00531433 s, 1.9 MB/s mmap write when over soft limit running as uid/gid/euid/egid 60000/60000/60000/60000, groups: [multiop] [/mnt/lustre/d3b.sanity-quota/f3b.sanity-quota-0.mmap] [OT40960SMW] Disk quotas for usr quota_usr (uid 60000): Filesystem kbytes quota limit grace files quota limit grace /mnt/lustre 4148 8192 0 - 2 0 0 - lustre-MDT0000_UUID 0 - 0 - 2 - 0 - lustre-OST0000_UUID 4108 - 4144 - - - - - lustre-OST0001_UUID 40* - 40 - - - - - Total allocated inode limit: 0, total allocated block limit: 4184 Disk quotas for grp quota_usr (gid 60000): Filesystem kbytes quota limit grace files quota limit grace /mnt/lustre 4148 0 0 - 2 0 0 - lustre-MDT0000_UUID 0 - 0 - 2 - 0 - lustre-OST0000_UUID 4108 - 0 - - - - - lustre-OST0001_UUID 40 - 0 - - - - - Total allocated inode limit: 0, total allocated block limit: 0 Disk quotas for prj 1000 (pid 1000): Filesystem kbytes quota limit grace files quota limit grace /mnt/lustre 0 0 0 - 0 0 0 - lustre-MDT0000_UUID 0 - 0 - 0 - 0 - lustre-OST0000_UUID 0 - 0 - - - - - lustre-OST0001_UUID 0 - 0 - - - - - Total allocated inode limit: 0, total allocated block limit: 0 Block grace time: 40s; Inode grace time: 1w Block grace time: 1w; Inode grace time: 1w Block grace time: 1w; Inode grace time: 1w Write before timer goes off running as uid/gid/euid/egid 60000/60000/60000/60000, groups: [dd] [if=/dev/zero] [of=/mnt/lustre/d3b.sanity-quota/f3b.sanity-quota-0] [bs=1K] [count=10] [seek=5120] 10+0 records in 10+0 records out 10240 bytes (10 kB, 10 KiB) copied, 0.011448 s, 894 kB/s Quota info for qpool1: Disk quotas for usr quota_usr (uid 60000): Filesystem kbytes quota limit grace files quota limit grace /mnt/lustre 4160* 4096 0 20s 2 0 0 - Sleep through grace ... ...sleep 25 seconds Disk quotas for usr quota_usr (uid 60000): Filesystem kbytes quota limit grace files quota limit grace /mnt/lustre 4160 8192 0 - 2 0 0 - lustre-MDT0000_UUID 0 - 0 - 2 - 0 - lustre-OST0000_UUID 4120 - 4144 - - - - - lustre-OST0001_UUID 40* - 40 - - - - - Total allocated inode limit: 0, total allocated block limit: 4184 Disk quotas for grp quota_usr (gid 60000): Filesystem kbytes quota limit grace files quota limit grace /mnt/lustre 4160 0 0 - 2 0 0 - lustre-MDT0000_UUID 0 - 0 - 2 - 0 - lustre-OST0000_UUID 4120 - 0 - - - - - lustre-OST0001_UUID 40 - 0 - - - - - Total allocated inode limit: 0, total allocated block limit: 0 Disk quotas for prj 1000 (pid 1000): Filesystem kbytes quota limit grace files quota limit grace /mnt/lustre 0 0 0 - 0 0 0 - lustre-MDT0000_UUID 0 - 0 - 0 - 0 - lustre-OST0000_UUID 0 - 0 - - - - - lustre-OST0001_UUID 0 - 0 - - - - - Total allocated inode limit: 0, total allocated block limit: 0 Block grace time: 40s; Inode grace time: 1w Block grace time: 1w; Inode grace time: 1w Block grace time: 1w; Inode grace time: 1w Write after timer goes off running as uid/gid/euid/egid 60000/60000/60000/60000, groups: [dd] [if=/dev/zero] [of=/mnt/lustre/d3b.sanity-quota/f3b.sanity-quota-0] [bs=1K] [count=10] [seek=6144] 10+0 records in 10+0 records out 10240 bytes (10 kB, 10 KiB) copied, 0.00177872 s, 5.8 MB/s Write after cancel lru locks running as uid/gid/euid/egid 60000/60000/60000/60000, groups: [dd] [if=/dev/zero] [of=/mnt/lustre/d3b.sanity-quota/f3b.sanity-quota-0] [bs=1K] [count=10] [seek=7168] 10+0 records in 10+0 records out 10240 bytes (10 kB, 10 KiB) copied, 0.00480989 s, 2.1 MB/s Disk quotas for usr quota_usr (uid 60000): Filesystem kbytes quota limit grace files quota limit grace /mnt/lustre 4172 8192 0 - 2 0 0 - lustre-MDT0000_UUID 0 - 0 - 2 - 0 - lustre-OST0000_UUID 4132 - 4144 - - - - - lustre-OST0001_UUID 40* - 40 - - - - - Total allocated inode limit: 0, total allocated block limit: 4184 Files for user (quota_usr): File: /mnt/lustre/d3b.sanity-quota/f3b.sanity-quota-0.mmap Size: 40960 Blocks: 80 IO Block: 4194304 regular file Device: 2c54f966h/743766374d Inode: 144115238826934298 Links: 1 Access: (0644/-rw-r--r--) Uid: (60000/quota_usr) Gid: (60000/quota_usr) Access: 2020-12-23 13:07:15.000000000 +0500 Modify: 2020-12-23 13:07:15.000000000 +0500 Change: 2020-12-23 13:07:15.000000000 +0500 Birth: 2020-12-23 13:07:15.000000000 +0500 File: /mnt/lustre/d3b.sanity-quota/f3b.sanity-quota-0 Size: 7350272 Blocks: 8264 IO Block: 4194304 regular file Device: 2c54f966h/743766374d Inode: 144115238826934296 Links: 1 Access: (0644/-rw-r--r--) Uid: (60000/quota_usr) Gid: (60000/quota_usr) Access: 2020-12-23 13:07:14.000000000 +0500 Modify: 2020-12-23 13:07:41.000000000 +0500 Change: 2020-12-23 13:07:41.000000000 +0500 Birth: 2020-12-23 13:07:14.000000000 +0500 Disk quotas for grp quota_usr (gid 60000): Filesystem kbytes quota limit grace files quota limit grace /mnt/lustre 4172 0 0 - 2 0 0 - lustre-MDT0000_UUID 0 - 0 - 2 - 0 - lustre-OST0000_UUID 4132 - 0 - - - - - lustre-OST0001_UUID 40 - 0 - - - - - Total allocated inode limit: 0, total allocated block limit: 0 Files for group (quota_usr): File: /mnt/lustre/d3b.sanity-quota/f3b.sanity-quota-0.mmap Size: 40960 Blocks: 80 IO Block: 4194304 regular file Device: 2c54f966h/743766374d Inode: 144115238826934298 Links: 1 Access: (0644/-rw-r--r--) Uid: (60000/quota_usr) Gid: (60000/quota_usr) Access: 2020-12-23 13:07:15.000000000 +0500 Modify: 2020-12-23 13:07:15.000000000 +0500 Change: 2020-12-23 13:07:15.000000000 +0500 Birth: 2020-12-23 13:07:15.000000000 +0500 File: /mnt/lustre/d3b.sanity-quota/f3b.sanity-quota-0 Size: 7350272 Blocks: 8264 IO Block: 4194304 regular file Device: 2c54f966h/743766374d Inode: 144115238826934296 Links: 1 Access: (0644/-rw-r--r--) Uid: (60000/quota_usr) Gid: (60000/quota_usr) Access: 2020-12-23 13:07:14.000000000 +0500 Modify: 2020-12-23 13:07:41.000000000 +0500 Change: 2020-12-23 13:07:41.000000000 +0500 Birth: 2020-12-23 13:07:14.000000000 +0500 Disk quotas for prj 1000 (pid 1000): Filesystem kbytes quota limit grace files quota limit grace /mnt/lustre 0 0 0 - 0 0 0 - lustre-MDT0000_UUID 0 - 0 - 0 - 0 - lustre-OST0000_UUID 0 - 0 - - - - - lustre-OST0001_UUID 0 - 0 - - - - - Total allocated inode limit: 0, total allocated block limit: 0 Files for project (1000): sanity-quota test_3b: @@@@@@ FAIL: write success, but expect EDQUOT Trace dump: = ./../tests/test-framework.sh:6273:error() = sanity-quota.sh:159:quota_error() = sanity-quota.sh:1297:test_block_soft() = sanity-quota.sh:1435:test_3b() = ./../tests/test-framework.sh:6576:run_one() = ./../tests/test-framework.sh:6623:run_one_logged() = ./../tests/test-framework.sh:6450:run_test() = sanity-quota.sh:1494:main() Dumping lctl log to /tmp/ltest-logs/sanity-quota.test_3b.*.1608710861.log Dumping logs only on local client. Resetting fail_loc on all nodes...done. Delete files... Wait for unlink objects finished... Waiting for local destroys to complete Destroy the created pools: qpool1 lustre.qpool1 OST lustre-OST0000_UUID removed from pool lustre.qpool1 OST lustre-OST0001_UUID removed from pool lustre.qpool1 Pool lustre.qpool1 destroyed FAIL 3b (85s) |
| Comment by Wang Shilong (Inactive) [ 23/Dec/20 ] |
|
Alex, the first guess is this might be similar problem like |
| Comment by Alex Zhuravlev [ 23/Dec/20 ] |
|
wshilong I can't reproduce that with clean master (local setup), but it can't be that some unluck conditions/timings get exposed better with the patch. |
| Comment by Wang Shilong (Inactive) [ 23/Dec/20 ] |
|
Sure, would you help attach debug for me? i could try to reproduce locally, but that is not fast than yours. |
| Comment by Gerrit Updater [ 05/Jan/21 ] |
|
Oleg Drokin (green@whamcloud.com) merged in patch https://review.whamcloud.com/40991/ |
| Comment by Peter Jones [ 05/Jan/21 ] |
|
Landed for 2.14 |