[LU-14187] LU-14134 causes many racer timeouts - Whamcloud Community JIRA

Wang Shilong (Inactive) added a comment - 23/Dec/20 8:34 AM

Alex, the first guess is this might be similar problem like ~~LU-12702~~, anyway, would you get debug logs for it.

Wang Shilong (Inactive) added a comment - 23/Dec/20 8:34 AM Alex, the first guess is this might be similar problem like LU-12702 , anyway, would you get debug logs for it.

Alex Zhuravlev added a comment - 23/Dec/20 8:14 AM

with the latest version I hit sanity-quota/3b very often:

== sanity-quota test 3b: Quota pools: Block soft limit (start timer, expires, stop timer) ============ 13:06:59 (1608710819)
limit 4 glbl_limit 8
grace 20 glbl_grace 40
User quota in qpool1(soft limit:4 MB  grace:20 seconds)
Creating new pool
Pool lustre.qpool1 created
Adding targets to pool
OST lustre-OST0000_UUID added to pool lustre.qpool1
OST lustre-OST0001_UUID added to pool lustre.qpool1
Trying to set grace for pool qpool1
Trying to set quota for pool qpool1
Waiting for local destroys to complete
Creating test directory
fail_val=0
fail_loc=0
Write up to soft limit
running as uid/gid/euid/egid 60000/60000/60000/60000, groups:
 [dd] [if=/dev/zero] [bs=1M] [of=/mnt/lustre/d3b.sanity-quota/f3b.sanity-quota-0] [count=4]
4+0 records in
4+0 records out
4194304 bytes (4.2 MB, 4.0 MiB) copied, 0.205771 s, 20.4 MB/s
Write to exceed soft limit
running as uid/gid/euid/egid 60000/60000/60000/60000, groups:
 [dd] [if=/dev/zero] [of=/mnt/lustre/d3b.sanity-quota/f3b.sanity-quota-0] [bs=1K] [count=10] [seek=4096]
10+0 records in
10+0 records out
10240 bytes (10 kB, 10 KiB) copied, 0.00531433 s, 1.9 MB/s
mmap write when over soft limit
running as uid/gid/euid/egid 60000/60000/60000/60000, groups:
 [multiop] [/mnt/lustre/d3b.sanity-quota/f3b.sanity-quota-0.mmap] [OT40960SMW]
Disk quotas for usr quota_usr (uid 60000):
     Filesystem  kbytes   quota   limit   grace   files   quota   limit   grace
    /mnt/lustre    4148    8192       0       -       2       0       0       -
lustre-MDT0000_UUID
                      0       -       0       -       2       -       0       -
lustre-OST0000_UUID
                   4108       -    4144       -       -       -       -       -
lustre-OST0001_UUID
                     40*      -      40       -       -       -       -       -
Total allocated inode limit: 0, total allocated block limit: 4184
Disk quotas for grp quota_usr (gid 60000):
     Filesystem  kbytes   quota   limit   grace   files   quota   limit   grace
    /mnt/lustre    4148       0       0       -       2       0       0       -
lustre-MDT0000_UUID
                      0       -       0       -       2       -       0       -
lustre-OST0000_UUID
                   4108       -       0       -       -       -       -       -
lustre-OST0001_UUID
                     40       -       0       -       -       -       -       -
Total allocated inode limit: 0, total allocated block limit: 0
Disk quotas for prj 1000 (pid 1000):
     Filesystem  kbytes   quota   limit   grace   files   quota   limit   grace
    /mnt/lustre       0       0       0       -       0       0       0       -
lustre-MDT0000_UUID
                      0       -       0       -       0       -       0       -
lustre-OST0000_UUID
                      0       -       0       -       -       -       -       -
lustre-OST0001_UUID
                      0       -       0       -       -       -       -       -
Total allocated inode limit: 0, total allocated block limit: 0
Block grace time: 40s; Inode grace time: 1w
Block grace time: 1w; Inode grace time: 1w
Block grace time: 1w; Inode grace time: 1w
Write before timer goes off
running as uid/gid/euid/egid 60000/60000/60000/60000, groups:
 [dd] [if=/dev/zero] [of=/mnt/lustre/d3b.sanity-quota/f3b.sanity-quota-0] [bs=1K] [count=10] [seek=5120]
10+0 records in
10+0 records out
10240 bytes (10 kB, 10 KiB) copied, 0.011448 s, 894 kB/s
Quota info for qpool1:
Disk quotas for usr quota_usr (uid 60000):
     Filesystem  kbytes   quota   limit   grace   files   quota   limit   grace
    /mnt/lustre    4160*   4096       0     20s       2       0       0       -
Sleep through grace ...
...sleep 25 seconds
Disk quotas for usr quota_usr (uid 60000):
     Filesystem  kbytes   quota   limit   grace   files   quota   limit   grace
    /mnt/lustre    4160    8192       0       -       2       0       0       -
lustre-MDT0000_UUID
                      0       -       0       -       2       -       0       -
lustre-OST0000_UUID
                   4120       -    4144       -       -       -       -       -
lustre-OST0001_UUID
                     40*      -      40       -       -       -       -       -
Total allocated inode limit: 0, total allocated block limit: 4184
Disk quotas for grp quota_usr (gid 60000):
     Filesystem  kbytes   quota   limit   grace   files   quota   limit   grace
    /mnt/lustre    4160       0       0       -       2       0       0       -
lustre-MDT0000_UUID
                      0       -       0       -       2       -       0       -
lustre-OST0000_UUID
                   4120       -       0       -       -       -       -       -
lustre-OST0001_UUID
                     40       -       0       -       -       -       -       -
Total allocated inode limit: 0, total allocated block limit: 0
Disk quotas for prj 1000 (pid 1000):
     Filesystem  kbytes   quota   limit   grace   files   quota   limit   grace
    /mnt/lustre       0       0       0       -       0       0       0       -
lustre-MDT0000_UUID
                      0       -       0       -       0       -       0       -
lustre-OST0000_UUID
                      0       -       0       -       -       -       -       -
lustre-OST0001_UUID
                      0       -       0       -       -       -       -       -
Total allocated inode limit: 0, total allocated block limit: 0
Block grace time: 40s; Inode grace time: 1w
Block grace time: 1w; Inode grace time: 1w
Block grace time: 1w; Inode grace time: 1w
Write after timer goes off
running as uid/gid/euid/egid 60000/60000/60000/60000, groups:
 [dd] [if=/dev/zero] [of=/mnt/lustre/d3b.sanity-quota/f3b.sanity-quota-0] [bs=1K] [count=10] [seek=6144]
10+0 records in
10+0 records out
10240 bytes (10 kB, 10 KiB) copied, 0.00177872 s, 5.8 MB/s
Write after cancel lru locks
running as uid/gid/euid/egid 60000/60000/60000/60000, groups:
 [dd] [if=/dev/zero] [of=/mnt/lustre/d3b.sanity-quota/f3b.sanity-quota-0] [bs=1K] [count=10] [seek=7168]
10+0 records in
10+0 records out
10240 bytes (10 kB, 10 KiB) copied, 0.00480989 s, 2.1 MB/s
Disk quotas for usr quota_usr (uid 60000):
     Filesystem  kbytes   quota   limit   grace   files   quota   limit   grace
    /mnt/lustre    4172    8192       0       -       2       0       0       -
lustre-MDT0000_UUID
                      0       -       0       -       2       -       0       -
lustre-OST0000_UUID
                   4132       -    4144       -       -       -       -       -
lustre-OST0001_UUID
                     40*      -      40       -       -       -       -       -
Total allocated inode limit: 0, total allocated block limit: 4184
Files for user (quota_usr):
  File: /mnt/lustre/d3b.sanity-quota/f3b.sanity-quota-0.mmap
  Size: 40960     	Blocks: 80         IO Block: 4194304 regular file
Device: 2c54f966h/743766374d	Inode: 144115238826934298  Links: 1
Access: (0644/-rw-r--r--)  Uid: (60000/quota_usr)   Gid: (60000/quota_usr)
Access: 2020-12-23 13:07:15.000000000 +0500
Modify: 2020-12-23 13:07:15.000000000 +0500
Change: 2020-12-23 13:07:15.000000000 +0500
 Birth: 2020-12-23 13:07:15.000000000 +0500
  File: /mnt/lustre/d3b.sanity-quota/f3b.sanity-quota-0
  Size: 7350272   	Blocks: 8264       IO Block: 4194304 regular file
Device: 2c54f966h/743766374d	Inode: 144115238826934296  Links: 1
Access: (0644/-rw-r--r--)  Uid: (60000/quota_usr)   Gid: (60000/quota_usr)
Access: 2020-12-23 13:07:14.000000000 +0500
Modify: 2020-12-23 13:07:41.000000000 +0500
Change: 2020-12-23 13:07:41.000000000 +0500
 Birth: 2020-12-23 13:07:14.000000000 +0500
Disk quotas for grp quota_usr (gid 60000):
     Filesystem  kbytes   quota   limit   grace   files   quota   limit   grace
    /mnt/lustre    4172       0       0       -       2       0       0       -
lustre-MDT0000_UUID
                      0       -       0       -       2       -       0       -
lustre-OST0000_UUID
                   4132       -       0       -       -       -       -       -
lustre-OST0001_UUID
                     40       -       0       -       -       -       -       -
Total allocated inode limit: 0, total allocated block limit: 0
Files for group (quota_usr):
  File: /mnt/lustre/d3b.sanity-quota/f3b.sanity-quota-0.mmap
  Size: 40960     	Blocks: 80         IO Block: 4194304 regular file
Device: 2c54f966h/743766374d	Inode: 144115238826934298  Links: 1
Access: (0644/-rw-r--r--)  Uid: (60000/quota_usr)   Gid: (60000/quota_usr)
Access: 2020-12-23 13:07:15.000000000 +0500
Modify: 2020-12-23 13:07:15.000000000 +0500
Change: 2020-12-23 13:07:15.000000000 +0500
 Birth: 2020-12-23 13:07:15.000000000 +0500
  File: /mnt/lustre/d3b.sanity-quota/f3b.sanity-quota-0
  Size: 7350272   	Blocks: 8264       IO Block: 4194304 regular file
Device: 2c54f966h/743766374d	Inode: 144115238826934296  Links: 1
Access: (0644/-rw-r--r--)  Uid: (60000/quota_usr)   Gid: (60000/quota_usr)
Access: 2020-12-23 13:07:14.000000000 +0500
Modify: 2020-12-23 13:07:41.000000000 +0500
Change: 2020-12-23 13:07:41.000000000 +0500
 Birth: 2020-12-23 13:07:14.000000000 +0500
Disk quotas for prj 1000 (pid 1000):
     Filesystem  kbytes   quota   limit   grace   files   quota   limit   grace
    /mnt/lustre       0       0       0       -       0       0       0       -
lustre-MDT0000_UUID
                      0       -       0       -       0       -       0       -
lustre-OST0000_UUID
                      0       -       0       -       -       -       -       -
lustre-OST0001_UUID
                      0       -       0       -       -       -       -       -
Total allocated inode limit: 0, total allocated block limit: 0
Files for project (1000):
 sanity-quota test_3b: @@@@@@ FAIL: write success, but expect EDQUOT 
  Trace dump:
  = ./../tests/test-framework.sh:6273:error()
  = sanity-quota.sh:159:quota_error()
  = sanity-quota.sh:1297:test_block_soft()
  = sanity-quota.sh:1435:test_3b()
  = ./../tests/test-framework.sh:6576:run_one()
  = ./../tests/test-framework.sh:6623:run_one_logged()
  = ./../tests/test-framework.sh:6450:run_test()
  = sanity-quota.sh:1494:main()
Dumping lctl log to /tmp/ltest-logs/sanity-quota.test_3b.*.1608710861.log
Dumping logs only on local client.
Resetting fail_loc on all nodes...done.
Delete files...
Wait for unlink objects finished...
Waiting for local destroys to complete
Destroy the created pools: qpool1
lustre.qpool1
OST lustre-OST0000_UUID removed from pool lustre.qpool1
OST lustre-OST0001_UUID removed from pool lustre.qpool1
Pool lustre.qpool1 destroyed
FAIL 3b (85s)

Alex Zhuravlev added a comment - 23/Dec/20 8:14 AM with the latest version I hit sanity-quota/3b very often: == sanity-quota test 3b: Quota pools: Block soft limit (start timer, expires, stop timer) ============ 13:06:59 (1608710819) limit 4 glbl_limit 8 grace 20 glbl_grace 40 User quota in qpool1(soft limit:4 MB grace:20 seconds) Creating new pool Pool lustre.qpool1 created Adding targets to pool OST lustre-OST0000_UUID added to pool lustre.qpool1 OST lustre-OST0001_UUID added to pool lustre.qpool1 Trying to set grace for pool qpool1 Trying to set quota for pool qpool1 Waiting for local destroys to complete Creating test directory fail_val=0 fail_loc=0 Write up to soft limit running as uid/gid/euid/egid 60000/60000/60000/60000, groups: [dd] [ if =/dev/zero] [bs=1M] [of=/mnt/lustre/d3b.sanity-quota/f3b.sanity-quota-0] [count=4] 4+0 records in 4+0 records out 4194304 bytes (4.2 MB, 4.0 MiB) copied, 0.205771 s, 20.4 MB/s Write to exceed soft limit running as uid/gid/euid/egid 60000/60000/60000/60000, groups: [dd] [ if =/dev/zero] [of=/mnt/lustre/d3b.sanity-quota/f3b.sanity-quota-0] [bs=1K] [count=10] [seek=4096] 10+0 records in 10+0 records out 10240 bytes (10 kB, 10 KiB) copied, 0.00531433 s, 1.9 MB/s mmap write when over soft limit running as uid/gid/euid/egid 60000/60000/60000/60000, groups: [multiop] [/mnt/lustre/d3b.sanity-quota/f3b.sanity-quota-0.mmap] [OT40960SMW] Disk quotas for usr quota_usr (uid 60000): Filesystem kbytes quota limit grace files quota limit grace /mnt/lustre 4148 8192 0 - 2 0 0 - lustre-MDT0000_UUID 0 - 0 - 2 - 0 - lustre-OST0000_UUID 4108 - 4144 - - - - - lustre-OST0001_UUID 40* - 40 - - - - - Total allocated inode limit: 0, total allocated block limit: 4184 Disk quotas for grp quota_usr (gid 60000): Filesystem kbytes quota limit grace files quota limit grace /mnt/lustre 4148 0 0 - 2 0 0 - lustre-MDT0000_UUID 0 - 0 - 2 - 0 - lustre-OST0000_UUID 4108 - 0 - - - - - lustre-OST0001_UUID 40 - 0 - - - - - Total allocated inode limit: 0, total allocated block limit: 0 Disk quotas for prj 1000 (pid 1000): Filesystem kbytes quota limit grace files quota limit grace /mnt/lustre 0 0 0 - 0 0 0 - lustre-MDT0000_UUID 0 - 0 - 0 - 0 - lustre-OST0000_UUID 0 - 0 - - - - - lustre-OST0001_UUID 0 - 0 - - - - - Total allocated inode limit: 0, total allocated block limit: 0 Block grace time: 40s; Inode grace time: 1w Block grace time: 1w; Inode grace time: 1w Block grace time: 1w; Inode grace time: 1w Write before timer goes off running as uid/gid/euid/egid 60000/60000/60000/60000, groups: [dd] [ if =/dev/zero] [of=/mnt/lustre/d3b.sanity-quota/f3b.sanity-quota-0] [bs=1K] [count=10] [seek=5120] 10+0 records in 10+0 records out 10240 bytes (10 kB, 10 KiB) copied, 0.011448 s, 894 kB/s Quota info for qpool1: Disk quotas for usr quota_usr (uid 60000): Filesystem kbytes quota limit grace files quota limit grace /mnt/lustre 4160* 4096 0 20s 2 0 0 - Sleep through grace ... ...sleep 25 seconds Disk quotas for usr quota_usr (uid 60000): Filesystem kbytes quota limit grace files quota limit grace /mnt/lustre 4160 8192 0 - 2 0 0 - lustre-MDT0000_UUID 0 - 0 - 2 - 0 - lustre-OST0000_UUID 4120 - 4144 - - - - - lustre-OST0001_UUID 40* - 40 - - - - - Total allocated inode limit: 0, total allocated block limit: 4184 Disk quotas for grp quota_usr (gid 60000): Filesystem kbytes quota limit grace files quota limit grace /mnt/lustre 4160 0 0 - 2 0 0 - lustre-MDT0000_UUID 0 - 0 - 2 - 0 - lustre-OST0000_UUID 4120 - 0 - - - - - lustre-OST0001_UUID 40 - 0 - - - - - Total allocated inode limit: 0, total allocated block limit: 0 Disk quotas for prj 1000 (pid 1000): Filesystem kbytes quota limit grace files quota limit grace /mnt/lustre 0 0 0 - 0 0 0 - lustre-MDT0000_UUID 0 - 0 - 0 - 0 - lustre-OST0000_UUID 0 - 0 - - - - - lustre-OST0001_UUID 0 - 0 - - - - - Total allocated inode limit: 0, total allocated block limit: 0 Block grace time: 40s; Inode grace time: 1w Block grace time: 1w; Inode grace time: 1w Block grace time: 1w; Inode grace time: 1w Write after timer goes off running as uid/gid/euid/egid 60000/60000/60000/60000, groups: [dd] [ if =/dev/zero] [of=/mnt/lustre/d3b.sanity-quota/f3b.sanity-quota-0] [bs=1K] [count=10] [seek=6144] 10+0 records in 10+0 records out 10240 bytes (10 kB, 10 KiB) copied, 0.00177872 s, 5.8 MB/s Write after cancel lru locks running as uid/gid/euid/egid 60000/60000/60000/60000, groups: [dd] [ if =/dev/zero] [of=/mnt/lustre/d3b.sanity-quota/f3b.sanity-quota-0] [bs=1K] [count=10] [seek=7168] 10+0 records in 10+0 records out 10240 bytes (10 kB, 10 KiB) copied, 0.00480989 s, 2.1 MB/s Disk quotas for usr quota_usr (uid 60000): Filesystem kbytes quota limit grace files quota limit grace /mnt/lustre 4172 8192 0 - 2 0 0 - lustre-MDT0000_UUID 0 - 0 - 2 - 0 - lustre-OST0000_UUID 4132 - 4144 - - - - - lustre-OST0001_UUID 40* - 40 - - - - - Total allocated inode limit: 0, total allocated block limit: 4184 Files for user (quota_usr): File: /mnt/lustre/d3b.sanity-quota/f3b.sanity-quota-0.mmap Size: 40960 Blocks: 80 IO Block: 4194304 regular file Device: 2c54f966h/743766374d Inode: 144115238826934298 Links: 1 Access: (0644/-rw-r--r--) Uid: (60000/quota_usr) Gid: (60000/quota_usr) Access: 2020-12-23 13:07:15.000000000 +0500 Modify: 2020-12-23 13:07:15.000000000 +0500 Change: 2020-12-23 13:07:15.000000000 +0500 Birth: 2020-12-23 13:07:15.000000000 +0500 File: /mnt/lustre/d3b.sanity-quota/f3b.sanity-quota-0 Size: 7350272 Blocks: 8264 IO Block: 4194304 regular file Device: 2c54f966h/743766374d Inode: 144115238826934296 Links: 1 Access: (0644/-rw-r--r--) Uid: (60000/quota_usr) Gid: (60000/quota_usr) Access: 2020-12-23 13:07:14.000000000 +0500 Modify: 2020-12-23 13:07:41.000000000 +0500 Change: 2020-12-23 13:07:41.000000000 +0500 Birth: 2020-12-23 13:07:14.000000000 +0500 Disk quotas for grp quota_usr (gid 60000): Filesystem kbytes quota limit grace files quota limit grace /mnt/lustre 4172 0 0 - 2 0 0 - lustre-MDT0000_UUID 0 - 0 - 2 - 0 - lustre-OST0000_UUID 4132 - 0 - - - - - lustre-OST0001_UUID 40 - 0 - - - - - Total allocated inode limit: 0, total allocated block limit: 0 Files for group (quota_usr): File: /mnt/lustre/d3b.sanity-quota/f3b.sanity-quota-0.mmap Size: 40960 Blocks: 80 IO Block: 4194304 regular file Device: 2c54f966h/743766374d Inode: 144115238826934298 Links: 1 Access: (0644/-rw-r--r--) Uid: (60000/quota_usr) Gid: (60000/quota_usr) Access: 2020-12-23 13:07:15.000000000 +0500 Modify: 2020-12-23 13:07:15.000000000 +0500 Change: 2020-12-23 13:07:15.000000000 +0500 Birth: 2020-12-23 13:07:15.000000000 +0500 File: /mnt/lustre/d3b.sanity-quota/f3b.sanity-quota-0 Size: 7350272 Blocks: 8264 IO Block: 4194304 regular file Device: 2c54f966h/743766374d Inode: 144115238826934296 Links: 1 Access: (0644/-rw-r--r--) Uid: (60000/quota_usr) Gid: (60000/quota_usr) Access: 2020-12-23 13:07:14.000000000 +0500 Modify: 2020-12-23 13:07:41.000000000 +0500 Change: 2020-12-23 13:07:41.000000000 +0500 Birth: 2020-12-23 13:07:14.000000000 +0500 Disk quotas for prj 1000 (pid 1000): Filesystem kbytes quota limit grace files quota limit grace /mnt/lustre 0 0 0 - 0 0 0 - lustre-MDT0000_UUID 0 - 0 - 0 - 0 - lustre-OST0000_UUID 0 - 0 - - - - - lustre-OST0001_UUID 0 - 0 - - - - - Total allocated inode limit: 0, total allocated block limit: 0 Files for project (1000): sanity-quota test_3b: @@@@@@ FAIL: write success, but expect EDQUOT Trace dump: = ./../tests/test-framework.sh:6273:error() = sanity-quota.sh:159:quota_error() = sanity-quota.sh:1297:test_block_soft() = sanity-quota.sh:1435:test_3b() = ./../tests/test-framework.sh:6576:run_one() = ./../tests/test-framework.sh:6623:run_one_logged() = ./../tests/test-framework.sh:6450:run_test() = sanity-quota.sh:1494:main() Dumping lctl log to /tmp/ltest-logs/sanity-quota.test_3b.*.1608710861.log Dumping logs only on local client. Resetting fail_loc on all nodes...done. Delete files... Wait for unlink objects finished... Waiting for local destroys to complete Destroy the created pools: qpool1 lustre.qpool1 OST lustre-OST0000_UUID removed from pool lustre.qpool1 OST lustre-OST0001_UUID removed from pool lustre.qpool1 Pool lustre.qpool1 destroyed FAIL 3b (85s)

Gerrit Updater added a comment - 16/Dec/20 7:18 AM

Wang Shilong (wshilong@whamcloud.com) uploaded a new patch: https://review.whamcloud.com/40991
Subject: ~~LU-14187~~ osd-ldiskfs: fix locking in write commit
Project: fs/lustre-release
Branch: master
Current Patch Set: 1
Commit: be8337f36c01227e8a959041146577134a402977

Gerrit Updater added a comment - 16/Dec/20 7:18 AM Wang Shilong (wshilong@whamcloud.com) uploaded a new patch: https://review.whamcloud.com/40991 Subject: LU-14187 osd-ldiskfs: fix locking in write commit Project: fs/lustre-release Branch: master Current Patch Set: 1 Commit: be8337f36c01227e8a959041146577134a402977

Wang Shilong (Inactive) added a comment - 15/Dec/20 8:54 AM

bzzzyup, i saw your pushed patch, i'll try to fix by moving restart to ofd layer.

Wang Shilong (Inactive) added a comment - 15/Dec/20 8:54 AM bzzz yup, i saw your pushed patch, i'll try to fix by moving restart to ofd layer.

Alex Zhuravlev added a comment - 15/Dec/20 5:31 AM

It looks currently, some ofd_write_lock is held outside transaction, some inside, so maybe that is reason why it could timeout even without patch

I found this only in fallocate path which was landed just recently and this is clearly a bug.

Alex Zhuravlev added a comment - 15/Dec/20 5:31 AM It looks currently, some ofd_write_lock is held outside transaction, some inside, so maybe that is reason why it could timeout even without patch I found this only in fallocate path which was landed just recently and this is clearly a bug.

Wang Shilong (Inactive) added a comment - 15/Dec/20 2:51 AM

It looks currently, some ofd_write_lock is held outside transaction, some inside, so maybe that is reason why it could timeout even without patch

Wang Shilong (Inactive) added a comment - 15/Dec/20 2:51 AM It looks currently, some ofd_write_lock is held outside transaction, some inside, so maybe that is reason why it could timeout even without patch

Wang Shilong (Inactive) added a comment - 15/Dec/20 2:32 AM

What do you think we move IO restart upper layer(in OFD layer), or we should implement similar way like truncate?

Wang Shilong (Inactive) added a comment - 15/Dec/20 2:32 AM What do you think we move IO restart upper layer(in OFD layer), or we should implement similar way like truncate?

Alex Zhuravlev added a comment - 14/Dec/20 8:39 PM

while trunc lock was involved in this specific example, I think the issue is not directly related to that one - regular dt locks (e.g. ofd_write_lock() and ofd_read_lock()) are taken within a running transaction and transaction restart breaks this. actually that's the reason why truncate was implemented via osd_trans_stop().

Alex Zhuravlev added a comment - 14/Dec/20 8:39 PM while trunc lock was involved in this specific example, I think the issue is not directly related to that one - regular dt locks (e.g. ofd_write_lock() and ofd_read_lock()) are taken within a running transaction and transaction restart breaks this. actually that's the reason why truncate was implemented via osd_trans_stop().

Wang Shilong (Inactive) added a comment - 08/Dec/20 1:54 AM

bzzz Thanks for helping track it down.
Would you be specific a bit more? It looks now we hold trunc lock before start transaction, and for restart case, trunc lock is still held?

Wang Shilong (Inactive) added a comment - 08/Dec/20 1:54 AM bzzz Thanks for helping track it down. Would you be specific a bit more? It looks now we hold trunc lock before start transaction, and for restart case, trunc lock is still held?

Alex Zhuravlev added a comment - 07/Dec/20 2:01 PM

PID: 1677   TASK: ffff88012f27e000  CPU: 1   COMMAND: "ll_ost_io00_030"
 #0 [ffff880105c5ba80] __schedule at ffffffff816af9ad
    /home/lustre/linux-4.18.0-32.el8/kernel/sched/core.c: 2853
 #1 [ffff880105c5bb18] schedule at ffffffff816b0234
    /home/lustre/linux-4.18.0-32.el8/./arch/x86/include/asm/preempt.h: 81
 #2 [ffff880105c5bb28] rwsem_down_write_failed at ffffffff816b479b
    /home/lustre/linux-4.18.0-32.el8/./arch/x86/include/asm/current.h: 15
 #3 [ffff880105c5bbb8] call_rwsem_down_write_failed at ffffffff816abb13
    /home/lustre/linux-4.18.0-32.el8/arch/x86/lib/rwsem.S: 118
 #4 [ffff880105c5bbf8] down_write at ffffffff816b3a34
    /home/lustre/linux-4.18.0-32.el8/./arch/x86/include/asm/rwsem.h: 142
 #5 [ffff880105c5bc18] osd_trunc_lock at ffffffffa0b2da21 [osd_ldiskfs]
    /home/lustre/master-mine/lustre/osd-ldiskfs/osd_io.c: 2530
 #6 [ffff880105c5bc48] osd_declare_punch at ffffffffa0b2e0fe [osd_ldiskfs]
    /home/lustre/master-mine/libcfs/include/libcfs/libcfs_debug.h: 155
 #7 [ffff880105c5bc70] ofd_object_punch at ffffffffa0e824f9 [ofd]
    /home/lustre/master-mine/lustre/include/dt_object.h: 2570
 #8 [ffff880105c5bd48] ofd_punch_hdl at ffffffffa0e6c54c [ofd]
    /home/lustre/master-mine/lustre/ofd/ofd_dev.c: 2125

PID: 6515   TASK: ffff88012c996000  CPU: 1   COMMAND: "ll_ost_io00_006"
 #0 [ffff88010d40b9a8] __schedule at ffffffff816af9ad
    /home/lustre/linux-4.18.0-32.el8/kernel/sched/core.c: 2853
 #1 [ffff88010d40ba40] schedule at ffffffff816b0234
    /home/lustre/linux-4.18.0-32.el8/./arch/x86/include/asm/preempt.h: 81
 #2 [ffff88010d40ba50] wait_transaction_locked at ffffffff812ef568
    /home/lustre/linux-4.18.0-32.el8/fs/jbd2/transaction.c: 160
 #3 [ffff88010d40baa8] add_transaction_credits at ffffffff812ef700
    /home/lustre/linux-4.18.0-32.el8/fs/jbd2/transaction.c: 188
 #4 [ffff88010d40bb00] start_this_handle at ffffffff812efa1c
    /home/lustre/linux-4.18.0-32.el8/fs/jbd2/transaction.c: 357
 #5 [ffff88010d40bb80] osd_write_commit at ffffffffa0b2ccc0 [osd_ldiskfs]
    /home/lustre/master-mine/lustre/osd-ldiskfs/osd_io.c: 1476
 #6 [ffff88010d40bbf8] ofd_commitrw at ffffffffa0e892df [ofd]
    /home/lustre/master-mine/lustre/include/dt_object.h: 2530
 #7 [ffff88010d40bcb0] tgt_brw_write at ffffffffa05c756c [ptlrpc]
    /home/lustre/master-mine/libcfs/include/libcfs/libcfs_debug.h: 155

basically ~~LU-14134~~ breaks transaction-locks ordering as restart (which is just start in essense) is done with the lock held

Alex Zhuravlev added a comment - 07/Dec/20 2:01 PM PID: 1677 TASK: ffff88012f27e000 CPU: 1 COMMAND: "ll_ost_io00_030" #0 [ffff880105c5ba80] __schedule at ffffffff816af9ad /home/lustre/linux-4.18.0-32.el8/kernel/sched/core.c: 2853 #1 [ffff880105c5bb18] schedule at ffffffff816b0234 /home/lustre/linux-4.18.0-32.el8/./arch/x86/include/asm/preempt.h: 81 #2 [ffff880105c5bb28] rwsem_down_write_failed at ffffffff816b479b /home/lustre/linux-4.18.0-32.el8/./arch/x86/include/asm/current.h: 15 #3 [ffff880105c5bbb8] call_rwsem_down_write_failed at ffffffff816abb13 /home/lustre/linux-4.18.0-32.el8/arch/x86/lib/rwsem.S: 118 #4 [ffff880105c5bbf8] down_write at ffffffff816b3a34 /home/lustre/linux-4.18.0-32.el8/./arch/x86/include/asm/rwsem.h: 142 #5 [ffff880105c5bc18] osd_trunc_lock at ffffffffa0b2da21 [osd_ldiskfs] /home/lustre/master-mine/lustre/osd-ldiskfs/osd_io.c: 2530 #6 [ffff880105c5bc48] osd_declare_punch at ffffffffa0b2e0fe [osd_ldiskfs] /home/lustre/master-mine/libcfs/include/libcfs/libcfs_debug.h: 155 #7 [ffff880105c5bc70] ofd_object_punch at ffffffffa0e824f9 [ofd] /home/lustre/master-mine/lustre/include/dt_object.h: 2570 #8 [ffff880105c5bd48] ofd_punch_hdl at ffffffffa0e6c54c [ofd] /home/lustre/master-mine/lustre/ofd/ofd_dev.c: 2125 PID: 6515 TASK: ffff88012c996000 CPU: 1 COMMAND: "ll_ost_io00_006" #0 [ffff88010d40b9a8] __schedule at ffffffff816af9ad /home/lustre/linux-4.18.0-32.el8/kernel/sched/core.c: 2853 #1 [ffff88010d40ba40] schedule at ffffffff816b0234 /home/lustre/linux-4.18.0-32.el8/./arch/x86/include/asm/preempt.h: 81 #2 [ffff88010d40ba50] wait_transaction_locked at ffffffff812ef568 /home/lustre/linux-4.18.0-32.el8/fs/jbd2/transaction.c: 160 #3 [ffff88010d40baa8] add_transaction_credits at ffffffff812ef700 /home/lustre/linux-4.18.0-32.el8/fs/jbd2/transaction.c: 188 #4 [ffff88010d40bb00] start_this_handle at ffffffff812efa1c /home/lustre/linux-4.18.0-32.el8/fs/jbd2/transaction.c: 357 #5 [ffff88010d40bb80] osd_write_commit at ffffffffa0b2ccc0 [osd_ldiskfs] /home/lustre/master-mine/lustre/osd-ldiskfs/osd_io.c: 1476 #6 [ffff88010d40bbf8] ofd_commitrw at ffffffffa0e892df [ofd] /home/lustre/master-mine/lustre/include/dt_object.h: 2530 #7 [ffff88010d40bcb0] tgt_brw_write at ffffffffa05c756c [ptlrpc] /home/lustre/master-mine/libcfs/include/libcfs/libcfs_debug.h: 155 basically LU-14134 breaks transaction-locks ordering as restart (which is just start in essense) is done with the lock held

LU-14134 causes many racer timeouts

Details

Description

Attachments

Attachments

Issue Links

Activity

People

Dates