[LU-2176] ZFS: running racer grounds everything to a standstill Created: 14/Oct/12  Updated: 30/Apr/20  Resolved: 30/Apr/20

Status: Resolved
Project: Lustre
Component/s: None
Affects Version/s: Lustre 2.4.0
Fix Version/s: None

Type: Bug Priority: Major
Reporter: Oleg Drokin Assignee: Alex Zhuravlev
Resolution: Cannot Reproduce Votes: 0
Labels: LB, performance, zfs

Issue Links:
Related
is related to LU-2887 sanity-quota test_12a: slow due to ZF... Resolved
Severity: 3
Rank (Obsolete): 5214

 Description   

There seems to be a very strange problem with racer and lustre over zfs.
When you run SLOW=yes REFORMAT=yes FSTYPE=zfs DURATIOn=2700 sh racer.sh
the dd stuff at the start of racer mostly never finish by themselves, even though we see heavy activity of zfs threads and some errors about ENOSPC in the logs.

the actual creates and other stuff usually gets run after the timeout is already finished and everything is being killed.

Contrast this with an ldiskfs run whre shortly after start we get a lot of messages about "file_create: SIZE=XXX" going all the way until the time is up.

Alex is already looking into this one.



 Comments   
Comment by Alex Zhuravlev [ 29/Oct/12 ]

I'd suggest to make few runs with http://review.whamcloud.com/4403 and attach the stats here.

1) ldiskfs with default settings
2) zfs with default settings
3) zfs with OSTSIZE=500000 (rm -f /tmp/lustre-ost* to recreate them with a different size)

Comment by Alex Zhuravlev [ 06/Nov/12 ]

http://review.whamcloud.com/4403 should provide a bit more info including amount of data read/written.

from what I've seen locally racer does work with ZFS, not that great as with ldiskfs, but quite acceptable, IMHO.

I'd suggest Oleg to see actual numbers locally and, hopefully, close the ticket

Comment by Alex Zhuravlev [ 12/Nov/12 ]

SLOW=yes DURATION=2700 sh racer.sh:

ldiskfs

{OST|MDS}SIZE=default: 4740321 opens, 44659 getattrs, 131489 readdirs, 654MB in 145047 reads, 10312MB in 84271 writes
zfs {OST|MDS}

SIZE=default: 1287981 opens, 201684 getattrs, 41838 readdirs, 41MB in 5827 reads, 825MB in 36025 writes
zfs

{OST|MDS}SIZE=500000: 1830863 opens, 180514 getattrs, 50876 readdirs, 98MB in 17732 reads, 3640MB in 80152 writes
zfs,4GB,{OST|MDS}

SIZE=500000: 2941173 opens, 132408 getattrs, 84917 readdirs, 216MB in 7664 reads, 4726MB in 62449 writes

given zfs's requirements are significantly higher, I think the numbers are OK at the moment.

Comment by Alex Zhuravlev [ 30/Sep/15 ]

Oleg, is this still an actual issue?

Comment by Andreas Dilger [ 30/Apr/20 ]

Close old issue, I think ZFS is running this today w/o problems.

Generated at Sat Feb 10 01:23:00 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.