[LU-7754] DNE3: osd-zfs gets into a livelock if transaction is too big Created: 07/Feb/16 Updated: 27/Aug/19 |
|
| Status: | Open |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | Lustre 2.8.0 |
| Fix Version/s: | None |
| Type: | Bug | Priority: | Minor |
| Reporter: | Alex Zhuravlev | Assignee: | Alex Zhuravlev |
| Resolution: | Unresolved | Votes: | 0 |
| Labels: | dne3, zfs | ||
| Issue Links: |
|
||||||||||||
| Severity: | 3 | ||||||||||||
| Rank (Obsolete): | 9223372036854775807 | ||||||||||||
| Description |
|
ONLY=300k bash sanity.sh: [ 89.828294] LNet: Service thread pid 4249 was inactive for 40.00s. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes: |
| Comments |
| Comment by Gerrit Updater [ 07/Feb/16 ] |
|
Alex Zhuravlev (alexey.zhuravlev@intel.com) uploaded a new patch: http://review.whamcloud.com/18341 |
| Comment by Andreas Dilger [ 08/Feb/16 ] |
|
Your patch turn this from a hang into a failure. That is an improvement, but it doesn't explain why this test failed? Do you have an unusual config (small MDT?) or is there some regression that makes the transaction too large? |
| Comment by Alex Zhuravlev [ 08/Feb/16 ] |
|
sanity/300k tries to create a big striped directory: $LFS setdirstripe -i 0 -c512 $DIR/$tdir/striped_dir with default MDSSIZE=200000 DMU fails to start such a big transaction. |
| Comment by Andreas Dilger [ 08/Feb/16 ] |
|
How large is the transaction? Do we have a larger MDS size in our testing? I guess this is because we don't run DNE + ZFS by default. |
| Comment by Alex Zhuravlev [ 08/Feb/16 ] |
|
transaction calculations: it seem to fail because of insufficient memory: 4986830848 (4755MB) is needed while the test system had 4GB in total. |
| Comment by Andreas Dilger [ 18/Apr/17 ] |
|
That is 4755 MB / 512 stripes = 9 MB/stripe which seems like a lot of space to reserve? I thought we got away from O(n^2) transaction sizes for striped directories? |