[LU-16780] zfs's osd_sync() doesn't wait for commit callbacks Created: 27/Apr/23 Updated: 28/Apr/23 |
|
| Status: | Open |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | None |
| Fix Version/s: | None |
| Type: | Bug | Priority: | Minor |
| Reporter: | Alex Zhuravlev | Assignee: | WC Triage |
| Resolution: | Unresolved | Votes: | 0 |
| Labels: | None | ||
| Severity: | 3 |
| Rank (Obsolete): | 9223372036854775807 |
| Description |
|
zfs's osd_sync (implementing dt_sync()) can return before all related commit callbacks have been processed. this result in an incorrect quota state: quota "usage" (read in lquota_disk_read()) returns actual number, but "pending" is out of date (updated from the commit callback). /* use latest usage */ usage = lqe->lqe_usage; /* take pending write into account */ usage += lqe->lqe_pending_write; if (space + usage <= lqe->lqe_granted - lqe->lqe_pending_rel) { lqe->lqe_pending_write += space; lqe->lqe_waiting_write -= space; rc = 0; } else if (lqe->lqe_edquot && (lqe->lqe_edquot_time > ktime_get_seconds() - 5)) { rc = -EDQUOT; } else { rc = -EAGAIN; } this is a snipped from the log confirming the problem: 00040000:04000000:1.0:1682597449.976673:0:27241:0:(qsd_entry.c:253:qsd_refresh_usage()) $$$ disk usage: 0 qsd:lustre-MDT0001 qtype:usr id:60000 enforced:1 granted: 1024 pending:952 waiting:1 req:0 usage: 0 qunit:1024 qtune:512 edquot:1 default:no 00040000:04000000:1.0:1682597449.994977:0:7285:0:(qsd_entry.c:253:qsd_refresh_usage()) $$$ disk usage: 219 qsd:lustre-MDT0001 qtype:usr id:60000 enforced:1 granted: 1024 pending:953 waiting:1 req:0 usage: 219 qunit:1024 qtune:512 edquot:1 default:no 00040000:04000000:1.0:1682597450.084402:0:6415:0:(qsd_entry.c:253:qsd_refresh_usage()) $$$ disk usage: 879 qsd:lustre-MDT0001 qtype:usr id:60000 enforced:1 granted: 1024 pending:299 waiting:1 req:0 usage: 879 qunit:1024 qtune:512 edquot:1 default:no 00040000:04000000:1.0:1682597450.094358:0:6415:0:(qsd_entry.c:253:qsd_refresh_usage()) $$$ disk usage: 879 qsd:lustre-MDT0001 qtype:usr id:60000 enforced:1 granted: 1024 pending:74 waiting:1 req:0 usage: 879 qunit:1024 qtune:512 edquot:1 default:no 00040000:04000000:1.0:1682597450.186265:0:7285:0:(qsd_entry.c:253:qsd_refresh_usage()) $$$ disk usage: 953 qsd:lustre-MDT0001 qtype:usr id:60000 enforced:1 granted: 1024 pending:74 waiting:1 req:0 usage: 953 qunit:1024 qtune:512 edquot:1 default:no ... 00040000:04000000:1.0:1682597450.186948:0:7285:0:(qsd_handler.c:774:qsd_op_begin0()) $$$ acquire quota failed:-122 qsd:lustre-MDT0001 qtype:usr id:60000 enforced:1 granted: 1024 pending:74 waiting:1 req:0 usage: 953 qunit:1024 qtune:512 edquot:1 default:no 00040000:00000001:1.0:1682597450.186950:0:7285:0:(qsd_handler.c:830:qsd_op_begin0()) Process leaving (rc=18446744073709551494 : -122 : ffffffffffffff86) ... 00040000:04000000:1.0:1682597450.310321:0:6415:0:(qsd_entry.c:253:qsd_refresh_usage()) $$$ disk usage: 953 qsd:lustre-MDT0001 qtype:usr id:60000 enforced:1 granted: 1024 pending:0 waiting:0 req:0 usage: 953 qunit:1024 qtune:512 edquot:1 default:no |
| Comments |
| Comment by Gerrit Updater [ 28/Apr/23 ] |
|
"Alex Zhuravlev <bzzz@whamcloud.com>" uploaded a new patch: https://review.whamcloud.com/c/fs/lustre-release/+/50790 |