[LU-6732] Cannot pick up EDQUOT from ll_write_begin and ll_write_end Created: 16/Jun/15  Updated: 10/Dec/15  Resolved: 10/Dec/15

Status: Resolved
Project: Lustre
Component/s: None
Affects Version/s: Lustre 2.8.0
Fix Version/s: Lustre 2.8.0

Type: Bug Priority: Critical
Reporter: Hiroya Nozaki Assignee: WC Triage
Resolution: Fixed Votes: 0
Labels: patch

Severity: 3
Rank (Obsolete): 9223372036854775807

 Description   

When -EDQUOT happens in ll_write_begin or ll_write_end, write(2) may return 0 with no errno. This is because of the implementation of generic_perform_write().

static ssize_t generic_perform_write( ... )
{
        ...
        do {
                ...
                status = a_ops->write_begin( ... );
                if (unlikely(status))
                        break;
                ...
                status = a_ops->write_end( ... );
                if (unlikely(status))
                        break;
                copied = status;
                ...
                written += copied;
                ...
        } while (iov_iter_count(i));
        return written ? written : status;
}

when "written" already isn't zero and EDQUOT happened in ll_write_begin() or ll_write_end(), generic_perform_write() returns "written" bytes and ignores "status". So vvp_io_write_start() has no way to know the error.

We can confirm the issue using quota function like following

bash-4.1$ id
uid=60000(quota_usr) gid=60000(quota_usr) groups=60000(quota_usr)
bash-4.1$ lfs quota /mnt/lustre
Disk quotas for user quota_usr (uid 60000):
     Filesystem  kbytes   quota   limit   grace   files   quota   limit   grace
    /mnt/lustre 1048580*  58617   59617       -      10       0       0       -
Disk quotas for group quota_usr (gid 60000):
     Filesystem  kbytes   quota   limit   grace   files   quota   limit   grace
    /mnt/lustre 1048580       0       0       -      10       0       0       -
bash-4.1$ dd if=/dev/zero of=/mnt/lustre/quota bs=4M count=1 (<--- PTLRPC_MAX_BRW_PAGES)
dd: writing `/mnt/lustre/quota': No space left on device
1+0 records in
0+0 records out
0 bytes (0 B) copied, 0.028993 s, 0.0 kB/s

And the following is a strace log.

strace dd if=/dev/zero of=/mnt/lustre/quota bs=4M count=1
...
read(0, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 4194304) = 4194304
write(1, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 4194304) = 0
...
write(2, ": No space left on device", 25: No space left on device) = 25
write(2, "\n", 1
)                       = 1
close(0)                                = 0
close(1)                                = 0
write(2, "1+0 records in\n0+0 records out\n", 311+0 records in
0+0 records out
) = 31
write(2, "0 bytes (0 B) copied", 200 bytes (0 B) copied)    = 20
write(2, ", 0.0365546 s, 0.0 kB/s\n", 24, 0.0365546 s, 0.0 kB/s
) = 24
close(2)                                = 0
exit_group(1)                           = ?

write(2) should've return -1 with EDQUOT but, as you can see, it actually returned 0 with no errno.
(ENOSPC was set in dd. check the source)



 Comments   
Comment by Gerrit Updater [ 16/Jun/15 ]

Hiroya Nozaki (nozaki.hiroya@jp.fujitsu.com) uploaded a new patch: http://review.whamcloud.com/15302
Subject: LU-6732 llite: Cannot pick up EDQUOT from ll_write_begin/end
Project: fs/lustre-release
Branch: master
Current Patch Set: 1
Commit: cde0eff7657bab72cde858be2e4d18d93e645ff4

Comment by Hiroya Nozaki [ 17/Aug/15 ]

needs one more +1 review.
Could someone review this patch, please ?

Comment by Gerrit Updater [ 10/Dec/15 ]

Oleg Drokin (oleg.drokin@intel.com) merged in patch http://review.whamcloud.com/15302/
Subject: LU-6732 llite: ll_write_begin/end not passing on errors
Project: fs/lustre-release
Branch: master
Current Patch Set:
Commit: 05300cbf466d97368b00b49d57839186e5662687

Comment by Joseph Gmitter (Inactive) [ 10/Dec/15 ]

Landed for 2.8

Generated at Sat Feb 10 02:02:46 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.