[LU-13470] sysfs ping write creates a flood-ping situation that could not be normally stopped Created: 21/Apr/20 Updated: 07/May/20 Resolved: 07/May/20 |
|
| Status: | Resolved |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | Lustre 2.14.0, Lustre 2.12.5 |
| Fix Version/s: | Lustre 2.14.0 |
| Type: | Bug | Priority: | Minor |
| Reporter: | Oleg Drokin | Assignee: | WC Triage |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | None | ||
| Issue Links: |
|
||||||||
| Severity: | 3 | ||||||||
| Rank (Obsolete): | 9223372036854775807 | ||||||||
| Description |
|
It looks like the then ping file was migrated to sysfs, it unfortunately introduced a bug when doing a write: ssize_t ping_store(struct kobject *kobj, struct attribute *attr,
const char *buffer, size_t count)
{
return ping_show(kobj, attr, (char *)buffer);
}
what it really sohuld be doing is return count, otherwise outer logic thinks it's a short write that needs to be retried (errno = 0) and enters a loop that you cannot really break short of disconnectign from the server: [root@centos6-16 ~]# cat /sys/fs/lustre/mdc/lustre-MDT0000-mdc-ffff880387d67800/ping [root@centos6-16 ~]# echo blahblah > /sys/fs/lustre/mdc/lustre-MDT0000-mdc-ffff880387d67800/ping ^Z ^C we can see how the cpu is eaten with all the retries and pings now: PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 2529 root 20 0 0 0 0 R 27.8 0.0 20:07.59 socknal_sd+ 12488 root 20 0 115568 2124 1612 S 27.8 0.0 12:48.58 bash 2530 root 20 0 0 0 0 S 27.5 0.0 20:09.36 socknal_sd+ 3861 root 20 0 0 0 0 S 14.2 0.0 4:11.05 mdt03_002 16784 root 20 0 0 0 0 S 10.6 0.0 4:04.16 mdt03_004 4410 root 20 0 0 0 0 S 5.0 0.0 4:08.74 mdt03_003 3859 root 20 0 0 0 0 S 2.6 0.0 4:11.23 mdt03_000 55 root 20 0 0 0 0 S 0.3 0.0 0:22.51 rcuos/6 3860 root 20 0 0 0 0 S 0.3 0.0 3:58.34 mdt03_001 15467 green 20 0 162104 2408 1524 R 0.3 0.0 0:00.05 top |
| Comments |
| Comment by Gerrit Updater [ 21/Apr/20 ] |
|
Oleg Drokin (green@whamcloud.com) uploaded a new patch: https://review.whamcloud.com/38304 |
| Comment by Gerrit Updater [ 07/May/20 ] |
|
Oleg Drokin (green@whamcloud.com) merged in patch https://review.whamcloud.com/38304/ |
| Comment by Peter Jones [ 07/May/20 ] |
|
Landed for 2.14 |