[LU-13296] statfs isn't work properly with MDT statfs proxy Created: 26/Feb/20 Updated: 11/Dec/21 Resolved: 05/Mar/20 |
|
| Status: | Resolved |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | Lustre 2.12.0, Lustre 2.12.5 |
| Fix Version/s: | Lustre 2.14.0, Lustre 2.12.5 |
| Type: | Bug | Priority: | Blocker |
| Reporter: | Alexey Lyashkov | Assignee: | Alexey Lyashkov |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | None | ||
| Issue Links: |
|
||||||||||||||||||||
| Severity: | 3 | ||||||||||||||||||||
| Rank (Obsolete): | 9223372036854775807 | ||||||||||||||||||||
| Description |
|
statfs don't work fine sometimes. It's easy replicate with start a several processes on same time which ask about block size. all instances run on same node. % aprun -n 8 ./a.out /lus/snx11242 statfs (/lus/snx11242) f_bsize 0X9003B000 statfs (/lus/snx11242) f_bsize 0X8C19F000 statfs (/lus/snx11242) f_bsize 0X82BFD000 statfs (/lus/snx11242) f_bsize 0X85C31000 statfs (/lus/snx11242) f_bsize 0X8BED2000 statfs (/lus/snx11242) f_bsize 0X835B1000 statfs (/lus/snx11242) f_bsize 0X828E3000 statfs (/lus/snx11242) f_bsize 0X1000 log for failed application. 00000080:00200000:0.0:1582645192.532388:0:16422:0:(llite_lib.c:1890:ll_statfs()) VFS Op: at 4296762105 jiffies 00000080:00000001:0.0:1582645192.532391:0:16422:0:(llite_lib.c:1838:ll_statfs_internal()) Process entered 00000080:00000001:0.0:1582645192.532392:0:16422:0:(obd_class.h:1051:obd_statfs()) Process entered 00000080:00000004:0.0:1582645192.532394:0:16422:0:(obd_class.h:1063:obd_statfs()) snx11242-clilmv-ffff881f8b680000: age 7262, max_age 7478 00000080:00000004:0.0:1582645192.535406:0:16422:0:(obd_class.h:1083:obd_statfs()) snx11242-clilmv-ffff881f8b680000: new ffff881f9cad8228 cache blocks 3366320026/3419988114 objects 630715697 0/6372092768 00000080:00000001:0.0:1582645192.535408:0:16422:0:(obd_class.h:1103:obd_statfs()) Process leaving (rc=0 : 0 : 0) 00000080:00000004:0.0:1582645192.535409:0:16422:0:(llite_lib.c:1848:ll_statfs_internal()) MDC blocks 18446744071580744875/18446744071582034486 objects 0/1 00000080:00000001:0.0:1582645192.535410:0:16422:0:(llite_lib.c:1851:ll_statfs_internal()) Process leaving via out (rc=0 : 0 : 0x0) 00000080:00000001:0.0:1582645192.535411:0:16422:0:(llite_lib.c:1881:ll_statfs_internal()) Process leaving (rc=0 : 0 : 0) regression introduced by commit b500d5193360711a6c6b07497f34e61cc590cf19
Author: Alex Zhuravlev <bzzz@whamcloud.com>
Date: Thu Sep 21 18:24:18 2017 +0300
LU-10018 protocol: MDT as a statfs proxy
|
| Comments |
| Comment by Alexey Lyashkov [ 26/Feb/20 ] |
|
issue replicated easy with several statfs in parallel. # cat test ./a.out & ./a.out & ./a.out & ./a.out & ./a.out & ./a.out & ./a.out & ./a.out & sleep 1 # cat ttt.c #include <stdio.h> #include <sys/statfs.h> int main() { struct statfs buf; statfs("/mnt/lustre", &buf); printf("b_size %llx\n", buf.f_bsize); return 0; } # bash test b_size 1000 b_size 1000 b_size 0 b_size 0 b_size 0 b_size 0 b_size 0 b_size 0 |
| Comment by Alexey Lyashkov [ 26/Feb/20 ] |
|
Root cause of this bug - cached copy ins't used in case of race between obtain statfs info and refreshing it. |
| Comment by Gerrit Updater [ 27/Feb/20 ] |
|
Alexey Lyashkov (alexey.lyashkov@hpe.com) uploaded a new patch: https://review.whamcloud.com/37753 |
| Comment by Gerrit Updater [ 05/Mar/20 ] |
|
Oleg Drokin (green@whamcloud.com) merged in patch https://review.whamcloud.com/37753/ |
| Comment by Peter Jones [ 05/Mar/20 ] |
|
Landed for 2.14 |
| Comment by Gerrit Updater [ 06/Mar/20 ] |
|
Minh Diep (mdiep@whamcloud.com) uploaded a new patch: https://review.whamcloud.com/37819 |
| Comment by Gerrit Updater [ 25/Mar/20 ] |
|
Oleg Drokin (green@whamcloud.com) merged in patch https://review.whamcloud.com/37819/ |