[LU-2619] Bogus value of dqb_curinodes returned by osc_quotactl Created: 15/Jan/13  Updated: 11/May/15  Resolved: 20/Jan/15

Status: Resolved
Project: Lustre
Component/s: None
Affects Version/s: Lustre 2.4.0
Fix Version/s: None

Type: Bug Priority: Minor
Reporter: Prakash Surya (Inactive) Assignee: Niu Yawei (Inactive)
Resolution: Won't Fix Votes: 0
Labels: llnl

Attachments: File porter-mds1.txt.gz    
Issue Links:
Related
is related to LU-2435 inode accounting in osd-zfs is racy Resolved
Severity: 3
Rank (Obsolete): 6124

 Description   

When running lfs quota -u <USER> <FS> on Sequoia, a couple users do not have any files in their directory but quota reports a bogus large value:

# lfs quota -u pjmccart /p/ls1
Disk quotas for user pjmccart (uid 8624):
     Filesystem  kbytes   quota   limit   grace   files   quota   limit   grace
         /p/ls1     913       0       0       - 18446744073709547865       0       0       -

# du -sh /p/ls1/pjmccart/
913K    /p/ls1/pjmccart/

# ls -alR /p/ls1/pjmccart/
/p/ls1/pjmccart/:
total 1214
913 drwx------    2 pjmccart pjmccart 934400 Nov 15 10:28 ./
302 drwxr-xr-x 2193 root     root     308736 Jan 11 08:05 ../ 

Using systemtap to print the obd_quotactl structure when the osc_quotactl function returns, I see odd values coming from two of the OSCs:

osc_quotactl: "ls1-OST0037-osc-c0000003c865a400": {.qc_cmd=8388867, .qc_type=0, .qc_id=8624, .qc_stat=0, .qc_dqinfo={.dqi_bgrace=0, .dqi_igrace=0, .dqi_flags=0, .dqi_valid=0}, .qc_dqblk={.dqb_bhardlimit=0, .dqb_bsoftlimit=0, .dqb_curspace=0, .dqb_ihardlimit=0, .dqb_isoftlimit=0, .dqb_curinodes=0, .dqb_btime=0, .dqb_itime=0, .dqb_valid=15, .dqb_padding=0}}
osc_quotactl: "ls1-OST0038-osc-c0000003c865a400": {.qc_cmd=8388867, .qc_type=0, .qc_id=8624, .qc_stat=0, .qc_dqinfo={.dqi_bgrace=0, .dqi_igrace=0, .dqi_flags=0, .dqi_valid=0}, .qc_dqblk={.dqb_bhardlimit=0, .dqb_bsoftlimit=0, .dqb_curspace=0, .dqb_ihardlimit=0, .dqb_isoftlimit=0, .dqb_curinodes=18446744073709551615, .dqb_btime=0, .dqb_itime=0, .dqb_valid=15, .dqb_padding=0}}
osc_quotactl: "ls1-OST0039-osc-c0000003c865a400": {.qc_cmd=8388867, .qc_type=0, .qc_id=8624, .qc_stat=0, .qc_dqinfo={.dqi_bgrace=0, .dqi_igrace=0, .dqi_flags=0, .dqi_valid=0}, .qc_dqblk={.dqb_bhardlimit=0, .dqb_bsoftlimit=0, .dqb_curspace=0, .dqb_ihardlimit=0, .dqb_isoftlimit=0, .dqb_curinodes=0, .dqb_btime=0, .dqb_itime=0, .dqb_valid=15, .dqb_padding=0}}
osc_quotactl: "ls1-OST0073-osc-c0000003c865a400": {.qc_cmd=8388867, .qc_type=0, .qc_id=8624, .qc_stat=0, .qc_dqinfo={.dqi_bgrace=0, .dqi_igrace=0, .dqi_flags=0, .dqi_valid=0}, .qc_dqblk={.dqb_bhardlimit=0, .dqb_bsoftlimit=0, .dqb_curspace=0, .dqb_ihardlimit=0, .dqb_isoftlimit=0, .dqb_curinodes=3, .dqb_btime=0, .dqb_itime=0, .dqb_valid=15, .dqb_padding=0}}
osc_quotactl: "ls1-OST0074-osc-c0000003c865a400": {.qc_cmd=8388867, .qc_type=0, .qc_id=8624, .qc_stat=0, .qc_dqinfo={.dqi_bgrace=0, .dqi_igrace=0, .dqi_flags=0, .dqi_valid=0}, .qc_dqblk={.dqb_bhardlimit=0, .dqb_bsoftlimit=0, .dqb_curspace=0, .dqb_ihardlimit=0, .dqb_isoftlimit=0, .dqb_curinodes=18446744073709551615, .dqb_btime=0, .dqb_itime=0, .dqb_valid=15, .dqb_padding=0}}
osc_quotactl: "ls1-OST0075-osc-c0000003c865a400": {.qc_cmd=8388867, .qc_type=0, .qc_id=8624, .qc_stat=0, .qc_dqinfo={.dqi_bgrace=0, .dqi_igrace=0, .dqi_flags=0, .dqi_valid=0}, .qc_dqblk={.dqb_bhardlimit=0, .dqb_bsoftlimit=0, .dqb_curspace=0, .dqb_ihardlimit=0, .dqb_isoftlimit=0, .dqb_curinodes=3, .dqb_btime=0, .dqb_itime=0, .dqb_valid=15, .dqb_padding=0}}

Specifically, the values of dqb_curinodes:

ls1-OST0074-osc-c0000003c865a400:dqb_curinodes=18446744073709551615
ls1-OST0038-osc-c0000003c865a400:dqb_curinodes=18446744073709551615


 Comments   
Comment by Peter Jones [ 15/Jan/13 ]

Niu

Could you please look into this one?

Thanks

Peter

Comment by Niu Yawei (Inactive) [ 15/Jan/13 ]

I suppose the backend is zfs, and the "quota_iused_estimate" is 0 (you can check it under the osd-zfs proc dir), right?

The curinodes from OSTs will not be counted in the total inode usage at the end, though the number implies something is wrong. I guess the MDC was getting the same number too, is it possible to get some log with D_QUOTA enabled on the MDT or the OSTs which returning invalid number? Thanks.

Comment by Prakash Surya (Inactive) [ 23/Sep/13 ]

Ping. We're still suffering from this.

# sierra38 /root > cat /proc/fs/lustre/version 
lustre: 2.1.4
kernel: patchless_client
build:  2.1.4-5chaos-5chaos--PRISTINE-2.6.32-358.11.1.2chaos.ch5.1.x86_64
# sierra38 /root > cat TOSS-27.stp 
probe module("lquota").function("client_quota_ctl").return {
        printf("%s: %s: %s\n", probefunc(), @cast($exp, "obd_export")->exp_obd->obd_name$, $oqctl$$);
}
# sierra38 /root > stap TOSS-27.stp -c "lfs quota -u richmond /p/lscratche"
Disk quotas for user richmond (uid 1098):
     Filesystem  kbytes   quota   limit   grace   files   quota   limit   grace
   /p/lscratche       8       0       0       - 18446744073709551606       0       0       -
client_quota_ctl: "lse-MDT0000-mdc-ffff8802fa988c00": {.qc_cmd=8388615, .qc_type=0, .qc_id=1098, .qc_stat=0, .qc_dqinfo={.dqi_bgrace=0, .dqi_igrace=0, .dqi_flags=0, .dqi_valid=0}, .qc_dqblk={.dqb_bhardlimit=0, .dqb_bsoftlimit=0, .dqb_curspace=0, .dqb_ihardlimit=0, .dqb_isoftlimit=0, .dqb_curinodes=0, .dqb_btime=0, .dqb_itime=0, .dqb_valid=0, .padding=0}}
client_quota_ctl: "lse-OST0001-osc-ffff8802fa988c00": {.qc_cmd=8388867, .qc_type=0, .qc_id=1098, .qc_stat=0, .qc_dqinfo={.dqi_bgrace=0, .dqi_igrace=0, .dqi_flags=0, .dqi_valid=0}, .qc_dqblk={.dqb_bhardlimit=0, .dqb_bsoftlimit=0, .dqb_curspace=0, .dqb_ihardlimit=0, .dqb_isoftlimit=0, .dqb_curinodes=0, .dqb_btime=0, .dqb_itime=0, .dqb_valid=0, .padding=0}}
client_quota_ctl: "lse-OST0002-osc-ffff8802fa988c00": {.qc_cmd=8388867, .qc_type=0, .qc_id=1098, .qc_stat=0, .qc_dqinfo={.dqi_bgrace=0, .dqi_igrace=0, .dqi_flags=0, .dqi_valid=0}, .qc_dqblk={.dqb_bhardlimit=0, .dqb_bsoftlimit=0, .dqb_curspace=0, .dqb_ihardlimit=0, .dqb_isoftlimit=0, .dqb_curinodes=18446744073709551419, .dqb_btime=0, .dqb_itime=0, .dqb_valid=15, .padding=0}}
client_quota_ctl: "lse-OST0003-osc-ffff8802fa988c00": {.qc_cmd=8388867, .qc_type=0, .qc_id=1098, .qc_stat=0, .qc_dqinfo={.dqi_bgrace=0, .dqi_igrace=0, .dqi_flags=0, .dqi_valid=0}, .qc_dqblk={.dqb_bhardlimit=0, .dqb_bsoftlimit=0, .dqb_curspace=0, .dqb_ihardlimit=0, .dqb_isoftlimit=0, .dqb_curinodes=18446744073709551442, .dqb_btime=0, .dqb_itime=0, .dqb_valid=15, .padding=0}}
client_quota_ctl: "lse-OST0004-osc-ffff8802fa988c00": {.qc_cmd=8388867, .qc_type=0, .qc_id=1098, .qc_stat=0, .qc_dqinfo={.dqi_bgrace=0, .dqi_igrace=0, .dqi_flags=0, .dqi_valid=0}, .qc_dqblk={.dqb_bhardlimit=0, .dqb_bsoftlimit=0, .dqb_curspace=0, .dqb_ihardlimit=0, .dqb_isoftlimit=0, .dqb_curinodes=18446744073709551404, .dqb_btime=0, .dqb_itime=0, .dqb_valid=15, .padding=0}}
client_quota_ctl: "lse-OST0005-osc-ffff8802fa988c00": {.qc_cmd=8388867, .qc_type=0, .qc_id=1098, .qc_stat=0, .qc_dqinfo={.dqi_bgrace=0, .dqi_igrace=0, .dqi_flags=0, .dqi_valid=0}, .qc_dqblk={.dqb_bhardlimit=0, .dqb_bsoftlimit=0, .dqb_curspace=0, .dqb_ihardlimit=0, .dqb_isoftlimit=0, .dqb_curinodes=18446744073709551487, .dqb_btime=0, .dqb_itime=0, .dqb_valid=15, .padding=0}}
client_quota_ctl: "lse-OST0006-osc-ffff8802fa988c00": {.qc_cmd=8388867, .qc_type=0, .qc_id=1098, .qc_stat=0, .qc_dqinfo={.dqi_bgrace=0, .dqi_igrace=0, .dqi_flags=0, .dqi_valid=0}, .qc_dqblk={.dqb_bhardlimit=0, .dqb_bsoftlimit=0, .dqb_curspace=0, .dqb_ihardlimit=0, .dqb_isoftlimit=0, .dqb_curinodes=18446744073709551433, .dqb_btime=0, .dqb_itime=0, .dqb_valid=15, .padding=0}}
client_quota_ctl: "lse-OST0007-osc-ffff8802fa988c00": {.qc_cmd=8388867, .qc_type=0, .qc_id=1098, .qc_stat=0, .qc_dqinfo={.dqi_bgrace=0, .dqi_igrace=0, .dqi_flags=0, .dqi_valid=0}, .qc_dqblk={.dqb_bhardlimit=0, .dqb_bsoftlimit=0, .dqb_curspace=0, .dqb_ihardlimit=0, .dqb_isoftlimit=0, .dqb_curinodes=18446744073709551521, .dqb_btime=0, .dqb_itime=0, .dqb_valid=15, .padding=0}}
client_quota_ctl: "lse-OST0008-osc-ffff8802fa988c00": {.qc_cmd=8388867, .qc_type=0, .qc_id=1098, .qc_stat=0, .qc_dqinfo={.dqi_bgrace=0, .dqi_igrace=0, .dqi_flags=0, .dqi_valid=0}, .qc_dqblk={.dqb_bhardlimit=0, .dqb_bsoftlimit=0, .dqb_curspace=0, .dqb_ihardlimit=0, .dqb_isoftlimit=0, .dqb_curinodes=18446744073709551422, .dqb_btime=0, .dqb_itime=0, .dqb_valid=15, .padding=0}}
client_quota_ctl: "lse-OST0009-osc-ffff8802fa988c00": {.qc_cmd=8388867, .qc_type=0, .qc_id=1098, .qc_stat=0, .qc_dqinfo={.dqi_bgrace=0, .dqi_igrace=0, .dqi_flags=0, .dqi_valid=0}, .qc_dqblk={.dqb_bhardlimit=0, .dqb_bsoftlimit=0, .dqb_curspace=0, .dqb_ihardlimit=0, .dqb_isoftlimit=0, .dqb_curinodes=18446744073709551416, .dqb_btime=0, .dqb_itime=0, .dqb_valid=15, .padding=0}}
client_quota_ctl: "lse-OST000a-osc-ffff8802fa988c00": {.qc_cmd=8388867, .qc_type=0, .qc_id=1098, .qc_stat=0, .qc_dqinfo={.dqi_bgrace=0, .dqi_igrace=0, .dqi_flags=0, .dqi_valid=0}, .qc_dqblk={.dqb_bhardlimit=0, .dqb_bsoftlimit=0, .dqb_curspace=0, .dqb_ihardlimit=0, .dqb_isoftlimit=0, .dqb_curinodes=18446744073709551325, .dqb_btime=0, .dqb_itime=0, .dqb_valid=15, .padding=0}}
client_quota_ctl: "lse-OST000b-osc-ffff8802fa988c00": {.qc_cmd=8388867, .qc_type=0, .qc_id=1098, .qc_stat=0, .qc_dqinfo={.dqi_bgrace=0, .dqi_igrace=0, .dqi_flags=0, .dqi_valid=0}, .qc_dqblk={.dqb_bhardlimit=0, .dqb_bsoftlimit=0, .dqb_curspace=0, .dqb_ihardlimit=0, .dqb_isoftlimit=0, .dqb_curinodes=18446744073709551499, .dqb_btime=0, .dqb_itime=0, .dqb_valid=15, .padding=0}}
client_quota_ctl: "lse-OST000c-osc-ffff8802fa988c00": {.qc_cmd=8388867, .qc_type=0, .qc_id=1098, .qc_stat=0, .qc_dqinfo={.dqi_bgrace=0, .dqi_igrace=0, .dqi_flags=0, .dqi_valid=0}, .qc_dqblk={.dqb_bhardlimit=0, .dqb_bsoftlimit=0, .dqb_curspace=0, .dqb_ihardlimit=0, .dqb_isoftlimit=0, .dqb_curinodes=18446744073709551469, .dqb_btime=0, .dqb_itime=0, .dqb_valid=15, .padding=0}}
client_quota_ctl: "lse-OST000d-osc-ffff8802fa988c00": {.qc_cmd=8388867, .qc_type=0, .qc_id=1098, .qc_stat=0, .qc_dqinfo={.dqi_bgrace=0, .dqi_igrace=0, .dqi_flags=0, .dqi_valid=0}, .qc_dqblk={.dqb_bhardlimit=0, .dqb_bsoftlimit=0, .dqb_curspace=0, .dqb_ihardlimit=0, .dqb_isoftlimit=0, .dqb_curinodes=18446744073709551340, .dqb_btime=0, .dqb_itime=0, .dqb_valid=15, .padding=0}}
client_quota_ctl: "lse-OST000e-osc-ffff8802fa988c00": {.qc_cmd=8388867, .qc_type=0, .qc_id=1098, .qc_stat=0, .qc_dqinfo={.dqi_bgrace=0, .dqi_igrace=0, .dqi_flags=0, .dqi_valid=0}, .qc_dqblk={.dqb_bhardlimit=0, .dqb_bsoftlimit=0, .dqb_curspace=0, .dqb_ihardlimit=0, .dqb_isoftlimit=0, .dqb_curinodes=18446744073709551387, .dqb_btime=0, .dqb_itime=0, .dqb_valid=15, .padding=0}}
client_quota_ctl: "lse-OST000f-osc-ffff8802fa988c00": {.qc_cmd=8388867, .qc_type=0, .qc_id=1098, .qc_stat=0, .qc_dqinfo={.dqi_bgrace=0, .dqi_igrace=0, .dqi_flags=0, .dqi_valid=0}, .qc_dqblk={.dqb_bhardlimit=0, .dqb_bsoftlimit=0, .dqb_curspace=0, .dqb_ihardlimit=0, .dqb_isoftlimit=0, .dqb_curinodes=18446744073709551435, .dqb_btime=0, .dqb_itime=0, .dqb_valid=15, .padding=0}}
client_quota_ctl: "lse-OST0010-osc-ffff8802fa988c00": {.qc_cmd=8388867, .qc_type=0, .qc_id=1098, .qc_stat=0, .qc_dqinfo={.dqi_bgrace=0, .dqi_igrace=0, .dqi_flags=0, .dqi_valid=0}, .qc_dqblk={.dqb_bhardlimit=0, .dqb_bsoftlimit=0, .dqb_curspace=0, .dqb_ihardlimit=0, .dqb_isoftlimit=0, .dqb_curinodes=18446744073709551326, .dqb_btime=0, .dqb_itime=0, .dqb_valid=15, .padding=0}}
client_quota_ctl: "lse-MDT0000-mdc-ffff8802fa988c00": {.qc_cmd=8388867, .qc_type=0, .qc_id=1098, .qc_stat=0, .qc_dqinfo={.dqi_bgrace=0, .dqi_igrace=0, .dqi_flags=0, .dqi_valid=0}, .qc_dqblk={.dqb_bhardlimit=0, .dqb_bsoftlimit=0, .dqb_curspace=0, .dqb_ihardlimit=0, .dqb_isoftlimit=0, .dqb_curinodes=0, .dqb_btime=0, .dqb_itime=0, .dqb_valid=15, .padding=0}}

I suppose the backend is zfs, and the "quota_iused_estimate" is 0 (you can check it under the osd-zfs proc dir), right?

Yes.

# porteri /root > pdsh -w porter[1-16] 'cat /proc/fs/lustre/osd-zfs/*/quota_iused_estimate' | dshbak -c
----------------
porter[1-16]
----------------
0

The curinodes from OSTs will not be counted in the total inode usage at the end, though the number implies something is wrong. I guess the MDC was getting the same number too, is it possible to get some log with D_QUOTA enabled on the MDT or the OSTs which returning invalid number? Thanks.

I'll see if I can get some debug logs from the problem OSTs.

Comment by Niu Yawei (Inactive) [ 11/Oct/13 ]

What's exact client & server version? client_quota_ctl() should only exist in 2.1 and earlier version.

Comment by Prakash Surya (Inactive) [ 11/Oct/13 ]

Sorry I haven't gotten any debug data pushed to you yet. Would that still be useful? And if so, from the MDT, OST, or both?

The client version taken from the previous comment:

# sierra38 /root > cat /proc/fs/lustre/version 
lustre: 2.1.4
kernel: patchless_client
build:  2.1.4-5chaos-5chaos--PRISTINE-2.6.32-358.11.1.2chaos.ch5.1.x86_64

The server version:

# porter-mds1 /root > cat /proc/fs/lustre/version 
lustre: 2.4.0
kernel: patchless_client
build:  2.4.0-15chaos-15chaos--PRISTINE-2.6.32-358.14.1.2chaos.ch5.1.1.x86_64
Comment by Niu Yawei (Inactive) [ 14/Oct/13 ]

The bugous value returned from OST, the log on OST would be helpul, however, these inode usage values from OSTs will not be accounted, so I guess the value displayed in 'lfs quota' output comes from MDT. Could you get logs from both MDT & OST? Thanks.

Comment by D. Marc Stearman (Inactive) [ 04/Nov/13 ]

I have attached the MDS logs generated after running this command:

  1. oslic1 /root > lfs quota -u 40186 /p/lscratche
    Disk quotas for user 40186 (uid 40186):
    Filesystem kbytes quota limit grace files quota limit grace
    /p/lscratche 14 0 0 - 18446744073709551606 0 0 -
  2. oslic1 /root >

The OSS logs total 100MB compressed, so I can't upload them all at once. If you want specific ones, or have an FTP site, I can get them to you.

Comment by Niu Yawei (Inactive) [ 05/Nov/13 ]

Marc

I have privately emailed you details of how to get us the logs

Peter

Comment by D. Marc Stearman (Inactive) [ 05/Nov/13 ]

Thank you Peter. I created an LU-2619 directory, and placed a file called "porter_lustre_logs.tgz" It is about 115MB compressed. Untarred, it will be about 2.4GB. Let me know if you need anything else.

Comment by Niu Yawei (Inactive) [ 06/Nov/13 ]

Unfortunately, I didn't find any clue in the logs.

Marc, could you apply the debug patch (http://review.whamcloud.com/#/c/8191/) and try to get the log in following steps again? Thanks a lot.

1. 'lctl clear' on MDS and OSTs to clear the debug buffer;
2. enable D_QUOTA & D_TRACE on MDS and OSTs; (echo +quota > /proc/sys/lnet/debug; echo +trace > /proc/sys/lnet/debug);
3. 'lctl debug_daemon start $tmpfile 300' on MDS and OSTs to start the debug daemon;
4. 'lctl mark "======= lfs quota ======"' on MDS and OSTs to set a marker in debug log;
5. execute the lfs quota command which prints the bugous value;
6. 'lctl debug_daemon stop' on MDS and OSTs to stop debug daemon;
7. 'lctl debug_file $tmpfile $logfile' to convert binary logs into text files;
8. put the text log files in previoius step on ftp;

Comment by D. Marc Stearman (Inactive) [ 06/Nov/13 ]

Applying a patch will take a few weeks. Is it useful to run the above steps without the patch?

Comment by Niu Yawei (Inactive) [ 07/Nov/13 ]

It's better to apply the patch or you can use system tap to print these values in odd_acct_index_lookup().

Comment by Prakash Surya (Inactive) [ 08/Nov/13 ]

Sigh.. Systemtap is failing me..

# porter34 /root > stap -v TOSS-27.stp 
Pass 1: parsed user script and 91 library script(s) using 101264virt/26156res/2964shr/24060data kb, in 110usr/0sys/111real ms.
semantic error: failed to retrieve location attribute for 'osd' (dieoffset: 0x8434c): identifier '$osd' at TOSS-27.stp:3:9
        source:                 @cast($osd, "osd_device")->od_svname$,
                                      ^

Pass 2: analyzed script: 1 probe(s), 7 function(s), 0 embed(s), 0 global(s) using 249680virt/34380res/6848shr/27132data kb, in 20usr/20sys/40real ms.
Pass 2: analysis failed.  Try again with another '--vp 01' option.
# porter34 /root > cat TOSS-27.stp 
probe module("osd-zfs").function("osd_acct_index_lookup").return {
        printf("%s: %s: id: %s, ispace: %u, bspace: %u\n", probefunc(),
                @cast($osd, "osd_device")->od_svname$,
                $buf, @cast($rec, "lquota_acct_rec")->ispace,
                @cast($rec, "lquota_acct_rec")->bspace);
}
Comment by Niu Yawei (Inactive) [ 11/Nov/13 ]

This problem happened on both 2.1.4 client 2.4 client, right? Is it possible to make a reproducer?

Comment by Prakash Surya (Inactive) [ 12/Nov/13 ]

Yes, we see the same behavior on 2.1 and 2.4 clients. The server is 2.4 only, though. I don't know if the same would happen on a 2.1 server. We have a reproducer, but I think it is dependent on the server returning a "bad" value. Perhaps we can try to reproduce this in a VM setup, using a "fail_loc" on the server to return a bogus value to the client? I haven't tried that, but it might work.

Comment by Niu Yawei (Inactive) [ 13/Nov/13 ]

what do you mean using a 'fail_loc' on server to return a bogus value to client? I don't think server is expected to return a bad value.

Comment by Prakash Surya (Inactive) [ 13/Nov/13 ]

Well, I'm still unsure where the bad value is coming from, but my guess is it's coming from the server. I could be wrong, though.

I'm assuming the bad "dqb_curinodes" values uncovered by the client systemtap script is coming from the server, is that not the case?

Comment by Niu Yawei (Inactive) [ 14/Nov/13 ]

I'm assuming the bad "dqb_curinodes" values uncovered by the client systemtap script is coming from the server, is that not the case?

That's quite possible. I'm just not sure the purpose of using a 'fail_loc' on server to return bad value to client.

Comment by Niu Yawei (Inactive) [ 09/Jul/14 ]

The bogus 'dqb_curinodes' comes from OST, I'm wondering how it can contribute to the 'files' of 'lfs quota' output, because we only collect inode usage on MDTs:

                        /* collect space usage from OSTs */
                        oqctl_tmp->qc_dqblk.dqb_curspace = 0;
                        rc = obd_quotactl(sbi->ll_dt_exp, oqctl_tmp);
                        if (!rc || rc == -EREMOTEIO) {
                                oqctl->qc_dqblk.dqb_curspace =
                                        oqctl_tmp->qc_dqblk.dqb_curspace;
                                oqctl->qc_dqblk.dqb_valid |= QIF_SPACE;
                        }

                        /* collect space & inode usage from MDTs */
                        oqctl_tmp->qc_dqblk.dqb_curspace = 0;
                        oqctl_tmp->qc_dqblk.dqb_curinodes = 0;
                        rc = obd_quotactl(sbi->ll_md_exp, oqctl_tmp);
                        if (!rc || rc == -EREMOTEIO) {
                                oqctl->qc_dqblk.dqb_curspace +=
                                        oqctl_tmp->qc_dqblk.dqb_curspace;
                                oqctl->qc_dqblk.dqb_curinodes =
                                        oqctl_tmp->qc_dqblk.dqb_curinodes;
                                oqctl->qc_dqblk.dqb_valid |= QIF_INODES;
                        } else {
                                oqctl->qc_dqblk.dqb_valid &= ~QIF_SPACE;
                        }

I did some local testing that making OST to return fake 'curinodes' to client, however, client ignored the fake value as expected.

While investigating why server returns bogus value, I think I'd verify the client code you running wasn't changed by some unexpected patch. Could you show me where to check the client code? (llnl tree? which tag?) Thank you.

Comment by Prakash Surya (Inactive) [ 09/Jul/14 ]

As always, our source and releases are on github: https://github.com/chaos/lustre

As far as which releases were installed on the servers and clients in question, I'll have to ask the admins. Marc Stearman, can you double check this issue is still occurring on Sequoia and report back the version currently installed there?

Comment by D. Marc Stearman (Inactive) [ 09/Jul/14 ]

Yes it is still happening on all of our file systems, so the current tag will work for you.

Comment by Christopher Morrone [ 09/Jul/14 ]

In other words, tag 2.4.2-13chaos.

Comment by Niu Yawei (Inactive) [ 10/Jul/14 ]

Thank you, I didn't see any difference in client code.

Looks the debug patch (http://review.whamcloud.com/#/c/8191/) has been applied in the code, is it possible to capture server logs with D_QUOTA enabled? So we can see if the bogus value is returned from osd_acct_index_lookup().

Comment by D. Marc Stearman (Inactive) [ 12/Jan/15 ]

I enabled +quota debugging on the MDS. Then I ran this command:

[root@surface86:~]# lfs quota -u weems2 /p/lscratche
Disk quotas for user weems2 (uid 59519):
     Filesystem  kbytes   quota   limit   grace   files   quota   limit   grace
   /p/lscratche      88       0       0       - 18446744073709551462       0       0       -
[root@surface86:~]# 

You can see that the files column is very large. I then dumped the debug logs on the MDS right after that: These are the lines from the quota debugging:

00040000:04000000:5.0F:1421104120.034812:0:13022:0:(qmt_handler.c:65:qmt_get()) $$$ fetch settings qmt:lse-QMT0000 pool:0-md id:59519 enforced:0 hard:0 soft:0 granted:0 time:0 qunit:0 edquot:0 may_rel:0 revoke:0
00040000:04000000:5.0:1421104120.034818:0:13022:0:(qmt_handler.c:65:qmt_get()) $$$ fetch settings qmt:lse-QMT0000 pool:0-dt id:59519 enforced:0 hard:0 soft:0 granted:0 time:0 qunit:0 edquot:0 may_rel:0 revoke:0
00000001:04000000:4.0F:1421104120.068222:0:13022:0:(osd_quota.c:122:osd_acct_index_lookup()) lse-MDT0000: id:e87f, ispace:18446744073709551462, bspace:90112

Do you need lines from OSTs as well, or just from the MDS?

Comment by Niu Yawei (Inactive) [ 13/Jan/15 ]

From the log we can see the bogus value is from MDS, and it's read from the zap object which we created for inode accounting. Given that this problem happened only for inode accounting, I highly suspect it's related to LU-2435. I think a temporary workaround is to set the quota_iused_estimate to 1.

Do you need lines from OSTs as well, or just from the MDS?

No, I think MDS log is enough. Thank you.

Comment by D. Marc Stearman (Inactive) [ 13/Jan/15 ]

Thanks, setting quota_iused_estimate to 1 reports a more realistic value:

[root@surface86:~]# lfs quota -u weems2 /p/lscratche
Disk quotas for user weems2 (uid 59519):
     Filesystem  kbytes   quota   limit   grace   files   quota   limit   grace
   /p/lscratche      88       0       0       -       1       0       0       -
[root@surface86:~]# 

I will watch the status of LU-2435. Are there any other downsides to setting quota_iused_estimate to 1?

Comment by Niu Yawei (Inactive) [ 16/Jan/15 ]

Are there any other downsides to setting quota_iused_estimate to 1

With quota_iused_estimate = 1, the reported inode accounting is calculated based on consumed space on MDT, it's not accurate as quota_iused_estimate = 0.

Comment by D. Marc Stearman (Inactive) [ 20/Jan/15 ]

I think this is a decent workaround for my purposes. I'm happy to close this one, and I will await a fix in LU-2435 that will allow us to scan the file system and fix broken ZAP entries.

Comment by Peter Jones [ 20/Jan/15 ]

ok thanks Marc!

Generated at Sat Feb 10 01:26:45 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.