[LU-9375] llog files have less number of records than they designed Created: 20/Apr/17 Updated: 21/Apr/17 |
|
| Status: | Open |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | Lustre 2.10.0 |
| Fix Version/s: | None |
| Type: | Bug | Priority: | Critical |
| Reporter: | Alexander Boyko | Assignee: | Bruno Faccini (Inactive) |
| Resolution: | Unresolved | Votes: | 0 |
| Labels: | None | ||
| Severity: | 3 |
| Rank (Obsolete): | 9223372036854775807 |
| Description |
|
The default bitmap size for llog file is 64768, so it can store 64767 records. llog_cat_add fills one plain llog file to full size then go to another plain llog file. Right now, I see that this logic was broken. [root@localhost intelgerrit]# sh luste/tests/llmount.sh [root@localhost intelgerrit]# lctl --device lustre-MDT0000 changelog_register lustre-MDT0000: Registered changelog userid 'cl1' [root@localhost intelgerrit]# mkdir /mnt/lustre/test [root@localhost intelgerrit]# lustre/tests/createmany -o /mnt/lustre/test/foo- 64768 - open/close 10000 (time 1492674148.41 total 5.24 last 1909.71) - open/close 20000 (time 1492674153.50 total 10.33 last 1963.04) - open/close 30000 (time 1492674158.39 total 15.22 last 2044.79) - open/close 40000 (time 1492674163.59 total 20.42 last 1923.54) - open/close 50000 (time 1492674168.46 total 25.29 last 2055.37) - open/close 60000 (time 1492674173.32 total 30.15 last 2055.01) total: 64768 open/close in 32.60 seconds: 1986.96 ops/second [root@localhost ~]# debugfs -R "dump changelog_catalog changelog_catalog" /tmp/lustre-mdt1 debugfs 1.42.13.x3 (26-Dec-2016) [root@localhost ~]# llog_reader changelog_catalog | tail -n13 #01 (064)id=[0x7:0x1:0x0]:0 path=oi.1/0x1:0x7:0x0 #02 (064)id=[0x8:0x1:0x0]:0 path=oi.1/0x1:0x8:0x0 #03 (064)id=[0x9:0x1:0x0]:0 path=oi.1/0x1:0x9:0x0 #04 (064)id=[0xa:0x1:0x0]:0 path=oi.1/0x1:0xa:0x0 #05 (064)id=[0xb:0x1:0x0]:0 path=oi.1/0x1:0xb:0x0 #06 (064)id=[0xc:0x1:0x0]:0 path=oi.1/0x1:0xc:0x0 #07 (064)id=[0xd:0x1:0x0]:0 path=oi.1/0x1:0xd:0x0 #08 (064)id=[0xe:0x1:0x0]:0 path=oi.1/0x1:0xe:0x0 #09 (064)id=[0xf:0x1:0x0]:0 path=oi.1/0x1:0xf:0x0 #10 (064)id=[0x10:0x1:0x0]:0 path=oi.1/0x1:0x10:0x0 #11 (064)id=[0x11:0x1:0x0]:0 path=oi.1/0x1:0x11:0x0 #12 (064)id=[0x12:0x1:0x0]:0 path=oi.1/0x1:0x12:0x0 #13 (064)id=[0x13:0x1:0x0]:0 path=oi.1/0x1:0x13:0x0 [root@localhost ~]# debugfs -R "dump /O/1/d7/7 plain_1" /tmp/lustre-mdt1 debugfs 1.42.13.x3 (26-Dec-2016) [root@localhost ~]# llog_reader plain_1 | tail #11994 (128)changelog record id:0x0 cr_flags:0x5000 cr_type:CREAT(0x1) #11995 (120)changelog record id:0x0 cr_flags:0x5043 cr_type:CLOSE(0xb) #11996 (128)changelog record id:0x0 cr_flags:0x5000 cr_type:CREAT(0x1) #11997 (120)changelog record id:0x0 cr_flags:0x5043 cr_type:CLOSE(0xb) #11998 (128)changelog record id:0x0 cr_flags:0x5000 cr_type:CREAT(0x1) #11999 (120)changelog record id:0x0 cr_flags:0x5043 cr_type:CLOSE(0xb) #12000 (128)changelog record id:0x0 cr_flags:0x5000 cr_type:CREAT(0x1) #12001 (120)changelog record id:0x0 cr_flags:0x5043 cr_type:CLOSE(0xb) #12002 (128)changelog record id:0x0 cr_flags:0x5000 cr_type:CREAT(0x1) #12003 (120)changelog record id:0x0 cr_flags:0x5043 cr_type:CLOSE(0xb) [root@localhost ~]# debugfs -R "dump /O/1/d8/8 plain_2" /tmp/lustre-mdt1 debugfs 1.42.13.x3 (26-Dec-2016) [root@localhost ~]# llog_reader plain_2 | tail #11627 (136)changelog record id:0x0 cr_flags:0x5000 cr_type:CREAT(0x1) #11628 (120)changelog record id:0x0 cr_flags:0x5043 cr_type:CLOSE(0xb) #11629 (136)changelog record id:0x0 cr_flags:0x5000 cr_type:CREAT(0x1) #11630 (120)changelog record id:0x0 cr_flags:0x5043 cr_type:CLOSE(0xb) #11631 (136)changelog record id:0x0 cr_flags:0x5000 cr_type:CREAT(0x1) #11632 (120)changelog record id:0x0 cr_flags:0x5043 cr_type:CLOSE(0xb) #11633 (136)changelog record id:0x0 cr_flags:0x5000 cr_type:CREAT(0x1) #11634 (120)changelog record id:0x0 cr_flags:0x5043 cr_type:CLOSE(0xb) #11635 (136)changelog record id:0x0 cr_flags:0x5000 cr_type:CREAT(0x1) #11636 (120)changelog record id:0x0 cr_flags:0x5043 cr_type:CLOSE(0xb) [root@localhost ~]# debugfs -R "dump /O/1/d16/16 plain_10" /tmp/lustre-mdt1 debugfs 1.42.13.x3 (26-Dec-2016) [root@localhost ~]# llog_reader plain_10 | tail #9529 (120)changelog record id:0x0 cr_flags:0x5043 cr_type:CLOSE(0xb) #9530 (136)changelog record id:0x0 cr_flags:0x5000 cr_type:CREAT(0x1) #9531 (120)changelog record id:0x0 cr_flags:0x5043 cr_type:CLOSE(0xb) #9532 (136)changelog record id:0x0 cr_flags:0x5000 cr_type:CREAT(0x1) #9533 (120)changelog record id:0x0 cr_flags:0x5043 cr_type:CLOSE(0xb) #9534 (136)changelog record id:0x0 cr_flags:0x5000 cr_type:CREAT(0x1) #9535 (120)changelog record id:0x0 cr_flags:0x5043 cr_type:CLOSE(0xb) #9536 (136)changelog record id:0x0 cr_flags:0x5000 cr_type:CREAT(0x1) #9537 (120)changelog record id:0x0 cr_flags:0x5043 cr_type:CLOSE(0xb) #9538 (136)changelog record id:0x0 cr_flags:0x5000 cr_type:CREAT(0x1) So every plain llog file stores about ~11k records instead of 64k. |
| Comments |
| Comment by Bruno Faccini (Inactive) [ 20/Apr/17 ] |
|
Well this seems to be caused by this piece of code in llog_osd_write_rec() : 377 static int llog_osd_write_rec(const struct lu_env *env,
378 struct llog_handle *loghandle,
379 struct llog_rec_hdr *rec,
380 struct llog_cookie *reccookie,
381 int idx, struct thandle *th)
382 {
.................
559 if (loghandle->lgh_max_size > 0 &&
560 lgi->lgi_off >= loghandle->lgh_max_size) {
561 CDEBUG(D_OTHER, "llog is getting too large (%u > %u) at %u "
562 DOSTID"\n", (unsigned)lgi->lgi_off,
563 loghandle->lgh_max_size,
564 (int)loghandle->lgh_last_idx,
565 POSTID(&loghandle->lgh_id.lgl_oi));
566 /* this is to signal that this llog is full */
567 loghandle->lgh_last_idx = LLOG_HDR_BITMAP_SIZE(llh) - 1;
568 RETURN(-ENOSPC);
569 }
................
this new limitation code comes from this patch : commit 4724b52bba54ccdb0f81d0c63010b69e87e7f65c
Author: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Date: Mon Jan 18 09:24:19 2016 +0300
LU-6838 llog: limit file size of plain logs
on small filesystems plain log can grow dramatically. especially
given large record sizes produced by DNE and extended chunksize.
I saw >50% of space consumed by a single llog file which was still
in use. this leads to test failures (sanityn, etc).
the patch introduces additional limit on plain llog size, which
is calculated as <free space>/64 (128MB at most) at llog creation
time.
Change-Id: I0eab8177d4e416a32a6aab56d47e4142c81d13de
Signed-off-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Reviewed-on: https://review.whamcloud.com/18028
Tested-by: Jenkins
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: wangdi <di.wang@intel.com>
Reviewed-by: Mike Pershin <mike.pershin@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
And thus, i believe you get so small plain LLOGs due to the very little size of your MDT device created by llmount.sh, the number of entries varying due to the different/variable sizes of the different ChangeLog records being recorded. |
| Comment by Alexander Boyko [ 21/Apr/17 ] |
|
Bruno Faccini thanks for clarification, you are right. But commit msg doesn`t fit with code calculation. It seems that we need 8,2GB free space to skip 2MB limit of plain log.
loghandle->lgh_max_size = 2 << 20;
dt = lu2dt_dev(cathandle->lgh_obj->do_lu.lo_dev);
rc = dt_statfs(env, dt, &lgi->lgi_statfs);
if (rc == 0 && lgi->lgi_statfs.os_bfree > 0) {
__u64 freespace = (lgi->lgi_statfs.os_bfree *
lgi->lgi_statfs.os_bsize) >> 6;
if (freespace < loghandle->lgh_max_size)
loghandle->lgh_max_size = freespace;
/* shouldn't be > 128MB in any case?
* it's 256K records of 512 bytes each */
if (freespace > (128 << 20))
loghandle->lgh_max_size = 128 << 20;
}
|