[LU-5405] Performance issue while using robinhood in changelog mode Created: 24/Jul/14 Updated: 01/Mar/17 Resolved: 04/Sep/14 |
|
| Status: | Resolved |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | Lustre 2.4.3 |
| Fix Version/s: | Lustre 2.7.0, Lustre 2.5.4 |
| Type: | Bug | Priority: | Major |
| Reporter: | Bruno Travouillon (Inactive) | Assignee: | Bruno Faccini (Inactive) |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | None | ||
| Environment: |
Kernel: kernel-2.6.32-431.11.2.el6.x86_64 w/ patch for Lustre |
||
| Attachments: |
|
||||
| Issue Links: |
|
||||
| Epic/Theme: | changelog | ||||
| Severity: | 3 | ||||
| Rank (Obsolete): | 15041 | ||||
| Description |
|
We are facing a performance issue when using robinhood in changelog mode on top The more the changelog size increase, the more time to clear the records increase. The production has started on 2014/06/19, after a update from Lustre 2.1 to The evolution of the read speed is shown in figure robinhood-readspeed.png. The The performance of the changelog clear is also available in this logfile. This The service has been restarted on 2014/07/16 11:11:50, with no improvments. As the customer was suspecting the performance of the mysql backend storage to By looking at the changelog, we observed that the MDS was still clearing The MDS was clearing records with a rate of 30 records/sec. In the meantime, by looking at the MDS, I find out that the kernel thread The function llog_cat_id2handle() makes an heavy use of this CPU: Samples: 27K of event 'cycles', Event count (approx.): 2347474354
46.45% [obdclass] [k] llog_cat_id2handle
37.66% [kernel] [k] _spin_lock_irqsave
4.34% [kernel] [k] _spin_lock
0.71% [kernel] [k] find_busiest_group
0.32% [kernel] [k] _spin_unlock_irqrestore
0.26% [ldiskfs] [k] 0x0000000000043778
0.24% [lvfs] [k] lprocfs_counter_add
0.16% [kernel] [k] schedule
0.16% [mlx4_ib] [k] 0x0000000000000c34
0.15% [kernel] [k] thread_return
0.14% [kernel] [k] apic_timer_interrupt
0.14% [obdclass] [k] class_handle2object
0.12% [ko2iblnd] [k] kiblnd_find_peer_locked
0.12% [kernel] [k] menu_select
0.11% [kernel] [k] _spin_lock_irq
0.11% [ko2iblnd] [k] kiblnd_scheduler
0.11% [kernel] [k] find_next_bit
0.10% [kernel] [k] fair_update_curr
0.10% [kernel] [k] memcpy
0.09% [kernel] [k] kfree
...
In the context of clearing records, the function seems to be called by We still see the same rate of 30 rec/sec. The bottleneck seems to be in this portion of code (lustre/obdclass/llog_cat.c): 155 int llog_cat_id2handle(const struct lu_env *env, struct llog_handle *cathandle, 156 struct llog_handle **res, struct llog_logid *logid) 157 { 158 struct llog_handle *loghandle; ... 166 down_write(&cathandle->lgh_lock); 167 cfs_list_for_each_entry(loghandle, &cathandle->u.chd.chd_head, <== bottleneck ? 168 u.phd.phd_entry) { 169 struct llog_logid *cgl = &loghandle->lgh_id; 170 171 if (ostid_id(&cgl->lgl_oi) == ostid_id(&logid->lgl_oi) && ==> ostid_id: /* extract OST objid from a wire ost_id (id/seq) pair */ 172 ostid_seq(&cgl->lgl_oi) == ostid_seq(&logid->lgl_oi)) { ==> ostid_seq: /* extract OST sequence (group) from a wire ost_id (id/seq) pair */ ==> check if it is the same log object (id/seq) 173 if (cgl->lgl_ogen != logid->lgl_ogen) { ... 179 } 180 loghandle->u.phd.phd_cat_handle = cathandle; 181 up_write(&cathandle->lgh_lock); 182 GOTO(out, rc = 0); ... 212 out: 213 llog_handle_get(loghandle); 214 *res = loghandle; 215 return 0; 216 } Finally, the robinhood has been restarted on 2014/07/22 16:46:19, but in scan The changelog has not been deregistered yet, we are keeping it for testing |
| Comments |
| Comment by Bruno Travouillon (Inactive) [ 24/Jul/14 ] |
|
I forgot to notice that I read the content of the changelog. # lfs changelog scratch3-MDT0000 > scratch3-MDT0000.changelog.2 # ls -l scratch3-MDT0000.changelog.2 -rw-r--r-- 1 root root 46577682272 Jul 18 18:47 scratch3-MDT0000.changelog.2 First entries were from 19:46:00.582365844 2014.07.04 There are 436,211,167 records in the file. |
| Comment by Peter Jones [ 25/Jul/14 ] |
|
Lai Could you please advise on this issue? Thanks Peter |
| Comment by Lai Siyao [ 28/Jul/14 ] |
|
Debug log doesn't show any errors, and as you said the change log is quite large, IMO this catalog contains too many llogs, and all of them are opened, and it took quite some time to iterate the open handle list to find the right handle for each record, and to increase this iteration, it should be done backward since the latest opened handle is at the end. I'll commit a patch soon. |
| Comment by Lai Siyao [ 06/Aug/14 ] |
|
Patch is on http://review.whamcloud.com/#/c/11338/, could you test with it? Sorry for the late update because I just get back from vacation. |
| Comment by Lai Siyao [ 13/Aug/14 ] |
|
Each llog contains 60k records, so for your changelogs(400m records), it has almost 6k opened logs, I've verified http://review.whamcloud.com/#/c/11338/ works on my local test system with 2m records, which can always find the log from the head of the opened log handle list. |
| Comment by Li Xi (Inactive) [ 25/Aug/14 ] |
|
Patch for master: |
| Comment by Peter Jones [ 04/Sep/14 ] |
|
Landed for 2.7 |
| Comment by Aurelien Degremont (Inactive) [ 21/Sep/14 ] |
|
I think this fix should also be considered for 2.5.4 |