[LU-5225] Client is evicted by multiple OSTs on all OSSs Created: 18/Jun/14 Updated: 04/Aug/14 Resolved: 04/Aug/14 |
|
| Status: | Resolved |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | None |
| Fix Version/s: | None |
| Type: | Bug | Priority: | Minor |
| Reporter: | Bruno Faccini (Inactive) | Assignee: | Bruno Faccini (Inactive) |
| Resolution: | Duplicate | Votes: | 0 |
| Labels: | None | ||
| Issue Links: |
|
||||||||
| Severity: | 3 | ||||||||
| Rank (Obsolete): | 14559 | ||||||||
| Description |
|
As part of Having a look to the infos provided (ftp.whamcloud.com/uploads/ The reason why Client's RPCs are not sent anymore can not be found using only the Lustre debug log level (dlmtrace) on Client, but I can see during Client's eviction process/handling these RPCs were on the delayed queue. |
| Comments |
| Comment by Bruno Faccini (Inactive) [ 18/Jun/14 ] |
|
James, can you update it with any interesting detail you think please? |
| Comment by Bruno Faccini (Inactive) [ 18/Jun/14 ] |
|
Also, what would be helpful would be to have a precise description of the configuration/reproducer you use, like : _ do you still use the "single shared file IOR job" you indicated in _ in the sets of logs you provided, it seems you use 22 Clients, and your Lustre filesystems has 4 OSSs, each of them is managing hundreds of OSTs (round-robin indexed?). Right ? |
| Comment by James A Simmons [ 18/Jun/14 ] |
|
The reproducers are IOR and simul both running jobs generating single shared files. They are executed on a single Cray cage system (22 nodes) with one router between the file system and the computes. That router is used as a bridge between the Gemini interconnect to the infiniband fabric. For each job a test directory is created that is the current maximum stripe count for each OST available. That current count is 21 OSTs so the directory is created with 20 stripes. This is the case for all job sets. The command to execute the jobs are aprun -n 288 /lustre/tick/stf008/scratch/jsimmons/IOR -a POSIX -i 5 -C -v -g -w -r -e -b 256m -t 4m -o /lustre/tick/stf008/scratch/jsimmons/test_ior/testfile.out aprun -n 112 /lustre/tick/stf008/scratch/jsimmons/simul -d /lustre/tick/stf008/scratch/jsimmons/test_simul For aprun -n # is the number of cores to use. Since each compute has 16 cores this means for example in the 288 core job only 18 nodes are used in total. For the simul job only 7 nodes are used. All cores are used on each node. The file system system itself is made of 4 MDS with one single MDT per MDS. For the back end storage array it is a DDN 10K attached to 3 OSS with each OSS having 7 OSTs mapped to it to bring the total amount of OSTs to 21. The size of each MDT is formatted to be 30GB for each MDT and 7.1TB for each OST. The clients are running a 2.6 version from about 2 weeks ago and the file system is running with 2.5.60 from last Friday. |
| Comment by Bruno Faccini (Inactive) [ 18/Jun/14 ] |
|
Humm, only 3 OSSs ?? When you provided 4x OSSs dmesg/debug-log and where we see the same Client to be evicted from the 4 at about the same time : [root@client-17 20140609-run1]# ls -l total 14438608 -rw-r--r-- 1 2004 2647 52742 Jun 9 08:06 clients-20140609 -rw-r--r-- 1 2004 2647 0 Jun 9 08:13 dmesg-mds1-20140609 -rw-r--r-- 1 2004 2647 1414 Jun 9 08:14 dmesg-oss1-20140609 -rw-r--r-- 1 2004 2647 1928 Jun 9 08:14 dmesg-oss2-20140609 -rw-r--r-- 1 2004 2647 2604 Jun 9 08:14 dmesg-oss3-20140609 -rw-r--r-- 1 2004 2647 1296 Jun 9 08:14 dmesg-oss4-20140609 -rw-r--r-- 1 2004 2647 18560470 Jun 9 08:32 lustre-log.c0-0c0s0n2.txt -rw-r--r-- 1 2004 2647 2293127795 Jun 9 08:32 lustre-log.c0-0c0s1n0.txt -rw-r--r-- 1 2004 2647 2292348834 Jun 9 08:33 lustre-log.c0-0c0s1n1.txt -rw-r--r-- 1 2004 2647 2292352991 Jun 9 08:33 lustre-log.c0-0c0s1n2.txt -rw-r--r-- 1 2004 2647 2292326850 Jun 9 08:34 lustre-log.c0-0c0s1n3.txt -rw-r--r-- 1 2004 2647 584732 Jun 9 08:34 lustre-log.c0-0c0s2n2.txt -rw-r--r-- 1 2004 2647 1442967552 Jun 9 08:34 lustre-log.c0-0c0s3n0.txt -rw-r--r-- 1 2004 2647 0 Jun 9 08:35 lustre-log.c0-0c0s3n1.txt -rw-r--r-- 1 2004 2647 0 Jun 9 08:35 lustre-log.c0-0c0s3n2.txt -rw-r--r-- 1 2004 2647 0 Jun 9 08:36 lustre-log.c0-0c0s3n3.txt -rw-r--r-- 1 2004 2647 0 Jun 9 08:36 lustre-log.c0-0c0s5n0.txt -rw-r--r-- 1 2004 2647 0 Jun 9 08:36 lustre-log.c0-0c0s5n1.txt -rw-r--r-- 1 2004 2647 0 Jun 9 08:36 lustre-log.c0-0c0s5n2.txt -rw-r--r-- 1 2004 2647 0 Jun 9 08:36 lustre-log.c0-0c0s5n3.txt -rw-r--r-- 1 2004 2647 0 Jun 9 08:36 lustre-log.c0-0c0s6n0.txt -rw-r--r-- 1 2004 2647 0 Jun 9 08:36 lustre-log.c0-0c0s6n1.txt -rw-r--r-- 1 2004 2647 0 Jun 9 08:36 lustre-log.c0-0c0s6n2.txt -rw-r--r-- 1 2004 2647 0 Jun 9 08:36 lustre-log.c0-0c0s6n3.txt -rw-r--r-- 1 2004 2647 0 Jun 9 08:36 lustre-log.c0-0c0s7n0.txt -rw-r--r-- 1 2004 2647 0 Jun 9 08:36 lustre-log.c0-0c0s7n1.txt -rw-r--r-- 1 2004 2647 0 Jun 9 08:36 lustre-log.c0-0c0s7n2.txt -rw-r--r-- 1 2004 2647 0 Jun 9 08:36 lustre-log.c0-0c0s7n3.txt -rw-r--r-- 1 2004 2647 406375578 Jun 9 08:13 mds1-log-20140609 -rw-r--r-- 1 2004 2647 944812799 Jun 9 08:13 oss1-log-20140609 -rw-r--r-- 1 2004 2647 939348575 Jun 9 08:12 oss2-log-20140609 -rw-r--r-- 1 2004 2647 905196600 Jun 9 08:08 oss3-log-20140609 -rw-r--r-- 1 2004 2647 942542266 Jun 9 08:08 oss4-log-20140609 [root@client-17 20140609-run1]# grep -A 1 -i evict dmesg-oss* dmesg-oss1-20140609:[251714.315118] LustreError: 0:0:(ldlm_lockd.c:344:waiting_locks_callback()) ### lock callback timer expired after 148s: evicting client at 28@gni1 ns: filter-tickfs-OST0000_UUID lock: ffff88059326c7c0/0x2004179f53efbf08 lrc: 3/0,0 mode: PR/PR res: [0x3e9b:0x0:0x0].0 rrc: 13 type: EXT [469762048->585105407] (req 469762048->470810623) flags: 0x60000000010020 nid: 28@gni1 remote: 0xb50852018d8f69d9 expref: 50 pid: 30470 timeout: 4546331352 lvb_type: 0 dmesg-oss1-20140609-[251714.355075] LustreError: 0:0:(ldlm_lockd.c:344:waiting_locks_callback()) Skipped 608 previous similar messages -- dmesg-oss2-20140609:[251704.627677] LustreError: 0:0:(ldlm_lockd.c:344:waiting_locks_callback()) ### lock callback timer expired after 154s: evicting client at 28@gni1 ns: filter-tickfs-OST0015_UUID lock: ffff88057b31abc0/0xc23127a58a2b8bb6 lrc: 3/0,0 mode: PR/PR res: [0x3e3b:0x0:0x0].0 rrc: 10 type: EXT [412090368->479199231] (req 412090368->413138943) flags: 0x60000000010020 nid: 28@gni1 remote: 0xb50852018d8f5e8e expref: 65 pid: 19858 timeout: 4546325396 lvb_type: 0 dmesg-oss2-20140609-[251704.667664] LustreError: 0:0:(ldlm_lockd.c:344:waiting_locks_callback()) Skipped 508 previous similar messages -- dmesg-oss3-20140609:[251707.739428] LustreError: 0:0:(ldlm_lockd.c:344:waiting_locks_callback()) ### lock callback timer expired after 147s: evicting client at 28@gni1 ns: filter-tickfs-OST0002_UUID lock: ffff88056f8dc2c0/0x5619018c38d12223 lrc: 3/0,0 mode: PR/PR res: [0x3e99:0x0:0x0].0 rrc: 25 type: EXT [575668224->920649727] (req 575668224->576716799) flags: 0x60000000010020 nid: 28@gni1 remote: 0xb50852018d8f6c33 expref: 80 pid: 21886 timeout: 4546330359 lvb_type: 0 dmesg-oss3-20140609-[251707.779405] LustreError: 0:0:(ldlm_lockd.c:344:waiting_locks_callback()) Skipped 406 previous similar messages -- dmesg-oss4-20140609:[251704.795841] LustreError: 0:0:(ldlm_lockd.c:344:waiting_locks_callback()) ### lock callback timer expired after 399s: evicting client at 28@gni1 ns: filter-tickfs-OST0013_UUID lock: ffff880511d64ac0/0x12a297bb77a5f496 lrc: 3/0,0 mode: PR/PR res: [0x3e59:0x0:0x0].0 rrc: 2 type: EXT [402653184->614465535] (req 402653184->403701759) flags: 0x60000000010020 nid: 28@gni1 remote: 0xb50852018d8eefc2 expref: 62 pid: 2561 timeout: 4546325612 lvb_type: 0 dmesg-oss4-20140609-[251704.835633] LustreError: 0:0:(ldlm_lockd.c:344:waiting_locks_callback()) Skipped 491 previous similar messages [root@client-17 20140609-run1]# where I have been bluffed by the number of "lock callback timer" msgs to think about "thousands" of OSTs, when it stands for the number of locks ... BTW, I see 25 OSTs and not 21 : [root@client-17 20140609-run1]# grep -i evict lustre-log.c0-0c0s1n2.txt | head -1
00000100:02020000:31.0:1402326083.337206:0:2533:0:(import.c:1370:ptlrpc_import_recovery_state_machine()) 167-0: tickfs-OST0000-osc-ffff88060e906400: This client was evicted by tickfs-OST0000; in progress operations using this service will fail.
[root@client-17 20140609-run1]# grep -i evict lustre-log.c0-0c0s1n2.txt.end | awk '{print $9}' | sort
tickfs-OST0000;
tickfs-OST0001;
tickfs-OST0002;
tickfs-OST0003;
tickfs-OST0004;
tickfs-OST0005;
tickfs-OST0006;
tickfs-OST0007;
tickfs-OST0008;
tickfs-OST0009;
tickfs-OST000a;
tickfs-OST000b;
tickfs-OST000c;
tickfs-OST000d;
tickfs-OST000e;
tickfs-OST000f;
tickfs-OST0010;
tickfs-OST0011;
tickfs-OST0012;
tickfs-OST0013;
tickfs-OST0014;
tickfs-OST0015;
tickfs-OST0016;
tickfs-OST0017;
tickfs-OST0018;
[root@client-17 20140609-run1]#
|
| Comment by James A Simmons [ 18/Jun/14 ] |
|
Sorry you are right. Using 3 OSS is the most recent setup since I lost one OSS this last week. Original setup was 4 OSS with 7 OSTs per OSS. So it should be 28 OSTs all total. |
| Comment by Bruno Faccini (Inactive) [ 24/Jun/14 ] |
|
James, I wonder if, during some dedicated time, you can get a Client's crash-dump (with a resized debug buffer and full debug trace enabled, or at least "net+dlmtrace+rpctrace" in addition to the default mask) when such eviction scenario occurs ?? This can be automated if you add http://review.whamcloud.com/#/c/8875/ on top of your Client's distro. |
| Comment by James A Simmons [ 30/Jun/14 ] |
|
So I updated my 2.6 tree to what is current and tried to duplicate the problem on my small scale system. I couldn't so I moved to a larger system (not titan) and same thing. I'm think it is the patch from |
| Comment by James A Simmons [ 02/Jul/14 ] |
|
So far the results on my small scale system are very promising using the patch from |
| Comment by Bruno Faccini (Inactive) [ 03/Jul/14 ] |
|
James, |
| Comment by James A Simmons [ 07/Jul/14 ] |
|
It was the testing with 2..5.60 clients. When I updated the clients to a newer version and could not reproduce the problem I figured some patch that landed fixed the problem. So I examined the list of merged patches since the broken client. The only one that made sense was 1) Since I was seeing evictions from the OST it makes since a possible source of the problem could be the osc layer. 2) The patch deals with a deadlock issue would can explain why I saw evictions. It seems I'm familiar enough with the code to make a good enough educated guess what will fix my problems I have been testing the |
| Comment by Bruno Faccini (Inactive) [ 08/Jul/14 ] |
|
> 2) The patch deals with a deadlock issue would can explain why I saw evictions. |
| Comment by Bruno Faccini (Inactive) [ 25/Jul/14 ] |
|
James, is this still working for you ? |
| Comment by James A Simmons [ 25/Jul/14 ] |
|
Yes this is still working for me. You can close it. If I have any problems in next weeks test shot I will open another ticket. |
| Comment by Bruno Faccini (Inactive) [ 04/Aug/14 ] |
|
Dup of |