[LU-180] Sometimes evicted clients never reconnect Created: 30/Mar/11  Updated: 19/Apr/11  Resolved: 19/Apr/11

Status: Resolved
Project: Lustre
Component/s: None
Affects Version/s: Lustre 2.0.0
Fix Version/s: None

Type: Bug Priority: Critical
Reporter: Sebastien Buisson (Inactive) Assignee: Niu Yawei (Inactive)
Resolution: Duplicate Votes: 0
Labels: None

Attachments: File client.log.gz     Text File deadlock-trace.log     File kay297     File kay3    
Severity: 3
Rank (Obsolete): 8539

 Description   

Hi,

On our benchmarking cluster we observe a very strange behavior of Lustre: sometimes clients that are evicted by an OST never reconnect. The consequence is that we have a lot of Defunct processes on the impacted compute nodes, so we have to reboot them.

For instance on a compute node we can see that one OST connection is missing:

[root@kay297 ~]# lfs df
UUID 1K-blocks Used Available Use% Mounted on
scratch-MDT0000_UUID 983490512 887336 982603176 0% /scratch_lustre[MDT:0]
scratch-OST0000_UUID 7691221300 547721748 7143497248 7% /scratch_lustre[OST:0]
scratch-OST0001_UUID 7691221300 548640356 7142579792 7% /scratch_lustre[OST:1]
scratch-OST0002_UUID 7691221300 555585984 7135634108 7% /scratch_lustre[OST:2]
scratch-OST0003_UUID 7691221300 551102404 7140118212 7% /scratch_lustre[OST:3]
scratch-OST0004_UUID 7691221300 569235872 7121985356 7% /scratch_lustre[OST:4]
scratch-OST0005_UUID 7691221300 565971400 7125210680 7% /scratch_lustre[OST:5]
scratch-OST0006_UUID 7691221300 551380176 7139839904 7% /scratch_lustre[OST:6]
scratch-OST0007_UUID 7691221300 552060560 7139160472 7% /scratch_lustre[OST:7]
scratch-OST0008_UUID 7691221300 540060736 7151160480 7% /scratch_lustre[OST:8]
scratch-OST0009_UUID 7691221300 542803928 7148417244 7% /scratch_lustre[OST:9]
scratch-OST000a_UUID 7691221300 549910932 7141309212 7% /scratch_lustre[OST:10]
scratch-OST000b_UUID 7691221300 553465732 7137754416 7% /scratch_lustre[OST:11]
scratch-OST000c_UUID 7691221300 547134756 7144086380 7% /scratch_lustre[OST:12]
scratch-OST000d_UUID 7691221300 542512828 7148708296 7% /scratch_lustre[OST:13]
scratch-OST000e_UUID 7691221300 540940940 7150278116 7% /scratch_lustre[OST:14]
scratch-OST000f_UUID 7691221300 552187304 7139031756 7% /scratch_lustre[OST:15]
scratch-OST0010_UUID 7691221300 553010540 7138207368 7% /scratch_lustre[OST:16]
scratch-OST0011_UUID 7691221300 549111608 7142109332 7% /scratch_lustre[OST:17]
OST0012 : inactive device
scratch-OST0013_UUID 7691221300 545678392 7145547784 7% /scratch_lustre[OST:19]
scratch-OST0014_UUID 7691221300 545673392 7145547784 7% /scratch_lustre[OST:20]
scratch-OST0015_UUID 7691221300 553029372 7138191732 7% /scratch_lustre[OST:21]
scratch-OST0016_UUID 7691221300 578557784 7112659292 7% /scratch_lustre[OST:22]
scratch-OST0017_UUID 7691221300 553574948 7137640080 7% /scratch_lustre[OST:23]
scratch-OST0018_UUID 7691221300 593382232 7097838936 7% /scratch_lustre[OST:24]
scratch-OST0019_UUID 7691221300 550952100 7140232336 7% /scratch_lustre[OST:25]
scratch-OST001a_UUID 7691221300 604897244 7086322904 7% /scratch_lustre[OST:26]
scratch-OST001b_UUID 7691221300 545086976 7146133184 7% /scratch_lustre[OST:27]
scratch-OST001c_UUID 7691221300 550491540 7140725472 7% /scratch_lustre[OST:28]
scratch-OST001d_UUID 7691221300 542008368 7149211712 7% /scratch_lustre[OST:29]

filesystem summary: 215354196400 16076170152 199823591804 7% /scratch_lustre

The phenomenon is erratic, it does not always impact the same clients or the same OSTs.

I attach the syslogs of kay297 (client) and kay3 (OSS).



 Comments   
Comment by Sebastien Buisson (Inactive) [ 31/Mar/11 ]

Hi,

After less than 24 hours we have a new occurrence of this issue. The client node is kay310 (10.17.1.211@o2ib) and the OST is scratch-OST000b hosted by OSS kay2.

[root@kay310 ~]# lctl dl
0 UP mgc MGC10.17.0.2@o2ib 1d6fa677-53e6-5478-7b63-13c135edb2cd 5
1 UP lov scratch-clilov-ffff8803313ff000 0d84de33-3370-ded5-56b3-420997b67d2c 4
2 UP lmv scratch-clilmv-ffff8803313ff000 0d84de33-3370-ded5-56b3-420997b67d2c 4
3 UP mdc scratch-MDT0000-mdc-ffff8803313ff000 0d84de33-3370-ded5-56b3-420997b67d2c 5
4 UP osc scratch-OST0016-osc-ffff8803313ff000 0d84de33-3370-ded5-56b3-420997b67d2c 5
5 UP osc scratch-OST0017-osc-ffff8803313ff000 0d84de33-3370-ded5-56b3-420997b67d2c 5
6 UP osc scratch-OST001d-osc-ffff8803313ff000 0d84de33-3370-ded5-56b3-420997b67d2c 5
7 UP osc scratch-OST0010-osc-ffff8803313ff000 0d84de33-3370-ded5-56b3-420997b67d2c 5
8 UP osc scratch-OST0003-osc-ffff8803313ff000 0d84de33-3370-ded5-56b3-420997b67d2c 5
9 UP osc scratch-OST0015-osc-ffff8803313ff000 0d84de33-3370-ded5-56b3-420997b67d2c 5
10 UP osc scratch-OST0005-osc-ffff8803313ff000 0d84de33-3370-ded5-56b3-420997b67d2c 5
11 UP osc scratch-OST0009-osc-ffff8803313ff000 0d84de33-3370-ded5-56b3-420997b67d2c 5
12 UP osc scratch-OST0007-osc-ffff8803313ff000 0d84de33-3370-ded5-56b3-420997b67d2c 5
13 UP osc scratch-OST0006-osc-ffff8803313ff000 0d84de33-3370-ded5-56b3-420997b67d2c 5
14 UP osc scratch-OST000d-osc-ffff8803313ff000 0d84de33-3370-ded5-56b3-420997b67d2c 5
15 UP osc scratch-OST0002-osc-ffff8803313ff000 0d84de33-3370-ded5-56b3-420997b67d2c 5
16 UP osc scratch-OST000e-osc-ffff8803313ff000 0d84de33-3370-ded5-56b3-420997b67d2c 5
17 UP osc scratch-OST0004-osc-ffff8803313ff000 0d84de33-3370-ded5-56b3-420997b67d2c 5
18 UP osc scratch-OST000f-osc-ffff8803313ff000 0d84de33-3370-ded5-56b3-420997b67d2c 5
19 UP osc scratch-OST0008-osc-ffff8803313ff000 0d84de33-3370-ded5-56b3-420997b67d2c 5
20 UP osc scratch-OST001b-osc-ffff8803313ff000 0d84de33-3370-ded5-56b3-420997b67d2c 5
21 IN osc scratch-OST000b-osc-ffff8803313ff000 0d84de33-3370-ded5-56b3-420997b67d2c 5
22 UP osc scratch-OST0019-osc-ffff8803313ff000 0d84de33-3370-ded5-56b3-420997b67d2c 5
23 UP osc scratch-OST0012-osc-ffff8803313ff000 0d84de33-3370-ded5-56b3-420997b67d2c 5
24 UP osc scratch-OST000a-osc-ffff8803313ff000 0d84de33-3370-ded5-56b3-420997b67d2c 5
25 UP osc scratch-OST0011-osc-ffff8803313ff000 0d84de33-3370-ded5-56b3-420997b67d2c 5
26 UP osc scratch-OST000c-osc-ffff8803313ff000 0d84de33-3370-ded5-56b3-420997b67d2c 5
27 UP osc scratch-OST0001-osc-ffff8803313ff000 0d84de33-3370-ded5-56b3-420997b67d2c 5
28 UP osc scratch-OST001c-osc-ffff8803313ff000 0d84de33-3370-ded5-56b3-420997b67d2c 5
29 UP osc scratch-OST0014-osc-ffff8803313ff000 0d84de33-3370-ded5-56b3-420997b67d2c 5
30 UP osc scratch-OST0018-osc-ffff8803313ff000 0d84de33-3370-ded5-56b3-420997b67d2c 5
31 UP osc scratch-OST001a-osc-ffff8803313ff000 0d84de33-3370-ded5-56b3-420997b67d2c 5
32 UP osc scratch-OST0000-osc-ffff8803313ff000 0d84de33-3370-ded5-56b3-420997b67d2c 5
33 UP osc scratch-OST0013-osc-ffff8803313ff000 0d84de33-3370-ded5-56b3-420997b67d2c 5

[root@kay310 ~]# cat /proc/fs/lustre/osc/scratch-OST000b-osc-ffff8803313ff000/import
import:
name: scratch-OST000b-osc-ffff8803313ff000
target: scratch-OST000b_UUID
state: EVICTED
connect_flags: [write_grant, server_lock, version, request_portal, truncate_lock, max_byte_per_rpc, early_lock_cancel, adaptive_timeouts, lru_resize, alt_checksum_algorithm, version_recovery, full20]
import_flags: [invalid, replayable, pingable]
connection:
failover_nids: [10.17.0.3@o2ib]
current_connection: 10.17.0.3@o2ib
connection_attempts: 2
generation: 2
in-progress_invalidations: 1
rpcs:
inflight: 0
unregistering: 0
timeouts: 0
avg_waittime: 167693 usec
service_estimates:
services: 1 sec
network: 1 sec
transactions:
last_replay: 0
peer_committed: 4514305
last_checked: 4514305
read_data_averages:
bytes_per_rpc: 1046287
usec_per_rpc: 57030
MB_per_sec: 18.34
write_data_averages:
bytes_per_rpc: 1046268
usec_per_rpc: 585258
MB_per_sec: 1.78

[buissons@kay2 ~]$ lctl dl
0 UP mgc MGC10.17.0.2@o2ib b612ff68-0ab5-593c-d7ca-0fc4e5087a57 5
1 UP ost OSS OSS_uuid 3
2 UP obdfilter scratch-OST0006 scratch-OST0006_UUID 157
3 UP obdfilter scratch-OST000b scratch-OST000b_UUID 153
4 UP obdfilter scratch-OST0017 scratch-OST0017_UUID 157
5 UP obdfilter scratch-OST0007 scratch-OST0007_UUID 157
6 UP obdfilter scratch-OST0008 scratch-OST0008_UUID 157
7 UP obdfilter scratch-OST001d scratch-OST001d_UUID 157
8 UP obdfilter scratch-OST000c scratch-OST000c_UUID 157
9 UP obdfilter scratch-OST0003 scratch-OST0003_UUID 157
10 UP obdfilter scratch-OST0004 scratch-OST0004_UUID 157
11 UP obdfilter scratch-OST0016 scratch-OST0016_UUID 157
12 UP obdfilter scratch-OST0005 scratch-OST0005_UUID 155
13 UP obdfilter scratch-OST0010 scratch-OST0010_UUID 157
14 UP obdfilter scratch-OST0002 scratch-OST0002_UUID 157
15 UP obdfilter scratch-OST0015 scratch-OST0015_UUID 157
16 UP obdfilter scratch-OST000a scratch-OST000a_UUID 157

[buissons@kay2 ~]$ cat /proc/fs/lustre/obdfilter/scratch-OST000b/exports/10.17.1.211@o2ib/uuid
[buissons@kay2 ~]$ cat /proc/fs/lustre/obdfilter/scratch-OST000b/exports/10.17.1.211@o2ib/stats
snapshot_time 1301562458.151546 secs.usecs
read_bytes 4109 samples [bytes] 4096 1048576 4299194368
write_bytes 4089 samples [bytes] 57344 1048576 4278190080
get_info 41 samples [reqs]
set_info_async 3 samples [reqs]
disconnect 6 samples [reqs]
sync 6 samples [reqs]
preprw 8198 samples [reqs]
commitrw 8198 samples [reqs]
ping 24659 samples [reqs]

This is very annoying because I tried to reactivate the import on the client, but with no success.

[root@kay310 ~]# lctl --device 21 activate
error: activate: failed: Invalid argument

The only option we have is to reboot the impacted Lustre clients, which is not acceptable on a benchmarking cluster.

Sebastien.

Comment by Peter Jones [ 31/Mar/11 ]

Niu

Could you look into this one please?

Thanks

Peter

Comment by Sebastien Buisson (Inactive) [ 31/Mar/11 ]

Hi,

This problem seems to be similar to bugzilla 21636. The initial description in this bugzilla is not really the same, but if you look at comment 82 you will see that the errors are exactly the same as what we are suffering from on our benchmarking cluster.

HTH
Sebastien.

Comment by Niu Yawei (Inactive) [ 01/Apr/11 ]

Looks the invalidate thread did never finish the import invalidation job, I suspect that there is something deadlock.

Hi, Sebastien

Could you also collect all the threads' stack trace of the abnormal client when this issue happened? (echo t > /proc/sysrq-trigger) I want to see where the invalidate thread was blocked.

Comment by Niu Yawei (Inactive) [ 18/Apr/11 ]

This bug is probably caused by a deadlock in clio, which could cause the client being evicted, and because of the deadlock, the invalidation thread (ll_imp_inval) would be blocked on a semaphore, then the import can not be reactived anymore.

The deadlock issue is LU-122, and Jay is working on it.

Hi, Sebastien, when you hit the issue next time, could you please check the stack trace to see if it's similar to the trace shown in the attached file? Thanks.

Comment by Gregoire Pichon [ 19/Apr/11 ]

There is a new occurence of the bug.

In attachment is the trace log of the client

  • at the time the client was evicted by the OST
  • with the threads' stack trace generated by 'sysrq-trigger'

Tell me if you need other information from the client node.

Comment by Gregoire Pichon [ 19/Apr/11 ]

The stack trace of the new occurence looks similar to the one in deadlock-trace.log.

We are going to install the fix described by LU-122 and see if it reproduces.

Comment by Niu Yawei (Inactive) [ 19/Apr/11 ]

Hi, Gregoire

Yes, I checked the log and I believe it's the deadlock problem in LU-122, mark it as duplicated.

Generated at Sat Feb 10 01:04:34 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.