[LU-1287] MDS refuses client connections Created: 05/Apr/12  Updated: 27/Feb/13  Resolved: 27/Feb/13

Status: Closed
Project: Lustre
Component/s: None
Affects Version/s: None
Fix Version/s: Lustre 2.4.0

Type: Bug Priority: Major
Reporter: Florent Thery (Inactive) Assignee: Lai Siyao
Resolution: Fixed Votes: 0
Labels: None

Attachments: File lu-1287.dmesg.tar     File lustre_mds_ko.gz     File lustre_mds_ok.gz     Text File mdt_failure.log    
Severity: 3
Rank (Obsolete): 8134

 Description   

When configuring a lustre filesystem with the following setup, the MDS infinitely refuses client connections:

  • one MDT alone on a node
  • one or more OSTs on other nodes
  • separate MGS
  • filesystem started in the following order (OSTs, MDT)

Once the filesystem is started, the 'mount' command on the client stays blocked forever.

Please find attached a log from the MDS node.



 Comments   
Comment by Cliff White (Inactive) [ 05/Apr/12 ]

There does not appear to be a log attached, please provide more details.

Comment by Cliff White (Inactive) [ 05/Apr/12 ]

The Lustre kernel log (lctl dk) is not especially helpful in these situations. Please examine your system logs, (typically /var/log/messages) on all the servers and the impacted clients, there should be some more helpful messages there. Are you certain the servers are all up cleanly? The log mentions a writeconf - did you restart all servers?

Comment by Gregoire Pichon [ 06/Apr/12 ]

Actually, the issue arises even when not using the writeconf option.
The issue has been reproduced from scratch, e.g.:

  • mkfs the MGS
  • mkfs the filesystem
  • mount the MGS
  • mount the OSTs
  • mount the MDT
  • try to mount the client

Attached is a tarball containing the dmesg output on each node.
Thanks.

Comment by Florent Thery (Inactive) [ 06/Apr/12 ]

We've been investigating this issue today to figure out exactly in which case it arises.
We've found that bugzilla #24050 relates to the same issue.
So we've experienced a problem of start order between the MDT and OSTs.
Starting the MDT before the OSTs when it is the first time fixes the issue.

We have a number of questions though:

  • will this issue be fixed on the lustre code level ?
  • is there a section addressing this situation in the lustre manual ?
    (Section 13.2 speaks about start orders but does not seem to address the situation)
  • on the user level, is there an easy way to troubleshoot this situation (e.g. a log line that helps to identify, ...) ?

Thanks
Florent.

Comment by Cliff White (Inactive) [ 06/Apr/12 ]

I am sorry, but your logs are extremely short. You say you have re-created the filesystem, I see no logs showing this.
I do see the MDS is refusing the connection: Lustre: b10-MDT0000: temporarily refusing client connection from 60.64.2.24@o2ib
but threre really should be more detail there. Please attach the output of tunefs.lustre --print <device> for the MGT, MDT and all OSTs. I would suggest a full cold start of the systems, verifying system health at each step.

Comment by Cliff White (Inactive) [ 29/Apr/12 ]

What is your state? Do you have more information on this issue?

Comment by Sebastien Buisson (Inactive) [ 03/May/12 ]

Hi,

From a full cold start system running Lustre 2.1, I reran the test by doing sequentially:

1. format MGS

At this point, 'tunefs.lustre --print' on the MGT gives:

[root@perou2 ~]# tunefs.lustre --print /dev/disk/by-id/scsi-2003013841aac0025
checking for existing Lustre data: found CONFIGS/mountdata
Reading CONFIGS/mountdata

Read previous values:
Target: MGS
Index: unassigned
Lustre FS: mgs
Mount type: ldiskfs
Flags: 0x74
(MGS needs_index first_time update )
Persistent mount opts: user_xattr,errors=remount-ro
Parameters: failover.node=10.3.1.2@o2ib

Permanent disk data:
Target: MGS
Index: unassigned
Lustre FS: mgs
Mount type: ldiskfs
Flags: 0x74
(MGS needs_index first_time update )
Persistent mount opts: user_xattr,errors=remount-ro
Parameters: failover.node=10.3.1.2@o2ib

exiting before disk write.

2. start MGS

3. format MDT

At this point, 'tunefs.lustre --print' on the MDT gives:

[root@perou3 ~]# tunefs.lustre --print /dev/disk/by-id/scsi-2003013841aac002d
checking for existing Lustre data: found CONFIGS/mountdata
Reading CONFIGS/mountdata

Read previous values:
Target: fs_mdt-MDT0000
Index: 0
Lustre FS: fs_mdt
Mount type: ldiskfs
Flags: 0x61
(MDT first_time update )
Persistent mount opts: user_xattr,errors=remount-ro
Parameters: mgsnode=10.5.1.3@o2ib lov.stripesize=1048576 failover.node=10.5.1.3@o2ib network=o2ib0

Permanent disk data:
Target: fs_mdt-MDT0000
Index: 0
Lustre FS: fs_mdt
Mount type: ldiskfs
Flags: 0x61
(MDT first_time update )
Persistent mount opts: user_xattr,errors=remount-ro
Parameters: mgsnode=10.5.1.3@o2ib lov.stripesize=1048576 failover.node=10.5.1.3@o2ib network=o2ib0

exiting before disk write.

4. format OSTs

At this point, 'tunefs.lustre --print' on the OSTs gives:

[root@perou6 ~]# tunefs.lustre --print /dev/disk/by-id/scsi-2003013841aac0037
checking for existing Lustre data: found CONFIGS/mountdata
Reading CONFIGS/mountdata

Read previous values:
Target: fs_mdt-OST0000
Index: 0
Lustre FS: fs_mdt
Mount type: ldiskfs
Flags: 0x62
(OST first_time update )
Persistent mount opts: errors=remount-ro,extents,mballoc
Parameters: mgsnode=10.5.1.3@o2ib failover.node=10.5.1.6@o2ib network=o2ib0

Permanent disk data:
Target: fs_mdt-OST0000
Index: 0
Lustre FS: fs_mdt
Mount type: ldiskfs
Flags: 0x62
(OST first_time update )
Persistent mount opts: errors=remount-ro,extents,mballoc
Parameters: mgsnode=10.5.1.3@o2ib failover.node=10.5.1.6@o2ib network=o2ib0

exiting before disk write.

[root@perou6 ~]# tunefs.lustre --print /dev/disk/by-id/scsi-2003013841aac0035
checking for existing Lustre data: found CONFIGS/mountdata
Reading CONFIGS/mountdata

Read previous values:
Target: fs_mdt-OST0001
Index: 1
Lustre FS: fs_mdt
Mount type: ldiskfs
Flags: 0x62
(OST first_time update )
Persistent mount opts: errors=remount-ro,extents,mballoc
Parameters: mgsnode=10.5.1.3@o2ib failover.node=10.5.1.6@o2ib network=o2ib0

Permanent disk data:
Target: fs_mdt-OST0001
Index: 1
Lustre FS: fs_mdt
Mount type: ldiskfs
Flags: 0x62
(OST first_time update )
Persistent mount opts: errors=remount-ro,extents,mballoc
Parameters: mgsnode=10.5.1.3@o2ib failover.node=10.5.1.6@o2ib network=o2ib0

exiting before disk write.

5. start OSTs

At that point, here is the content of the MGS:

[root@perou2 ~]# ls toto/CONFIGS/
fs_mdt-client fs_mdt-OST0000 fs_mdt-OST0001 fs_mdt-sptlrpc _mgs-sptlrpc mountdata
[root@perou2 ~]# llog_reader toto/CONFIGS/fs_mdt-OST0000
Header size : 8192
Time : Fri Apr 13 15:07:03 2012
Number of records: 4
Target uuid : config_uuid
-----------------------
#01 (224)marker 5 (flags=0x01, v2.1.0.0) fs_mdt-OST0000 'add ost' Fri Apr 13 15:07:03 2012-
#02 (128)attach 0:fs_mdt-OST0000 1:obdfilter 2:fs_mdt-OST0000_UUID
#03 (112)setup 0:fs_mdt-OST0000 1:dev 2:type 3:f
#04 (224)marker 5 (flags=0x02, v2.1.0.0) fs_mdt-OST0000 'add ost' Fri Apr 13 15:07:03 2012-
[root@perou2 ~]#
[root@perou2 ~]# llog_reader toto/CONFIGS/fs_mdt-OST0001
Header size : 8192
Time : Fri Apr 13 15:07:03 2012
Number of records: 4
Target uuid : config_uuid
-----------------------
#01 (224)marker 1 (flags=0x01, v2.1.0.0) fs_mdt-OST0001 'add ost' Fri Apr 13 15:07:03 2012-
#02 (128)attach 0:fs_mdt-OST0001 1:obdfilter 2:fs_mdt-OST0001_UUID
#03 (112)setup 0:fs_mdt-OST0001 1:dev 2:type 3:f
#04 (224)marker 1 (flags=0x02, v2.1.0.0) fs_mdt-OST0001 'add ost' Fri Apr 13 15:07:03 2012-
[root@perou2 ~]#

6. start MDT

At that point, the content of the MGS is:

[root@perou2 ~]# ls toto/CONFIGS/
fs_mdt-client fs_mdt-MDT0000 fs_mdt-OST0000 fs_mdt-OST0001 fs_mdt-sptlrpc _mgs-sptlrpc mountdata
[root@perou2 ~]#
[root@perou2 ~]# llog_reader toto/CONFIGS/fs_mdt-OST0000
Header size : 8192
Time : Fri Apr 13 15:07:03 2012
Number of records: 4
Target uuid : config_uuid
-----------------------
#01 (224)marker 5 (flags=0x01, v2.1.0.0) fs_mdt-OST0000 'add ost' Fri Apr 13 15:07:03 2012-
#02 (128)attach 0:fs_mdt-OST0000 1:obdfilter 2:fs_mdt-OST0000_UUID
#03 (112)setup 0:fs_mdt-OST0000 1:dev 2:type 3:f
#04 (224)marker 5 (flags=0x02, v2.1.0.0) fs_mdt-OST0000 'add ost' Fri Apr 13 15:07:03 2012-
[root@perou2 ~]#
[root@perou2 ~]# llog_reader toto/CONFIGS/fs_mdt-OST0001
Header size : 8192
Time : Fri Apr 13 15:07:03 2012
Number of records: 4
Target uuid : config_uuid
-----------------------
#01 (224)marker 1 (flags=0x01, v2.1.0.0) fs_mdt-OST0001 'add ost' Fri Apr 13 15:07:03 2012-
#02 (128)attach 0:fs_mdt-OST0001 1:obdfilter 2:fs_mdt-OST0001_UUID
#03 (112)setup 0:fs_mdt-OST0001 1:dev 2:type 3:f
#04 (224)marker 1 (flags=0x02, v2.1.0.0) fs_mdt-OST0001 'add ost' Fri Apr 13 15:07:03 2012-

And in the syslog of the MDS node we have:

1334322067 2012 Apr 13 15:01:07 perou3 kern warning kernel LDISKFS-fs warning (device sdau): ldiskfs_fill_super: extents feature not enabled on this filesystem, use tune2fs.
1334322068 2012 Apr 13 15:01:08 perou3 kern info kernel LDISKFS-fs (sdau): barriers disabled
1334322068 2012 Apr 13 15:01:08 perou3 kern info kernel LDISKFS-fs (sdau): mounted filesystem with ordered data mode
1334322545 2012 Apr 13 15:09:05 perou3 kern warning kernel LDISKFS-fs warning (device sdau): ldiskfs_fill_super: extents feature not enabled on this filesystem, use tune2fs.
1334322545 2012 Apr 13 15:09:05 perou3 kern info kernel LDISKFS-fs (sdau): barriers disabled
1334322545 2012 Apr 13 15:09:05 perou3 kern info kernel LDISKFS-fs (sdau): mounted filesystem with ordered data mode
1334322545 2012 Apr 13 15:09:05 perou3 kern warning kernel LDISKFS-fs warning (device sdau): ldiskfs_fill_super: extents feature not enabled on this filesystem, use tune2fs.
1334322545 2012 Apr 13 15:09:05 perou3 kern info kernel LDISKFS-fs (sdau): barriers disabled
1334322545 2012 Apr 13 15:09:05 perou3 kern info kernel LDISKFS-fs (sdau): mounted filesystem with ordered data mode
1334322545 2012 Apr 13 15:09:05 perou3 kern warning kernel Lustre: 7750:0:(sec.c:1474:sptlrpc_import_sec_adapt()) import MGC10.5.1.3@o2ib->MGC10.5.1.3@o2ib_0 netid 50000: select flavor null
1334322545 2012 Apr 13 15:09:05 perou3 kern warning kernel Lustre: MGC10.5.1.3@o2ib: Reactivating import
1334322545 2012 Apr 13 15:09:05 perou3 kern info kernel Lustre: Enabling ACL
1334322545 2012 Apr 13 15:09:05 perou3 kern info kernel Lustre: Enabling user_xattr
1334322545 2012 Apr 13 15:09:05 perou3 kern warning kernel Lustre: fs_mdt-MDT0000: new disk, initializing
1334322545 2012 Apr 13 15:09:05 perou3 kern warning kernel Lustre: 7783:0:(mds_lov.c:1004:mds_notify()) MDS mdd_obd-fs_mdt-MDT0000: add target fs_mdt-OST0001_UUID
1334322545 2012 Apr 13 15:09:05 perou3 kern warning kernel Lustre: 7783:0:(mds_lov.c:1004:mds_notify()) Skipped 1 previous similar message
1334322545 2012 Apr 13 15:09:05 perou3 kern warning kernel Lustre: 620:0:(client.c:1778:ptlrpc_expire_one_request()) @@@ Request x1398844479800069 sent from fs_mdt-OST0001-osc-MDT0000 to NID 10.5.1.6@o2ib has failed due to network error: [
sent 1334322545] [real_sent 1334322545] [current 1334322545] [deadline 5s] [delay -5s] req@ffff88030a0dc000 x1398844479800069/t0(0) o-1->fs_mdt-OST0001_UUID@10.5.1.5@o2ib:28/4 lens 368/512 e 0 to 1 dl 1334322550 ref 1 fl Rpc:XN/ffffffff/ffffffff rc 0/-1
1334322545 2012 Apr 13 15:09:05 perou3 kern warning kernel Lustre: 620:0:(client.c:1778:ptlrpc_expire_one_request()) @@@ Request x1398844479800070 sent from fs_mdt-OST0000-osc-MDT0000 to NID 10.5.1.6@o2ib has failed due to network error: [sent 1334322545] [real_sent 1334322545] [current 1334322545] [deadline 5s] [delay -5s] req@ffff88030a0f0000 x1398844479800070/t0(0) o-1->fs_mdt-OST0000_UUID@10.5.1.5@o2ib:28/4 lens 368/512 e 0 to 1 dl 1334322550 ref 1 fl Rpc:XN/ffffffff/ffffffff rc 0/-1
1334322570 2012 Apr 13 15:09:30 perou3 kern warning kernel Lustre: 621:0:(import.c:526:import_select_connection()) fs_mdt-OST0001-osc-MDT0000: tried all connections, increasing latency to 5s
1334322570 2012 Apr 13 15:09:30 perou3 kern warning kernel Lustre: 620:0:(client.c:1778:ptlrpc_expire_one_request()) @@@ Request x1398844479800072 sent from fs_mdt-OST0001-osc-MDT0000 to NID 10.5.1.6@o2ib has failed due to network error: [sent 1334322570] [real_sent 1334322570] [current 1334322570] [deadline 10s] [delay -10s] req@ffff8803313d3400 x1398844479800072/t0(0) o-1->fs_mdt-OST0001_UUID@10.5.1.5@o2ib:28/4 lens 368/512 e 0 to 1 dl 1334322580 ref 1 fl Rpc:XN/ffffffff/ffffffff rc 0/-1
1334322577 2012 Apr 13 15:09:37 perou3 kern warning kernel Lustre: 619:0:(client.c:1778:ptlrpc_expire_one_request()) @@@ Request x1398844479800071 sent from MGC10.5.1.3@o2ib to NID 10.5.1.3@o2ib has timed out for slow reply: [sent 1334322570] [real_sent 1334322570] [current 1334322577] [deadline 7s] [delay 0s] req@ffff88031fdf4000 x1398844479800071/t0(0) o-1->MGS@MGC10.5.1.3@o2ib_0:26/25 lens 192/192 e 0 to 1 dl 1334322577 ref 1 fl Rpc:XN/ffffffff/ffffffff rc 0/-1
1334322577 2012 Apr 13 15:09:37 perou3 kern warning kernel Lustre: 619:0:(client.c:1778:ptlrpc_expire_one_request()) Skipped 1 previous similar message
1334322577 2012 Apr 13 15:09:37 perou3 kern err kernel LustreError: 166-1: MGC10.5.1.3@o2ib: Connection to service MGS via nid 10.5.1.3@o2ib was lost; in progress operations using this service will fail.
1334322583 2012 Apr 13 15:09:43 perou3 kern warning kernel Lustre: 620:0:(client.c:1778:ptlrpc_expire_one_request()) @@@ Request x1398844479800074 sent from MGC10.5.1.3@o2ib to NID 10.5.1.3@o2ib has timed out for slow reply: [sent 1334322577] [real_sent 1334322577] [current 1334322583] [deadline 6s] [delay 0s] req@ffff88031fdf4000 x1398844479800074/t0(0) o-1->MGS@MGC10.5.1.3@o2ib_0:26/25 lens 368/512 e 0 to 1 dl 1334322583 ref 1 fl Rpc:XN/ffffffff/ffffffff rc 0/-1
1334322602 2012 Apr 13 15:10:02 perou3 kern warning kernel Lustre: 621:0:(import.c:526:import_select_connection()) MGC10.5.1.3@o2ib: tried all connections, increasing latency to 6s
1334322602 2012 Apr 13 15:10:02 perou3 kern warning kernel Lustre: 621:0:(import.c:526:import_select_connection()) Skipped 1 previous similar message
1334322602 2012 Apr 13 15:10:02 perou3 kern warning kernel Lustre: 620:0:(client.c:1778:ptlrpc_expire_one_request()) @@@ Request x1398844479800076 sent from fs_mdt-OST0001-osc-MDT0000 to NID 10.5.1.6@o2ib has failed due to network error: [sent 1334322602] [real_sent 1334322602] [current 1334322602] [deadline 15s] [delay -15s] req@ffff88017cafa800 x1398844479800076/t0(0) o-1->fs_mdt-OST0001_UUID@10.5.1.5@o2ib:28/4 lens 368/512 e 0 to 1 dl 1334322617 ref 1 fl Rpc:XN/ffffffff/ffffffff rc 0/-1
1334322613 2012 Apr 13 15:10:13 perou3 kern warning kernel Lustre: 620:0:(client.c:1778:ptlrpc_expire_one_request()) @@@ Request x1398844479800075 sent from MGC10.5.1.3@o2ib to NID 10.5.1.3@o2ib has timed out for slow reply: [sent 1334322602] [real_sent 1334322602] [current 1334322613] [deadline 11s] [delay 0s] req@ffff88030a074800 x1398844479800075/t0(0) o-1->MGS@MGC10.5.1.3@o2ib_0:26/25 lens 368/512 e 0 to 1 dl 1334322613 ref 1 fl Rpc:XN/ffffffff/ffffffff rc 0/-1
1334322613 2012 Apr 13 15:10:13 perou3 kern warning kernel Lustre: 620:0:(client.c:1778:ptlrpc_expire_one_request()) Skipped 1 previous similar message
1334322627 2012 Apr 13 15:10:27 perou3 kern warning kernel Lustre: 621:0:(import.c:526:import_select_connection()) MGC10.5.1.3@o2ib: tried all connections, increasing latency to 11s
1334322627 2012 Apr 13 15:10:27 perou3 kern warning kernel Lustre: 621:0:(import.c:526:import_select_connection()) Skipped 2 previous similar messages
1334322643 2012 Apr 13 15:10:43 perou3 kern warning kernel Lustre: 620:0:(client.c:1778:ptlrpc_expire_one_request()) @@@ Request x1398844479800078 sent from MGC10.5.1.3@o2ib to NID 10.5.1.3@o2ib has timed out for slow reply: [sent 1334322627] [real_sent 1334322627] [current 1334322643] [deadline 16s] [delay 0s] req@ffff88032908c000 x1398844479800078/t0(0) o-1->MGS@MGC10.5.1.3@o2ib_0:26/25 lens 368/512 e 0 to 1 dl 1334322643 ref 1 fl Rpc:XN/ffffffff/ffffffff rc 0/-1
1334322643 2012 Apr 13 15:10:43 perou3 kern warning kernel Lustre: 620:0:(client.c:1778:ptlrpc_expire_one_request()) Skipped 1 previous similar message
1334322652 2012 Apr 13 15:10:52 perou3 kern warning kernel Lustre: 621:0:(import.c:526:import_select_connection()) MGC10.5.1.3@o2ib: tried all connections, increasing latency to 16s 1334322652 2012 Apr 13 15:10:52 perou3 kern warning kernel Lustre: 621:0:(import.c:526:import_select_connection()) Skipped 1 previous similar message
1334322677 2012 Apr 13 15:11:17 perou3 kern warning kernel Lustre: 621:0:(import.c:526:import_select_connection()) MGC10.5.1.3@o2ib: tried all connections, increasing latency to 21s
1334322677 2012 Apr 13 15:11:17 perou3 kern warning kernel Lustre: 621:0:(import.c:526:import_select_connection()) Skipped 2 previous similar messages
1334322677 2012 Apr 13 15:11:17 perou3 kern warning kernel Lustre: 620:0:(client.c:1778:ptlrpc_expire_one_request()) @@@ Request x1398844479800084 sent from fs_mdt-OST0001-osc-MDT0000 to NID 10.5.1.6@o2ib has failed due to network error: [sent 1334322677] [real_sent 1334322677] [current 1334322677] [deadline 30s] [delay -30s] req@ffff8801c35a6c00 x1398844479800084/t0(0) o-1->fs_mdt-OST0001_UUID@10.5.1.5@o2ib:28/4 lens 368/512 e 0 to 1 dl 1334322707 ref 1 fl Rpc:XN/ffffffff/ffffffff rc 0/-1
1334322677 2012 Apr 13 15:11:17 perou3 kern warning kernel Lustre: 620:0:(client.c:1778:ptlrpc_expire_one_request()) Skipped 3 previous similar messages
1334322677 2012 Apr 13 15:11:17 perou3 kern warning kernel Lustre: 620:0:(import.c:852:ptlrpc_connect_interpret()) MGS@MGC10.5.1.3@o2ib_0 changed server handle from 0x555b88f8bfb49318 to 0x555b88f8bfb49373
1334322677 2012 Apr 13 15:11:17 perou3 kern warning kernel Lustre: MGC10.5.1.3@o2ib: Reactivating import
1334322677 2012 Apr 13 15:11:17 perou3 kern info kernel Lustre: MGC10.5.1.3@o2ib: Connection restored to service MGS using nid 10.5.1.3@o2ib.
1334322702 2012 Apr 13 15:11:42 perou3 kern warning kernel Lustre: 621:0:(import.c:526:import_select_connection()) fs_mdt-OST0001-osc-MDT0000: tried all connections, increasing latency to 30s
1334322702 2012 Apr 13 15:11:42 perou3 kern warning kernel Lustre: 621:0:(import.c:526:import_select_connection()) Skipped 2 previous similar messages
1334322727 2012 Apr 13 15:12:07 perou3 kern warning kernel Lustre: 621:0:(import.c:526:import_select_connection()) fs_mdt-OST0001-osc-MDT0000: tried all connections, increasing latency to 35s
1334322727 2012 Apr 13 15:12:07 perou3 kern err kernel LustreError: 11-0: an error occurred while communicating with 10.5.1.3@o2ib. The obd_ping operation failed with -107
1334322727 2012 Apr 13 15:12:07 perou3 kern err kernel LustreError: 166-1: MGC10.5.1.3@o2ib: Connection to service MGS via nid 10.5.1.3@o2ib was lost; in progress operations using this service will fail.
1334322727 2012 Apr 13 15:12:07 perou3 kern warning kernel Lustre: 620:0:(import.c:852:ptlrpc_connect_interpret()) MGS@MGC10.5.1.3@o2ib_0 changed server handle from 0x555b88f8bfb49373 to 0x555b88f8bfb493dc
1334322727 2012 Apr 13 15:12:07 perou3 kern warning kernel Lustre: MGC10.5.1.3@o2ib: Reactivating import
1334322727 2012 Apr 13 15:12:07 perou3 kern info kernel Lustre: MGC10.5.1.3@o2ib: Connection restored to service MGS using nid 10.5.1.3@o2ib.
1334322752 2012 Apr 13 15:12:32 perou3 kern warning kernel Lustre: 621:0:(import.c:526:import_select_connection()) fs_mdt-OST0001-osc-MDT0000: tried all connections, increasing latency to 40s
1334322752 2012 Apr 13 15:12:32 perou3 kern warning kernel Lustre: 621:0:(import.c:526:import_select_connection()) Skipped 1 previous similar message
1334322752 2012 Apr 13 15:12:32 perou3 kern warning kernel Lustre: 620:0:(client.c:1778:ptlrpc_expire_one_request()) @@@ Request x1398844479800103 sent from fs_mdt-OST0001-osc-MDT0000 to NID 10.5.1.6@o2ib has failed due to network error: [sent 1334322752] [real_sent 1334322752] [current 1334322752] [deadline 45s] [delay -45s] req@ffff88030a0dc800 x1398844479800103/t0(0) o-1->fs_mdt-OST0001_UUID@10.5.1.5@o2ib:28/4 lens 368/512 e 0 to 1 dl 1334322797 ref 1 fl Rpc:XN/ffffffff/ffffffff rc 0/-1
1334322752 2012 Apr 13 15:12:32 perou3 kern warning kernel Lustre: 620:0:(client.c:1778:ptlrpc_expire_one_request()) Skipped 4 previous similar messages
1334322802 2012 Apr 13 15:13:22 perou3 kern warning kernel Lustre: 621:0:(import.c:526:import_select_connection()) fs_mdt-OST0001-osc-MDT0000: tried all connections, increasing latency to 50s
1334322802 2012 Apr 13 15:13:22 perou3 kern warning kernel Lustre: 621:0:(import.c:526:import_select_connection()) Skipped 2 previous similar messages
1334322877 2012 Apr 13 15:14:37 perou3 kern warning kernel Lustre: 621:0:(import.c:526:import_select_connection()) fs_mdt-OST0001-osc-MDT0000: tried all connections, increasing latency to 50s
1334322877 2012 Apr 13 15:14:37 perou3 kern warning kernel Lustre: 621:0:(import.c:526:import_select_connection()) Skipped 5 previous similar messages

On the MDS, we can see:
[root@perou3 ~]# lctl dl
0 UP mgc MGC10.5.1.3@o2ib f2d4e47f-96b1-e539-f0e1-e125d27e617f 5
1 UP lov fs_mdt-MDT0000-mdtlov fs_mdt-MDT0000-mdtlov_UUID 4
2 UP mdt fs_mdt-MDT0000 fs_mdt-MDT0000_UUID 3
3 UP mds mdd_obd-fs_mdt-MDT0000 mdd_obd_uuid-fs_mdt-MDT0000 3
4 UP osc fs_mdt-OST0001-osc-MDT0000 fs_mdt-MDT0000-mdtlov_UUID 5
5 UP osc fs_mdt-OST0000-osc-MDT0000 fs_mdt-MDT0000-mdtlov_UUID 5

7. Mount client

In the client syslog, we have:

1336046968 2012 May 3 14:09:28 perou7 kern info kernel Lustre: OBD class driver, http://wiki.whamcloud.com/
1336046968 2012 May 3 14:09:28 perou7 kern info kernel Lustre: Lustre Version: 2.1.0
1336046968 2012 May 3 14:09:28 perou7 kern info kernel Lustre: Build Version: B-2_1_0_0-lustrebull-20120404161806-CHANGED-2.6.32-71.24.1.bl6.Bull.23.x86_64
1336046968 2012 May 3 14:09:28 perou7 kern info kernel Lustre: Lustre LU module (ffffffffa053c2c0).
1336046968 2012 May 3 14:09:28 perou7 kern info kernel Lustre: Register global MR array, MR size: 0xffffffffffffffff, array size: 1
1336046968 2012 May 3 14:09:28 perou7 kern info kernel Lustre: Added LNI 10.5.1.6@o2ib [8/64/0/180]
1336046968 2012 May 3 14:09:28 perou7 kern info kernel Lustre: Lustre OSC module (ffffffffa09780c0).
1336046968 2012 May 3 14:09:28 perou7 kern info kernel Lustre: Lustre LOV module (ffffffffa09e3e40).
1336046968 2012 May 3 14:09:28 perou7 kern info kernel Lustre: Lustre client module (ffffffffa0d392e0).
1336046968 2012 May 3 14:09:28 perou7 user info logger lustre-tune: 0 devices have been tuned.
1336046968 2012 May 3 14:09:28 perou7 kern warning kernel Lustre: 21809:0:(sec.c:1474:sptlrpc_import_sec_adapt()) import MGC10.5.1.3@o2ib->MGC10.5.1.3@o2ib_0 netid 50000: select flavor null
1336046968 2012 May 3 14:09:28 perou7 kern err kernel LustreError: 152-6: Ignoring deprecated mount option 'acl'.
1336046968 2012 May 3 14:09:28 perou7 kern warning kernel Lustre: MGC10.5.1.3@o2ib: Reactivating import
1336046968 2012 May 3 14:09:28 perou7 kern warning kernel Lustre: 21809:0:(sec.c:1474:sptlrpc_import_sec_adapt()) import fs_mdt-MDT0000-mdc-ffff8802d12c9000->10.5.1.4@o2ib netid 50000: select flavor null
1336046968 2012 May 3 14:09:28 perou7 kern err kernel LustreError: 11-0: an error occurred while communicating with 10.5.1.4@o2ib. The mds_connect operation failed with -11
1336046993 2012 May 3 14:09:53 perou7 kern err kernel LustreError: 11-0: an error occurred while communicating with 10.5.1.4@o2ib. The mds_connect operation failed with -11
1336047018 2012 May 3 14:10:18 perou7 kern err kernel LustreError: 11-0: an error occurred while communicating with 10.5.1.4@o2ib. The mds_connect operation failed with -11

At the same time, in the MDS log we have:

1336046968 2012 May 3 14:09:28 perou3 kern warning kernel Lustre: fs_mdt-MDT0000: temporarily refusing client connection from 10.5.1.6@o2ib
1336046968 2012 May 3 14:09:28 perou3 kern err kernel LustreError: 26684:0:(ldlm_lib.c:2137:target_send_reply_msg()) @@@ processing error (11) req@ffff88032beafc00 x1400946785517577/t0(0) o-1><?>@<?>:0/0 lens 368/0 e 0 to 0 dl 1336047068 ref 1 fl Interpret:/ffffffff/ffffffff rc -11/-1
1336046993 2012 May 3 14:09:53 perou3 kern warning kernel Lustre: fs_mdt-MDT0000: temporarily refusing client connection from 10.5.1.6@o2ib
1336046993 2012 May 3 14:09:53 perou3 kern err kernel LustreError: 26684:0:(ldlm_lib.c:2137:target_send_reply_msg()) @@@ processing error (11) req@ffff88032bee3000 x1400946785517580/t0(0) o-1><?>@<?>:0/0 lens 368/0 e 0 to 0 dl 1336047093 ref 1 fl Interpret:/ffffffff/ffffffff rc -11/-1

This issue looks like LU-350, but the fact is the fix from this ticket is landed into Lustre 2.1.
The wierd thing here is that when the MDT is started, it tries to reach the failover node of the OSTs (NID 10.5.1.6@o2ib) and apparently not their primary node.
Of course, when starting the MDT before the OSTs, the MDT connects directly to the OSTs with the right NID, ie the primary one.

Regards,
Sebastien.

Comment by Peter Jones [ 03/May/12 ]

Lai

Could you please analyze this situation?

Thanks

Peter

Comment by Lai Siyao [ 09/May/12 ]

Hi Florent, what's the output of `llog_reader toto/CONFIGS/fs_mdt-MDT0000` after you started MDS? The nid of OST (MDS connects to) should be written in this config. Could you print this config for the case of MDS starting first also?

Comment by Sebastien Buisson (Inactive) [ 09/May/12 ]

Hi,

When we start OSTs first and then MDT (case when we hit the bug):

  1. llog_reader toto/CONFIGS/fs_mdt-MDT0000
    Header size : 8192
    Time : Wed May 9 13:08:57 2012
    Number of records: 32
    Target uuid : config_uuid
    -----------------------
    #01 (224)marker 7 (flags=0x01, v2.1.0.0) fs_mdt-MDT0000-mdtlov 'lov setup' Wed May 9 13:08:57 2012-
    #02 (136)attach 0:fs_mdt-MDT0000-mdtlov 1:lov 2:fs_mdt-MDT0000-mdtlov_UUID
    #03 (176)lov_setup 0:fs_mdt-MDT0000-mdtlov 1:(struct lov_desc)
    uuid=fs_mdt-MDT0000-mdtlov_UUID stripe:cnt=1 size=1048576 offset=18446744073709551615 pattern=0x1
    #04 (224)marker 7 (flags=0x02, v2.1.0.0) fs_mdt-MDT0000-mdtlov 'lov setup' Wed May 9 13:08:57 2012-
    #05 (224)marker 8 (flags=0x01, v2.1.0.0) fs_mdt-MDT0000 'add mdt' Wed May 9 13:08:57 2012-
    #06 (120)attach 0:fs_mdt-MDT0000 1:mdt 2:fs_mdt-MDT0000_UUID
    #07 (112)mount_option 0: 1:fs_mdt-MDT0000 2:fs_mdt-MDT0000-mdtlov
    #08 (160)setup 0:fs_mdt-MDT0000 1:fs_mdt-MDT0000_UUID 2:0 3:fs_mdt-MDT0000-mdtlov 4:f
    #09 (224)marker 8 (flags=0x02, v2.1.0.0) fs_mdt-MDT0000 'add mdt' Wed May 9 13:08:57 2012-
    #10 (224)marker 9 (flags=0x01, v2.1.0.0) fs_mdt-MDT0000 'add osc(copied)' Wed May 9 13:08:57 2012-
    #11 (224)marker 10 (flags=0x01, v2.1.0.0) fs_mdt-OST0000 'add osc' Wed May 9 13:08:57 2012-
    #12 (080)add_uuid nid=10.5.1.5@o2ib(0x500000a050105) 0: 1:10.5.1.5@o2ib
    #13 (080)add_uuid nid=10.5.1.6@o2ib(0x500000a050106) 0: 1:10.5.1.5@o2ib
    #14 (144)attach 0:fs_mdt-OST0000-osc-MDT0000 1:osc 2:fs_mdt-MDT0000-mdtlov_UUID
    #15 (144)setup 0:fs_mdt-OST0000-osc-MDT0000 1:fs_mdt-OST0000_UUID 2:10.5.1.5@o2ib
    #16 (136)lov_modify_tgts add 0:fs_mdt-MDT0000-mdtlov 1:fs_mdt-OST0000_UUID 2:0 3:1
    #17 (224)marker 10 (flags=0x02, v2.1.0.0) fs_mdt-OST0000 'add osc' Wed May 9 13:08:57 2012-
    #18 (224)marker 10 (flags=0x02, v2.1.0.0) fs_mdt-MDT0000 'add osc(copied)' Wed May 9 13:08:57 2012-
    #19 (224)marker 11 (flags=0x01, v2.1.0.0) fs_mdt-MDT0000 'add osc(copied)' Wed May 9 13:08:57 2012-
    #20 (224)marker 12 (flags=0x01, v2.1.0.0) fs_mdt-OST0001 'add osc' Wed May 9 13:08:57 2012-
    #21 (080)add_uuid nid=10.5.1.5@o2ib(0x500000a050105) 0: 1:10.5.1.5@o2ib
    #22 (080)add_uuid nid=10.5.1.6@o2ib(0x500000a050106) 0: 1:10.5.1.5@o2ib
    #23 (080)add_uuid nid=10.5.1.5@o2ib(0x500000a050105) 0: 1:10.5.1.5@o2ib
    #24 (080)add_uuid nid=10.5.1.6@o2ib(0x500000a050106) 0: 1:10.5.1.5@o2ib
    #25 (144)attach 0:fs_mdt-OST0001-osc-MDT0000 1:osc 2:fs_mdt-MDT0000-mdtlov_UUID
    #26 (144)setup 0:fs_mdt-OST0001-osc-MDT0000 1:fs_mdt-OST0001_UUID 2:10.5.1.5@o2ib
    #27 (136)lov_modify_tgts add 0:fs_mdt-MDT0000-mdtlov 1:fs_mdt-OST0001_UUID 2:1 3:1
    #28 (224)marker 12 (flags=0x02, v2.1.0.0) fs_mdt-OST0001 'add osc' Wed May 9 13:08:57 2012-
    #29 (224)marker 12 (flags=0x02, v2.1.0.0) fs_mdt-MDT0000 'add osc(copied)' Wed May 9 13:08:57 2012-
    #30 (224)marker 15 (flags=0x01, v2.1.0.0) fs_mdt-MDT0000-mdtlov 'lov.stripesize' Wed May 9 13:08:57 2012-
    #31 (112)param 0:fs_mdt-MDT0000-mdtlov 1:lov.stripesize=1048576
    #32 (224)marker 15 (flags=0x02, v2.1.0.0) fs_mdt-MDT0000-mdtlov 'lov.stripesize' Wed May 9 13:08:57 2012-

When we start MDT first and then the OSTs (no bug):

  • right after MDT start:
  1. llog_reader toto/CONFIGS/fs_mdt-MDT0000
    Header size : 8192
    Time : Wed May 9 13:21:35 2012
    Number of records: 12
    Target uuid : config_uuid
    -----------------------
    #01 (224)marker 1 (flags=0x01, v2.1.0.0) fs_mdt-MDT0000-mdtlov 'lov setup' Wed May 9 13:21:35 2012-
    #02 (136)attach 0:fs_mdt-MDT0000-mdtlov 1:lov 2:fs_mdt-MDT0000-mdtlov_UUID
    #03 (176)lov_setup 0:fs_mdt-MDT0000-mdtlov 1:(struct lov_desc)
    uuid=fs_mdt-MDT0000-mdtlov_UUID stripe:cnt=1 size=1048576 offset=18446744073709551615 pattern=0x1
    #04 (224)marker 1 (flags=0x02, v2.1.0.0) fs_mdt-MDT0000-mdtlov 'lov setup' Wed May 9 13:21:35 2012-
    #05 (224)marker 2 (flags=0x01, v2.1.0.0) fs_mdt-MDT0000 'add mdt' Wed May 9 13:21:35 2012-
    #06 (120)attach 0:fs_mdt-MDT0000 1:mdt 2:fs_mdt-MDT0000_UUID
    #07 (112)mount_option 0: 1:fs_mdt-MDT0000 2:fs_mdt-MDT0000-mdtlov
    #08 (160)setup 0:fs_mdt-MDT0000 1:fs_mdt-MDT0000_UUID 2:0 3:fs_mdt-MDT0000-mdtlov 4:f
    #09 (224)marker 2 (flags=0x02, v2.1.0.0) fs_mdt-MDT0000 'add mdt' Wed May 9 13:21:35 2012-
    #10 (224)marker 7 (flags=0x01, v2.1.0.0) fs_mdt-MDT0000-mdtlov 'lov.stripesize' Wed May 9 13:21:35 2012-
    #11 (112)param 0:fs_mdt-MDT0000-mdtlov 1:lov.stripesize=1048576
    #12 (224)marker 7 (flags=0x02, v2.1.0.0) fs_mdt-MDT0000-mdtlov 'lov.stripesize' Wed May 9 13:21:35 2012-
  • then after OSTs start:
  1. llog_reader toto/CONFIGS/fs_mdt-MDT0000
    Header size : 8192
    Time : Wed May 9 13:21:35 2012
    Number of records: 28
    Target uuid : config_uuid
    -----------------------
    #01 (224)marker 1 (flags=0x01, v2.1.0.0) fs_mdt-MDT0000-mdtlov 'lov setup' Wed May 9 13:21:35 2012-
    #02 (136)attach 0:fs_mdt-MDT0000-mdtlov 1:lov 2:fs_mdt-MDT0000-mdtlov_UUID
    #03 (176)lov_setup 0:fs_mdt-MDT0000-mdtlov 1:(struct lov_desc)
    uuid=fs_mdt-MDT0000-mdtlov_UUID stripe:cnt=1 size=1048576 offset=18446744073709551615 pattern=0x1
    #04 (224)marker 1 (flags=0x02, v2.1.0.0) fs_mdt-MDT0000-mdtlov 'lov setup' Wed May 9 13:21:35 2012-
    #05 (224)marker 2 (flags=0x01, v2.1.0.0) fs_mdt-MDT0000 'add mdt' Wed May 9 13:21:35 2012-
    #06 (120)attach 0:fs_mdt-MDT0000 1:mdt 2:fs_mdt-MDT0000_UUID
    #07 (112)mount_option 0: 1:fs_mdt-MDT0000 2:fs_mdt-MDT0000-mdtlov
    #08 (160)setup 0:fs_mdt-MDT0000 1:fs_mdt-MDT0000_UUID 2:0 3:fs_mdt-MDT0000-mdtlov 4:f
    #09 (224)marker 2 (flags=0x02, v2.1.0.0) fs_mdt-MDT0000 'add mdt' Wed May 9 13:21:35 2012-
    #10 (224)marker 7 (flags=0x01, v2.1.0.0) fs_mdt-MDT0000-mdtlov 'lov.stripesize' Wed May 9 13:21:35 2012-
    #11 (112)param 0:fs_mdt-MDT0000-mdtlov 1:lov.stripesize=1048576
    #12 (224)marker 7 (flags=0x02, v2.1.0.0) fs_mdt-MDT0000-mdtlov 'lov.stripesize' Wed May 9 13:21:35 2012-
    #13 (224)marker 10 (flags=0x01, v2.1.0.0) fs_mdt-OST0001 'add osc' Wed May 9 13:22:26 2012-
    #14 (080)add_uuid nid=10.5.1.5@o2ib(0x500000a050105) 0: 1:10.5.1.5@o2ib
    #15 (144)attach 0:fs_mdt-OST0001-osc-MDT0000 1:osc 2:fs_mdt-MDT0000-mdtlov_UUID
    #16 (144)setup 0:fs_mdt-OST0001-osc-MDT0000 1:fs_mdt-OST0001_UUID 2:10.5.1.5@o2ib
    #17 (080)add_uuid nid=10.5.1.6@o2ib(0x500000a050106) 0: 1:10.5.1.6@o2ib
    #18 (112)add_conn 0:fs_mdt-OST0001-osc-MDT0000 1:10.5.1.6@o2ib
    #19 (136)lov_modify_tgts add 0:fs_mdt-MDT0000-mdtlov 1:fs_mdt-OST0001_UUID 2:1 3:1
    #20 (224)marker 10 (flags=0x02, v2.1.0.0) fs_mdt-OST0001 'add osc' Wed May 9 13:22:26 2012-
    #21 (224)marker 13 (flags=0x01, v2.1.0.0) fs_mdt-OST0000 'add osc' Wed May 9 13:22:27 2012-
    #22 (080)add_uuid nid=10.5.1.5@o2ib(0x500000a050105) 0: 1:10.5.1.5@o2ib
    #23 (144)attach 0:fs_mdt-OST0000-osc-MDT0000 1:osc 2:fs_mdt-MDT0000-mdtlov_UUID
    #24 (144)setup 0:fs_mdt-OST0000-osc-MDT0000 1:fs_mdt-OST0000_UUID 2:10.5.1.5@o2ib
    #25 (080)add_uuid nid=10.5.1.6@o2ib(0x500000a050106) 0: 1:10.5.1.6@o2ib
    #26 (112)add_conn 0:fs_mdt-OST0000-osc-MDT0000 1:10.5.1.6@o2ib
    #27 (136)lov_modify_tgts add 0:fs_mdt-MDT0000-mdtlov 1:fs_mdt-OST0000_UUID 2:0 3:1
    #28 (224)marker 13 (flags=0x02, v2.1.0.0) fs_mdt-OST0000 'add osc' Wed May 9 13:22:27 2012-

For the record, the OSS hosting the 2 OSTs has NID 10.5.1.5@o2ib.

Cheers,
Sebastien.

Comment by Lai Siyao [ 17/May/12 ]

The llog config looks fine. And I did the same test, but can't reproduce with master code. Could you `lctl set_param debug=-1` on MDS node, and dump debug log after MDS mounted. I'll test 2.1 later.

Comment by Lai Siyao [ 18/May/12 ]

2.1 test can pass in my test env too.

Comment by Sebastien Buisson (Inactive) [ 21/May/12 ]

Hi,

Here is the requested debug information:

  • lustre_mds_ko.gz is the debug log when the OSTs are mounted first, which means we git the error;
  • lustre_mds_ok.gz is the debug log when the MDT is mounted first, so that we have no error.

HTH
Sebastien.

Comment by Lai Siyao [ 12/Jun/12 ]

It looks like the config generated for MDS(for the failure case) is wrong: both nid 10.5.1.5@o2ib and 10.5.1.6@o2ib are mapped to uuid 10.5.1.5@o2ib, and during MDS osc connect, it tends to use the last nid, which is 10.5.1.6@o2ib, thus connection always fail. I'll dig into config code to see why this happens.

Comment by Lai Siyao [ 14/Jun/12 ]

Once OSTs start before MDT, MDT needs steal OSC config from client config, however both target nids and failover nids are written in config as the same type: LCFG_ADD_UUID. Previously, all these nids are treated as target nids, so wrong config file is generated for MDT. And ptlrpc tends to use the last nid of target (if all nids in the same subnet) to connect to target, so the error happens.

The patch for master is at: http://review.whamcloud.com/#change,3107.

It should be able to be patched to 2.1 branch. Sebastien, could you help verify that it works for you?

Comment by Sebastien Buisson (Inactive) [ 15/Jun/12 ]

Hi,

I gave a try to the patch.
The good news is it fixes the issue when we mount OSTs before the MDT, having the 'first_time' ldd flag set (meaning it is the first time this Lustre file system is mounted).
But the bad news is it does not fix the issue when we explicitly set the 'writeconf' flag (and same starting order: OSTs then MDT).

On the MGS, the logs are:

  • when we start OSTs:

1339769487 2012 Jun 15 16:11:27 perou2 kern warning kernel Lustre: 956:0:(ldlm_lib.c:876:target_handle_connect()) MGS: connection from 527e8357-0c75-491a-738c-18026d6ec94c@10.5.1.5@o2ib t0 exp (null) cur 1339769487 last 0
1339769487 2012 Jun 15 16:11:27 perou2 kern warning kernel Lustre: 956:0:(sec.c:1474:sptlrpc_import_sec_adapt()) import MGS->NET_0x500000a050105_UUID netid 50000: select flavor null
1339769487 2012 Jun 15 16:11:27 perou2 kern warning kernel Lustre: 955:0:(ldlm_lib.c:791:target_handle_connect()) MGS: exp ffff8802eb2ad800 already connecting
1339769487 2012 Jun 15 16:11:27 perou2 kern err kernel LustreError: 955:0:(mgs_handler.c:783:mgs_handle()) MGS handle cmd=250 rc=-114
1339769487 2012 Jun 15 16:11:27 perou2 kern err kernel LustreError: 955:0:(ldlm_lib.c:2137:target_send_reply_msg()) @@@ processing error (114) req@ffff8803179b0850 x1404849096753257/t0(0) o-1><?>@<?>:0/0 lens 368/264 e 0 to 0 dl 1339769587 ref 1 fl Interpret:/ffffffff/ffffffff rc -114/-1
1339769512 2012 Jun 15 16:11:52 perou2 kern warning kernel Lustre: MGS: Regenerating fs_mdt-OST0001 log by user request.

  • when we then start the MDT:

1339769542 2012 Jun 15 16:12:22 perou2 kern warning kernel Lustre: 956:0:(ldlm_lib.c:876:target_handle_connect()) MGS: connection from 35c02d1f-5920-b87f-bc64-358fbcf00f05@10.5.1.4@o2ib t0 exp (null) cur 1339769542 last 0
1339769542 2012 Jun 15 16:12:22 perou2 kern warning kernel Lustre: 956:0:(ldlm_lib.c:876:target_handle_connect()) Skipped 1 previous similar message
1339769542 2012 Jun 15 16:12:22 perou2 kern warning kernel Lustre: MGS: Logs for fs fs_mdt were removed by user request. All servers must be restarted in order to regenerate the logs.
1339769542 2012 Jun 15 16:12:22 perou2 kern info kernel Lustre: Setting parameter fs_mdt-MDT0000-mdtlov.lov.stripesize in log fs_mdt-MDT0000
1339769542 2012 Jun 15 16:12:22 perou2 kern info kernel Lustre: Skipped 1 previous similar message

So half of the problem is solved ;-(

HTH,
Sebastien.

Comment by Lai Siyao [ 17/Jun/12 ]

Sebastien, could you tell me how did you set 'writeconf'? In my understanding 'writeconf' implies removing all existed config for the specified fs, so in your test, when MDT started, it removes all configs for 'fs_mdt' (including OSTs). Could you explain why you set 'writeconf', and normally how you use it?

Comment by Sebastien Buisson (Inactive) [ 20/Jun/12 ]

Hi,

I use 'writeconf' when I need to reformat a file system.

For instance, consider I have a 'fs1' file system formated and running, talking to my MGS. Now I stop this 'fs1' file system, and build a new file system (different OSTs, and/or different MDT) while keeping the same 'fs1' name. Then, if I format this new 'fs1' file system, there will be a problem on the MGS, because the configuration information for 'fs1' will be there already. This is why I need to use the 'writeconf' parameter.

I always set the 'writeconf' parameter on the OSTs as well as the MDT:

  1. mkfs.lustre --reformat --fsname=fs_mdt --mdt --index=0 --mgsnode=perou2-ib0@o2ib0 --param=lov.stripesize=1048576 --network=o2ib0 --writeconf --mkfsoptions="-j -J device=/dev/disk/by-id/scsi-2003013841aac000d -m 0" /dev/disk/by-id/scsi-2003013841aac002d
  2. mkfs.lustre --reformat --fsname=fs_mdt --ost --index=0 --mgsnode=perou2-ib0@o2ib0 --failnode=perou7-ib0@o2ib0 --network=o2ib0 --writeconf --mkfsoptions="-j -J device=/dev/disk/by-id/scsi-2003013841aac0017 -m 0" /dev/disk/by-id/scsi-2003013841aac0037
  3. mkfs.lustre --reformat --fsname=fs_mdt --ost --index=1 --mgsnode=perou2-ib0@o2ib0 --failnode=perou7-ib0@o2ib0 --network=o2ib0 --writeconf --mkfsoptions="-j -J device=/dev/disk/by-id/scsi-2003013841aac0015 -m 0" /dev/disk/by-id/scsi-2003013841aac0035

If I set 'writeconf' on all targets, I do not understand why the configuration should not be regenerated on the MGS, even if I start the OSTs first.

Cheers,
Sebastien.

Comment by Lai Siyao [ 21/Jun/12 ]

Lustre Manual 14.4 Regenerating Lustre Configuration Logs:
Run the writeconf command on all servers (MDT first, then OSTs)
Start the file system in this order:
MGS (or the combined MGS/MDT)
MDT
OSTs
Lustre clients

And the code also shows that writeconf MDT will erase all fs log (MDT, OSTs and clients), so there is message like this in your log:

1339769542 2012 Jun 15 16:12:22 perou2 kern warning kernel Lustre: MGS: Logs for fs fs_mdt were removed by user request. All servers must be restarted in order to regenerate the logs.

In my understanding this is a special case, and I'm not clear how 'writeconf' is originally designed, but this doesn't look to be an issue.

Comment by Sebastien Buisson (Inactive) [ 21/Jun/12 ]

Hum, sorry. You are absolutely right, one is supposed to invert target start order when 'writeconf' flag is set.

So, in the end, I think I can say your patch is working great!
I have changed my review to '+1' in http://review.whamcloud.com/3107

Cheers,
Sebastien.

Comment by Lai Siyao [ 27/Feb/13 ]

Patch landed.

Generated at Sat Feb 10 01:15:18 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.