[LU-1169] Fix race during new fsdb creation. Created: 03/Mar/12  Updated: 27/Nov/12  Resolved: 27/Nov/12

Status: Resolved
Project: Lustre
Component/s: None
Affects Version/s: None
Fix Version/s: Lustre 2.4.0

Type: Bug Priority: Major
Reporter: Andriy Skulysh Assignee: Keith Mannthey (Inactive)
Resolution: Fixed Votes: 0
Labels: patch

Severity: 3
Rank (Obsolete): 6437

 Description   

It results to 2 equal OST indexes and totally broken MGS config.
./llog_reader /tmp/cohiba-MDT0000
Bit 56 of 201 not set
Bit 114 of 201 not set
Bit 171 of 201 not set
Header size : 8192
Time : Mon Sep 26 16:22:14 2011
Number of records: 201
Target uuid : cohiba-MDT0000
-----------------------
#01 (224)marker 1 (flags=0x01, v2.0.61.0) cohiba-MDT0000-mdtlov 'lov setup' Thu Sep 22 21:29:14 2011-
#02 (136)attach 0:cohiba-MDT0000-mdtlov 1:lov 2:cohiba-MDT0000-mdtlov_UUID
#03 (176)lov_setup 0:cohiba-MDT0000-mdtlov 1:(struct lov_desc)
uuid=cohiba-MDT0000-mdtlov_UUID stripe:cnt=1 size=1048576 offset=18446744073709551615 pattern=0x1
#04 (224)marker 1 (flags=0x02, v2.0.61.0) cohiba-MDT0000-mdtlov 'lov setup' Thu Sep 22 21:29:14 2011-
#05 (224)marker 2 (flags=0x01, v2.0.61.0) cohiba-MDT0000 'add mdt' Thu Sep 22 21:29:14 2011-
#06 (120)attach 0:cohiba-MDT0000 1:mdt 2:cohiba-MDT0000_UUID
#07 (112)mount_option 0: 1:cohiba-MDT0000 2:cohiba-MDT0000-mdtlov
#08 (160)setup 0:cohiba-MDT0000 1:cohiba-MDT0000_UUID 2:0 3:cohiba-MDT0000-mdtlov 4:f
#09 (224)marker 2 (flags=0x02, v2.0.61.0) cohiba-MDT0000 'add mdt' Thu Sep 22 21:29:14 2011-
#10 (224)marker 8 (flags=0x01, v2.0.61.0) cohiba-OST0002 'add osc' Thu Sep 22 21:53:41 2011-
#11 (080)add_uuid nid=172.18.1.4@o2ib(0x50000ac120104) 0: 1:172.18.1.4@o2ib
#12 (144)attach 0:cohiba-OST0002-osc-MDT0000 1:osc 2:cohiba-MDT0000-mdtlov_UUID
#13 (144)setup 0:cohiba-OST0002-osc-MDT0000 1:cohiba-OST0002_UUID 2:172.18.1.4@o2ib
#14 (080)add_uuid nid=172.18.1.3@o2ib(0x50000ac120103) 0: 1:172.18.1.3@o2ib
#15 (112)add_conn 0:cohiba-OST0002-osc-MDT0000 1:172.18.1.3@o2ib
#16 (136)lov_modify_tgts add 0:cohiba-MDT0000-mdtlov 1:cohiba-OST0002_UUID 2:2 3:1
#17 (224)marker 8 (flags=0x02, v2.0.61.0) cohiba-OST0002 'add osc' Thu Sep 22 21:53:41 2011-
#18 (224)marker 11 (flags=0x01, v2.0.61.0) cohiba-OST0001 'add osc' Thu Sep 22 21:53:41 2011-
#19 (080)add_uuid nid=172.18.1.3@o2ib(0x50000ac120103) 0: 1:172.18.1.3@o2ib
#20 (144)attach 0:cohiba-OST0001-osc-MDT0000 1:osc 2:cohiba-MDT0000-mdtlov_UUID
#21 (144)setup 0:cohiba-OST0001-osc-MDT0000 1:cohiba-OST0001_UUID 2:172.18.1.3@o2ib
#22 (080)add_uuid nid=172.18.1.4@o2ib(0x50000ac120104) 0: 1:172.18.1.4@o2ib
#23 (112)add_conn 0:cohiba-OST0001-osc-MDT0000 1:172.18.1.4@o2ib
#24 (136)lov_modify_tgts add 0:cohiba-MDT0000-mdtlov 1:cohiba-OST0001_UUID 2:1 3:1
#25 (224)marker 11 (flags=0x02, v2.0.61.0) cohiba-OST0001 'add osc' Thu Sep 22 21:53:41 2011-
#26 (224)marker 14 (flags=0x01, v2.0.61.0) cohiba-OST0003 'add osc' Thu Sep 22 21:53:41 2011-
#27 (080)add_uuid nid=172.18.1.4@o2ib(0x50000ac120104) 0: 1:172.18.1.4@o2ib
#28 (144)attach 0:cohiba-OST0003-osc-MDT0000 1:osc 2:cohiba-MDT0000-mdtlov_UUID
#29 (144)setup 0:cohiba-OST0003-osc-MDT0000 1:cohiba-OST0003_UUID 2:172.18.1.4@o2ib
#30 (080)add_uuid nid=172.18.1.3@o2ib(0x50000ac120103) 0: 1:172.18.1.3@o2ib
#31 (112)add_conn 0:cohiba-OST0003-osc-MDT0000 1:172.18.1.3@o2ib
#32 (136)lov_modify_tgts add 0:cohiba-MDT0000-mdtlov 1:cohiba-OST0003_UUID 2:3 3:1
#33 (224)marker 14 (flags=0x02, v2.0.61.0) cohiba-OST0003 'add osc' Thu Sep 22 21:53:41 2011-
#34 (224)marker 17 (flags=0x01, v2.0.61.0) cohiba-OST0000 'add osc' Thu Sep 22 21:53:41 2011-
#35 (080)add_uuid nid=172.18.1.3@o2ib(0x50000ac120103) 0: 1:172.18.1.3@o2ib
#36 (144)attach 0:cohiba-OST0000-osc-MDT0000 1:osc 2:cohiba-MDT0000-mdtlov_UUID
#37 (144)setup 0:cohiba-OST0000-osc-MDT0000 1:cohiba-OST0000_UUID 2:172.18.1.3@o2ib
#38 (080)add_uuid nid=172.18.1.4@o2ib(0x50000ac120104) 0: 1:172.18.1.4@o2ib
#39 (112)add_conn 0:cohiba-OST0000-osc-MDT0000 1:172.18.1.4@o2ib
#40 (136)lov_modify_tgts add 0:cohiba-MDT0000-mdtlov 1:cohiba-OST0000_UUID 2:0 3:1
#41 (224)marker 17 (flags=0x02, v2.0.61.0) cohiba-OST0000 'add osc' Thu Sep 22 21:53:41 2011-
<... SKIPPED RECORDS ...>
#107 (224)marker 44 (flags=0x01, v2.0.61.0) cohiba-OST0002 'add osc' Fri Sep 23 01:07:16 2011-
#108 (080)add_uuid nid=172.18.1.4@o2ib(0x50000ac120104) 0: 1:172.18.1.4@o2ib
#109 (144)attach 0:cohiba-OST0002-osc-MDT0000 1:osc 2:cohiba-MDT0000-mdtlov_UUID
#110 (144)setup 0:cohiba-OST0002-osc-MDT0000 1:cohiba-OST0002_UUID 2:172.18.1.4@o2ib
#111 (080)add_uuid nid=172.18.1.3@o2ib(0x50000ac120103) 0: 1:172.18.1.3@o2ib
#112 (112)add_conn 0:cohiba-OST0002-osc-MDT0000 1:172.18.1.3@o2ib
#113 (136)lov_modify_tgts add 0:cohiba-MDT0000-mdtlov 1:cohiba-OST0002_UUID 2:2 3:1
#115 (224)marker 44 (flags=0x02, v2.0.61.0) cohiba-OST0002 'add osc' Fri Sep 23 01:07:16 2011-
<...SKIPPED RECORDS...>
#140 (224)marker 56 (flags=0x01, v2.0.61.0) cohiba-OST0002 'add osc' Fri Sep 23 01:17:48 2011-
#141 (080)add_uuid nid=172.18.1.4@o2ib(0x50000ac120104) 0: 1:172.18.1.4@o2ib
#142 (144)attach 0:cohiba-OST0002-osc-MDT0000 1:osc 2:cohiba-MDT0000-mdtlov_UUID
#143 (144)setup 0:cohiba-OST0002-osc-MDT0000 1:cohiba-OST0002_UUID 2:172.18.1.4@o2ib
#144 (080)add_uuid nid=172.18.1.3@o2ib(0x50000ac120103) 0: 1:172.18.1.3@o2ib
#145 (112)add_conn 0:cohiba-OST0002-osc-MDT0000 1:172.18.1.3@o2ib
#146 (136)lov_modify_tgts add 0:cohiba-MDT0000-mdtlov 1:cohiba-OST0002_UUID 2:2 3:1
#147 (224)marker 56 (flags=0x02, v2.0.61.0) cohiba-OST0002 'add osc' Fri Sep 23 01:17:48 2011-
<...SKIPPED RECORDS...>



 Comments   
Comment by Andriy Skulysh [ 03/Mar/12 ]

CODE: http://review.whamcloud.com/2251

Comment by Andreas Dilger [ 04/Mar/12 ]

Andriy,
I haven't looked at the patch yet, and it is definitely good to fix such a problem like this. Note however that the use of MGS-assigned OST index values is deprecated, and will be removed entirely in the future. All of the customers we talked to already specify "--index=N" when formatting their filesystems today.

The reason is that outside of test environments, customers care about specific OST->disk assignments for maintenance reasons (i.e. to locate specific disks when an OST reports errors), so random MGS-assigned OST index values are not useful. Requiring that the user (or management interface) specify the OST index at format time avoids any confusion or misordering between OST and LUN/OSS layouts.

Comment by Andriy Skulysh [ 30/Mar/12 ]

The patch fixes race between loading data from llog into fsdb and obtaining data form it. It doesn't matter if a index was specified manually or automatically assigned.
I suppose the races should be fixed until fsdb code removal.

Comment by Build Master (Inactive) [ 01/Apr/12 ]

Integrated in lustre-reviews » x86_64,client,el6,inkernel #4629
LU-1169 mgs: Fix race during new fsdb creation. (Revision b287a870bc5eb3b205dbe52bbb7d15c342b41e09)

Result = SUCCESS
Andriy_Skulysh : b287a870bc5eb3b205dbe52bbb7d15c342b41e09
Files :

  • lustre/mgs/mgs_llog.c
Comment by Build Master (Inactive) [ 01/Apr/12 ]

Integrated in lustre-reviews » i686,client,el6,inkernel #4629
LU-1169 mgs: Fix race during new fsdb creation. (Revision b287a870bc5eb3b205dbe52bbb7d15c342b41e09)

Result = SUCCESS
Andriy_Skulysh : b287a870bc5eb3b205dbe52bbb7d15c342b41e09
Files :

  • lustre/mgs/mgs_llog.c
Comment by Build Master (Inactive) [ 01/Apr/12 ]

Integrated in lustre-reviews » x86_64,client,sles11,inkernel #4629
LU-1169 mgs: Fix race during new fsdb creation. (Revision b287a870bc5eb3b205dbe52bbb7d15c342b41e09)

Result = SUCCESS
Andriy_Skulysh : b287a870bc5eb3b205dbe52bbb7d15c342b41e09
Files :

  • lustre/mgs/mgs_llog.c
Comment by Build Master (Inactive) [ 01/Apr/12 ]

Integrated in lustre-reviews » x86_64,client,el5,inkernel #4629
LU-1169 mgs: Fix race during new fsdb creation. (Revision b287a870bc5eb3b205dbe52bbb7d15c342b41e09)

Result = SUCCESS
Andriy_Skulysh : b287a870bc5eb3b205dbe52bbb7d15c342b41e09
Files :

  • lustre/mgs/mgs_llog.c
Comment by Build Master (Inactive) [ 01/Apr/12 ]

Integrated in lustre-reviews » i686,server,el5,inkernel #4629
LU-1169 mgs: Fix race during new fsdb creation. (Revision b287a870bc5eb3b205dbe52bbb7d15c342b41e09)

Result = SUCCESS
Andriy_Skulysh : b287a870bc5eb3b205dbe52bbb7d15c342b41e09
Files :

  • lustre/mgs/mgs_llog.c
Comment by Build Master (Inactive) [ 01/Apr/12 ]

Integrated in lustre-reviews » x86_64,server,el6,inkernel #4629
LU-1169 mgs: Fix race during new fsdb creation. (Revision b287a870bc5eb3b205dbe52bbb7d15c342b41e09)

Result = SUCCESS
Andriy_Skulysh : b287a870bc5eb3b205dbe52bbb7d15c342b41e09
Files :

  • lustre/mgs/mgs_llog.c
Comment by Build Master (Inactive) [ 01/Apr/12 ]

Integrated in lustre-reviews » i686,server,el6,inkernel #4629
LU-1169 mgs: Fix race during new fsdb creation. (Revision b287a870bc5eb3b205dbe52bbb7d15c342b41e09)

Result = SUCCESS
Andriy_Skulysh : b287a870bc5eb3b205dbe52bbb7d15c342b41e09
Files :

  • lustre/mgs/mgs_llog.c
Comment by Build Master (Inactive) [ 01/Apr/12 ]

Integrated in lustre-reviews » x86_64,server,el5,inkernel #4629
LU-1169 mgs: Fix race during new fsdb creation. (Revision b287a870bc5eb3b205dbe52bbb7d15c342b41e09)

Result = SUCCESS
Andriy_Skulysh : b287a870bc5eb3b205dbe52bbb7d15c342b41e09
Files :

  • lustre/mgs/mgs_llog.c
Comment by Build Master (Inactive) [ 01/Apr/12 ]

Integrated in lustre-reviews » i686,client,el5,inkernel #4629
LU-1169 mgs: Fix race during new fsdb creation. (Revision b287a870bc5eb3b205dbe52bbb7d15c342b41e09)

Result = SUCCESS
Andriy_Skulysh : b287a870bc5eb3b205dbe52bbb7d15c342b41e09
Files :

  • lustre/mgs/mgs_llog.c
Comment by Nathan Rutman [ 21/Nov/12 ]

Xyratex-bug-id: MRP-230

Comment by Keith Mannthey (Inactive) [ 27/Nov/12 ]

http://review.whamcloud.com/2251 has been merged.

Generated at Sat Feb 10 01:14:10 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.