[LU-8018] lov_init_raid0() ASSERTION( subdev != NULL ) failed: not init ost 0 Created: 13/Apr/16  Updated: 29/Nov/16  Resolved: 26/Sep/16

Status: Resolved
Project: Lustre
Component/s: None
Affects Version/s: Lustre 2.8.0
Fix Version/s: Lustre 2.9.0

Type: Bug Priority: Major
Reporter: Olaf Faaland Assignee: Zhenyu Xu
Resolution: Fixed Votes: 0
Labels: llnl, llnlfixready
Environment:

Lustre 2.8.0 + llnl patches, both client and server
TOSS 2 (RHEL 6.7 based)
Kernel 2.6.32-573.22.1.1chaos.ch5.4.x86_64


Attachments: File con-logs.tgz    
Issue Links:
Related
Severity: 3
Rank (Obsolete): 9223372036854775807

 Description   

Client crashes during sys_creat() call on LBUG.

LustreError: 12025:0:(lov_object.c:278:lov_init_raid0()) ASSERTION( subdev != NULL ) failed: not init ost 0
LustreError: 12025:0:(lov_object.c:278:lov_init_raid0()) LBUG                                              
Pid: 12025, comm: mdtest                                                                                   

Call Trace:
 [<ffffffffa04a48d5>] libcfs_debug_dumpstack+0x55/0x80 [libcfs]
 [<ffffffffa04a4ed7>] lbug_with_loc+0x47/0xb0 [libcfs]         
 [<ffffffffa0dcc729>] lov_init_raid0+0xde9/0x1140 [lov]        
 [<ffffffffa0dc9394>] lov_object_init+0x124/0x300 [lov]        
 [<ffffffffa09347ec>] ? lu_object_add+0x2c/0x30 [obdclass]     
 [<ffffffffa09371a8>] lu_object_alloc+0xd8/0x320 [obdclass]    
 [<ffffffffa093851d>] lu_object_find_try+0xcd/0x260 [obdclass] 
 [<ffffffffa04b48d2>] ? cfs_hash_bd_add_locked+0x62/0x90 [libcfs]
 [<ffffffffa0938761>] lu_object_find_at+0xb1/0xe0 [obdclass]     
 [<ffffffffa04b7f7a>] ? cfs_hash_find_or_add+0x9a/0x190 [libcfs] 
 [<ffffffffa09387cf>] lu_object_find_slice+0x1f/0x80 [obdclass]  
 [<ffffffffa093e455>] cl_object_find+0x55/0xc0 [obdclass]        
 [<ffffffffa0e6929e>] cl_file_inode_init+0x22e/0x340 [lustre]    
 [<ffffffffa0e368b4>] ll_update_inode+0x474/0x1d90 [lustre]      
 [<ffffffffa0915765>] ? lprocfs_counter_add+0x165/0x1c0 [obdclass]
 [<ffffffffa0b01f35>] ? lustre_msg_buf+0x55/0x60 [ptlrpc]         
 [<ffffffffa0b29372>] ? __req_capsule_get+0x162/0x6e0 [ptlrpc]    
 [<ffffffffa0b29372>] ? __req_capsule_get+0x162/0x6e0 [ptlrpc]    
 [<ffffffff812352a0>] ? security_inode_alloc+0x40/0x60            
 [<ffffffffa0e32cc9>] ? ll_lli_init+0x209/0x2a0 [lustre]          
 [<ffffffffa0e3823d>] ll_read_inode2+0x6d/0x410 [lustre]          
 [<ffffffffa0e4eceb>] ll_iget+0x12b/0x330 [lustre]                
 [<ffffffffa0e3a0ed>] ll_prep_inode+0x5bd/0xc40 [lustre]          
 [<ffffffffa0e536de>] ? ll_lookup_it+0x63e/0xda0 [lustre]         
 [<ffffffffa0e546e8>] ll_create_nd+0x3b8/0xf20 [lustre]           
 [<ffffffffa0e247c6>] ? ll_inode_permission+0x136/0x3e0 [lustre]  
 [<ffffffff811a2b76>] vfs_create+0xe6/0x110                       
 [<ffffffff811a626e>] do_filp_open+0xa8e/0xd20                    
 [<ffffffff810f3a85>] ? call_rcu_sched+0x15/0x20                  
 [<ffffffff812a07cd>] ? strncpy_from_user+0x5d/0xa0               
 [<ffffffff811b3572>] ? alloc_fd+0x92/0x160                       
 [<ffffffff81190007>] do_sys_open+0x67/0x130                      
 [<ffffffff8118fe6d>] ? filp_close+0x5d/0x90                      
 [<ffffffff81190110>] sys_open+0x20/0x30                          
 [<ffffffff81190135>] sys_creat+0x15/0x20                         
 [<ffffffff8100b112>] system_call_fastpath+0x16/0x1b


 Comments   
Comment by Olaf Faaland [ 13/Apr/16 ]

For specific patch set, see branch:2.8-llnl-preview1

Comment by Olaf Faaland [ 13/Apr/16 ]

There were 10 MDTs, 20 OSTs at the time. Of the 20 OSTs, 2 reported only 5 exports in /proc/fs/lustre/obdfilter/lustre-OST0000/num_exports. The remaining 18 OSTs reported 11 exports.

Comment by Olaf Faaland [ 13/Apr/16 ]

The console log contains the following messages that occurred within the same second of the LBUG:

LustreError: 12027:0:(lov_ea.c:195:lsm_unpackmd_common()) lustre-clilov_UUID: OST index 0 missing   
Lustre: 12027:0:(lov_pack.c:65:lov_dump_lmm_common()) objid 0x4:2147484700, magic 0x0bd10bd0, pattern 0x1
Lustre: 12025:0:(lov_pack.c:65:lov_dump_lmm_common()) objid 0x8:1073742858, magic 0x0bd10bd0, pattern 0x1
Lustre: 12025:0:(lov_pack.c:69:lov_dump_lmm_common()) stripe_size 1048576, stripe_count 1, layout_gen 0  
Lustre: 12025:0:(lov_pack.c:88:lov_dump_lmm_objects()) stripe 0 idx 0 subobj 0x940000400:34
Comment by Olaf Faaland [ 13/Apr/16 ]

Our local patches are mostly build-related and I do not believe any of them are related to this issue.

Comment by Olaf Faaland [ 13/Apr/16 ]

I don't have a record to prove it, but I believe that OST 0 was one of the two OSTs which did not report connections to all the MDTs.

Comment by Olaf Faaland [ 13/Apr/16 ]

The filesystem existed only temporarily during a DAT on Catalyst, and it's gone now. However I have crash dumps from the two client nodes that crashed. Both had just started to run mdtest.

I do not have any records from the two OSTs which were missing connections. I should have crashed one and/or gathered more information. My mistake.

Comment by Andreas Dilger [ 13/Apr/16 ]

Olaf, what options were used to run mdtest? Was the test run inside a striped directory, and if yes, presumably stripe count of 10? Had the filesystem _just_been mounted? I recall one problem during pre-release DNE testing with 128 MDTs that it took some time for all the MDTs to connect to each other, and I wonder if there was still MDT-OST object procreation underway when the test started?

The LBUG and lov console messages do appear related. Presumably the lam_unoackmd() error was first, and LBUG second? The LBUG could just be bad error handling during the cleanup path, but it doesn't explain why the client was given a bad layout from the MDS. It is not unusual for an OST to be disconnected from the MDS, and it should just be skipped in this case.

Comment by Olaf Faaland [ 13/Apr/16 ]

Andreas,

The mdtest options were probably this:
mdtest -F -u -i 3 -n 16000 -p 60 -v -d /p/lustre/faaland1/count10_index0

Yes, the directory was striped across all 10 MDTs in the file system, with --default_stripe option set to match.

The MDTs and OSTs were all started at 08:19:48, and the client mounted at 08:20:40. The job wasn't run until about 40 minutes later, which is when the clients hit the LBUG at 09:01:40. I would think that object precreation was long since finished.

I realize now that I do have the console logs for the servers, which may have clues. So I will go through and find what is there between 8:19 and 9:01 and report back.

Yes, based on the output of dmesg in crash, the lov_ea.c:195:lsm_unpackmd_common() error message appears first, and the LBUG follows. As you say, I saw the two OSTs were not properly connected, but I expected no objects on them would be allocated and they were just sit idle.

Comment by Andreas Dilger [ 15/Apr/16 ]

Olaf, any chance of posting the server console logs, so we can see why the initial MDS-OSS connections failed?

Comment by Olaf Faaland [ 15/Apr/16 ]

Attached the console logs as con-logs.tgz

Comment by Joseph Gmitter (Inactive) [ 18/Apr/16 ]

Hi Bobijam,
Can you also follow up with this ticket?
Thanks.
Joe

Comment by Zhenyu Xu [ 20/Apr/16 ]

From the log, the system was not stable and have several failover happened between 8:19 and 9:01

[~/tmp/logs/lu8012/con-logs/] $ grep Failing *
199.log:16:2016-04-12 08:34:22 Lustre: Failing over lustre-OST0000
199.log:26:2016-04-12 09:05:47 Lustre: Failing over lustre-OST0000
200.log:19:2016-04-12 09:05:47 Lustre: Failing over lustre-OST0001
201.log:19:2016-04-12 09:05:47 Lustre: Failing over lustre-OST0002
202.log:23:2016-04-12 09:05:47 Lustre: Failing over lustre-OST0003
203.log:19:2016-04-12 09:05:47 Lustre: Failing over lustre-OST0004
204.log:16:2016-04-12 08:34:22 Lustre: Failing over lustre-OST0005
204.log:26:2016-04-12 09:05:47 Lustre: Failing over lustre-OST0005
205.log:23:2016-04-12 09:05:47 Lustre: Failing over lustre-OST0006
206.log:23:2016-04-12 09:05:47 Lustre: Failing over lustre-OST0007
207.log:22:2016-04-12 09:05:47 Lustre: Failing over lustre-OST0008
208.log:23:2016-04-12 09:05:47 Lustre: Failing over lustre-OST0009

Can't tell what caused the failover though.

Comment by Olaf Faaland [ 20/Apr/16 ]

Zhenyu,

Those "Failing over" messages are a result of us umount-ing the server targets and then mounting them again. The first time at 8:34 as I recall was to see if OSTs would make the missing connections. In retrospect I should have looked at the config logs. The second time I don't recall the reason for.

-Olaf

Comment by Gerrit Updater [ 19/Jul/16 ]

Bobi Jam (bobijam@hotmail.com) uploaded a new patch: http://review.whamcloud.com/21411
Subject: LU-8018 lov: ld_target could be NULL
Project: fs/lustre-release
Branch: master
Current Patch Set: 6
Commit: d3dceaca9964e65f0da4d05bbb3e93ee31a8a752

Comment by Gerrit Updater [ 26/Sep/16 ]

Oleg Drokin (oleg.drokin@intel.com) merged in patch http://review.whamcloud.com/21411/
Subject: LU-8018 lov: ld_target could be NULL
Project: fs/lustre-release
Branch: master
Current Patch Set:
Commit: 916b5f7f4672c1070e27a2fe0bfae371b0a729d6

Comment by Peter Jones [ 26/Sep/16 ]

Landed for 2.9

Generated at Sat Feb 10 02:13:55 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.