[LU-1459] Disabling OSC in file system causes multiple issues Created: 31/May/12  Updated: 28/Nov/13  Resolved: 28/Nov/13

Status: Resolved
Project: Lustre
Component/s: None
Affects Version/s: Lustre 1.8.8
Fix Version/s: Lustre 1.8.9

Type: Bug Priority: Minor
Reporter: Jeremy Filizetti Assignee: Zhenyu Xu
Resolution: Duplicate Votes: 0
Labels: ptr

Issue Links:
Related
is related to LU-4302 If a client mounts FS with conf_param... Open
is related to LU-642 LBUG in client when activating an OST... Resolved
Severity: 3
Bugzilla ID: 15,505
Rank (Obsolete): 4582

 Description   

Creating a general bug to handle issues related to mounting a file system with disabled OSC/OSTs.

1. "lfs check servers" will cause an LBUG because the import remains in the LUSTRE_IMP_NEW state

May 31 04:03:13 test-340-5 kernel: LustreError: 12165:0:(client.c:837:ptlrpc_import_delay_req()) @@@ Uninitialized import. req@ffff81183c37e000 x1403452819243059/t0 o400->test-OST0000_UUID@<NULL
>:28/4 lens 192/384 e 0 to 1 dl 0 ref 1 fl Rpc:N/0/0 rc 0/0
May 31 04:03:13 test-340-5 kernel: LustreError: 12165:0:(client.c:839:ptlrpc_import_delay_req()) LBUG

2. Reading the import file for the OSC will cause a NULL pointer dereference (already filed under LU-1448)

3. When an OSC/OST is set to active for the file system a client will not be able to activate the OST
[root@test-340-5 lustre-release]# dmesg
LustreError: 19991:0:(lov_obd.c:220:lov_notify()) event(9) of test-OST0000_UUID failed: -22
[root@test-340-5 lustre-release]# lctl dk /root/lustre2.log
Debug log: 19 lines, 19 kept, 0 dropped, 0 bad.
[root@test-340-5 lustre-release]# cat /root/lustre2.log
10000000:01000000:0:1338449754.150751:0:19935:0:(mgc_request.c:595:mgc_blocking_ast()) Lock res 0x5665746953 (test)
10000000:01000000:6:1338449754.150836:0:19990:0:(mgc_request.c:281:mgc_requeue_thread()) Starting requeue thread
10000000:01000000:6:1338449757.820844:0:19990:0:(mgc_request.c:307:mgc_requeue_thread()) updating log test-client
10000000:01000000:6:1338449757.820847:0:19990:0:(mgc_request.c:1128:mgc_process_log()) Process log test-client:ffff81182eb8dc00 from 83
10000000:01000000:6:1338449757.820849:0:19990:0:(mgc_request.c:657:mgc_enqueue()) Enqueue for test-client (res 0x5665746953)
00000020:01000000:0:1338449757.822392:0:19991:0:(obd_config.c:1075:class_config_llog_handler()) Marker, inst_flg=0x0 mark_flg=0x1
00000020:00000080:0:1338449757.822397:0:19991:0:(obd_config.c:788:class_process_config()) processing cmd: cf010
00000020:00000080:0:1338449757.822399:0:19991:0:(obd_config.c:857:class_process_config()) marker 30 (0x1) test-OST0000-os osc.active
00000020:01000000:0:1338449757.822403:0:19991:0:(obd_config.c:1152:class_config_llog_handler()) cmd cf00f, instance name: test-OST0000-osc-ffff81182eb8dc00
00000020:00000080:0:1338449757.822404:0:19991:0:(obd_config.c:788:class_process_config()) processing cmd: cf00f
00000100:00080000:0:1338449757.822413:0:19991:0:(recover.c:258:ptlrpc_set_import_active()) setting import test-OST0000_UUID VALID
00000100:00080000:0:1338449757.822415:0:19991:0:(import.c:182:ptlrpc_set_import_discon()) osc: import ffff81182ad69800 already not connected (conn 0, was 0): NEW
00020000:00020000:0:1338449757.822422:0:19991:0:(lov_obd.c:220:lov_notify()) event(9) of test-OST0000_UUID failed: -22
00000020:01000000:0:1338449757.830661:0:19991:0:(obd_config.c:1017:class_process_proc_param()) test-OST0000-osc-ffff81182eb8dc00.osc: set parameter active=1
00000020:01000000:0:1338449757.830664:0:19991:0:(obd_config.c:1075:class_config_llog_handler()) Marker, inst_flg=0x2 mark_flg=0x2
00000020:00000080:0:1338449757.830666:0:19991:0:(obd_config.c:788:class_process_config()) processing cmd: cf010
00000020:00000080:0:1338449757.830668:0:19991:0:(obd_config.c:857:class_process_config()) marker 30 (0x2) test-OST0000-os osc.active
00000020:01000000:6:1338449757.830730:0:19990:0:(obd_config.c:1230:class_config_parse_llog()) Processed log test-client gen 83-85 (rc=0)
10000000:01000000:6:1338449757.830740:0:19990:0:(mgc_request.c:1196:mgc_process_log()) MGC10.10.2.80@o2ib: configuration from log 'test-client' succeeded (0).

4. Messages in lov_connect_obd should print out the actual target instead of uuid, which currently is using an uninitialized value



 Comments   
Comment by Jeremy Filizetti [ 31/May/12 ]

For number 4 a patch can be found at:
http://review.whamcloud.com/2992

Comment by Jeremy Filizetti [ 31/May/12 ]

Patch for number 1 can be found at:
http://review.whamcloud.com/2993

Comment by Peter Jones [ 31/May/12 ]

Thanks Jeremy! Bobijam could you please review and port if necessary these patches, just as you have done with LU-1448 - thanks!

Comment by Peter Jones [ 26/Aug/12 ]

It looks like #1 has been ported and landed to master but #4 is still outstanding

Comment by Zhenyu Xu [ 26/Aug/12 ]

master has already had #4 contents.

Comment by Jodi Levi (Inactive) [ 27/Sep/12 ]

Please reopen ticket if more work is needed.

Comment by Jeremy Filizetti [ 27/Sep/12 ]

I need to reopen this because 3 is still not fixed. Next week I will try to post some more details about the issue but currently I don't have a fix for it.

Comment by Peter Jones [ 27/Sep/12 ]

ok Jeremy I have reopened the ticket

Comment by Kit Westneat (Inactive) [ 16/Oct/13 ]

Any updates on this? We recently ran into #3 as well.

Comment by Zhenyu Xu [ 28/Nov/13 ]

issue #3 links to LU-4302

Generated at Sat Feb 10 01:16:49 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.