[LU-4354] 2.1 clients fail to mount upgraded 2.4 filesystem with EEXIST Created: 05/Dec/13  Updated: 20/Jan/14  Resolved: 20/Jan/14

Status: Resolved
Project: Lustre
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Bug Priority: Critical
Reporter: Ned Bass Assignee: Cliff White (Inactive)
Resolution: Not a Bug Votes: 0
Labels: None

Severity: 3
Rank (Obsolete): 11926

 Description   

Clients are unable to mount an ldiskfs filesystem upgraded from 2.1 to 2.4.

Transcribed client messages:

mgc_request.c:247 do_config_log_add() failed processing sptlrpc log: -2
genops.c:304 class_newdev() ls4-OST0000-osc-... alread exists, won't add
obd_config.c:368:class_attach() Cannot create defvice ls4-OST000... of type osc: -17
obd_config.c:1491:class_config_llog_handler() Err -17 on cfg command:
 cmd=cf001 0:ls4-OST0000... 1:osc 2:ls4-clilov_UUID
LustreError: The configuration from log 'ls4-client' failed (-17). ...


 Comments   
Comment by Ned Bass [ 05/Dec/13 ]

On MGS we get in debug log when client tries to mount:

MGS fail to handle opc = 501: rc = -2
Comment by Cliff White (Inactive) [ 06/Dec/13 ]

Can you attach the debug log?

Comment by Ned Bass [ 06/Dec/13 ]

This may have been related to an incomplete writeconf procedure. Admins are repeating the procedure now.

Comment by Ned Bass [ 06/Dec/13 ]

Sorry Cliff the system is classified.

Comment by Cliff White (Inactive) [ 06/Dec/13 ]

Had a feeling you would say that. I would indeed repeat the writeconf, if there are still issues I would directly examine the ls4-client config log with llog_reader and see what is in there. The writeconf is likely the correct answer.

Comment by Minh Diep [ 06/Dec/13 ]

I am curious, why would a writeconf is needed? Unless you need to change the NIDS, or backup/restore.

Comment by Ned Bass [ 06/Dec/13 ]

Minh, we did a writeconf because these are upgraded 1.8 filesystems, and we were advised that a writeconf is needed before going to 2.4. Otherwise clients would not initialize the lmv layer which is needed after DNE landed.

Comment by Cliff White (Inactive) [ 09/Dec/13 ]

Did the writeconf fix this issue?

Comment by Ned Bass [ 09/Dec/13 ]

Cliff, yes the filesystem is back online after a writeconf. So I guess there is not a bug here, other than poor error reporting.

Comment by Cliff White (Inactive) [ 20/Jan/14 ]

I am closing this issue as a non bug. Please reopen if you have further questions/issues

Generated at Sat Feb 10 01:41:58 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.