[LU-3574] MDT cannot sync with OST after upgrade from 2.1 to 2.4 then downgrade Created: 11/Jul/13  Updated: 11/Jul/13  Resolved: 11/Jul/13

Status: Resolved
Project: Lustre
Component/s: None
Affects Version/s: Lustre 2.4.0, Lustre 2.1.3
Fix Version/s: Lustre 2.1.6

Type: Bug Priority: Critical
Reporter: Sebastien Buisson (Inactive) Assignee: Bruno Faccini (Inactive)
Resolution: Duplicate Votes: 0
Labels: None

Issue Links:
Duplicate
is duplicated by LU-2888 After downgrade from 2.4 to 2.1.4, hi... Resolved
Severity: 3
Rank (Obsolete): 9042

 Description   

Hi,

I format and start a Lustre file system with Lustre 2.1.
Then I stop my file system, and upgrade to Lustre 2.4.
I restart my Lustre file system, and everything is fine.
Then I stop my file system, and decide to downgrade to Lustre 2.1.

The MGT restarts fine, as well as the OSTs.
The mount of the MDT returns successfully, but when I look in the dmesg I see:

[root@yosemite2 ~]# dmesg 
LDISKFS-fs warning (device loop1): ldiskfs_fill_super: extents feature not enabled on this filesystem, use tune2fs.
LDISKFS-fs (loop1): barriers disabled
LDISKFS-fs (loop1): mounted filesystem with ordered data mode. Opts: 
LDISKFS-fs warning (device loop1): ldiskfs_fill_super: extents feature not enabled on this filesystem, use tune2fs.
LDISKFS-fs (loop1): barriers disabled
LDISKFS-fs (loop1): mounted filesystem with ordered data mode. Opts: 
Lustre: Enabling ACL
Lustre: Enabling user_xattr
Lustre: migrate-MDT0000: used disk, loading
LustreError: 3273:0:(llog_lvfs.c:199:llog_lvfs_read_header()) bad log / header magic: 0x2e (expected 0x10645539)
LustreError: 3273:0:(llog_obd.c:218:llog_setup_named()) obd migrate-OST0001-osc-MDT0000 ctxt 2 lop_setup=ffffffffa0530cb0 failed -5
LustreError: 3273:0:(osc_request.c:4198:__osc_llog_init()) failed LLOG_MDS_OST_ORIG_CTXT
LustreError: 3273:0:(osc_request.c:4215:__osc_llog_init()) osc 'migrate-OST0001-osc-MDT0000' tgt 'mdd_obd-migrate-MDT0000' catid ffff88003da1b8d0 rc=-5
LustreError: 3273:0:(osc_request.c:4217:__osc_llog_init()) logid 0x2:0x0
LustreError: 3273:0:(osc_request.c:4245:osc_llog_init()) rc: -5
LustreError: 3273:0:(lov_log.c:248:lov_llog_init()) error osc_llog_init idx 1 osc 'migrate-OST0001-osc-MDT0000' tgt 'mdd_obd-migrate-MDT0000' (rc=-5)
LustreError: 3273:0:(llog_lvfs.c:616:llog_lvfs_create()) error looking up logfile 0x4:0x0: rc -116
LustreError: 3273:0:(llog_obd.c:218:llog_setup_named()) obd migrate-OST0000-osc-MDT0000 ctxt 2 lop_setup=ffffffffa0530cb0 failed -116
LustreError: 3273:0:(osc_request.c:4198:__osc_llog_init()) failed LLOG_MDS_OST_ORIG_CTXT
LustreError: 3273:0:(osc_request.c:4215:__osc_llog_init()) osc 'migrate-OST0000-osc-MDT0000' tgt 'mdd_obd-migrate-MDT0000' catid ffff88003da1b8d0 rc=-116
LustreError: 3302:0:(lov_log.c:160:lov_llog_origin_connect()) error osc_llog_connect tgt 0 (-107)
LustreError: 3302:0:(mds_lov.c:870:__mds_lov_synchronize()) migrate-OST0000_UUID failed at llog_origin_connect: -107
LustreError: 3302:0:(mds_lov.c:901:__mds_lov_synchronize()) migrate-OST0000_UUID sync failed -107, deactivating

lctl dl shows that the connection to the OSTs is not operational:

[root@yosemite2 ~]# lctl dl
  0 UP mgs MGS MGS 7
  1 UP mgc MGC10.9.0.2@tcp efd0d467-a35f-6aed-8db6-089a2d2584ed 5
  2 UP lov migrate-MDT0000-mdtlov migrate-MDT0000-mdtlov_UUID 4
  3 UP mdt migrate-MDT0000 migrate-MDT0000_UUID 3
  4 UP mds mdd_obd-migrate-MDT0000 mdd_obd_uuid-migrate-MDT0000 3
  5 IN osc migrate-OST0001-osc-MDT0000 migrate-MDT0000-mdtlov_UUID 5
  6 IN osc migrate-OST0000-osc-MDT0000 migrate-MDT0000-mdtlov_UUID 5

In the OSS's dmesg, I have:

[root@yosemite3 ~]# dmesg 
LDISKFS-fs (loop0): barriers disabled
LDISKFS-fs (loop0): mounted filesystem with ordered data mode. Opts: 
LDISKFS-fs (loop0): barriers disabled
LDISKFS-fs (loop0): mounted filesystem with ordered data mode. Opts: 
Lustre: MGC10.9.0.2@tcp: Reactivating import
Lustre: migrate-OST0001: Now serving migrate-OST0001 on /dev/loop0 with recovery enabled
LDISKFS-fs (loop1): barriers disabled
LDISKFS-fs (loop1): mounted filesystem with ordered data mode. Opts: 
LDISKFS-fs (loop1): barriers disabled
LDISKFS-fs (loop1): mounted filesystem with ordered data mode. Opts: 
Lustre: migrate-OST0000: Now serving migrate-OST0000 on /dev/loop1 with recovery enabled
Lustre: 3221:0:(ldlm_lib.c:946:target_handle_connect()) migrate-OST0001: connection from migrate-MDT0000-mdtlov_UUID@10.9.0.2@tcp t0 exp (null) cur 1373508570 last 0
Lustre: migrate-OST0000: received MDS connection from 10.9.0.2@tcp

lctl gives:

[root@yosemite3 ~]# lctl dl
  0 UP mgc MGC10.9.0.2@tcp 93f173c1-3222-0f46-4337-5fe54f34d51d 5
  1 UP ost OSS OSS_uuid 3
  2 UP obdfilter migrate-OST0001 migrate-OST0001_UUID 5
  3 UP obdfilter migrate-OST0000 migrate-OST0000_UUID 5

Surprisingly I can mount Lustre on my client, but have no write access to the file system:

[root@yosemite4 ~]# cat /migrate/hi
hey
[root@yosemite4 ~]# touch /migrate/hello
touch: cannot touch `/migrate/hello': Input/output error

As a consequence my file system is not usable.

Sebastien.



 Comments   
Comment by Bruno Faccini (Inactive) [ 11/Jul/13 ]

Hello Seb,
After 1st reading, looks like a duplicate of LU-2888. Will investigate and give you a definitive answer soon.

Comment by Sebastien Buisson (Inactive) [ 11/Jul/13 ]

Hi Bruno!

I am sorry, I am afraid this ticket could be a dup' of LU-2888.
I have just tested the upgrade/downgrade path with Lustre 2.1.6 and Lustre 2.4.0, and this time it went off smoothly.

Sebastien.

Comment by Bruno Faccini (Inactive) [ 11/Jul/13 ]

Ok, so do you agree I can close it already now ??

Comment by Diego Moreno (Inactive) [ 11/Jul/13 ]

Hi Bruno, you can close it as a duplicate of LU-2888.

Comment by Bruno Faccini (Inactive) [ 11/Jul/13 ]

Thank's Diego.

Comment by Bruno Faccini (Inactive) [ 11/Jul/13 ]

Closed as a duplicate of LU-2888.

Generated at Sat Feb 10 01:35:05 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.