[LU-9576] Failed mount when osp catalog is full Created: 31/May/17  Updated: 21/Dec/17  Resolved: 21/Dec/17

Status: Resolved
Project: Lustre
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Bug Priority: Major
Reporter: Alexander Boyko Assignee: WC Triage
Resolution: Won't Fix Votes: 0
Labels: None

Epic/Theme: patch
Severity: 3
Rank (Obsolete): 9223372036854775807

 Description   
Mar  4 01:32:35 snx11103n002 kernel: LDISKFS-fs (md66): recovery complete
Mar  4 01:32:36 snx11103n002 kernel: LDISKFS-fs (md66): mounted filesystem with ordered data mode. quota=on. Opts:
Mar  4 01:32:37 snx11103n002 kernel: Lustre: 198403:0:(llog_cat.c:91:llog_cat_new_log()) snx11103-OST0000-osc-MDT0000: there are no more free slots in catalog
Mar  4 01:32:37 snx11103n002 kernel: LustreError: 198403:0:(osp_sync.c:1336:osp_sync_init()) snx11103-OST0000-osc-MDT0000: can't initialize llog: rc = -28
Mar  4 01:32:37 snx11103n002 kernel: LustreError: 198403:0:(obd_config.c:517:class_setup()) setup snx11103-OST0000-osc-MDT0000 failed (-28)
Mar  4 01:32:37 snx11103n002 kernel: LustreError: 198403:0:(obd_config.c:1550:class_config_llog_handler()) MGC10.10.100.52@o2ib: cfg command failed: rc = -28
Mar  4 01:32:37 snx11103n002 kernel: Lustre:    cmd=cf003 0:snx11103-OST0000-osc-MDT0000  1:snx11103-OST0000_UUID  2:10.10.100.54@o2ib
Mar  4 01:32:37 snx11103n002 kernel:
Mar  4 01:32:37 snx11103n002 kernel: LustreError: 15c-8: MGC10.10.100.52@o2ib: The configuration from log 'snx11103-MDT0000' failed (-28). This may be the result of communication errors between this node and the MGS, a bad configuration, or other errors. See the syslog for more information.
Mar  4 01:32:37 snx11103n002 kernel: LustreError: 196556:0:(obd_mount_server.c:1242:server_start_targets()) failed to start server snx11103-MDT0000: -28
Mar  4 01:32:37 snx11103n002 kernel: LustreError: 196556:0:(obd_mount_server.c:1739:server_fill_super()) Unable to start targets: -28
Mar  4 01:32:37 snx11103n002 kernel: LustreError: 196556:0:(obd_mount_server.c:834:lustre_disconnect_lwp()) snx11103-MDT0000-lwp-MDT0000: Can't end config log snx11103-client.
Mar  4 01:32:37 snx11103n002 kernel: LustreError: 196556:0:(obd_mount_server.c:1422:server_put_super()) snx11103-MDT0000: failed to disconnect lwp. (rc=-2)
Mar  4 01:32:37 snx11103n002 kernel: Lustre: Failing over snx11103-MDT0000
Mar  4 01:32:38 snx11103n002 kernel: Lustre: server umount snx11103-MDT0000 complete


 Comments   
Comment by Alexander Boyko [ 31/May/17 ]

The current logic of osp sync doesn`t handle full catalog.

Comment by Gerrit Updater [ 31/May/17 ]

Alexander Boyko (alexander.boyko@seagate.com) uploaded a new patch: https://review.whamcloud.com/27357
Subject: LU-9576 osp: skip llog_add error
Project: fs/lustre-release
Branch: master
Current Patch Set: 1
Commit: 81ae3b7fb89ed88456b246035d080a804ed9a55f

Comment by Alexander Boyko [ 21/Jun/17 ]

The situation with full osp catalog is very rare. I think it can be closed.

Generated at Sat Feb 10 02:27:25 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.