[LU-13804] LustreError (osd_index.c:1201:osd_dir_delete()) lquake-MDT0000: failed to destroy agent object (0) for the entry data049: rc = -22 Created: 20/Jul/20 Updated: 20/Oct/23 Resolved: 22/Jul/20 |
|
| Status: | Resolved |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | None |
| Fix Version/s: | None |
| Type: | Bug | Priority: | Minor |
| Reporter: | Olaf Faaland | Assignee: | Lai Siyao |
| Resolution: | Not a Bug | Votes: | 0 |
| Labels: | llnl | ||
| Environment: |
Lustre 2.10.8 and 2.12.5 |
||
| Severity: | 3 |
| Rank (Obsolete): | 9223372036854775807 |
| Description |
|
jet1 console log reports the following during an I/O SWL (and a few other misc jobs) on Opal: Jul 15 09:51:54 jet1 kernel: LustreError: 1391:0:(osd_index.c:1201:osd_dir_delete()) lquake-MDT0000: failed to destroy agent object (0) for the entry data049: rc = -22 Jul 15 09:51:54 jet1 kernel: LustreError: 1391:0:(osd_index.c:1201:osd_dir_delete()) Skipped 1 previous similar message Jul 15 09:51:54 jet1 kernel: LustreError: 17478:0:(osd_index.c:1201:osd_dir_delete()) lquake-MDT0000: failed to destroy agent object (0) for the entry data026: rc = -22 Jul 15 09:51:54 jet1 kernel: LustreError: 17478:0:(osd_index.c:1201:osd_dir_delete()) Skipped 75 previous similar messages There were no console log messages on Opal that appeared to correspond. No messages at 09:51, and no unusual messages at all on opal. Testing both under Lustre 2.10 and Lustre 2.12 included creating striped directories via lfs mkdir -i3 -c4 <target>. Some of these directories were likely created under 2.10 and deleted under 2.12. Before this occurred, the jet servers had been upgraded from Lustre 2.10 to Lustre 2.12, then downgraded to 2.10, and then upgraded to 2.12 again. Significant I/O was performed between each Lustre version, and changelog users were deregistered and logs cleared. |
| Comments |
| Comment by Olaf Faaland [ 20/Jul/20 ] |
|
The osd_dir_delete() CERROR was added after 2.10 branched. The commit is
Based on the commit message, I believe it's possible that creating remote objects under lustre 2.10, and then deleting them under 2.12, would trigger this error message. Reaching this error message does not result in osd_dir_delete() returning early or returning an error. So this seems like this is harmless (IE does not indicate damage to the file system). And it also seems like it is explained by my downgrade-then-upgrade. Please confirm, thanks. |
| Comment by Olaf Faaland [ 20/Jul/20 ] |
|
Note this was performed on our staging file system, not a production one; but we are about to put Lustre 2.12.5 on all our file systems, so I'm working through error messages to minimize risk. |
| Comment by Peter Jones [ 21/Jul/20 ] |
|
Lai Could you please advise? Thanks Peter |
| Comment by Lai Siyao [ 22/Jul/20 ] |
|
Yes, it's exactly what you described. |
| Comment by Olaf Faaland [ 22/Jul/20 ] |
|
Thank you |