[LU-13085] (namei.c:87:ll_set_inode()) Can not initialize inode [0x540028b1f:0x2:0x0] without object type: valid = 0x100000001 Created: 18/Dec/19 Updated: 01/Feb/20 Resolved: 22/Jan/20 |
|
| Status: | Resolved |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | Lustre 2.10.8 |
| Fix Version/s: | None |
| Type: | Bug | Priority: | Minor |
| Reporter: | Olaf Faaland | Assignee: | Lai Siyao |
| Resolution: | Duplicate | Votes: | 0 |
| Labels: | llnl | ||
| Environment: |
corona82 login node |
||
| Attachments: |
|
||||||||
| Issue Links: |
|
||||||||
| Severity: | 3 | ||||||||
| Rank (Obsolete): | 9223372036854775807 | ||||||||
| Description |
|
I attempted to list the contents of /p/lustre3/faaland1/ using tab complete to determine the "faaland1" part of the path. There was a delay of several seconds, then the ls command returned with appropriate output. On the console of the client node, I saw the following: [Tue Dec 17 18:07:31 2019] LustreError: 24055:0:(namei.c:87:ll_set_inode()) Can not initialize inode [0x2400013a0:0x6b6:0x0] without object type: valid = 0x100000001 [Tue Dec 17 18:07:31 2019] LustreError: 24055:0:(namei.c:87:ll_set_inode()) Skipped 6 previous similar messages [Tue Dec 17 18:07:31 2019] LustreError: 24055:0:(llite_lib.c:2426:ll_prep_inode()) new_inode -fatal: rc -12 [Tue Dec 17 18:07:31 2019] LustreError: 24055:0:(llite_lib.c:2426:ll_prep_inode()) Skipped 6 previous similar messages See https://github.com/LLNL/lustre for the patch stacks. |
| Comments |
| Comment by Olaf Faaland [ 18/Dec/19 ] |
|
There are 2 MDTs on this file system. The listed fid's info and my directory's info: [root@corona82:~]# lfs fid2path /p/lustre3/ [0x2400013a0:0x6b6:0x0] /p/lustre3/bennion1 [root@corona82:~]# lfs getdirstripe /p/lustre3/bennion1 lmv_stripe_count: 0 lmv_stripe_offset: 1 lmv_hash_type: none [root@corona82:~]# lfs getdirstripe /p/lustre3/faaland1 lmv_stripe_count: 0 lmv_stripe_offset: 0 lmv_hash_type: none [root@corona82:~]# lfs path2fid /p/lustre3/faaland1 [0x200000bd0:0x745:0x0] |
| Comment by Olaf Faaland [ 18/Dec/19 ] |
|
The two MDS nodes have nothing in dmesg from the time when the error was reported on the client: ---------------- eporter81 ---------------- [Tue Dec 17 16:07:18 2019] Lustre: Skipped 3 previous similar messages [Tue Dec 17 16:12:11 2019] Lustre: MGS: Connection restored to ae14aba1-e9ca-73c5-d91e-d5258431f3c5 (at 192.168.128.137@o2ib20) [Tue Dec 17 16:12:11 2019] Lustre: Skipped 1 previous similar message [Tue Dec 17 16:12:23 2019] Lustre: MGS: Connection restored to 57067c2f-6884-8432-75c3-5d29b7b44621 (at 192.168.128.138@o2ib20) [Tue Dec 17 16:12:23 2019] Lustre: Skipped 1 previous similar message ---------------- eporter82 ---------------- [Tue Dec 17 15:23:34 2019] Lustre: lustre3-MDT0001: Connection restored to 6e206333-0ad1-f33f-5455-92e3e1592cf5 (at 192.168.128.140@o2ib20) [Tue Dec 17 16:07:39 2019] Lustre: lustre3-MDT0001: haven't heard from client a8de14f3-4714-ff41-67ca-ffcfd5d3ce43 (at 192.168.128.138@o2ib20) in 227 seconds. I think it's dead, and I am evicting it. exp ffff99e50e087800, cur 1576627639 expire 1576627489 last 1576627412 [Tue Dec 17 16:07:39 2019] Lustre: Skipped 1 previous similar message [Tue Dec 17 16:12:33 2019] Lustre: lustre3-MDT0001: Connection restored to ae14aba1-e9ca-73c5-d91e-d5258431f3c5 (at 192.168.128.137@o2ib20) [Tue Dec 17 16:12:45 2019] Lustre: lustre3-MDT0001: Connection restored to a8de14f3-4714-ff41-67ca-ffcfd5d3ce43 (at 192.168.128.138@o2ib20) |
| Comment by Olaf Faaland [ 18/Dec/19 ] |
|
I saw the same error reported by the client earlier today, but I've not yet been able to reproduce it on demand. In the earlier case, the FID reported is on a different file system, lustre2. That other file system is running a slightly different tag, lustre-2.10.8_4.chaos. [root@corona82:~]# dmesg -T | grep ll_prep_inode -C2 [Mon Dec 16 12:54:27 2019] LNetError: 45583:0:(peer.c:3451:lnet_peer_ni_add_to_recoveryq_locked()) Skipped 14 previous similar messages [Mon Dec 16 13:02:43 2019] LustreError: 118530:0:(namei.c:87:ll_set_inode()) Can not initialize inode [0x580023454:0x2:0x0] without object type: valid = 0x100000001 [Mon Dec 16 13:02:43 2019] LustreError: 118530:0:(llite_lib.c:2426:ll_prep_inode()) new_inode -fatal: rc -12 [Mon Dec 16 13:04:37 2019] LNetError: 45583:0:(peer.c:3451:lnet_peer_ni_add_to_recoveryq_locked()) lpni 192.168.128.86@o2ib36 added to recovery queue. Health = 900 [Mon Dec 16 13:04:37 2019] LNetError: 45583:0:(peer.c:3451:lnet_peer_ni_add_to_recoveryq_locked()) Skipped 11 previous similar messages [root@corona82:~]# lfs fid2path /p/lustre2/ [0x580023454:0x2:0x0] /p/lustre2/faaland1/make-busy/mdt14 |
| Comment by Olaf Faaland [ 18/Dec/19 ] |
|
Let me know what information you'd like me to try to gather for the next occurrence. |
| Comment by Olaf Faaland [ 18/Dec/19 ] |
|
I gathered debug logs from the client and attached them.
Also the debug logs from the porter MDS nodes (/p/lustre3 file system):
|
| Comment by Peter Jones [ 18/Dec/19 ] |
|
Lai Could you please investigate? Thanks Peter |
| Comment by Lai Siyao [ 29/Dec/19 ] |
|
This is a duplicate of |