[LU-8292] unnecessary LustreError 17754:0:(file.c:1900:ll_do_fiemap()) obd_get_info failed: rc = -95 Created: 16/Jun/16  Updated: 14/Feb/17  Resolved: 03/Oct/16

Status: Resolved
Project: Lustre
Component/s: None
Affects Version/s: Lustre 2.5.5
Fix Version/s: None

Type: Bug Priority: Minor
Reporter: Olaf Faaland Assignee: Nathaniel Clark
Resolution: Fixed Votes: 0
Labels: llnl, llnlfixready, zfs
Environment:

RHEL6.7 and RHEL7.2
lustre-2.5.5-7chaos_2.6.32_642.1.1.1chaos.ch5.5.x86_64.x86_64
lustre-2.8.0_0.0.llnlpreview.13-1.ch6.x86_64
ZFS used as backend file system


Issue Links:
Related
is related to LU-8256 BUG: unable to handle kernel paging r... Resolved
Severity: 3
Rank (Obsolete): 9223372036854775807

 Description   

With ZFS backend, ioctl(fd, FS_IOC_FIEMAP, fiemap) on the client results in an obd_get_info call to the OST(s).
ZFS does not support FIEMAP, and so the dt_fiemap_get() call returns -ENOTSUPP. When the RPC reply is received by the client, two LustreError messages are sent to the console:

2016-06-16 10:04:23 LustreError: 11-0: lcy-OST0004-osc-ffff88044702bc00: Communicating with 10.1.1.175@o2ib9, operation ost_get_info failed with -95.
2016-06-16 10:04:23 LustreError: 3698:0:(file.c:1900:ll_do_fiemap()) obd_get_info failed: rc = -95

Since the user is notified of the ioctl failure, and the failure does not result in any compromise of the file system or the file data, this should not produce a console message.

The RHEL6 cp command did not use this ioctl, but the RHEL7 one does. So with RHEL7 this has become much more noticable.



 Comments   
Comment by Peter Jones [ 16/Jun/16 ]

Nathaniel

Could you please assist with this one?

Peter

Comment by Nathaniel Clark [ 16/Jun/16 ]

The "ll_do_fiemap" message only appears in the 2.5 code. Are you using that release on el7?

Comment by Andreas Dilger [ 16/Jun/16 ]

Two notes here:

  • the "operation ost_get_info failed" message is because the RPC reply is marked as PTL_RPC_MSG_ERR (i.e. complete RPC failure) when it should be just a PTL_RPC_MSG_REPLY with an error code in rq_status.
  • in ofd_get_info_hdl() there were changes in LU-3219 to ensure FIEMAP would flush regions of the file that are currently being written, but if the initial ofd_fiemap_get() fails then there is no point in doing the extra locking and consistency checks:
                    rc = ofd_fiemap_get(tsi->tsi_env, ofd, fid, fiemap);
    
                    /* LU-3219: Lock the sparse areas to make sure dirty
                     * flushed back from client, then call fiemap again. */
                    if (fm_key->lfik_oa.o_valid & OBD_MD_FLFLAGS &&
                        fm_key->lfik_oa.o_flags & OBD_FL_SRVLOCK) {
                            struct list_head locked;
    
                            INIT_LIST_HEAD(&locked);
                            ost_fid_build_resid(fid, &fti->fti_resid);
                            rc = lock_zero_regions(ofd->ofd_namespace,
                                                   &fti->fti_resid, fiemap,
                                                   &locked);
                            if (rc == 0 && !list_empty(&locked)) {
                                    rc = ofd_fiemap_get(tsi->tsi_env, ofd, fid,
                                                        fiemap);
                                    unlock_zero_regions(ofd->ofd_namespace,
                                                        &locked);
                            }
                    }
    

That should all be done only if (!rc && (...).

Comment by Olaf Faaland [ 17/Jun/16 ]

Nathaniel,

My mistake, you're right. The ll_do_fiemap console message does not occur in 2.8. 2.8 produces only the "operation ost_get_info failed" console message Andreas describes.

thanks,
Olaf

Comment by Peter Jones [ 19/Jun/16 ]

> The "ll_do_fiemap" message only appears in the 2.5 code. Are you using that release on el7?

Yes they have el7 for clients only

Comment by Peter Jones [ 03/Oct/16 ]

No change needed to master. 2.5.x fix is flagged for LLNL to pickup

Generated at Sat Feb 10 02:16:16 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.