[LU-5026] Create an lbug_on_eviction option Created: 07/May/14 Updated: 29/May/19 Resolved: 01/Oct/18 |
|
| Status: | Resolved |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | None |
| Fix Version/s: | Lustre 2.12.0 |
| Type: | Improvement | Priority: | Minor |
| Reporter: | Ryan Haasken | Assignee: | Chris Horn |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | patch | ||
| Rank (Obsolete): | 13907 |
| Description |
|
There should be an option to force an LBUG on eviction. This option should be controlled through a procfs interface, at /proc/sys/lustre/lbug_on_eviction. If set on the client only, the client should LBUG when it discovers that it has been evicted. If set on a server, it should LBUG when it evicts any client. |
| Comments |
| Comment by Keith Mannthey (Inactive) [ 07/May/14 ] |
|
What use case are you trying to handle? If you lbug on a server for any reason your will bring down the whole FS for all clients. |
| Comment by Ryan Haasken [ 07/May/14 ] |
|
Here is a patch: http://review.whamcloud.com/#/c/10257/ |
| Comment by Robert Read (Inactive) [ 07/May/14 ] |
|
Wouldn't this allow someone to cause a server to LBUG simply by restarting a client that was holding a write lock? |
| Comment by Ann Koehler (Inactive) [ 08/May/14 ] |
Only if lbug_on_eviction was explicitly set on the server. We don't expect the flag will normally be set on servers. But we have customers who cannot tolerate any evictions. Any information we can gather that helps us eliminate the evictions is potentially useful. I imagine lbug_on_eviction would typically be enabled during system bring up as we work out the problems and disabled when the system goes into production use. |
| Comment by Oleg Drokin [ 08/May/14 ] |
|
I suspect lock dump on eviction/timeout already covers like 95% of all usecases? |
| Comment by Ann Koehler (Inactive) [ 08/May/14 ] |
Well, if it were really LOCK dump on eviction rather than LOG dump on eviction, it would probably cover most of the use cases. Adding a dump of the lock tables is next on our list of enhancements. Since the lbug_on_eviction feature for servers is meeting with so much resistance, how about this idea? 1. Make lbug_on_eviction available only on clients. |
| Comment by Cory Spitz [ 12/May/14 ] |
|
LBUG upon eviction is really just a big hammer. We're finding that there are lots of applications that should fail upon an eviction, but don't. It's not necessarily the case that the application was badly written and doesn't check return codes either. We're trying to catch cases where successful writes complete to cache, but an app doesn't later do a sync (until sync upon close), but in the meantime the dirty pages that were dropped from the buffer cache for the eviction are silently ignored. In this case, I think, the app is totally unaware that 'something bad' happened. If there is something that we can do along those lines, then things like lbug_on_eviction on clients might not be needed. |
| Comment by Ryan Haasken [ 05/Jun/14 ] |
|
I've created I will change this patch so that it only makes lbug_on_eviction available on clients. Oleg, Robert, Keith, do you think this is a better solution? |
| Comment by Ryan Haasken [ 10/Jun/14 ] |
|
I've updated the patch and removed the code which makes the server LBUG when it evicts a client. Please take a look at the patch: http://review.whamcloud.com/#/c/10257 |
| Comment by Ryan Haasken [ 02/Jul/14 ] |
|
As an alternative to my patch, there is a similar patch here: http://review.whamcloud.com/#/c/8875/ My patch creates a new interface under /proc called lbug_on_eviction, while the other patch overloads the dump_on_eviction interface by causing an LBUG when obd_dump_on_eviction == -1. I prefer my patch, but as long as its well documented, the other patch would be fine as well. |
| Comment by Cliff White (Inactive) [ 10/Jul/14 ] |
|
I've asked Liang to review the patch, hopefully that will help resolve it. |
| Comment by Bruno Faccini (Inactive) [ 02/Sep/14 ] |
|
I have developed http://review.whamcloud.com/#/c/8875/, as a debug patch only (thus I decided to only divert a bit of dump_on_eviction purpose), to help on some ticket(s) at some point of time. But more experiences seems to indicate that the need of the feature is real so it needs to become more formal and thus http://review.whamcloud.com/#/c/10257 appears better now. |
| Comment by Oleg Drokin [ 15/Sep/14 ] |
|
I think you really mean that you want panic to happen on eviction? Additionally I'd like to draw your attention to the fact that MDS is a CLIENT to OSTs. That means that it could be evicted from OSTs and can panic then if this is set which is sort of undesired I suspect. |
| Comment by Ryan Haasken [ 23/Sep/14 ] |
|
Thanks for the comments, Oleg. Yes, that's a good point; I think we'd rather directly call panic() and change the interface name to panic_on_eviction. Thanks for pointing out the possibility of server panic on eviction, contrary to my commit message. I tested the two cases you mentioned with the patch as it currently is implemented.
We probably don't want this, even though it can be avoided by leaving lbug_on_eviction=0 on the servers. How would we determine whether we are running on a client or a server so we can panic only on clients? It doesn't seem like something we could do at compile time, and I'm not sure how to determine if there are any Lustre targets mounted. It doesn't look like there's anything to indicate this in the obd_import structure passed to ptlrpc_invalidate_import_thread. |
| Comment by Cliff White (Inactive) [ 20/Nov/14 ] |
|
Ryan, if you are continuing work on this patch, it needs to be rebased, can you address this? |
| Comment by James A Simmons [ 24/Jan/18 ] |
|
Now that we have sysfs support a better approach would be to use uvevent with udev so user land utilities can detect when a client is evicted. What I have noticed is we have imp_state covered by LUSTRE_IMP_* and we obd_import_event which handles IMP_EVENT_*. Which change should be removed to user land? |
| Comment by Cliff White (Inactive) [ 09/Feb/18 ] |
|
Old issue |
| Comment by Patrick Farrell (Inactive) [ 09/Feb/18 ] |
|
Cliff, It's an old issue but it has an approved-but-never-landed patch associated with it. I believe someone from Cray is going to update the patch shortly (like, in the next few days) and work on getting it landed. I'll assign it to myself as a placeholder. |
| Comment by James A Simmons [ 16/Feb/18 ] |
|
I have an idea to make this work with udev rules. Need |
| Comment by Gerrit Updater [ 24/Feb/18 ] |
|
James Simmons (uja.ornl@yahoo.com) uploaded a new patch: https://review.whamcloud.com/31407 |
| Comment by James A Simmons [ 24/Feb/18 ] |
|
So with the udev patch I believe the best way to get the results you want is with the following udev rule. SUBSYSTEM="lustre", ACTION=="change", ENV{STATE}=="RECOVER", RUN+="/usr/sbin/lctl dk > /tmp/dump.log" Of course you could create script to do more and use RUN+="my_script" instead. Basically you can do what ever you want. The udev also has two unique fields. One is the STATE which is what is returned by ptlrpc_import_state_name() in lustre_import.h. The second field is IMPORT which tells you want actually has changed state with you. For example it can be IMPORT="lustre-MDT0000_UUID'. So in theory you could also filter based on device i.e ENV{IMPORT} =="lustre-MDT*", or even by file system. This allows you to do things like report to some utility like nagios that a client was evicted and extract the file system and which device it was. Lots of possibilities here. If you want to use these fields as an argument do somethings along the lines of RUN+="my_app $env{IMPORT} $env{STATE}". Hope that helps. As you can see this give much power to the admins since many states are handled and the import is reported back to the user. |
| Comment by James A Simmons [ 01/Mar/18 ] |
|
Have you played with my patch? |
| Comment by Chris Horn [ 02/Mar/18 ] |
|
James, I haven't had a chance to check it out yet, but it should probably be submitted under a different ticket, no? |
| Comment by Chris Horn [ 02/Mar/18 ] |
|
James, read your instructions just now and we're not looking to dump dk logs. We want the node to panic the moment it discovers it was evicted. I don't think this uevent gives us the sort of granularity we're seeking for the panic. i.e. we can't guarantee to panic before the client tears down the import structures and we lose valuable state information. |
| Comment by James A Simmons [ 02/Mar/18 ] |
|
Doing a LBUG to force a panic is too heavy handed. Also servers are clients to each other which Oleg pointed out before in this ticket. If you look at LBUG(), which is really just lbug_with_loc(), it will not really panic unless you set libcfs_panic_on_lbug. When libcfs_panic_on_lbug is not set you end up calling libcfs_debug_dumplog(). You can do the exact same thing with obd_dump_on_eviction. Okay I see libcfs_debug_dumpstack() is missing. If we add libcfs_debug_dumpstack() to ptlrpc_invalidate_import_thread() under the obd_dump_on_eviction condition would that work for you? As long as you dump the stack right away the panic can be done later with the udev rule if you really want to panic the node. This places the reboot into the hands of the admin to choose when to push the button and under what conditions. I don't think many admins will want to the node to spontaneously reboot without any say. |
| Comment by Patrick Farrell (Inactive) [ 03/Mar/18 ] |
|
James, There is a trend in this bug to dispute whether or not Cray needs or wants this debug option we're trying to get merged (we've been using it happily for years, to great benefit). I wish it would stop. I'm going to try to be really clear about what we need and why in hopes of achieving that goal. We sometimes need to be able to panic nodes at the time of an eviction in order to debug evictions. Both dk logs and stack traces are insufficient, and we can't wait for later to panic the node. This option is purely for debugging and is not enabled by an admin except at the explicit request of a developer. All sorts of interesting state exists in memory and it is not realistic to try to dump all of it in debug logs, especially since new state will be created when the code changes. In certain situations, we need to be able to trigger a kernel panic at the time of an eviction. We cannot wait to panic at some future time/wait for admin intervention - It must be as immediate as practical or all sorts of state will get cleared away. If the eviction is allowed to proceed to destroying the import, almost all potentially useful information is gone. If that immediate, synchronous (ie no further execution of the thread processing the eviction except to do the panic) panic is possible to do with a udev rule, then that's probably fine! Nothing else and nothing short of that - giving us the ability to panic a client when an eviction occurs - will do what we need. The conversation about admins not wanting to reboot nodes is silly - this isn't for admins and it's not for normal run time. If it's in use, it's being used in a limited situation to get information for a developer, after the admin was specifically informed of the implications. No one will run with this enabled normally. This is no more dangerous than the existence of force_lbug or /proc/sysrq-trigger/. And we know servers are clients to each other sometimes. We also know they get evicted sometimes... So if we turned this on there, it would be to help debug such an eviction. This isn't dangerous - It's intentional. Lots of things in /proc/ can ruin your day in equally bad or worse ways if you use them incorrectly. |
| Comment by James A Simmons [ 05/Mar/18 ] |
|
Looking back on the comments people at Intel were uneasy about this patch. I was just trying to make it acceptable to them. I'm not personally against your patch. In any case the udev approach is of interest to ORNL as well as Intel for monitoring purposes. I removed my negative review and will leave it up to you to convince Oleg and Andreas to accept your work. |
| Comment by Gerrit Updater [ 01/Oct/18 ] |
|
Oleg Drokin (green@whamcloud.com) merged in patch https://review.whamcloud.com/10257/ |
| Comment by Peter Jones [ 01/Oct/18 ] |
|
Landed for 2.12 |
| Comment by Colin Faber [X] (Inactive) [ 29/May/19 ] |
|
should this be closed? |
| Comment by Peter Jones [ 29/May/19 ] |
|
Same response as on |