Lucee engine reset() kills current thread (regression)

Description

Ticket has had a negative effect on the FusionReactor line performance monitor, which is used by TestBox 3.0 for code coverage reporting. Specifically, this line which seems to kill the currently executing thread:

https://github.com/lucee/Lucee/commit/81f5520f20deeddbc25f811a571a60983c5cf684#diff-2bd4a0f2e5824a28a0ea12b1d5ff663dR41

When Fusionreactor enables the line performance monitor, it issues a reset() of the Lucee engine. I believe this is to clear out any cached class files so it can instrument them when they compile again. The problem is that Lucee kills the request dead when FR issues the reset() call, causing a 500 error to come back.

This causes an error similar to this in the logs:

This can be seen with great consistency like so:

Environment

None

Activity

Show:
Brad Wood
May 11, 2020, 4:30 PM

Its taken engineering months from the FR team to get to this point, but it DOES NOT WORK

The last word on that effort was that it did not work due to bugs in FusionReactor as confirmed by Mikey. The FR bugs preventing it from working were documented and I provided repro cases to Mikey and John H on the FR team. MIkey told me the ticket had been updated and was awaiting a sprint to be addressed. Presumably since FusionReactor completes development on it, it will work. Also, development was not ongoing for months. There were months between spurts of development that usually lasted for a few days. The last developer to work on this (John H) I tried to work directly with, but he didn't reply to my last messages until a broken version of the feature had already been shipped in 8.3

The on disk cache isn't reset or the classes arent reloaded reliably from the source

The on disk cache is reset and the classes are reloaded but just not on the first request. MIkey confirmed this to be a bug in FR that was supposed to be addressed in an upcoming sprint on your end.

The FR team is unwilling to randomly start deleting users class files off disk, by default

This is just silly. Who said anything about "random" deletion. The folder that needs cleared is the cfclasses folder,. which has been a folder in every CF engine I've known since they ran on Java and it's always been ok to clear this folder. Your current solution resets the entire CF engine and causes tons of errors and renders the engine unable to even complete the current request. I can't believe you're suggesting that clearing the classes from disk is more dangerous than resetting the entire CF engine in mid-request.

You can set
-Dfr.lucee.lineperf.classcache.ppc=true

This is the feature that has bugs on the FR side that are supposed to be scheduled for an upcoming sprint. MIkey and I worked out these bugs in person in Atlanta at the Devnexus conference in Feb.

Zac Spitzer
May 17, 2020, 3:14 PM

2020-05-15 16:43:50.852 reset-engine java.lang.NullPointerException
at lucee.runtime.engine.CFMLEngineImpl.reset(CFMLEngineImpl.java:1260)
at lucee.runtime.engine.CFMLEngineImpl.reset(CFMLEngineImpl.java:1248)
at lucee.loader.engine.CFMLEngineWrapper.reset(CFMLEngineWrapper.java:147)
at lucee.loader.engine.CFMLEngineFactory._restart(CFMLEngineFactory.java:603)
at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.base/java.lang.reflect.Method.invoke(Method.java:566)
at lucee.runtime.config.XMLConfigAdmin.restart(XMLConfigAdmin.java:3922)
at lucee.runtime.config.XMLConfigAdmin.updateCore(XMLConfigAdmin.java:4575)
at lucee.runtime.config.DeployHandler.deploy(DeployHandler.java:91)
at lucee.runtime.engine.CFMLEngineImpl.<init>(CFMLEngineImpl.java:395)
at lucee.runtime.engine.CFMLEngineImpl.getInstance(CFMLEngineImpl.java:702)
at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.base/java.lang.reflect.Method.invoke(Method.java:566)
at lucee.loader.engine.CFMLEngineFactory.getEngine(CFMLEngineFactory.java:1430)
at lucee.loader.engine.CFMLEngineFactory.initEngine(CFMLEngineFactory.java:365)
at lucee.loader.engine.CFMLEngineFactory.initEngineIfNecessary(CFMLEngineFactory.java:261)
at lucee.loader.engine.CFMLEngineFactory.getInstance(CFMLEngineFactory.java:167)
at lucee.loader.engine.CFMLEngineFactory.getInstance(CFMLEngineFactory.java:201)
at lucee.loader.servlet.CFMLServlet.init(CFMLServlet.java:42)
at org.apache.catalina.core.StandardWrapper.initServlet(StandardWrapper.java:1144)
at org.apache.catalina.core.StandardWrapper.loadServlet(StandardWrapper.java:1091)
at org.apache.catalina.core.StandardWrapper.load(StandardWrapper.java:985)
at org.apache.catalina.core.StandardContext.loadOnStartup(StandardContext.java:4885)
at org.apache.catalina.core.StandardContext.startInternal(StandardContext.java:5199)
at org.apache.catalina.util.LifecycleBase.start(LifecycleBase.java:183)
at org.apache.catalina.core.ContainerBase.addChildInternal(ContainerBase.java:743)
at org.apache.catalina.core.ContainerBase.addChild(ContainerBase.java:719)
at org.apache.catalina.core.StandardHost.addChild(StandardHost.java:705)
at org.apache.catalina.startup.HostConfig.deployDirectory(HostConfig.java:1125)
at org.apache.catalina.startup.HostConfig$DeployDirectory.run(HostConfig.java:1859)
at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)
at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
at java.base/java.lang.Thread.run(Thread.java:834)

Neil Wightman
May 18, 2020, 7:52 AM

FYI : FusionReactor has made changes to use to pagePoolClear (by default) which work on the newest versions of Lucee. We want to release this fix ASAP.
We still provide a -D option to use the restart() for older versions of Lucee in-case there are unforeseen issues with older versions of Lucee.

Zac Spitzer
May 18, 2020, 9:35 AM

what about the problem with clearing the cfclasses folder, do you need a new function or are you happy to just manually purge the folder yourself?

Neil Wightman
May 18, 2020, 10:51 AM

we call remove() on the classes directory via the pagecontext.getConfig().getClassDirectory().remove(true).
I dont like that FR is deleting these files (and that it will remove files outside of this directory if any symbolic links exists) but it does work at the moment. I have no idea what concurrency issues could occur if we call remove(true) when a class is being compiled on another thread though. I have added a -D option to disable this remove() call incase we find a customer who has problems with it.

Fixed

Assignee

Unassigned

Reporter

Brad Wood

Priority

Trivial

Labels

Fix versions

Configure