Lucee 5 is recording the start time of CFML requests too early, which prevents the CFML setting RequestTimeout from working.

Description

Fixing the issue in LDEV-2087, has revealed a related issue that may explain a variety of mysterious timeout errors that Lucee can have in a real application with slow network conditions.

I theorize that Lucee engine starts measuring the start time of a request thread before the CFML has even had a chance to be executed, which is too early. This seems to be the behavior, I haven't been able to isolate if this is true or not yet.

However, the correct time to start measuring the thread is immediately before the line of code that executes the CFML (whether that is application.cfc or a cfm or cfc doesn't matter). Without this correction, the entire application would have to have a higher requestTimeout setting to accommodate the worst case scenario, instead of giving the CFML developer accurate control.

This flaw expresses itself most easily when you do a large upload that takes longer the requestTimeout setting in the Lucee Web Admin to upload. I say Web Admin, because application.cfc settings can't be applied yet in this situation. While Lucee works hard to collect all the client input from the socket for minutes, the CFML has not had a chance to run yet. As a result, once CFML code executes, the full timeout has already been reached. This means you get an impossible to fix error in Application.cfc or other random places of your code that make no sense.

I'm attempting to solve this problem myself, and I will report a fix once I have it.

Environment

None

Activity

Show:
Bruce Kirkpatrick
November 27, 2018, 12:46 AM

I'm sorry, but I forgot to include the most important line of code (setting startTime again after the multipart parsing completes) when I was rewriting this and ended up making it have the bug again by mistake. Both commits are needed to fix it. My clients were complaining about the timeout still happening, so I had to fix it.

https://github.com/brucekirkpatrick/Lucee/commit/412eff86dd519be9aa029526cbe179824c65cdd4
https://github.com/brucekirkpatrick/Lucee/commit/df5d8dbff4c7e83d405b6cdda9c5671c6f95f94a

Bruce Kirkpatrick
December 3, 2018, 10:55 PM

I believe this remainingTime function has to verify that the pageContext allows a timeout, since it was still killing our uploads. I'm not sure what would cause it to only sometimes reach this code, and not others.

Additional commit:
https://github.com/brucekirkpatrick/Lucee/commit/d23e4e44cc61632d6c35b11d8a460abc1aece167

Zac Spitzer
December 3, 2018, 11:02 PM

Is there a way to get the remaining time before the current request will time out?

Bruce Kirkpatrick
December 3, 2018, 11:17 PM

Zac,

Not sure if it works, but it would be something like this to do that from CFML and I'd not recommend it.
c=createObject("java", "lucee.runtime.util.PageContextUtil");
c.checkRemainingTime();

Keep in mind, that CFML isn't even being allowed to run 1 line of CFML code on this specific bug before it fails, so there is nothing at CFML level to try.

Bruce Kirkpatrick
December 4, 2018, 4:27 AM

Apparently, FormImpl initialize() and other PageContext functions can be called multiple times by the internal JSP stuff with different values as it parses the multipart post requests. This was very confusing to debug and see, since I had expected them to only run once and the structure isn't setup to tell me exactly when all this work is done. Oddly, it tells me there is both application/x-www-form-urlencoded and multipart/form-data in the same request when I wrote custom logging to see what it was doing. It seems odd/buggy, since that's not what a browser sends.

It seems like there should be an event for when all headers & request body are done, but I don't see where you've implemented that. This is why my original fix wasn't enough. I wish I could find this place and make it a single clean call instead.

I figured out a way with more then one call to the function to get it working. It's taken me so many hours on this problem. I'll a little worried that the repeat calls to various functions in pagecontext might cause other problems or at least waste some resources.

I also think that the 2 variables might be able to have a random visibility problem, so I made them volatile to guarantee the monitoring thread will be reading the newest value from main memory.

https://github.com/brucekirkpatrick/Lucee/commit/6f0ba40d252bc95d9e9142dd85c7ea6dbf38f9b3

Assignee

Michael Offner

Reporter

Bruce Kirkpatrick

Priority

Major

Labels

None

Fix versions

None

Affects versions

Configure