Since upgrading to 18.104.22.168, my production Lucee server would become unresponsive within about 24 hours. Lucee was still running, but using 95%+ of the CPU and all HTTP requests to the server just hung. This is a serious issue that prevents some users from upgrading beyond version 22.214.171.124.
With some difficulty, I have been able to reproduce this locally in releases after and including 126.96.36.199. Version 188.8.131.52 does not have this problem, and is stable in production.
By capturing the output of getMemoryUsage() repeatedly shows the following pattern of memory usage in the problematic Lucee versions:
Tenured generation space increases fairly rapidly, with only small portions being reclaimed by garbage collection runs.
Tenured generation space becomes completely full.
Used Eden space starts increasing.
Eden space becomes completely full.
A little while after this, Lucee starts using a high percentage of CPU (presumably attempting repeated GC runs) and the server becomes unresponsive. By contrast, in version 184.108.40.206, tenured generation memory usage increases very slowly, and never maxes out even after weeks of uptime in production. See the attached charts for memory and CPU usage in the last good version, the first bad version, and the current version.
At this time, I unfortunately do not have a reproducible test case that I can share. Reproducing the problem involves crawling my site while capturing memory stats to a log file. Each test takes about an hour, and currently requires my application's custom code, which I am not at liberty to share. However, from the discussion boards, this seems to be a common complaint from other Lucee users, and affects applications other than mine (such as Mura CMS), so I wanted to create an authoritative place to collect information and discuss the issue.
References to other reports that I suspect may be experiencing this issue:
JAVA_OPTS="-Xms256m -Xmx512m -XX:MaxPermSize=128m"