Use StringBuilder.append instead of String.concat like Java to optimize Lucee CFML bytecode

Description

Java compiler optimizes string concatenation automatically to use StringBuilder, but Lucee relies on the slower String.concat when concatenating strings with the & operator in CFML.

CFML example:
myString="test a longer string then this to see an even bigger gain in performance";
echo(myString&myString&myString&myString&myString);

Currently, the Lucee CFML bytecode has to concat each pair together and do redundant string casting to achieve the result. This causes duplication of memory, which directly impacts the performance by the amount of extra memory wasted.

I have rewritten the lucee.transform.bytecode.op.OpString.java to be able to output the same bytecode that Java compiler generates for the plus operator in Java. I also handled the recursion of the expressions better to reduce object creation and toString calls.

When you decompile the CFML bytecode, the new code shows + operator instead of all the function calls like Java code does now and if you view the bytecode instructions, it shows stringbuilder and append as expected the least possible times. I initially built it without recursion enhancement, and Lucee was setup to repeat object creation. Just pointing out that I made it very efficient. It took me several hours trying to understand the left/right expression structure to do this since I couldn't do a simple array loop, since it is more like a linked list concept. I could probably apply this knowledge towards other optimizations later.

I think there is a duplicate of all the language features in the interpreter package. I assume that is because of the evaluate() function. I have not attempted to modify that code, but it would benefit from this as well though it appears to need modifications to work there.

I tried to let StringBuilder do the casting which works in general, but unfortunately double behavior changes to making 1 become 1.0 which breaks a lot of my CFML, so I left that the same. The casting overhead is minor in comparison.

It should benchmark progressively faster depending on the amount of memory no longer being accessed. Like a small number of concatenations is a minimum of 2-3 times faster. Bigger strings, and bigger number of concats on the same line benefit the most with this change.

Attachments

2
  • 31 May 2023, 06:33 pm
  • 31 May 2023, 06:33 pm

Activity

Show:

Zac Spitzer 16 May 2024 at 09:12

still interesting, re-opening…

Zac Spitzer 16 May 2024 at 09:08

After a lot of investigation, we found this didn’t make so much difference performance wise and had downsides,

what we did find was the targetting a new bytecode level, i.e. 11 or 17 made more of an impact

closing this ticket

Zac Spitzer 20 February 2024 at 01:04

closed the original PR due to a messy rebase, still requires changes as requested on the original (marked with TODOs)

https://github.com/lucee/Lucee/pull/2315

Details

Assignee

Reporter

New Issue warning screen

Before you create a new Issue, please post to the mailing list first https://dev.lucee.org

Once the issue has been verified, one of the Lucee team will ask you to file an issue

Sprint

Affects versions

Priority

Created 30 December 2018 at 17:29
Updated 6 March 2025 at 11:42

Flag notifications