Use StringBuilder.append instead of String.concat like Java to optimize Lucee CFML bytecode

Description

Java compiler optimizes string concatenation automatically to use StringBuilder, but Lucee relies on the slower String.concat when concatenating strings with the & operator in CFML.

CFML example:
myString="test a longer string then this to see an even bigger gain in performance";
echo(myString&myString&myString&myString&myString);

Currently, the Lucee CFML bytecode has to concat each pair together and do redundant string casting to achieve the result. This causes duplication of memory, which directly impacts the performance by the amount of extra memory wasted.

I have rewritten the lucee.transform.bytecode.op.OpString.java to be able to output the same bytecode that Java compiler generates for the plus operator in Java. I also handled the recursion of the expressions better to reduce object creation and toString calls.

When you decompile the CFML bytecode, the new code shows + operator instead of all the function calls like Java code does now and if you view the bytecode instructions, it shows stringbuilder and append as expected the least possible times. I initially built it without recursion enhancement, and Lucee was setup to repeat object creation. Just pointing out that I made it very efficient. It took me several hours trying to understand the left/right expression structure to do this since I couldn't do a simple array loop, since it is more like a linked list concept. I could probably apply this knowledge towards other optimizations later.

I think there is a duplicate of all the language features in the interpreter package. I assume that is because of the evaluate() function. I have not attempted to modify that code, but it would benefit from this as well though it appears to need modifications to work there.

I tried to let StringBuilder do the casting which works in general, but unfortunately double behavior changes to making 1 become 1.0 which breaks a lot of my CFML, so I left that the same. The casting overhead is minor in comparison.

It should benchmark progressively faster depending on the amount of memory no longer being accessed. Like a small number of concatenations is a minimum of 2-3 times faster. Bigger strings, and bigger number of concats on the same line benefit the most with this change.

Activity

Show:
Bruce Kirkpatrick
December 30, 2018, 5:31 PM
Glen Fordham
December 23, 2019, 1:18 PM

We've changed a small number of our string concatenations within cfloops to use java.lang.StringBuilder explicitly because of the monumental performance improvement it gave. Would be amazing if we could avoid making these changes across our application.

Assignee

Unassigned

Reporter

Bruce Kirkpatrick

Labels

None

Affects versions

Priority

New
Configure