We're updating the issue view to help you get more done. 

Assigned connection from the pool may change during runtime of a request

Description

In our legacy ACF code we rely on the fact that a database connection from the pool stays assigned to a request/thread for the lifetime of that request. In ACF the connection used for the first query of a specific datasource in a request can be relied upon the still be same same for any subsequent query to that datasource within the same request and you can also rely upon the fact that the connection will not have been used by any other concurrent request in the meantime, so the connection assignment is safe from any concurrency problems that may arise from the fact that a connection may actually have a certain environment state set by a query in a request.

I have attached a code example for use with a PostgreSQL datasource that demonstrates the problem in a concurrency situation:

In ACF 1000 of 1000 requests with concurrency of 40 are successful, in Lucee on my test machine, around 50-120 (averaging somewhere around 70) requests of 1000 result in an error thrown, as the result from the second query does not match the expected value.

From ACF documentation "How are database connections handled in ColdFusion?":

??If a request is using a data source connection that is already opened, and another request to the data source comes in, a new connection is established. Since only one request can use a connection at any time, the simultaneous request will open up a new connection because no idle cached connections are available. ??

A feature such as PostgreSQL's SET SESSION is fairly useless without this connection stickyness during the lifetime of a request, as one cannot depend upon the same PostgreSQL backend being used for a subsequent query within the same request and even if the same backend (i.e. connection) is being used, it may have been tampered with by a concurrent request in the meantime.

We are making use of this in our content management system (custom code, originally developed on ColdFusion 8), where we have a table versioning/auditing system in place that uses triggers to store and log any changes by DML queries on the audited tables transparently in shadow/attic tables and log timestamps an user info, which allows rolling back single edits up to full user sessions.

To achieve this, we fetch a new session id from a database sequence onSessionStart and set this session id along with the user-id in a first query to the database in the onRequestStart method using PostgreSQL's //SET SESSION myvar = someval//. In a last query at the end of the request, we unset this variables.

Any INSERT/UPDATE/DELETE on audited tables is being watched by trigger functions that read the variables set in that first query in the request and use the info for logging. This effectively prevents any changes to the tables without these variables being set in the backend and allows the developers to write their statements without any regard to this versioning/auditing system - it's fully transparent to the rest of the DML queries in the same request.

This approach fails with Lucee, unfortunately. The only workaround that assures the same connection being used for all queries to the same datasource in a sequence of queries is wrapping the whole sequence with a cftransaction.

That is however not practical in most situations, as such data manipulating queries might be in very different parts of the code. Some of that may even already use cftransaction for other purposes. Reworking all the code to pass in these variables explicitly with a separate statement preceding the actual query in the same cfquery-tag (in our use case to set the session-id and a user-id) would be extremely painful and would introduce a new possibilty for bugs through this additional requirement. The beauty of the current solution is, that the developers writing queries on those tables need not concern themselves at all with the audit/versioning system and thus cannot forget to implement it or make some sort of mistake in the implementation for a part of the code. The two-statement workaround also requires Lucee to be configured to actually allow more than one statement in a cfquery, which we have currently disabled.

So the deviation in dealing with the connections from the pool currently breaks at least one database feature in PostgreSQL and possibly other RDBMS, too, where connections actually have a specific environment. It also results in some so far unexpected caveats in DB-server side logfile analysis: With ACF one could expect the same backend-pid in PostgreSQL to be logged for subsequent queries in the same request, so one could browse/extract log entries with the backend-pid of interest from the DB server's logfile to see what was going on in the request. With Lucee, the backend PID may change within the same request and it's quite impossible to guess, which statements make up the sequence of a certain request's queries.

If the required "stickyness" of database connections to a request is not desireable for performance reasons, I'd very much welcome an option to configure a specific datasource for this stickyness behavior.

Status

Assignee

Unassigned

Reporter

Markus Wollny

Sprint

None

Fix versions

Affects versions

5.2.8.50

Priority

Blocker