Assigned connection from the pool may change during runtime of a request

Description

In our legacy ACF code we rely on the fact that a database connection from the pool stays assigned to a request/thread for the lifetime of that request. In ACF the connection used for the first query of a specific datasource in a request can be relied upon the still be same same for any subsequent query to that datasource within the same request and you can also rely upon the fact that the connection will not have been used by any other concurrent request in the meantime, so the connection assignment is safe from any concurrency problems that may arise from the fact that a connection may actually have a certain environment state set by a query in a request.

I have attached a code example for use with a PostgreSQL datasource that demonstrates the problem in a concurrency situation:

In ACF 1000 of 1000 requests with concurrency of 40 are successful, in Lucee on my test machine, around 50-120 (averaging somewhere around 70) requests of 1000 result in an error thrown, as the result from the second query does not match the expected value.

From ACF documentation "How are database connections handled in ColdFusion?":

??If a request is using a data source connection that is already opened, and another request to the data source comes in, a new connection is established. Since only one request can use a connection at any time, the simultaneous request will open up a new connection because no idle cached connections are available. ??

A feature such as PostgreSQL's SET SESSION is fairly useless without this connection stickyness during the lifetime of a request, as one cannot depend upon the same PostgreSQL backend being used for a subsequent query within the same request and even if the same backend (i.e. connection) is being used, it may have been tampered with by a concurrent request in the meantime.

We are making use of this in our content management system (custom code, originally developed on ColdFusion 8), where we have a table versioning/auditing system in place that uses triggers to store and log any changes by DML queries on the audited tables transparently in shadow/attic tables and log timestamps an user info, which allows rolling back single edits up to full user sessions.

To achieve this, we fetch a new session id from a database sequence onSessionStart and set this session id along with the user-id in a first query to the database in the onRequestStart method using PostgreSQL's //SET SESSION myvar = someval//. In a last query at the end of the request, we unset this variables.

Any INSERT/UPDATE/DELETE on audited tables is being watched by trigger functions that read the variables set in that first query in the request and use the info for logging. This effectively prevents any changes to the tables without these variables being set in the backend and allows the developers to write their statements without any regard to this versioning/auditing system - it's fully transparent to the rest of the DML queries in the same request.

This approach fails with Lucee, unfortunately. The only workaround that assures the same connection being used for all queries to the same datasource in a sequence of queries is wrapping the whole sequence with a cftransaction.

That is however not practical in most situations, as such data manipulating queries might be in very different parts of the code. Some of that may even already use cftransaction for other purposes. Reworking all the code to pass in these variables explicitly with a separate statement preceding the actual query in the same cfquery-tag (in our use case to set the session-id and a user-id) would be extremely painful and would introduce a new possibilty for bugs through this additional requirement. The beauty of the current solution is, that the developers writing queries on those tables need not concern themselves at all with the audit/versioning system and thus cannot forget to implement it or make some sort of mistake in the implementation for a part of the code. The two-statement workaround also requires Lucee to be configured to actually allow more than one statement in a cfquery, which we have currently disabled.

So the deviation in dealing with the connections from the pool currently breaks at least one database feature in PostgreSQL and possibly other RDBMS, too, where connections actually have a specific environment. It also results in some so far unexpected caveats in DB-server side logfile analysis: With ACF one could expect the same backend-pid in PostgreSQL to be logged for subsequent queries in the same request, so one could browse/extract log entries with the backend-pid of interest from the DB server's logfile to see what was going on in the request. With Lucee, the backend PID may change within the same request and it's quite impossible to guess, which statements make up the sequence of a certain request's queries.

If the required "stickyness" of database connections to a request is not desireable for performance reasons, I'd very much welcome an option to configure a specific datasource for this stickyness behavior.

Attachments

22 Feb 2019, 07:36 am
21 Feb 2019, 04:11 pm
18 Jul 2018, 06:37 am

Linked work items

relates to

LDEV-2604

BUG Exclusive connections for request. Persistent MSSQL Isolation Level.

Activity

Michael Offner
28 March 2019 at 14:27

https://github.com/lucee/Lucee/commit/9425c29886daf474587de3bf95544c1fd63f5b5b#diff-e72ce7dee22585b1b5b14f16f488ee47

Markus Wollny
22 February 2019 at 08:00

You are absolutely right, a per-request option is definitely not worth anything that would make handling this overly complicated - we will probably change the datasource references in our code to use something like request.strDS variable, then set up two datasources as you suggested and will simply use a little logic in onRequestStart to decide which datasource gets assigned to the current request. I am very much excited about this feature in any case

Michael Offner
22 February 2019 at 07:41
(edited)

@Zac Spitzer@Markus Wollny it was important for me to enable it on datasource level, this is the place it needs to be in my opinion. have a flag that allows it to change the behaviour within a request, make it much more complicated to handle with no real benefit in my opinion.

How to solve this
Simply use 2 datasources definitions, for example like this:

this.datasources["susi"] = {
	  class: 'com.mysql.cj.jdbc.Driver'
	, bundleName: 'com.mysql.cj'
	, bundleVersion: '8.0.15'
	, connectionString: 'jdbc:mysql://localhost:3306/test?useUnicode=true&characterEncoding=UTF-8&serverTimezone=Europe/Zurich&useLegacyDatetimeCode=true'
	, username: 'root'
	, password: "..."
	
	// optional settings
	, connectionLimit:100 // default:-1
	, alwaysSetTimeout:true // default: false
	, validate:false // default: false
	, requestExclusive:false
};

this.datasources["susi_excl"] = {
	  class: 'com.mysql.cj.jdbc.Driver'
	, bundleName: 'com.mysql.cj'
	, bundleVersion: '8.0.15'
	, connectionString: 'jdbc:mysql://localhost:3306/test?useUnicode=true&characterEncoding=UTF-8&serverTimezone=Europe/Zurich&useLegacyDatetimeCode=true'
	, username: 'root'
	, password: "..."
	
	// optional settings
	, connectionLimit:100 // default:-1
	, alwaysSetTimeout:true // default: false
	, validate:false // default: false
	, requestExclusive:true
};

so you can switch in your code between that option enabled/disabled by simply change the datasource used, you can also change the default datasource on the fly with help of <cfapplication action="update"> everywhere in your code.

Markus Wollny
22 February 2019 at 07:21

If it where possible to enable this at runtime, i.e. in onRequestStart for example, that would be terrific - this would enable us to use this for logged in users in one specific app only where this is actually needed, and keep everything else more scalable with the current behaviour still in effect. If that's not feasible, I'm absolutely happy with the admin/application.cfc-datasource approach of course.

Zac Spitzer
21 February 2019 at 19:34
(edited)

um, which option in the application.cfc example enables this option? looking at the commit I think it's requestExclusive:true?

is it possible to have this as a runtime option we can enable for the current request, i.e only enable if required?

Resize issue view side panel

Fixed

Details

Assignee

Unassigned

Reporter

Markus Wollny

Labels

concurrencydatasource

New Issue warning screen

Before you create a new Issue, please post to the mailing list first https://dev.lucee.org

Once the issue has been verified, one of the Lucee team will ask you to file an issue

Sprint

None

Fix versions

5.3.2.47

Affects versions

5.2.8.50

Priority

Blocker

Created 18 July 2018 at 07:21

Updated 11 December 2019 at 14:55

Resolved 28 March 2019 at 14:27

Assigned connection from the pool may change during runtime of a request

Description

Attachments

Linked work items

relates to

Activity

Michael Offner
28 March 2019 at 14:27

Markus Wollny
22 February 2019 at 08:00

Michael Offner
22 February 2019 at 07:41
(edited)

Markus Wollny
22 February 2019 at 07:21

Zac Spitzer
21 February 2019 at 19:34
(edited)

Details

Assignee

Reporter

Labels

New Issue warning screen

Sprint

Fix versions

Affects versions

Priority

Flag notifications

Something's gone wrong

Something's gone wrong

Assigned connection from the pool may change during runtime of a request

Description

Attachments

Linked work items

relates to

Activity

Michael Offner 28 March 2019 at 14:27

Markus Wollny 22 February 2019 at 08:00

Michael Offner 22 February 2019 at 07:41(edited)

Markus Wollny 22 February 2019 at 07:21

Zac Spitzer 21 February 2019 at 19:34(edited)

Details

Assignee

Reporter

Labels

New Issue warning screen

Sprint

Fix versions

Affects versions

Priority

Flag notifications

Something's gone wrong

Something's gone wrong

Michael Offner
28 March 2019 at 14:27

Markus Wollny
22 February 2019 at 08:00

Michael Offner
22 February 2019 at 07:41
(edited)

Markus Wollny
22 February 2019 at 07:21

Zac Spitzer
21 February 2019 at 19:34
(edited)