cffile - mimetype of MS Office files incorrect in 5.2.2.71+

Description

Behaviour of <cffile action="upload" ...> appears to have changed in 5.2.2.71 which result in the contentsubtype getting reported incorrectly for all MS Office documents and potentially others.

Following are the results of some test done with two MS Word 2016 docx files, one with content and the other without. A simple reproduction test is attached.

On 5.2.1.9

  • Uploading a Word 2016 file with content:
    contentsubtype = vnd.openxmlformats-officedocument.wordprocessingml.document

  • Uploading a Word 2016 file without content
    contentsubtype = vnd.openxmlformats-officedocument.wordprocessingml.document

On 5.2.2.71 and up

Uploading a Word 2016 file with content
contentsubtype = x-tika-ooxml

Uploading a Word 2016 file without content
contentsubtype = octet-stream

Replication:

Isolated the issue to a simple form post with a cfml page to handle the form submit.

Test files (can't attach an empty docx):

Environment

Windows Server 2012
JDK 8

Activity

Show:
J R
March 2, 2018, 7:41 PM

Tested on 5.2.7.7 and the issue no longer occurs on a file with content.

However, on an empty docx with no content the contentsubtype still returns octet-stream. Prior to the breaking issue in 5.2.2.54 it used to return the same as a file with content.

Patrick Quinn
March 15, 2018, 3:14 PM

/ - Can you please comment on 's latest ticket update?

Pothys - MitrahSoft
March 20, 2018, 6:11 AM

Hi ,

I've checked in latest version of lucee docsWithout content returns contentsubtype as "vnd.openxmlformats-officedocument.wordprocessingml.document" same as for with content. Added the empty docx file here.

Result
TestWithoutContent
TestWithContent

Can you please check once again if it is reproduced Please attach your empty docx here.

J R
March 21, 2018, 8:45 PM
Edited

On 5.2.7.49-SNAPSHOT I can still replicate the issue with a "true" empty file.

Just to confirm, by empty file I mean an actual empty file container (right clicked on my Desktop -> New -> Microsoft Word Document) which results in a 0KB file with a .docx extension.

I can see your file is about 10.4KB and has all the required headers, so I think you might be creating it by opening Word and saving an empty document which creates the headers.

Jira won't let me attach 0KB file here, but I have shared mine on Dropbox here

If you try with that file you get octet-stream on 5.2.7.49-SNAPSHOT as seen below

On 5.2.2.48-SNAPSHOT you get the expected vnd.openxmlformats-officedocument.wordprocessingml.document

As it turns out, the issue with empty files is not related to MS Office files only. Further testing with different files created the same way always results in octet-stream as contentsubtype on the latest snapshot.

With a empty.xlsx
5.2.2.48 -> vnd.openxmlformats-officedocument.spreadsheetml.sheet
5.2.7.49 -> octet-stream

With a empty.zip
5.2.2.48 - > x-zip-compressed
5.2.7.49 -> octet-stream

With empty.txt
5.2.2.48 - > text
5.2.7.49 -> octet-stream

Zac Spitzer
July 17, 2019, 4:49 PM

with 5.3.3.60-RC if you take an .xlsx and rename it to .xlsxxxxx it’s not detected as an .xlsx

Fixed

Assignee

Michael Offner

Reporter

J R

Priority

Blocker

Labels

Fix versions

Sprint

None

Affects versions

Configure