Incorrect mime type detection of ms excel

Description

When uploading a microsoft excel (97-2003) .xls file using the cffile tag with strict set true and the mime type "application/vnd.ms-excel" specified the upload will fail.

Uploaded the file without the mime type detection works, however testing the uploaded file using the FileGetMimeType() function also fails to work correctly as it returns a mimetype of "application/msword".

Environment

WIndows 7

Activity

Show:
Igal Sapir
July 9, 2017, 8:23 PM

Tika 1.10 reports `application/octet-stream` which is too generic IMO. Tika 1.15 fixes that, and reports instead either `application/x-tika-msoffice` or `application/x-tika-ooxml`.

We are unable to upgrade to Tika 1.15 ATM as it is used by some 3rd party libraries.

Michael Offner
September 1, 2017, 1:50 PM

was able to get everything working without xerces nor xalan bundled with one exception.
I still get an exception with transforming, before we can go on this needs to be fixed.

reverting to xalan for the moment

Sebastiaan Naafs - van Dijk
October 26, 2017, 9:58 AM

Is there an overview which MIME TYPES Lucee returns using TIKA? Because when uploading XLS or XLSX and using FileGetMimeType() I get application/zip or application/x-tika-ooxml for XLSX and application/x-tika-msword for RTF documents. This is quite confusing as I am expecting one of the "normal" MIME TYPES of application/vnd.ms-excel or application/vnd.openxmlformats-officedocument.spreadsheetml.sheet. If Lucee returns other stuff, we need to extend our current MIME TYPE checks Currently on Lucee 5.2.4.x.

J R
November 2, 2017, 4:58 PM
Edited

I've found a similar issue and reported it as https://luceeserver.atlassian.net/browse/LDEV-1549

From my testing, the misidentification of MS Office files was introduced in 5.2.2.71.

I believe both 5.1.x and the latest rely on Tika 1.10?

Michael Offner
February 1, 2018, 1:33 PM
Fixed

Assignee

Michael Offner

Reporter

Former user

Priority

Critical

Labels

Fix versions

Sprint

None

Affects versions

Configure