cfdocument ignores non-ascii characters
Description
Environment
Windows-7 64 bit (English and Russsian versions tested)
JRE 1.8.0_221
Attachments
- 12 Oct 2019, 01:45 pm
- 12 Oct 2019, 01:38 pm
- 12 Oct 2019, 01:36 pm
relates to
Activity
Pothys - MitrahSoft 9 February 2022 at 12:09Edited
The PR ( https://github.com/lucee/extension-pdf/pull/33 ) in the ticket https://luceeserver.atlassian.net/browse/LDEV-3836 fixes this ticket also. But modern engine renders the fonts only with the fontDirectory attribute and font-family.
Derek Bredensteiner 18 February 2021 at 18:06
I think it corresponds to something iText is calling “Unicode Horizontal”.
I’m pretty far outside of things I have experience with. From what I can see, Flying Saucer maybe at this point hands off some aspect of font parsing to something called iText:
http://flyingsaucerproject.github.io/flyingsaucer/r8/guide/users-guide-R8.html#xil_33
And then this Identity-H
I think is being mapped to iText’s BaseFont.IDENTITY_H
https://api.itextpdf.com/iText5/java/5.5.9/com/itextpdf/text/pdf/BaseFont.html#IDENTITY_H
Brad Wood 18 February 2021 at 17:47
Interesting information. I'll have to test this. Can you explain what `Identity-H` means however?
Derek Bredensteiner 18 February 2021 at 16:59
I encountered the same, and found a workaround for my case with a -fs-pdf-font-encoding
hint to the flying saucer (-fs-
) renderer.
This required removing the specification of font directory on the <cfdocument
(or pointing it at any wrong directory), so that flying saucer would only do it’s internal addFont
from my @font-face
with the -fs-pdf-font-encoding
hint.
@font-face {
font-family: "Roboto Condensed";
src: url("file:///C:/Windows/Fonts/RobotoCondensed-Regular.ttf");
font-weight: normal;
font-style: normal;
-fs-pdf-font-embed: embed;
-fs-pdf-font-encoding: Identity-H;
}
I believe a real fix, for at least my scenario where I am always specifying a font directory on the <cfdocument />
, will require something like changes to https://github.com/flyingsaucerproject/flyingsaucer/blob/master/flying-saucer-pdf/src/main/java/org/xhtmlrenderer/pdf/ITextFontResolver.java#L168 to allow an encoding to be passed for the entire directory being added, and https://github.com/lucee/extension-pdf/blob/master/source/java/src/org/lucee/extension/pdf/xhtmlrenderer/FSPDFDocument.java#L70 to presume to add more global encoding by default.
Sergey Mishchuk 10 January 2021 at 14:30
After some investigation it looks like problem with non-ascii symbols occures at 10.0.0.70, exactly when FS becames default type instead of PD4ML (10.0.0.69 now works correctly in my environment)
Details
Assignee
Michael OffnerMichael OffnerReporter
Sergey MishchukSergey MishchukPriority
CriticalLabels
New Issue warning screen
Before you create a new Issue, please post to the mailing list first https://dev.lucee.org
Once the issue has been verified, one of the Lucee team will ask you to file an issue
Affects versions
Details
Details
Assignee
Reporter
Priority
Labels
New Issue warning screen
Before you create a new Issue, please post to the mailing list first https://dev.lucee.org
Once the issue has been verified, one of the Lucee team will ask you to file an issue
All Cyrillic and other non-ascii characters are not rendered by CFDOCUMENT
<cfprocessingdirective pageencoding="utf-8"> <cfdocument format="pdf" orientation="portrait" pagetype="a4" fontEmbed = "yes"> тест/test <div style="font-family:Calibri; color:green; font-size:18pt;"> тест: Calibri </div> <div style="font-family:Courier; color:blue; font-size:14pt;"> тест: Courier </div> </cfdocument>
result by Lucee 5.3.3.62:
(non-ascii characters are skipped, Calibri font rendered as Times)
If I add Calibri font using fontdirectory attribute, it is rendered correctly, but still ignores non-ascci.
in 5.2.9.31 (CommandBox, also Lucee Express) - no problem
All fonts in place. Calibri is installed in Windows (no measures to add this font to Lucee taken)
skipping non-ascii characters also reproduced in
5.3.5.16-SNAPSHOT (Lucee Express)