Extracting text from PDF
Kurt Pfeifle
k1pfeifle at gmx.net
Tue Jul 3 11:57:41 PDT 2007
Rolf Kutz wrote:
> Kurt Pfeifle schrieb:
>> If you use your firefox to "print to file" your job, please upload the
>> resulting PostScript. I/we can then try to convert with a CUPS/pstops
>> + Ghostscript commandline chain (using different versions of Ghostscript
>> and parameter variations) to see if we find one which does not show your
>> problem....
>
> Here is a link to the Postscript:
>
> http://www.technology-forum.com/tmp/1004.ps
>
> Regards, Rolf
Sooo... most likely, Helge was right about his suspicion.
Evidence:
(1) I used ESP Ghostscript 8.15.3 to convert the PS (running the "ps2pdf"
shell script utility unmodified that comes with it). Result similar
to what you describe. Filesize 22.994 Bytes.
"pdffonts 1004.pdf" output:
name type emb sub uni object ID
------------------------------------- ------------ --- --- --- ---------
RCISND+Nimbus_Sans_L.Bold.0.0.Set0 Type 1C yes yes no 14 0
AJKQFS+DejaVu_Serif.Book.0.0.Set0 Type 1C yes yes no 9 0
VKJNGT+Nimbus_Sans_L.Regular.0.0.Set0 Type 1C yes yes no 12 0
(2) Then I run GPL Ghostscript 8.57 (some weeks ago self-compiled, with
not much tweaking what-so-ever -- I just wanted to see if it builds
and now has the "cups" device included). Result: fonts are properly
embedded; PDF is searchable. Filesize 25.876 Bytes.
"pdffonts 1004.pdf" output:
name type emb sub uni object ID
------------------------------------- ------------ --- --- --- ---------
RCISND+Nimbus_Sans_L.Bold.0.0.Set0 Type 1C yes yes yes 13 0
AJKQFS+DejaVu_Serif.Book.0.0.Set0 Type 1C yes yes yes 8 0
VKJNGT+Nimbus_Sans_L.Regular.0.0.Set0 Type 1C yes yes yes 11 0
I'm not sure if a PDF attachment would make it to the list. I'll send
it to you with private mail.
So Helge was right with his advice to upgrade Ghostscript to solve this
problem.
--
Kurt Pfeifle
System & Network Printing Consultant ---- Linux/Unix/Windows/Samba/CUPS
Infotec Deutschland GmbH ..................... Hedelfinger Strasse 58
A RICOH Company ........................... D-70327 Stuttgart/Germany
More information about the cups
mailing list