Extracting text from PDF

Kurt Pfeifle kurt.pfeifle at infotec.com
Thu Jun 28 08:51:15 PDT 2007


> > Most likely, your PDF contains what text you see on screen (or on
> > paper, when printed) only in the form of bitmaps, not proper fonts...
>
> How can I check this?

In acroread or in kpdf look for the menu entry where you can look at the document properties. There you should see a tab which allows you to check for the fonts.

See if the fonts are there, and what kind of names they have.

That said, this problem ("a bitmap font was used") usually does not appear with Firefox. Helge's guess about the root of the problem may be a much better one.

If you use your firefox to "print to file" your job, please upload the resulting PostScript. I/we can then try to convert with a CUPS/pstops + Ghostscript commandline chain (using different versions of Ghostscript and parameter variations) to see if we find one which does not show your problem....

--
Kurt Pfeifle
System & Network Printing Consultant --- Linux/Unix/Windows/Samba/CUPS
Infotec Deutschland GmbH - A RICOH Company ......... Stuttgart/Germany




More information about the cups mailing list