[vox-tech] How to tell if a pdf is text or image?

Troy Arnold troy-vox at zenux.net
Tue Mar 20 23:47:35 PDT 2007


On Tue, Mar 20, 2007 at 09:31:15PM -0700, hajhouse wrote:
> 
> What about converting the PDF files to postscript then running ps2ascii?

I know Alex mentioned not needing the actual text, but you can actually
skip the pdftops conversion:
pdftotext file.pdf - | wc -l

in Ubuntu/Debian, pdftotext is part of poppler-utils, which also
provides pdftops.

hth
-t


More information about the vox-tech mailing list