Thursday, March 3, 2011

[Tools] Convert PDF to text

Courtesy of the interwebs:

"That depends. If it is an image in the PDF, you're out of luck.
Otherwise (if it is a PDF containing text) you can do the following, all in one line, assuming osx has the strings command and a perl interpreter:"


strings filename.pdf | perl -ne '$line=$_; $s=$line; $w=""; while ($s =~ m/(\w+)(.*)/){print $line if ($w eq $1); $w=$1; $s=$2;}'

No comments: