Comment on Pdf to odt/docx conversion has me weeping!

observantTrapezium@lemmy.ca ⁨1⁩ ⁨week⁩ ago

I know the pain. While there are definitely solutions that work sometimes, there’s just no “one size fits all” that I’m aware of. PDFs can represent text very differently internally.

What I did for one project where extracting the text produced a complete mess was to convert the PDF pages to images and then OCR them…

source
Sort:hotnewtop