Helping people with computers... one answer at a time.

It's not uncommon to want to take a PDF file and convert it to a more easily editable format like Word. Problem is, PDF wasn't mean to do that.

I am trying to convert a PDF into a word document and there does not seem to be any free programs that will do this. I have tried several. Can you help?

Probably not.

Your question is a common one, but it represents a fundamental misunderstanding of exactly what PDF documents are and how they were intended to be used.

I'll put it another way: let me explain why I can't help you.

PDF Format

PDF, or Portable Document Format, is a document format that is intended to address the fact that every computer is different from almost every other computer in some ways. Frequently, those ways affect how documents are displayed and/or printed.

For example, a Word document you receive might use fonts that aren't installed on your machine, so when displayed Word has to pick a "close" font - but it's still different. Perhaps your printer has a minimum margin of 1/2 an inch, but your friend's printer can handle 1/8th of an inch - when you each print the same Word document it wraps, paginates and fundamentally looks different depending on the specifics of your printer.

"The upshot is that PDF is fundamentally a display format."

The things that contribute to visible differences in document presentation aren't limited to fonts and printers, nor are they limited to Word documents - almost any program that produces a document is susceptible to all sorts of system-to-system differences.

That's what PDF format sets out to solve: a PDF document is intended to look the same everywhere, and in practice it pretty much does. And across not only a wide variety of computers, but also these days on devices ranging from portable readers to cell phones. It's a pretty powerful concept.

PDF's Intent

I want to stress that last point again: "a PDF document is intended to look the same everywhere."

The upshot is that PDF is fundamentally a display format. It's often been termed "electronic paper" and that's a great way to think of it. One of the most common ways to create a PDF file is to install a virtual printer driver and "print" one.

Display. Output.

It was never intended that a PDF document be used as "input" to some process to extract, edit or copy its contents.

Editing or Converting PDF's

What do you do after you print a document to paper and find an error that needs to be corrected?

You reprint it.

That's the "correct" way to make a change in a PDF - alter the original document that it was created from and re-create the PDF.

That's also the "correct" way to "convert" it to a Word document - keep the original Word (or other) document around for editing purposes.

One of the problems is that there's really no always-effective way to get the data back out of a PDF. PDF's can contain so many different kinds of data - text, pictures, pictures of text - that getting data out could be as simple as copy/paste, and as complex has having to run every page through OCR (optical character recognition) software to recover text in some kind of editable format.

In addition, the information in a PDF is organized for layout, page by page. All your careful organization by topic and chapters, with paragraphs that flow neatly from one to another as you edit the document is completely replaced with an organization that reflects the physical layout of the information of the page. The display organization of a document is often very different than your conceptual organization of the document's contents.

PDF Editing and Extraction Tools

There are tools and as you've seen they often don't work, don't work well, don't work completely, or don't work the way we really want them to.

For example I can't point you to a tool that'll take a PDF in and produce a Word document out that matches the Word document you probably expect, if for no other reason that it's difficult if not impossible to reconstruct the document's logical layout from its physical. Tools can make lots of assumptions, but that's all they are - assumptions, and by their very nature they're often wrong.

Editing tools do exist - the makers of Foxit Reader, for example, have several. But once again not everything works the way you might expect: yes, you may be able to change a typo, but adding a paragraph or picture that would cause the entire document to re-flow and re-paginate is probably not something that'll work as expected, if at all.

And of course not every PDF is editable - either by design (encryption or password protection when the PDF is created), or by practicality. A PDF that's built from images - say .jpg's - of pages may look exactly like a PDF that's made from a Word document, but the ability to go back to the original text is severely impaired.

Article C4377 - July 23, 2010 « »

Share this article with your friends:

Share this article on Facebook Tweet this article Email a link to this article
Leo Leo A. Notenboom has been playing with computers since he was required to take a programming class in 1976. An 18 year career as a programmer at Microsoft soon followed. After "retiring" in 2001, Leo started Ask Leo! in 2003 as a place for answers to common computer and technical questions. More about Leo.

Not what you needed?

Recent Comments
21 Comments
Mudassir Ammaar Janjua
July 28, 2010 12:30 AM

Well! respected friend this time you have to use a technique which will pass from 3 Stages by using 2 Softwares, you will find a good result.
2 Softwares named below:
1) Adobe Acrobat 3D Ver.8/ use any software that convert PDF file into Image.
2) OmniPage 14 that will scan image as it is then you can save it as Word document file.

Remember in your prayers.

I can help you if you need any more about it regarding software.

Meublir delVIdeo
July 28, 2010 12:36 AM

I have found that if you click edit>select all>copy then open a word blank page and paste you can save the pasting as a word .doc document. Granted you lose the formatting but you can alter the pasted copy any way you want, even filling in the blanks if so desired.(again, this changes the formatting but it's really quite simple to fix...)

whs
July 28, 2010 5:35 AM

There are tons of programs that can do that. See here: http://www.google.de/search?q=convert+.pdf+to+.doc&rlz=1I7DDDE_en-GB&ie=UTF-8&oe=UTF-8&sourceid=ie7&redir_esc=&ei=xyNQTKWGOoWNOMv5zdAN

There are tons that will try to do that. By definition they cannot succeed on all PDFs, and most people are disappointed with the results. It's not a simple conversion at all.
Leo
29-Jul-2010

Dieter
July 28, 2010 5:48 AM

Try the link below it's freeware for personal use
at least text or pictures can be extracted and use to create a new document. It's available for Windows and Linux

http://pages.cs.wisc.edu/~ghost/

Good luck

Dieter

Joleen
September 5, 2010 9:05 PM

I downloaded a free program called PDF995 which worked great until I got my Acrobat distiller.