Helping people with computers... one answer at a time.

It's not uncommon to want to take a PDF file and convert it to a more easily editable format like Word. Problem is, PDF wasn't mean to do that.

I am trying to convert a PDF into a word document and there does not seem to be any free programs that will do this. I have tried several. Can you help?

Probably not.

Your question is a common one, but it represents a fundamental misunderstanding of exactly what PDF documents are and how they were intended to be used.

I'll put it another way: let me explain why I can't help you.

PDF Format

PDF, or Portable Document Format, is a document format that is intended to address the fact that every computer is different from almost every other computer in some ways. Frequently, those ways affect how documents are displayed and/or printed.

For example, a Word document you receive might use fonts that aren't installed on your machine, so when displayed Word has to pick a "close" font - but it's still different. Perhaps your printer has a minimum margin of 1/2 an inch, but your friend's printer can handle 1/8th of an inch - when you each print the same Word document it wraps, paginates and fundamentally looks different depending on the specifics of your printer.

"The upshot is that PDF is fundamentally a display format."

The things that contribute to visible differences in document presentation aren't limited to fonts and printers, nor are they limited to Word documents - almost any program that produces a document is susceptible to all sorts of system-to-system differences.

That's what PDF format sets out to solve: a PDF document is intended to look the same everywhere, and in practice it pretty much does. And across not only a wide variety of computers, but also these days on devices ranging from portable readers to cell phones. It's a pretty powerful concept.

PDF's Intent

I want to stress that last point again: "a PDF document is intended to look the same everywhere."

The upshot is that PDF is fundamentally a display format. It's often been termed "electronic paper" and that's a great way to think of it. One of the most common ways to create a PDF file is to install a virtual printer driver and "print" one.

Display. Output.

It was never intended that a PDF document be used as "input" to some process to extract, edit or copy its contents.

Editing or Converting PDF's

What do you do after you print a document to paper and find an error that needs to be corrected?

You reprint it.

That's the "correct" way to make a change in a PDF - alter the original document that it was created from and re-create the PDF.

That's also the "correct" way to "convert" it to a Word document - keep the original Word (or other) document around for editing purposes.

One of the problems is that there's really no always-effective way to get the data back out of a PDF. PDF's can contain so many different kinds of data - text, pictures, pictures of text - that getting data out could be as simple as copy/paste, and as complex has having to run every page through OCR (optical character recognition) software to recover text in some kind of editable format.

In addition, the information in a PDF is organized for layout, page by page. All your careful organization by topic and chapters, with paragraphs that flow neatly from one to another as you edit the document is completely replaced with an organization that reflects the physical layout of the information of the page. The display organization of a document is often very different than your conceptual organization of the document's contents.

PDF Editing and Extraction Tools

There are tools and as you've seen they often don't work, don't work well, don't work completely, or don't work the way we really want them to.

For example I can't point you to a tool that'll take a PDF in and produce a Word document out that matches the Word document you probably expect, if for no other reason that it's difficult if not impossible to reconstruct the document's logical layout from its physical. Tools can make lots of assumptions, but that's all they are - assumptions, and by their very nature they're often wrong.

Editing tools do exist - the makers of Foxit Reader, for example, have several. But once again not everything works the way you might expect: yes, you may be able to change a typo, but adding a paragraph or picture that would cause the entire document to re-flow and re-paginate is probably not something that'll work as expected, if at all.

And of course not every PDF is editable - either by design (encryption or password protection when the PDF is created), or by practicality. A PDF that's built from images - say .jpg's - of pages may look exactly like a PDF that's made from a Word document, but the ability to go back to the original text is severely impaired.

Article C4377 - July 23, 2010 « »

Share this article with your friends:

Share this article on Facebook Tweet this article Email a link to this article
Leo Leo A. Notenboom has been playing with computers since he was required to take a programming class in 1976. An 18 year career as a programmer at Microsoft soon followed. After "retiring" in 2001, Leo started Ask Leo! in 2003 as a place for answers to common computer and technical questions. More about Leo.

Not what you needed?

21 Comments
Maria
July 23, 2010 7:22 PM

Well, you can try saving it as a word doc & if you are lucky, you can play around with it or yank your hair out trying, lol or save it as text, then copy to Word & format the whole doc to your liking. PDF's are meant to be as is with no changes.

Mark Jacobs
July 24, 2010 2:03 AM

@Maria. The only way you can save a pdf document as a Word file is if you have a program such as Adobe Acrobat or Foxit Pro. Adobe & Foxit Readers can't do that. An OCR program can, but the output would need a bit of work to get it formatted like the original. You can select, copy and paste the data into Word that you want to copy, but often the layout will be so weird it would take a long time to fix up. One of the best OCR programs I've found for preserving layout is FineReader but the Word document it produces can sometimes be very hard to edit properly.

The Dr.
July 24, 2010 6:50 AM

I find this site does a superb job at converting PDF files to Word:
http://www.scannedpdftoword.com/

Mary
July 25, 2010 9:46 PM

One more free PDF to Word converter.

http://www.pdfonline.com/pdf2word/index.asp

Tony
July 25, 2010 9:59 PM

I thought one of the purposes of PDF was security. Rather than send an editable Word or Excel document via email, some people deliberately convert to PDF so that the recipient cannot tamper with it.

In the work environment we sometimes pull up reports from our accounting program, save them as PDFs, then use a program called Able2Extract to convert them to Excel spreadsheets if we want to work with the information. There's an option for Word conversions as well.

Jonathan V.
July 26, 2010 9:30 AM

Yet another PDF converter:
I've had pretty good results with , even for relatively complex documents.

Jonathan V.
July 26, 2010 9:31 AM

Link didn't come through on my last comment:
http://www.pdftoword.com/

John
July 27, 2010 8:28 AM

The best PDF to Word converter, which I use every day, is ABBYY PDF Transformer. It will produce a Word document that exactly matches the original PDF, once you get to know how to use it. Unfortunately, its a professional tool and is not cheap (about $100).

Douglas Johnson
July 27, 2010 11:08 AM

My major problem with PDF conversions is that I am subject to notification mandates from the local federal court, and the list of people to be notifed are furnished to me in PDF format. The PDFs are supposed to be compatible with Avery labels, but they seldom are. I need to convert documents that I did not originate in order to generate mailing labels. So far the only way I have found to do this is the old fashioned manual keyboard.

Tim
July 27, 2010 12:04 PM

Gmail just added the ability to convert pdf to doc when uploading to docs. Have not tried it myself, but since it's google I have the feeling it'll not be too bad.

Rob
July 27, 2010 12:10 PM

different site and not sure if it does the same as the pdftoword site but this looks similar.
http://www.pdfonline.com/pdf2word/index.asp

Geoff.
July 27, 2010 2:32 PM

I heard once that the main object of PDF file was that it can't be altered thus if an offer to purchase something for x that figure could not be altered and could be legaly relied on.?

I don't think that was ever a goal. Nonetheless, practical reality is that they can be altered.
Leo
29-Jul-2010

Rohn
July 27, 2010 3:13 PM

As Leo has already said, PDF format is designed to preserve formatting of a document. Although there are a few security features built into the PDF standard they are weak. PDF can be set to defeat simple copy and editing with other tools (that may add comments to unprotected PDFs). But the bottom line is that with some effort PDF content, text and graphics, can be copied or extracted one way or another.

If you want to ensure unchanged content, you should rely on a "Digital Signature" generated by an encryption tool. I'm not sure how well they work with PDFs, but I know that "Digital Signatures" are the way to go with text files like Word DOCs.

Actually, MS Office provides a way, kinda, of extracting text from PDFs. In the Office Tools feature, you can find the Document Imaging Tool. It opens graphic files, TIFF and MDI, and performs OCR to extract the text. So to make this work, you have to use a tool (there are many free websites that will do it for you) to convert the PDF to image format. Or you can do it "at home" as follows, Process to Convert PDF to DOC:
Irfanview needs optional Plugins installed to open PDF files.
Open the PDF in Irfanview
Save AS TIF format
Open the TIF version in Microsoft Office Document Imaging tool
Tools / Recognize Text Using OCR
Tools / Send Text to Word

I have tried several PDF to DOC conversion tools over the years. The one feature/defect that I have learned to check for is whether the converted text is put into a single paragraph or if each line is put into separate "Text Box" in a misguided effort to maintain the explicit line breaks. If each line is in a separate box, it is effectively un-editable. In that case I uninstall the conversion software immediately as a waste of time.

Here are links to some articles:
http://www.gohtm.com/ is a free service that converts pdf, word, xls, ppt to html. Including jpegs.

http://www.adobe.com/products/acrobat/access_onlinetools.html converts any pdf file to html.

http://www.pdfzone.com/article2/0,1759,1815273,00.asp - Converting PDF to DOC (check the PDFZone site, they have lots more similar pages!)

http://news.office-watch.com/t/n.aspx?articleid=1317&zoneid=12 - Converting PDF to Word

http://www.ofzenandcomputing.com/zanswers/1017

Kevin Warrington
July 27, 2010 4:38 PM

Simple, free and very effective conversion can be done using free OpenOffice and free Sun PDF Import Extension.

andy
July 27, 2010 4:40 PM

I agree with Rob.
http://www.pdfonline.com/pdf2word/index.asp

This site can produce a word document with all the pictures, appearing exactly as the PDF does. If using paste special then you will get a word doc that is easily edited.

Hooda
July 27, 2010 10:07 PM

i use a small utility called pdf2word
works effectively for me.

Link: www.hellopdf.com

May be it can be helpful for you too.

Mudassir Ammaar Janjua
July 28, 2010 12:30 AM

Well! respected friend this time you have to use a technique which will pass from 3 Stages by using 2 Softwares, you will find a good result.
2 Softwares named below:
1) Adobe Acrobat 3D Ver.8/ use any software that convert PDF file into Image.
2) OmniPage 14 that will scan image as it is then you can save it as Word document file.

Remember in your prayers.

I can help you if you need any more about it regarding software.

Meublir delVIdeo
July 28, 2010 12:36 AM

I have found that if you click edit>select all>copy then open a word blank page and paste you can save the pasting as a word .doc document. Granted you lose the formatting but you can alter the pasted copy any way you want, even filling in the blanks if so desired.(again, this changes the formatting but it's really quite simple to fix...)

whs
July 28, 2010 5:35 AM

There are tons of programs that can do that. See here: http://www.google.de/search?q=convert+.pdf+to+.doc&rlz=1I7DDDE_en-GB&ie=UTF-8&oe=UTF-8&sourceid=ie7&redir_esc=&ei=xyNQTKWGOoWNOMv5zdAN

There are tons that will try to do that. By definition they cannot succeed on all PDFs, and most people are disappointed with the results. It's not a simple conversion at all.
Leo
29-Jul-2010

Dieter
July 28, 2010 5:48 AM

Try the link below it's freeware for personal use
at least text or pictures can be extracted and use to create a new document. It's available for Windows and Linux

http://pages.cs.wisc.edu/~ghost/

Good luck

Dieter

Joleen
September 5, 2010 9:05 PM

I downloaded a free program called PDF995 which worked great until I got my Acrobat distiller.

Comments on this entry are closed.

If you have a question, start by using the search box up at the top of the page - there's a very good chance that your question has already been answered on Ask Leo!.

If you don't find your answer, head out to http://askleo.com/ask to ask your question.