Helping people with computers... one answer at a time.

Copying a web page to archive or read later is not terribly difficult, but getting everything to copy as you see it can be a challenge.

How do I copy an entire web page? I copy and paste, but not everything appears as I see it. For example I'm copying and pasting a bank statement to Word, but portions of the page appear empty.

It depends a little on exactly what you're trying to do. I know you're trying to copy a web page, but why? So you can modify it, or just save a copy for your archives?

There are several approaches. None of them are what I'd call really clean, but depending on your goal, one or more of them might work for you.

Print to PDF

If all you're attempting to do is save a copy of the page for your records, this is my number one recommendation. I do it myself for almost all of my banking records. I visit my bank's web site every month, display the statement, and then "print" it to a PDF file which I then save.

PDF files are great for several reasons. With the right software installed they're easy to produce, and PDF has become so ubiquitous that finding a PDF reader is almost trivially easy. Chances are you already have one downloaded on your machine.

If you're running Windows XP or older I recommend PDF Creator. This is a free, open source utility for creating PDFs. Install it, and you'll get a virtual printer driver that you simply print to which produces PDFs as your "printed" results. PDFCreator appears to have difficulty in Windows Vista. Scan the discussion forums for some hints, if you'd like to try to get it working.

Alternately Foxit Software, makers of the free Foxit Reader, also make their own PDF Creator. It's not free, but does apparently work under Vista. The highly regarded screen capture utility SnagIt also includes a PDF capture printer driver. And of course there's always Adobe Acrobat itself; it happens to be what I use on my Vista laptop since it came bundled.

Print to Paper

It's probably not what you were looking for, but it had to be said. Quite often for archival purposes actual hard copy is the way to go.

Side note: some HTML pages will print differently than they appear on screen. This is actually under the control of the web page designer. If you print this page, for example, items such as the advertisements and menu bar will not be printed. Ideally printing will give you useful but not necessarily identical results.

Copy/Paste

"In general, copy/paste is a reasonable approach when you want to save only a portion of text that you see on a web page."

OK, so if you still want to take the copy/paste route there are approaches, but there's almost no chance of getting exactly what you see in your browser. Depending on the page design and the program you're pasting into, there are many things that will not copy over or will copy over slightly differently. Consider that the same exact page viewed in two different browsers, for example Internet Explorer and FireFox, will look slightly different. You'll see the same exact page, and yet not the same exact results.

If different browsers which are specifically designed for viewing web pages can't get it the same, then the chances of other programs such as Word doing so are basically zero.

To start with, in your browser copy the document by doing this:

  • Type CTRL+A - this selects everything on the page. It's much more reliable than trying to select everything with the mouse. (I always miss something :-).

  • Type CTRL+C - this copies the selection to the clipboard.

Now in Word, type CTRL+V to paste. If you do this with, say, the Ask Leo! home page you'll see it looks quite different than the original:

Ask Leo home page copied into Microsoft Word

The content is there, but the formatting is gone. In fact, it appears that Word did not get the stylesheet that is associated with my pages. Stylesheets can control a tremendous amount of the content and formatting of web pages. In my case the results are still somewhat usable, but I can easily see that other sites which rely even more heavily on stylesheets might be more seriously affected.

In general, copy/paste is a reasonable approach when you want to save only a portion of text that you see on a web page. Various limitations make it less than ideal for trying to save the entire page.

File Save

Most people miss the fact that there's a "Save" item on the file menu in their browser. While viewing a web page you want to save, click on the File menu, and then the Save or Save As.... Make sure that the save type is a ".htm" or ".html", and you'll get a true copy of the web page saved to your local machine.

Naturally, there are caveats here also.

The web page may be saved as only the html. Meaning that all the images or other files referenced within the HTML page may not be saved. Depending on your browser when you the view that saved page later, these items may not display, or they may be fetched automatically from the web, assuming that they're still on the original web site.

The web page may be saved with all the images and additional files. This is handy because it's as close to a snapshot of the web page as you can get. The problem is that it's not saved in a single file. You may find "mysavedfile.html" as your saved file, if that's what you called it, but then you'll also find a sub-folder called "mysavedfile_files" where all of the images and other components have also been downloaded. You'll need to keep both that ".html" file as well as the files that came with it to accurately save a copy of the page.

Article C3034 - May 24, 2007

Leo Leo A. Notenboom has been playing with computers since he was required to take a programming class in 1976. An 18 year career as a programmer at Microsoft soon followed. After "retiring" in 2001, Leo started Ask Leo! in 2003 as a place for answers to common computer and technical questions. More about Leo.

Not what you needed?

Recent Comments
43 Comments

tried the control + a on page i wanted to copy and THIS page and NEITHER time did it work - i have never ever worked on any computer where those types of commands actually ... like the oxymoron they are called ... function.

Posted by: robal at January 11, 2011 7:53 AM

Not just helpful responses but in clear easy to follow language. Giving an idea of how things work. Even if the info is a bit off for a specific situation, you can now google for a better search and find what you exactly need. A very smart site. I'm very savvy, but couldn't match Leo.
Yet I have to add something to this post. Since 1995 I've been downloading everything on a site. css, pics, everything. Locally it looks the same and has the same code. Nothing works perfectly but these site down-loaders are a necessity. Do a search like "Save entire website", and find software like HTTrack or one of the many others.
Also this site would be friendlier if "preview" of a comment wasn't wiped. Not a biggie.
- Arthur

Posted by: Arthur at January 23, 2011 1:41 PM

I tried the save as pdf option and it worked great. I was flubbing around, wondering how to use the information stored somewhere/somehow for the purpose of printing to produce an image and since I never make PDFs, I wouldn't have thought of this. You're pretty much my hero. Thanks!

Posted by: Leah at January 31, 2012 10:14 AM

I just found my entire website in PDF form on a site called Printfu.com. WTF? Isn't this a blatant invitation to copy my intellectual property?

Posted by: China at March 1, 2012 1:12 PM

I am in need of saving a 100 + pages of a website before they pulled the plug. A quick internet search brought me here. What a simple and eloquent solution to print to PDF. Downloaded the driver and it worked like a charm.

Thank you!

Posted by: Nocutename at March 7, 2012 5:26 PM
Post a comment on "How do I copy an entire web page?":





Remember Me?

(You may use HTML tags for style)

Before commenting, please...

  • READ THE ARTICLE. A comment that shows you didn't will be deleted and ignored.

  • Comment only on the article. Use the search box at the top of the page if you have a question about something else.

  • NO PERSONAL INFORMATION in the comment. No email addresses. No phone numbers. No physical addresses.

  • Anything that looks the least bit like spam will be deleted. Links to unrelated sites or links that appear to be primarily promotional will be deleted, or the comment will be deleted.

  • Don't ask me to recover lost passwords or hacked accounts. I can't. Those comments will be deleted.

  • I can't respond to every comment. And I can't vouch for the accuracy of others who do.

Please wait. Your comment is being processed ...