Helping people with computers... one answer at a time.

Scanned documents can result in a large graphics file that is too large to email. Scan settings allow you to change the DPI to something more useful.

I scanned a copy of my W2 form and tried to email it to someone but I was told the file was too big - over 25 MB. How does a simple text document acquire such a huge volume?

In this excerpt from Answercast #81, I look at the reason scanned documents can often result in a very large file.

Scanned document is a large file

Well, when you scan it, it's not a text document. What a scan creates is a picture of the document.

It's like taking your camera and taking a picture of that piece of paper. A camera's image can be quite large depending on any of a number of different settings. A scan is no different. It's literally, probably, a .jpeg or a .png file (which is a graphics file format), not a text file format.

Converting an image of a scanned document into text requires OCR and a number of other things to make sure that the formatting all comes out proper.

Set scan resolution

Now, we can adjust the 25 MB. That does seem a tad large for a simple one-page document.

What I would have you do is look at the resolution being used by your scanner. Usually, a simple document like a text document can be scanned at something as low as 75 dpi. It will look just fine. That is 75 dots per inch.

Many scanners might be set at, or default to, a higher resolution. If you've been scanning pictures recently, it may be set to even much higher DPI: 312, 1200 maybe even more. The net result is that each inch of scan then results in more dots. More dots take up more space, more space means a larger file.

For a simple text document, you don't need that much. That's the setting I would look at the next time you scan something like this. Take a look at the DPI and see if something as low as 75 dpi doesn't get you a perfectly acceptable scan of the document you want. Perhaps even lower.

(Transcript lightly edited for readability.)

Article C6165 - December 23, 2012 « »

Share this article with your friends:

Share this article on Facebook Tweet this article Email a link to this article
Leo Leo A. Notenboom has been playing with computers since he was required to take a programming class in 1976. An 18 year career as a programmer at Microsoft soon followed. After "retiring" in 2001, Leo started Ask Leo! in 2003 as a place for answers to common computer and technical questions. More about Leo.

Not what you needed?

6 Comments
Tom R.
December 25, 2012 10:09 AM

I surely wouldn't recommend emailing a document as sensitive as a W2 form. The potential for identity theft is too great because typically these forms contain confidential info like SS numbers. I would paper mail such a sensitive document. Much more secure.

James Heinrich
December 25, 2012 11:56 AM

For a >25MB output file, I'd suspect the file was scanned into some lossless format, most likely TIFF or BMP, which would typically require 4 bytes per pixel. Assuming 8.5"x11" @ 300dpi * 4 bytes/pixel you get 32.1MB
You'd want to save as a PNG (or possibly GIF) if it's a mostly text/line-art image, or JPEG if it's a photographic type subject. This should be an option in either the scanning program or your favourite image editor. In either case the output image should easily be less than 1MB.

bill - Minneapolis
December 25, 2012 12:31 PM

In addition to the file type mentioned by James as being a typical culprit in large file sizes, check the color settings.
A W2 and many other documents don't need to be sent in full color.
8 bit gray scale only takes 1/3 the space of full color and 2 color (black/white) takes even less. If you get down to 2 color, many programs will use group 4 encoding which is extremely compact with no loss of image quality. Group 4 was created for sending faxes and any image that is mainly white with black scattered all around is what it is best at compressing.

Mark J
December 25, 2012 12:45 PM

@Bill
I've found that 8 bit gray scale compressed in the .jpg format is significantly more readable than a 2 color image and with not a great deal of size difference.

HA
December 25, 2012 8:47 PM

Send text documents as a PDF.

Alex Dow
December 26, 2012 4:09 AM

I recently carried out a comparison exercise, based on the simple phrase "Here is the News".

Bearing in mind the effects of disk segments etc; and that with the JPG, I trimmed the image to minimum to contain the phrase, the results were-

1 KB - TeXT
11KB - JPeG Graphics
26KB - DOC WORD
32KB - PRiNt as sent to Printer
50KB - Portable Document Format
7,125KB - MOVie Quicktime

I did not check the PDF version to see if there are any inclusions such as Fonts etc.

Comments on this entry are closed.

If you have a question, start by using the search box up at the top of the page - there's a very good chance that your question has already been answered on Ask Leo!.

If you don't find your answer, head out to http://askleo.com/ask to ask your question.