Helping people with computers... one answer at a time.

Most music, picture, and movie files are already compressed. The result is that compressing them again won't make much difference and could even make them larger.

Can I ZIP my pictures or MP3 files to save space?

ZIP is a very popular compression algorithm supported by many popular programs such as WinZip, 7-Zip, and recent versions of Microsoft Windows. ZIPping a file or set of files can often reduce their size significantly at the cost of needing to be unzipped before they can be used.

Note though that I said, "...often reduce their size."

Unfortunately, "often" doesn't mean "always."

The short answer

ZIPping photographs, music, and videos will typically not make them significantly smaller and can even make them slightly larger.

To understand why that might be, we need to look into how compression works at a high level.

About compression

While the specifics of many different compression algorithms is often the stuff of research, theses, and even patents, the concepts of compression are actually fairly simple.

The idea is that information stored on disk is often stored in a way that is less than optimal for storage. It may be optimal for other purposes, but as a side effect, there may be redundant information in the data that could be represented differently.

A simple compression algorithm is "run length encoding."

Consider the following text:

This is a row of 10 asterisks: ********** followed by text.

That's 59 characters long. If we define the character "+" to not be a plus character, but rather an indicator that the next two characters are a count, and the third character the character that should be repeated that many times, we get this:

This is a row of 10 asterisks: +10* followed by text.

We've shortened or "compressed" the text to only 53 characters, but it still means exactly the same thing. When decompressed, the "+" is encountered causing the "10*" that follows it to be read and replaced with 10 asterisks. The original uncompressed text is restored.

This is a row of 10 asterisks: ********** followed by text.

"One of the most common ways that compressed data can end up larger than the original is if the original itself is already compressed."

Compression doesn't always compress

In the example above, we took a line of 59 characters and "compressed" it to 53 characters. It's not a great compression algorithm, but it worked.

Now, let's compress this text using the same algorithm:

Here's a single plus sign: + followed by text.

That's 46 characters long.

The problem is that because it actually contains the plus sign, the character we said was special in our compression algorithm, we can't just let it be. If we do, the decompression algorithm will look at it and say, "Oh, the next two characters are a count of the number of times I should repeat the third character following," which is simply wrong.

Unless we specially encode the plus character:

Here's a single plus sign: +01+ followed by text.

That allowed the decompressor to follow its algorithm: "+" means the next three characters are a count (one, in this case) of the number of times to repeat the third character ("+"). The compression and decompression algorithm works.

The only problem is that the "compressed" data at 49 characters is now larger than the original 46.

Every compression algorithm faces this problem. My little example above was crafted to make it easy to show, but even the most advanced compression algorithm will have situations where compressing particular forms of data may cause the "compressed" data to be larger than the original.

Compressing already compressed data

One of the most common ways that compressed data can end up larger than the original is if the original is itself already compressed.

Let's look at the compressed version of my silly little example again:

This is a row of 10 asterisks: +10* followed by text.

What happens if we try to compress that data again? Well, as we saw, that single "+" sign is a problem and needs to be treated specially:

This is a row of 10 asterisks: +01+10* followed by text.

The result is that the "compressed" data got bigger than the original.

Or rather compressing the already-compressed data made it larger.

This happens most reliably when you compress twice using the same algorithm, but if the compression techniques you're using are relatively efficient, then the algorithms don't matter as much. ZIPping something twice makes the second zip larger than the first. But ZIPping a "RAR" file, also a compressed file, will typically result in something bigger than the original.

With that as background, we can finally explain our answer to the question.

Pictures, music, and videos are already compressed

Pictures in popular formats such as .jpg, .png or .gif are already compressed.

Music files in formats like .mp3, .ogg, .aac, and so on are already compressed.

Video files in formats like .wmv, .m4v, .mov, and more are already compressed.

And as we've seen by now, depending on the type of compression you're using, compressing an already-compressed file at best does very little and at worst makes the file bigger.

So there's typically no space-saving advantage to ZIPping a photo, a movie, or an MP3.

ZIPping is more than compression

Of course, creating a .zip file is useful for more than just compression.

It can be handy to ZIP several files (for example, a collection of several pictures) together, combining them into a single file for a single download, attachment, or file transfer.

People still need to unZIP before they can see them, but it can be simpler to transfer only one file rather than several.

(This is an update to an article originally published November 21, 2004.)

Article C2229 - October 7, 2012 « »

Share this article with your friends:

Share this article on Facebook Tweet this article Email a link to this article
Leo Leo A. Notenboom has been playing with computers since he was required to take a programming class in 1976. An 18 year career as a programmer at Microsoft soon followed. After "retiring" in 2001, Leo started Ask Leo! in 2003 as a place for answers to common computer and technical questions. More about Leo.

Not what you needed?

Recent Comments
30 Comments
bill
October 9, 2012 10:04 AM

Earl:

All those riffs that you think could be replaced by one copy, cannot. There are differences in the playing of them every time, even though the sheet music might be the same.
Consider the Smoke on the Water riff. Even on the sheet music, it has variations, then you add in the ones that come from a real person playing it.

The problem with your other theory of a loss less way to compress MP3s is that the MP3 has already created losses and if you could find your magical program, you are just copying the losses. However, as Leo tried to explain (and did a much better job than most of the explanations I have seen), a compessed file looks to another compressor as relatively random stuff that doesn't compress well.
Don't ever expect that any program will compress any already compressed (even a lossless) file. FLAC compression is lossless and does a great job in compressing audio files that were not already compressed by something else. You could compress them more by converting them to an MP3 but what that would really be doing is restoring the uncompressed audio and throwing out parts of it that you think are not really noticeable as it compresses it again.

If you want to experiment with something more obvious than audio (a lot of the changes are very subtle), try using a program where you can convert a photo to jpg and adjust the compression. Pick a photo with some subtle shading of similar colors (like sky) or for a shocker, use some line art (like a screen capture of this page) to see how much the compression looses or modifies the origonal.

John.P
October 9, 2012 3:50 PM

Life is too short to be using MP3, Memory storage is cheap! Poor quality sound is like drinking cheap wine.

Digital Artist
October 9, 2012 10:16 PM

The algorithm described in the article is lossless, the ten asterisks are returned intact. Years ago pondering file compression I figured something like that independently. (I have re-invented the wheel hundreds of times, can't help it) I do a lot of intense graphics work, and I always save my files as bitmaps. (Or vector graphics if they are of that nature) but never as jpg. (Unless I get careless and do it by mistake, which happens sometimes) Disk space is not that precious to me anyway, and a lot of times I will attach a bmp to an email, then wonder if the recipient will wonder why I didn't send a jpg. If you save a 256 color image as a jpg and then test the count of colors in the picture, it will be in the tens of thousands. If you cut a figure out of the background by filling the backgournd with white, then save the cut out figure as a jpg, and later try to paste that cut out figure over a new background, it won't work, because the white will have become a hundred or more 'shades' of white. I am now curious enough to test a compression program like 7zip on a bmp file, although I doubt I would use it even if it results in high compression and no loss of data. Maybe for archiving some old files to DVDs or something.... Thanks for giving me something to think about. It's great being a geek in the company of geeks. :)

Mark J
October 10, 2012 1:51 AM

@Digital Artist
.PNG format is a lossless format for saving photos in the same way .FLAC is a lossless format for compressing music. Now that storage and bandwidth are getting cheaper, many websites are using .PNG photos as they have a higher quality. :PNG format is a universal format which is understood by all browsers, image viewes and editors.

Sri
October 10, 2012 11:24 PM

ZIPping several less used files has a different benefit. It reduces the folder and file names clutter. And reduces the number of file handles (or inodes in Linux). File handles or inodes are the index entries for a file on the file system. And, these are not unlimited. Moreover, it can decrease the computer's performance/response when searching or using file explorers.

So, if you have many files that you use rarely, it is good to organize them in folders and ZIP each folder and then delete the files. Also, when you need to store a large number of files on a USB external hard disk, it reduces the backup time. Storing one or few zip files takes substantially lesser time than storing thousands of files.