Helping people with computers... one answer at a time.

In the long run, I wouldn't really worry. Trying to verify that the file is copied correctly takes more time than simply keeping regular backups.

I have a lot of files, pictures and documents, and most of the time, I copy or move it in the hard drive or to DVD-ROM. My question is how can I make sure that what I've copied or moved is exactly the same as the original? Most of the time, after I copied, I check the folder or File Properties and compare the Size and Size on disk and Contains - and you know, every time Size is always the same but Size on disk is sometimes different.

In this excerpt from Answercast #58, I look at the way files copy to disk and why you needn't really worry about it, especially if you have a backup!

Different size designations

So, it's interesting because I was just reading an article by a Microsoft techie who was discussing something very similar to this from a programming perspective.

His bottom line was there are many things that you can do to double-check that what has been written to the destination does match the source. And yet, for as much time as you might spend doing that, it's still possible that immediately after doing that, something else could happen.

Backing up is best

So, my very first recommendation is that the best thing you can do to make sure that you never lose data is (as you might expect) backup and backup regularly.

Copying usually works

Now, in a case like this where you're copying files - to be honest, I'd just let the file copy work and assume it works. Remember that if there's a problem along the way, if the copy operation actually stumbles and runs into something that would cause a failure of the write operation, it's going to tell you when you do the copy. So, by the time the copy has completed and has completed successfully (without error messages), then there is a very, very high likelihood that the file has been copied: it's been copied completely, it's been copied entirely and it's been copied correctly.

You may not need to do or to take any additional steps.

Command line functions

Now, if you do, if you're a little extra paranoid about these kinds of things, there are two solutions that I will actually point you at. Unfortunately, they are both command line functions.

The xcopy command line command has an option (I believe it's /v) which it calls "verify." So, if you learn how to use xcopy to copy files from one place to another, then when you specify this "verify" option - what xcopy does is it copies all of the files that you specified. And when it's done, it goes back and re-reads them. It makes sure that what it finds in the destination that it just put there still matches what it read originally as the file.

Now, the other solution, the other approach that doesn't actually involve moving to the command line to do the copies, uses a command line utility called FC. Just the letter F and the letter C. That stands for "file compare" and it is a command line utility that will do pretty much exactly what you're looking for.

You tell it "compare the files that are here with the files that are there" and it will tell you if any of them are different. You have to be a little careful with File Compare. It assumes that you're using text files. You do need to specify (I believe it's the /b) if the files that you're looking at are not text - and most of the files you're probably dealing with like your documents, like your pictures, are definitely not text.

But it's the same idea. It will actually read both sets of files assuming that they're the same and if they're not, it lets you know.

Different size on disk

Now, about the Size and Size on disk.

The interesting thing about the way Windows (and actually most operating systems) write information to disk is that they do not allocate space one byte at a time. They allocate space in terms of sectors or clusters. So a sector (usually, 512 bytes long) is allocated as an entity. If you have a one-byte file, it will take up 512 bytes on the disk; it will actually consume 512 bytes of disk space.

It will only use the first byte of that 512 bytes to store its data - and there's other information stored with the file that says it's "exactly one byte long only look at the first byte," but the entire 512-byte sector has been allocated to that file.

So, when you take a look at "file size" versus "size on disk," the two numbers are actually telling you two different things.

  • The "file size" is the true file size (it's the one byte in the case of my one byte file example.)
  • The "size on disk" is telling you how much space on disk has been allocated for that file (in my example, that would be 512 bytes even though it's a one byte file).

So, you would see two different numbers. And in fact, those numbers might be different between two different types of disks because 512 bytes is just an example. It can be 512, it can be 1024, it can be 2048, it can be 4096. And even more.

There are larger cluster sizes as well.

The bottom line is that's defined by the way the disk is formatted. It's an option that's actually specified in most cases at the time the disk was formatted.

So what that means is on that on disk, a file will always have at least this much space taken up on the disk. And it will always grow by 512 bytes or by 1024 bytes or whatever. That could be the different between the two places you're looking at. Specifically, if you're looking at one hard drive and a DVD, it's very reasonable to think that the cluster size of a large hard disk is going to be a different choice than the cluster size that you might find on something smaller like a CD or a DVD.

Bottom line is I wouldn't really pay much attention to the size on disk. I would pay attention to the specific file size and only the file size. And like I said, in the long run, I wouldn't really sweat about it... trying to verify that the file is in fact copied correctly. If you don't get any errors, it probably has. Even if it fails later, you wouldn't have found out at the time it's copied!

That's why you want to make sure that you're backing up and backing up regularly so that if (or more specifically when) something finally does fail, you've got backup copies of everything that you care about.

Article C5882 - October 3, 2012 « »

Share this article with your friends:

Share this article on Facebook Tweet this article Email a link to this article
Leo Leo A. Notenboom has been playing with computers since he was required to take a programming class in 1976. An 18 year career as a programmer at Microsoft soon followed. After "retiring" in 2001, Leo started Ask Leo! in 2003 as a place for answers to common computer and technical questions. More about Leo.

Not what you needed?

4 Comments
Matt
October 3, 2012 10:44 PM

I would think hash values would be much easier than command line functions for average users. Usually, they are used to ensure a downloaded file was completed properly, but double checking files you copied is possible as well...

I don't believe that there are any standard hashing programs included with Windows, and I wanted to stick with what everyone already has on their machines.
Leo
04-Oct-2012

Ronny
October 4, 2012 3:55 PM

When used on the command line, it's XCOPY not X COPY. XCOPY has a lot of options, including /V to verify. Use:

XCOPY /?

to see all the options.

Andy
October 5, 2012 9:14 AM

Leo,

You mentioned that you was reading an article about this from a Microsoft person. I'm a programmer myself and I would be quite interested in reading the article so could you provide the link to it if you still have it please?

Thanks

Len C
October 5, 2012 10:25 AM

I use teracopy to copy files where I want immediate confirmation that it was done correctly. Teracopy has an option to do an automatic hash value check as part of the copying process. Very nice program - and free.

Comments on this entry are closed.

If you have a question, start by using the search box up at the top of the page - there's a very good chance that your question has already been answered on Ask Leo!.

If you don't find your answer, head out to http://askleo.com/ask to ask your question.