Helping people with computers... one answer at a time.
Windows file deletion can be a complex process that might not work as you expect. If you need to keep data secure, extra steps are likely required.
Is it reasonable to assume that recovering overwritten information is expensive?
How does Windows deal with a normal File Save?
Are there snags to password protecting a file?
The original question is much more detailed, but I want to take it and talk about what's reasonable to expect, or not expect, when deleting files and keeping your data secure on your hard disk.
I appreciate that a normal file delete simply removes the file name from the directory system and marks clusters as available for reuse. I also realize that, just as trying to stick one piece of paper over another identical sized piece will normally leave a small amount of the lower piece exposed, so overwriting a disk leaves small areas with the original magnetization. Is it reasonable to assume that recovering overwritten information is so expensive that it would only be attempted for disks storing very valuable information?
Personally, I believe that yes, that is a costly operation that would typically be out of the price range of most folks. Certainly companies and governments with motivation and money may elect to do so, but you're right: I'm sure it would only happen if they thought that there was something valuable to be found, or perhaps done in the course of a criminal investigation.
What makes it costly is that the drive must be disassembled to gain access to that magnetic residue. That involves clean rooms and technology that can analyze the disk media at a much finer level of detail than the drives own read/write heads.
It's also something that's relatively easy to thwart beforehand by using secure delete utilities such as SDelete which deletes files and clears unused space by overwriting it multiple times.
How does Windows deal with a normal File Save? Does it attempt to rewrite the file to the same clusters, simply returning excess cluster to the available pool if the new file is smaller than the original and adding a few new clusters if the new file is larger than the original?
We don't know. Or rather, we may know but we shouldn't count on it.
First, the way space is reused will vary a great deal between file system types. FAT32 will behave differently than NTFS for example.
Second, whatever way it is now there's nothing that says it might not change in the future as the file system is optimized for performance, stability or any number of other reasons.
Realize also that a "File Save" is rarely a direct write on top of the old file by the application. If that failed halfway through for any reason then you would have lost both old and new versions of the file. What's much more common is a sequence like this:
As you can see, where the new file is written has nothing whatsoever to do with where the old file happened to be. To the file system they're just two different files.
If every File Save is to a new area of disk, then what I am suggesting will obviously not work, but if clusters are reused as far as possible, then is this a feasible way for people to deal with small amounts of moderately sensitive data?
If this would work, how often should normal people repeat it?
As you say, since every file save is likely to be to a new area of the disk, this will not work. However, it includes another misconception that I wanted to address.
Filling a document with "=rand()" is making a huge assumption about how the application itself behaves. In this case, you're talking about Word and Excel, but really it applies to any application that saves files.
The assumption is that it's doing what you think it's doing.
What you think it's doing in this case is filling up that portion of the file where data used to be with the new data you've just specified.
That's a bad assumption.
Many applications have fairly complex file formats, and many also perform additional optimizations for speed. For example, it might be faster to grow the file with new data while leaving the old data in the file, but marked as "no longer used" - kind of like the file system itself in your first question. As you can see, your old data is still in there, and could possibly be extracted by using other tools to examine the file.
In fact, there have been many instances of exactly this happening and sensitive data coming to unexpected light because people didn't realize that Word behaved in just this manner.
Now the applications may have options to disable this type of functionality (the feature in Word is called "Fast Save" and can be turned off in Options), but others may not.
Bottom line: don't rely on what you think the application does to erase sensitive data. If you really need to be absolutely sure that data is positively erased, then delete the file using a secure delete utility.
Are there snags to password protecting a file? I have only a few password protected files, and I protected them so long ago that I have forgotten how I did it. If I were to now password protect existing files, the file system would obviously only know about the password protected files, but would the old files still be in their original clusters?
There are so many snags to password protecting files that I basically recommend avoiding it. It's typically best thought of only as "keeping honest people honest".
The problem of course is that Windows does not provide a facility to password protect files. Windows assumes that multiple user accounts and access control will be used to restrict access to things on your hard disk, and that other means will be used when transferring data from machine to machine.
The result is that each application must add password protection on its own. So Word has password protection, and Excel has password protection, and other applications may have password protection. They might all be implemented the same way or not. They might be easy to crack (as many are), or they might not. Password protection might encrypt the data in your file, or it might not.
Once again, if what you have is truly sensitive, then I recommend avoiding application-specific password protection, and move to an encryption solution such as TrueCrypt.
And to come full circle, an added benefit of using a solution like TrueCrypt is that only encrypted data is physically written to the disk. That means that even the magnetic residue that might be recoverable is itself encrypted and thus so much useless noise without the encryption key.
Comments on this entry are closed.
If you have a question, start by using the search box up at the top of the page - there's a very good chance that your question has already been answered on Ask Leo!.
If you don't find your answer, head out to http://askleo.com/ask to ask your question.