Helping people with computers... one answer at a time.

Anti-malware software is amazingly tuned and optimized for doing what it does. On top of that, scanning all your files might not always be needed.

How in the world can my antivirus/antispyware/antimalware program possibly scan all of my files for the thousands of trojans/signatures out there without taking an eon to do so? Don't they have to scan every file on your computer (or at the very least the exes, zips, dlls and registry) sequentially for a trojan-name and/or each signature? I can only presume they must do this one trojan-name/signature at a time, and then repeat. I can't fathom how it can be done so quickly, relatively speaking, given the task at hand. Heck - just a manual search for one or two obscure files on my computer can take me almost as long to find them - if I even do!

And here I was thinking that the virus scans take forever, and you're wondering how they can be so fast! It's all a matter of perspective, I suppose.

The short answer is that sometimes it does take a really long time. But there are techniques that scanners use to dramatically speed up the process, or at least make it look that way.

In addition not everything is, in fact, a scanner.

Time for some explanation of how anti-malware software typically works.

I want to throw out an entire class of anti-malware software before we even start: those that don't scan files at all.

Much of what we classify as "anti-spyware" software doesn't scan files. In fact, it's one of the semi-accurate rules of thumb that differentiate anti-spyware tools from anti-virus (though the line continues to blur over time.)

"... in a sense, the scanner's actually looking for almost all the viruses at once."

Rather than scanning files, these tools monitor behavior. For example, they might look for attempts to reset your browser home page as it happens. If things look kosher, they allow it, if not, they alert. No scanning was involved, they just hook into places where spyware-like behavior is likely to happen, and keep an eye out.

I'm not saying all anti-spyware software doesn't scan, (or that all anti-virus tools don't watch behavior), I'm just saying that a large portion of what anti-spyware software often does doesn't involve scanning at all.

So I'll focus on anti-virus software, which typically does scan.

As the question outlines, you would think that a complete scan would involve two things:

  • Read the contents of every file on your hard drive (or whatever media is being scanned).

  • Compare the entire contents of each file against the pattern of every known virus.

In other words, one heck of a lot of work.

Fortunately there are several shortcuts that anti-virus software can take.

  • Full scans can happen in the background. In reality, a full scan typically happens in the background as you're doing other things. As a result you might not realize just how long the scan is taking. A good scanner will prioritize its work in such a way so as not to impact what you're doing, but still get its work done. It's not uncommon for this type of scan to take hours, and if its any good, you'd never notice.

  • Full scans can be scheduled for when you're not using your computer. Once again, in the "so you'd never notice it" category, a full scan could be scheduled to happen automatically in the middle of the night (assuming you leave your computer on), or at some other time that's appropriate. It could take a long time, but if you don't see it, did it matter?

  • Full scans might not be needed after the first. After you install your anti-virus software it typically does one full scan shortly thereafter. Theoretically as long as it then monitors all the files that arrive on your machine as they come in, and any changes to the files that are already on your machine, another full scan isn't really necessary. There's often no need to re-scan an old file that's never changed. Many anti-virus products' default configuration use exactly this model.

  • It might not scan every file. In reality, not all file types can carry viruses. ".exe" or ".dll" files are typical targets, but a ".dat", ".chm" or even a ".leo" files are not. That's not to say that they couldn't contain a virus, just that there's typically no way for that virus to be run. Virus scanners can take advantage of that and not bother scanning many types of files at all. Once again, this is typically an option that is set by default.

The other part of this scenario is that actual algorithms used to perform the scan aren't as brute force as we might think.

Let's say there are 100,000 virus definitions that your anti-virus software needs to look for. The scan is most certainly not 100,000 cases of "is it this one?", repeated for each file being scanned. That really would take forever.

In reality, the data and the patterns that make up the various virus signatures are optimized and stored in such a way that, in a sense, the scanner's actually looking for almost all the viruses at once. It's difficult to describe without getting all geeky, or even computer science-y, but I'm sure there's a lot of math, organization and optimization around setting up anti-virus databases in such a way as to optimize for the fastest and most complete scan possible. I'd bet it rivals the complexity of encryption in many ways.

I'd also bet it's one of the key differentiators in anti-virus software.

So, in a nutshell: not everything's a scanner, you might not notice full scans, full scans might not even be needed, and the actual technology of a scan is much faster than you might think.

Me, I'm just glad there are smart people in the world who are writing these critical pieces of our security infrastructure.

Article C3740 - May 22, 2009 « »

Share this article with your friends:

Share this article on Facebook Tweet this article Email a link to this article
Leo Leo A. Notenboom has been playing with computers since he was required to take a programming class in 1976. An 18 year career as a programmer at Microsoft soon followed. After "retiring" in 2001, Leo started Ask Leo! in 2003 as a place for answers to common computer and technical questions. More about Leo.

Not what you needed?

2 Comments
umair
May 22, 2009 11:28 PM

could fast scans be a bad thing? NOD32 does a full scan 5 times faster than Avast.

But..

i scanned Omea Pro.exe (an insanely powerful free RSS reader) with NOD32 v4. the log shows it tried to scan various .dll, .txt, .exe objects but against each it said something like "OmeaSetup-2.2.1.exe NSIS StartMenu.dll - decompression could not complete (possible reasons: insufficient free memory or disk space, or a problem with temp folders)"

i have gigs of free space and 1gb of my RAM was free when i ran the scan. so could it be part of the complex algorithm? how do i ensure that a particular file is actually scanned? there seems to be no way i could make NOD32 scan that file.

Philip Roe
May 26, 2009 10:31 AM

A simple principle which explains how so many things can be done very fast is putting things in alphabetcal order. Suppose i have a book written in a foreign languge but which uses our alphabet and I want to know if it contains any English words. (An English word is any of 100,000 or so which is on a list.) I look at each word in the foreign book and for each the question is, is it on the English list. That does not mean comparing iot with 100,000 others. My English list is a dictionary. Finding out whether "reciept" say is in the dictionary (whch we might have to do if we are not sure how to spell "receipt") just menas finding it if it is in or finding the pace where it would be if is isn't. This takes far fewer look ups. 17 in fact. About the log to base 2 of the number of words in the dictionary.

Comments on this entry are closed.

If you have a question, start by using the search box up at the top of the page - there's a very good chance that your question has already been answered on Ask Leo!.

If you don't find your answer, head out to http://askleo.com/ask to ask your question.