People (such as tech journalists and product reviewers) often ask us how our scanning engines work, and what the difference is between signature engines and other types of scan engines. In fact, we were asked such a question just last week. So, let’s explore the topic in-depth….
Signature-based scanning refers to the practice of checking a full-file hash or a series of partial-file hashes against a list or database, in order to obtain a verdict on an object. This is roughly where antivirus began, back in the 1980s. The emergence of polymorphic malware in the early 1990s was the catalyst that spurred an evolution from the signature-based approach to more complex file scanning engines.
Endpoint protection solutions include file scanning engines. They’re not really just for scanning files, though. Give them any sort of input buffer, such as a piece of memory or a network stream, and they’ll do their job.
File scanning engines have become very sophisticated. They include archive traversal mechanisms, parsers for multiple file formats, static and dynamic unpackers, disassemblers, and emulators capable of running both scripts and executable formats. Today’s detections are really just complex computer programs, designed to perform intricate sample analysis directly on the client. Modern detections are designed to catch thousands, or even hundreds of thousands of samples. A far cry from the one hash per sample approach of the old days.
As you might imagine, it takes time to create sophisticated detections. An analyst must to collect samples, inspect them, write code, and test, before finally releasing to customers. Fairly simple signature-based detections can, on the other hand, be generated easily by automation. As new samples arrive, they are run through a series of static and dynamic analysis tools, and rule engines in order to quickly deliver a verdict.
Hence, when a new threat emerges, back end automation kicks in to cover early samples while the analysts get to work writing proper detections. Since today’s software can quickly and easily perform hash lookups over the Internet, these simple detections are not even delivered as part of a local database update. This cloud-lookup mechanism has an added benefit in that it allows us to protect customers against emerging threats very quickly, and regardless of when they emerge.
But that’s not the whole story.
All modern endpoint protection solutions utilize multiple mechanisms to keep customers protected. The following is a very simple picture of how endpoint protection works today.
- URL blocking. Preventing a user from being exposed to a site hosting an exploit kit or other malicious content negates the need for any further protection measures. We do this largely via URL and IP reputation cloud queries. Spam blocking and email filtering also happen here.
- Exploit detection. If a user does manage to visit a site hosting an exploit kit, and that user is running vulnerable software, any attempt to exploit that vulnerable software will be blocked by our behavioral monitoring engine.
- Network and on-access scanning. If a user receives a malicious file via email or download, it will be scanned on the network or when it is written to disk. If the file is found to be malicious, it will be removed from the user’s system (for instance, to a quarantine).
- Behavioral blocking. Assuming no file-based detection existed for the object, the user may then go on to open or execute the document, script, or program. At this point, malicious behavior will be blocked by our behavioral engine and again, the file will be removed. The fact is, a majority of malware delivery mechanisms are easily blocked behaviorally. In most cases, when we find new threats, we also discover that we had, in the distant past, already added logic addressing the mechanisms it uses.
Antivirus software of yore, with its nightly disk-grinding scheduled scans has evolved into the latest generation of endpoint protection used today. One of the best ways to protect endpoints against modern threats is to prevent threats from making contact with their victims in the first place. Failing that, utilizing a multi-pronged approach to block common attack vectors ensures that multiple opportunities exist to stop attacks in their tracks.
File scanning is just one of the many mechanisms that “AV vendors” use to protect endpoints. Since we often have actual attack vectors covered well with both our exploit detection and behavioral blocking mechanisms, we often don’t bother adding file-based detections (i.e., static signatures) for every new threat. And remember, at the end of the day, we always test our protection components against real-world threats using our entire product, not just individual pieces of it.