Everything you wanted to know about AV sampling but were afraid to ask

If you’ve been paying any attention to geopolitical news of late, you might have heard about the storm around Kaspersky. The Russia-based antivirus firm has been responding to allegations that it collected top secret NSA files from a customer machine and shared them with Russian intelligence agencies. For its part, Kaspersky maintains its innocence. They contend that they collected the files as part of normal operations, but then deleted said files from their systems at the direction of the CEO.

The issue has prompted questions about antivirus vendors in general. We’ve received and answered some of these questions ourselves. Why does antivirus collect files from customer machines? What happens to those files and how are they protected?

We asked our own Mikko Hypponen, Chief Research Officer at F-Secure, many of these questions in the first episode of our brand new podcast, Cyber Security Sauna. Be sure to listen as Mikko explains how and why we handle customer data and files, and why it’s important to trust your vendor.

In addition, here are some of the questions we’ve been getting, along with our answers.

Why are any files ever sent from customer machines to AV providers?

This is the way most antivirus software operates today. Components on the customer machine are able to perform a fairly exhaustive structural analysis of malware. However, performing deep analysis of malicious files requires steps like detonating the sample in a controlled sandbox environment. These sorts of things can’t be done on the customer end.

To describe the process simply, if our software encounters a suspicious sample on a customer’s system we’ve never seen before, and if the software on its own cannot reach a verdict about whether the file is malicious or not, that sample may be uploaded to our cloud for further analysis.

Cloud technology enables better, faster protection because once the security cloud determines the suspicious file is in fact malicous, it can then instantaneously protect all our other customers as well.

How do we secure the files that are transmitted from customer machines?

Encryption. We use HTTPS with certificate pinning to protect from man-in-the-middle attacks. And we anonymize everything. So although we have a particular file, we won’t know whose machine it’s from. All queries regarding files (hashes) or URL reputation made to our “security cloud” are also encrypted and anonymized.

Altogether, what files and data are transmitted from the customer to us?

It depends on the product and the settings used. You can find out more information about what data we collect in our data declaration document (stay tuned for an updated version of this document, in progress). You can also read our privacy principles.

Assuming the customer is using our most modern solutions with the proper settings, that is, getting the full benefit of our security cloud, we get the following:

Metadata derived from performing a file execution trace on the client (sent only under certain conditions, and only for samples deemed malicious)
File prevalence information (a counter in our backend is incremented every time the hash of a file is queried)
Unknown suspicious/malicious URLs. These are normalized (personal information is stripped from the string). URL lookup queries are performed on a hash of the normalized URL string
Under certain conditions, suspicious or malicious executable files are uploaded to our security cloud for further analysis
Samples that are submitted manually by customers via our web form

We do NOT get:

Any information that would enable us to identify which user or machine the data came from. So, we do not know which files are from which user, which users have executed which files, or which URLs have been visited by which users
Any data that we don’t actually need in order to better protect our customers. Any data not relevant to protecting customers is either not collected, or discarded as soon as possible

We normalize and anonymize as much data as possible on the client before sending it to our back end.

What happens to a file that is uploaded for analysis?

Files uploaded for analysis are first processed in a cloud-based virtual environment. In most cases, this processing yields a verdict that is relayed back to the client that uploaded the file. At this point, the sample is discarded by our systems. If analysis in the cloud-based virtual environment doesn’t yield a definitive verdict, the sample may be forwarded for further analysis. In these cases, the file will be kept in our backend for a limited time while it is processed. Any files arriving in our backend via this mechanism receive special “confidential” flags that enforce limited access and prevent the sample from being shared with other systems. Once analysis is complete, the sample is discarded.

What are the samples we get from customers?

Only executable files samples are uploaded for further analysis in this way.

Do we ever share copies of these files with VirusTotal, law enforcement agencies, or intelligence agencies, domestic or foreign?

All customer file submissions are expressly categorized as confidential, meaning we do not share these with anyone. A file is only re-categorized if we find that it’s also out there in the wild – it’s not unique to one customer’s machine.

We do not submit files to VirusTotal. We do share samples with trusted partners, but only samples which are classified as non-confidential. Law enforcement agencies share with us, seeking our analysis, not the other way around. We share some threat intelligence with organizations such as CERT-FI (for example, C&C information or analysis of malware targeting specific targets within a country), which might then be forwarded on to the appropriate law enforcement agency. In summary: info sharing, not sample sharing. Sample requests might occur, but only non-confidential samples are shared.

For a user, is sharing suspicious files with our researchers optional? If so, do users “opt in” or must they “opt out”?

Installing our product involves “opt in” to some extent. We take care to link our customers to our privacy policies and principles. Our security cloud is something that consumers can opt out of, though this reduces the effectiveness of our our products. In summary: opt in to antivirus software, opt out of particular features. We even provide a data use collection opt out option with our free online scanner.

Trending tags

Everything you wanted to know about AV sampling but were afraid to ask

Melissa Michael

Why are any files ever sent from customer machines to AV providers?

How do we secure the files that are transmitted from customer machines?

Altogether, what files and data are transmitted from the customer to us?

What happens to a file that is uploaded for analysis?

What are the samples we get from customers?

Do we ever share copies of these files with VirusTotal, law enforcement agencies, or intelligence agencies, domestic or foreign?

For a user, is sharing suspicious files with our researchers optional? If so, do users “opt in” or must they “opt out”?

Melissa Michael

Highlighted article

Related posts

Do you want to stay posted?

Trending tags

Everything you wanted to know about AV sampling but were afraid to ask

Share

Why are any files ever sent from customer machines to AV providers?

How do we secure the files that are transmitted from customer machines?

Altogether, what files and data are transmitted from the customer to us?

What happens to a file that is uploaded for analysis?

What are the samples we get from customers?

Do we ever share copies of these files with VirusTotal, law enforcement agencies, or intelligence agencies, domestic or foreign?

For a user, is sharing suspicious files with our researchers optional? If so, do users “opt in” or must they “opt out”?

Share

Highlighted article

Related posts

Do you want to stay posted?