Threat modelling geospatial machine learning systems
Machine learning models are set to play an increasing role in aiding decision-making processes in both governmental and commercial industries in the years to come. One noteworthy area where this is likely to happen is in the geospatial domain, where information obtained from GPS devices and satellite and aerial imagery is used to make both strategic and business decisions. It is thus important to understand how models in this domain stand up to adversarial attack and how trustworthy their outputs are.
In April 2021, F-Secure conducted a threat analysis study of machine learning models in the geospatial domain. We investigated several possible attacks and attack goals and proposed mitigations against them. This article describes the outcome of that work.
Geospatial applications
Typical applications that utilize geospatial data are presented in Table 1. All of these use cases can potentially benefit from improvements provided by machine learning.
Given that machine learning will likely be used in an increasing number of geospatial data use cases going forward, the importance of security for both the systems that perform these tasks and the machine learning models utilized by those systems cannot be understated. The consequences of malfunctioning or compromised models in this domain can vary based on what their intended applications are. For example, if a system designed to monitor the structural integrity of a bridge provides incorrect information, maintenance may be delayed (causing safety issues) or premature maintenance may be scheduled (causing unnecessary costs). In the risk management field, a malfunctioning or compromised model can lead to large financial losses for its users. Finally, if a model designed to classify land usage malfunctions, it may lead to decisions causing irreversible damage to natural resources.
The next section examines potential security threats to a machine learning-based geospatial application and proposes mitigations to those threats.
Use case: business intelligence from satellite imagery
Setting the scene
Since many geospatial applications utilize aerial or satellite imagery, we decided to examine a simple use case involving a machine learning model that processes such imagery for the purposes of business or competitor intelligence. Our use case depicts a machine learning model designed to count the number of cars in a satellite image of a cark park. Information provided by this model can be used by business analysts to evaluate the sales performance of a retail store. This information can be used to predict sales numbers, which in turn can be used to estimate the company’s future stock market performance. Compromise of such a system can lead to negative outcomes for its users, such as incorrect sales performance estimates and incorrect stock price predictions. The adversary in this case may be a malicious actor or a competitor intent on disrupting the business goals of the system’s primary user. In order to understand how such a model might be attacked, we examine the following:
- in what ways can the machine learning model be compromised?
- what is the ease or likelihood of the compromise?
- what is the impact of the compromise?
Although our example application involves counting cars, it generalizes to other use cases where imagery is the primary source of data, such as quantifying deforestation or urban land usage or monitoring marine traffic through a canal or shipping strait.
The players
Before diving into technical details, we first introduce the players involved in the creation and use of the machine learning system:
- service provider: the company that provides the trained machine learning model as a service.
- model user: the user of the model (sourced from a service provider or built in-house). In our use case the model user is a business analyst who will use the model’s output to evaluate the performance of a retail store.
- model subject: the parties affected by the output of the model. In our use case, the model subject is the company that owns the retail store.
Besides the service provider and model subject, other potential parties of interest are competitors to the service provider (who have the motivation to degrade the service provider’s service), and competitors to the model subject (who have the motivation to compromise predictions related to their competition).
Model description
The machine learning model can be summarized as follows:
- model purpose: business intelligence derived from car counts in satellite imagery.
- training data: labelled satellite imagery from private and public data sources.
- inference data: unlabelled satellite imagery from private and public sources.
- model output: prediction of the number of cars in a satellite image.
The whole pipeline – from training data to serving the model over an API – is summarized in Figure 1.
Threat analysis
The scope of our threat analysis is limited to the machine learning aspects of the system – the data, the model, and the software used to process data, implement the model, and implement training and inference processes. It does not aim to provide a complete overview of all possible security threats to the studied system. Relevant incoming threats will likely target the following:
- training phase: where the model is trained on data from private and public data sources.
- inference phase: where the trained model provides predictions for given input data.
We identified the following goals as results of a successful attack:
- Attack Goal 1: An adversary seeks to compromise the accuracy of the model (and thus service quality). This can be achieved by manipulating training data, manipulating the pre-trained model, or through traditional software attacks.
- Attack Goal 2: An adversary wishes to perform a targeted attack, causing the model to generate inaccurate predictions for specific data (e.g., data related to a particular location or company). This can be achieved by manipulating the input data used in inference, by backdooring the pretrained model or through traditional software attacks.
These attack goals, which will be discussed in more detail in the next sections, are summarized in Table 2. Techniques are labelled according to MITRE’s Adversarial ML Threat Matrix [1] terminology.
Attack Goal 1: Lower the accuracy of car counting via geospatial imagery
In this attack, an adversary’s goal is to compromise the system such that the model outputs inaccurate counts of cars in input images. The motivation for this attack is to prevent a model’s user from deriving actionable business information based on geospatial imagery, or to reduce the accuracy of a particular service provider’s business intelligence.
Technique 1: Data Poisoning through image scaling attack
Figure 2 depicts a data poisoning attack. Using this technique, an adversary compromises the dataset used to train a machine learning model such that the resulting model’s accuracy is degraded.
Our car counting model can be attacked covertly using a technique known as an image scaling attack [2]. An image scaling attack is used to target the pre-processing component of either the training or inference pipelines. Typically, during pre-processing, an image is downscaled to a lower resolution, such that it matches the input dimensions of a deep neural network. Many algorithms used for resizing images do not equally consider all pixels while downscaling the image. In order to abuse this weakness, an adversary need only modify a limited number of pixels such that a downscaled image looks very different from its full-scale counterpart. Figure 3 illustrates this technique. The car park image on the left has a resolution of 2048×1294 pixels. At full resolution, the car park is observed to be completely full. This image is downscaled to 256×162 pixels prior to becoming the input to a machine learning model. However, a small number of pixels have been modified in the full-scale image, so that the car park will appear half empty after the resize operation (image on the right).
If you look closely at the full-resolution image, you may be able to make out a grid of grey pixels overlayed at the locations where cars went missing. These are the pixels that the resizing algorithm uses during downscaling, and which result in the empty spaces in the downscaled image.
This technique, when used in a data poisoning attack relies on the fact that training data is labelled by humans using full-resolution images. During the labelling process, a human would inspect the full resolution version of this image and label all visible cars. However, the resized image (missing a lot of cars) is the image that the machine learning model sees during training. As such, the labels that were assigned to the full-resolution images will be wrong, which will impact the accuracy of the trained model. Note that an equivalent model evasion attack can be launched using this same technique, assuming the adversary is able to modify full-sized images prior to the production model’s inference step. This type of attack is discussed later in this article.
For an image scaling attack to work, the adversary needs to know, or determine, a few details about the system:
- the resolution the images are resized to during the training process
- the algorithm used for resizing
These details can often be guessed, or if the API is public, determined through targeted tests.
Aside from determining details about the target system, and preparing data, all an adversary needs to do is make their data publicly available. Any model trained on a data set that includes the adversary’s data, will end up compromised. Therefore, the difficulty of performing such an attack can be considered low.
If successful, an image scaling attack represents a very general and model-independent way to compromise a model, since the attack exploits weaknesses in the data pre-processing stage of the pipeline.
Technique 2: Model Poisoning
In the computer vision field, machine learning models are often created by fine-tuning existing pre-trained models. Since a teacher model can transfer its vulnerabilities and weaknesses to a student model [3], and it is difficult to check pre-trained models for defects, reduced accuracy, or vulnerabilities, such an attack is viable. All the adversary must do is trick their victim into using a compromised pre-trained model.
Technique 3: Traditional Software Attacks
In addition to machine learning-specific attack vectors, an adversary can compromise their target and reduce the accuracy of predictions provided by the car counting model through “traditional” software attacks on, for example, the training or serialization libraries used in the system. This is illustrated in Figure 5.
One such attack technique is known as “dependency confusion” [5]. Software is typically comprised of a mix of internally developed code and external third-party packages, which are downloaded from external locations (such as PyPI). As the name suggests, a dependency confusion attack is designed to confuse the build or deployment system into downloading the wrong (compromised) dependency instead of the genuine one. This attack is possible if strict versioning or download location parameters are not provided in the build process. If this is the case, the latest available package may be downloaded from some default location.
If a company ACME includes a package acmelib v4.2 in their build process but does not enforce that this version is used in the build process, an adversary can push a compromised version – say, acmelib v13.37 – to a public package repository like PyPI, and that package will be picked up and used by the build system instead.
For this attack to be successful, an adversary needs to know, or guess, which packages the target company is using and upload their own packages to often-used package repositories such as PyPI. Such an attack has low to moderate complexity.
This attack technique has already been demonstrated. In February 2021 it was revealed that at least 35 major tech companies were vulnerable to dependency confusion attacks. Security researchers demonstrated the attack by using it to insert non-malicious code into the vulnerable companies’ systems [4].
Attack Goal 2: Lower car counting accuracy for specific locations
The goal of this targeted attack is to compromise a system in such a way that counting cars using geospatial imagery becomes inaccurate only for specific locations. Since the techniques used to achieve this attack goal are similar to those discussed for the previous attack goal, we will only summarize differences in this section.
Technique 1a: Model evasion via an image scaling attack
An adversary can use the same image scaling attack technique previously described to modify images prior to the downscaling operation. In this case, assuming an accurately trained model, the adversary must modify specific samples (i.e., full-size images from specific geographical locations) used in inference. This technique is depicted in Figure 6.
Technique 1b: Model evasion through adversarial images
It is also possible to fool computer vision models by distorting images in other ways that are not perceivable to the human eye. Such attacks can be used to prevent objects from being detected, to introduce extra detections for objects that are not in the image, or simply make the machine learning model misclassify objects [5].
As with an image scaling attack, the adversary must be able to modify data used during inference. However, adversarial images can be model-dependent and images that successfully confuse one model, may not necessarily confuse another. Therefore, it helps if the adversary knows what model is used for car counting. Alternatively, an adversary can utilize multiple adversarial image techniques attacks (that work against different models) in a single image.
Technique 2: Model poisoning / Model backdoor
In this attack, an adversary modifies a pre-trained model in a targeted way. For example, a model can be trained to count non-car objects (such as company logos) specific to a particular location as cars. Alternatively, an adversary can introduce their own physical objects, called adversarial patches, in order to backdoor a model. Whenever such an adversarial patch is visible on an object, the model can be taught to behave in a different manner and misclassify the object. An example of this is the well-known case of a model misclassifying a traffic sign that was modified with post-it notes or paint. Similarly, a maliciously crafted pre-trained model can be taught to classify non-car objects as cars whenever an adversarial patch is applied [6].
Technique 3: Traditional software attacks
Finally, an adversary may alter the model’s ability to count cars in a specific location through traditional software attacks – by modifying the software itself to behave in a different manner for specific locations. Such an attack could utilize GPS coordinates (if available), or trigger on images such as logos or other attributes specific to the targeted locations.
Mitigations
The attacks described in this article can be mitigated in a number of ways.
Technique 1a: Mitigations against image scaling attacks
Image scaling attacks can be mitigated by verifying that the downscaled image still looks the same as the original. This is done by upscaling the downscaled image and verifying that is similar enough to the original full-resolution version. Algorithmically, this can be achieved by checking that colour histograms of both images match, and by checking that the perceptual image hash of both images match [7].
Image scaling attacks can be rendered ineffective by utilizing a scaling algorithm, such as area scaling, that is more robust to such attacks. However, this method may not completely remove the effect of the modified pixels since they are still considered during resizing. A defence technique that avoids this issue is to explicitly remove the potentially modified pixels from an image prior to resizing and reconstructing them based on neighboring pixels. With knowledge of the scaling algorithm, it is possible to identify the limited set of pixels that are used for downsizing and then delete and reconstruct those pixels based on their neighbourhood [2]. This effectively sanitizes the image before downscaling.
Technique 1b: Model evasion through adversarial images
In order to prevent attacks via adversarial images, such images can be generated and included in the model’s training set. This technique is known as adversarial training. Adversarial samples can also be included in the model validation test set to ensure they are correctly labelled (or discarded by the system) in production.
Technique 2: Model poisoning / Model backdoor
Targeted attacks can be mitigated by validating newly trained models on examples of the data expected at inference time. If a new model does not pass these tests, an alert is issued, and the situation is investigated. Another way to mitigate these attacks involves building a system that utilizes the output from multiple models created using different pre-trained models. With such redundancy, any differences between source models can be used to identify and investigate anomalous behaviour in one of the models.
Technique 3: Traditional Software Attacks
The traditional software attacks discussed in this article assume that the model’s creators blindly trust package managers and external package repositories. In order to mitigate these attacks, companies should develop methods to determine the authenticity of externally sourced packages before bringing them into the pipeline. Potential ways of achieving this are:
- checking code signing
- verification of code checksums/hashes
- not directly sourcing from external package repositories, but mirroring external requirements to an internal repository after packages have been manually vetted
Conclusion
This article described theoretical attacks against a machine learning-based system designed to utilize geospatial data (specifically satellite and aerial images) to count cars in a car park. Both data poisoning and model evasion attack examples were described. Our examples generalize to other similar computer vision-related cases that utilize geospatial data to automate business intelligence.
Mitigating the described attacks requires good software security practices and an understanding of how each component and process of a machine learning pipeline might be attacked (and by whom and for what reason). Each step of the pipeline, from training and inference data, to pre-trained models, to model training, to externally sourcing packages should be secured in an appropriate manner. Validation of newly trained models against expected inputs and perhaps even expected adversarial inputs is also recommended. As is often the case, a machine learning model should only be considered trustworthy if it is properly secured.
Acknowledgements
This research was conducted by Mark van Heeswijk, Samuel Marchal and Setareh Roshan (F-Secure Corporation) as part of EU Horizon 2020 project SPATIAL, and F-Secure’s Project Blackfin. SPATIAL is an EU- funded project which investigates how to enhance AI-powered solutions in terms of accountability, privacy and resilience. This project has received funding from the European Union’s Horizon 2020 research and innovation programme, under grant agreement No 101021808. F-Secure’s Project Blackfin is a multi-year research effort with the goal of applying collective intelligence techniques to the cyber security domain.
References
[1] mitre/advmlthreatmatrix: Adversarial Threat Matrix (github.com)
[2] Image-Scaling Attacks and Defenses (scaling-attacks.net)
[3] Wang, Bolun, Yuanshun Yao, Bimal Viswanath, Haitao Zheng, and Ben Y. Zhao. “With Great Training Comes Great Vulnerability: Practical Attacks against Transfer Learning.” In 27th USENIX Security Symposium (USENIX Security 18), 1281–97, 2018.
[4] Dependency Confusion: Another Supply-Chain Vulnerability – Schneier on Security
[5] Chow, Ka-Ho, Ling Liu, Mehmet Emre Gursoy, Stacey Truex, Wenqi Wei, and Yanzhao Wu. “Understanding Object Detection Through An Adversarial Lens.” ArXiv:2007.05828 [Cs], July 11, 2020. http://arxiv.org/abs/2007.05828.
[6] Gu, Tianyu, Brendan Dolan-Gavitt, and Siddharth Garg. “BadNets: Identifying Vulnerabilities in the Machine Learning Model Supply Chain.” ArXiv:1708.06733 [Cs], March 11, 2019. http://arxiv.org/abs/1708.06733.
[7] JohannesBuchner/imagehash: A Python Perceptual Image Hashing Module (github.com)
Categories