Skip to content

Trending tags

5 adversarial AI attacks that show machines have more to fear from people than the other way around

Jason Sattler

11.07.19 6 min. read

One clear example of the still unmatched power of the human mind is our ability to imagine how machines could use adversarial attacks on artificial intelligence (AI) to attack, dominate, and possibly even enslave humanity. Today, only the opposite is possible.

The term ‘AI’ is today used to refer to machine learning

“I think most people associate the dangers of AI with things like killer robots because they confuse AI with the artificial general intelligence that they see in TV and movies. That’s different and far more advanced than what’s in use today, which is basically just a function that a computer has been trained to perform on its own,” explains Andy Patel. “The term ‘AI’ is today used to refer to machine learning, which is the process of training a function – not a computer – to perform a task.”

Andy is a researcher with F-Secure’s Artificial Intelligence Center of Excellence and a major contributor to “Security Issues, Dangers and Implications of Smart Information Systems”, a new study published by the SHERPA consortium – an EU-funded project F-Secure joined in 2018.

What AI attack scenarios tell us

In the section on “Adversarial attacks against AI”, the study explains the types and classes of ways to take on AI systems. While many of the full-scope of these attacks remains theoretical, many are already possible and have been happening for years. Many more are inevitable, especially given the often-reckless rush to get products to market, which can lead to unaddressed vulnerabilities, both known and unknown.

After breaking down the types and classes attacks that exist against AI, the study illuminates a number of scenarios to give AI researchers, engineers, policymakers and interested individuals like you a sense of what we’re already seeing and what’s possible.

Attack scenario: discredit a company or brand by poisoning a search engine’s autocomplete functionality

An adversary employs a Sybil attack to poison a web browser’s auto-complete function so that it suggests the word “fraud” at the end of an auto-completed sentence with a target company name in it. The targeted company doesn’t notice the attack for some time, but eventually discovers the problem and corrects it. However, the damage is already done, and they suffer long-term negative impact on their brand image. This is an integrity attack (and is possible today).

Sybil attacks use multiple ‘sock puppet’ accounts controlled by a single entity to violate the integrity of a system. They are used to promote products or content, to decrease popularity of products and content, and for social engineering to drive a user toward specific content.

These attacks are so common that a whole industry exists to support it.

Attack scenario: perform a targeted attack against an individual using hidden voice commands

An attacker embeds hidden voice commands into video content, uploads it to a popular video sharing service, and artificially promotes the video (using a Sybil attack). The hidden voice commands are used to successfully instruct a digital home assistant device to purchase a product without the owner knowing, instruct smart home appliances to alter settings (e.g. turn up the heat, turn off the lights, or unlock the front door), or to instruct a nearby computing device to perform searches for incriminating content (such as drugs or child pornography) without the owner’s knowledge (allowing the attacker to subsequently blackmail the victim). This is an availability attack.

This specific attack has not yet been seen in the world. But we know it is possible because of an experiment from August 2018. Researchers at the Horst Görtz Institute for IT Security in Bochum, Germany deployed psychoacoustic attacks against speech recognition systems, hiding voice commands in audio of birds chirping.

Availability attack scenario: take widespread control of digital home assistants

Scenario: take widespread control of digital home assistants An attacker forges a ‘leaked’ phone call depicting plausible scandalous interaction involving high-ranking politicians and business people. The forged audio contains embedded hidden voice commands. The message is broadcast during the evening news on national and international TV channels. The attacker gains the ability to issue voice commands to home assistants or other voice recognition control systems (such as Siri) on a potentially massive scale. This is an availability attack.

This is also a theoretical attack. But reports of Siri being triggered almost randomly are common and the rapid adoption of voice-activated devices continues to create a larger and larger target for such attacks.

Availability attack scenario: evade fake news detection systems to alter political discourse

Fake news detection is a relatively difficult problem to solve with automation, and hence, fake news detection solutions are still in their infancy. As these techniques improve and people start to rely on verdicts from trusted fake news detection services, tricking such services infrequently, and at strategic moments would be an ideal way to inject false narratives into political or social discourse. In such a scenario, an attacker would create a fictional news article based on current events, and adversarially alter it to evade known respected fake news detection systems. The article would then find its way into social media, where it would likely spread virally before it can be manually fact checked. This is an availability attack.

This may seem far-fetched, but it’s a relatively simple leap in technology from attacks we’ve already seen.

Natural language processing (NLP) models are widely used today to help computers understand human language. Anonymous researchers presenting to the International Conference on Learning Representations, along with researchers at UCLA, have recently demonstrated how NLP models can be fooled somewhat easily with the use of synonyms.

A new study suggests that the use of propaganda on social media may have already swayed one monumental election. So we have to assume that in the near future new techniques will be deployed that may be able to avoid quick detection.

Attack scenario: hijack autonomous military drones

By use of an adversarial attack against a reinforcement learning model, autonomous military drones are coerced into attacking a series of unintended targets, causing destruction of property, loss of life, and the escalation of a military conflict. This is an availability attack.

This is perhaps the example that seems most ripped from science fiction or a Bond film. But it’s likely possible today using the same processes that could be used to attack a video game.

Reinforcement learning is used to train “recommendation systems, self-driving vehicles, robotics, and games” to perform actions based on their environment.

The report explains that enchanting attacks against reinforcement models can modify multiple inputs to distract from the completion of a goal. “For instance, an enchanting attack against an agent playing Super Mario could lure could lure the agent into running on the spot, or moving backwards instead of forwards,” the study notes.

So instead of manipulating Mario, the attacker just takes down a drone.

Watch out, machines

AI is an invention of the human mind, but not yet a reflection of it. What humans have created, we can still fool.

These attacks have already spawned one industry and many more will follow.

“So you could almost say that the reality of AI is that the machines have more to fear from people than the other way around,” Andy says.

For a detailed look at the types and classes of AI attacks that have been identified, check out the Section 3 of the SHERPA report.

Jason Sattler

11.07.19 6 min. read

Related posts

Close

Newsletter modal

Thank you for your interest towards F-Secure newsletter. You will shortly get an email to confirm the subscription.

Gated Content modal

Congratulations – You can now access the content by clicking the button below.