Securing AI against bot attacks

Latest

Auth & identity

December 16, 2022

Author: Stytch Team

ChatGPT and the acceleration of AI adoption

Since its release last week, ChatGPT has quickly captured the internet’s attention for its uncanny ability to generate human-like responses to a wide array of complex questions. While it’s far from perfect, ChatGPT is the first large-scale deployment of GPT-3.5, the latest AI advancement from OpenAI. For those familiar with OpenAI’s previous model, GPT-3, the improvement is startling. GPT-3 (short for “Generative Pre-Trained Transformer 3”) was already an impressive feat, but it feels like a toy compared to the surprisingly lucid experience of interacting with GPT-3.5. In many ways, ChatGPT feels like a watershed moment for artificial intelligence.

And while we’re only able to speculate on many of the impacts that AI tech advancements may have, one thing is already clear – developers are eager to incorporate the APIs (Application Programming Interfaces) behind ChatGPT into existing and new applications in order to improve current workflows and introduce new, previously unimaginable user experiences. In just 5 days, ChatGPT amassed over 1 million users. Many of these early adopters are developers who are tinkering with the technology and exploring how it could be incorporated into various B2C and B2B use cases.

The developer excitement behind these new capabilities is no surprise. In the 2010s, companies like Stripe, Twilio, and Plaid introduced new superpowers to developers through APIs that simplified embedded payments, telecommunications, and third-party data access into applications, and these building blocks unleashed a wave of application innovation. Similarly, in the 2020s, APIs providing access to underlying artificial intelligence models like GPT-3, ChatGPT and (soon) GPT-4 will unleash unprecedented innovation. 

When AI meets API: security implications

While the opportunities are limitless, as more applications integrate these AI APIs, it will introduce new application security concerns for developers. In particular, applications building with these APIs should expect significant and sophisticated bot traffic directed at their sites. It’s a matter of simple game theory and what the last two decades of commercial APIs have shown us – anytime you expose a resource on the internet (e.g., compute) that offers significant monetization potential to an attacker, fraudsters will search for ways to exploit that resource. In the case of Stripe and Plaid, these companies provide API endpoints for validating credit card numbers and bank account credentials, which attackers exploit to perform account validation attacks to steal valuable financial accounts. In the case of Twilio, attackers commit SMS toll fraud in order to pump expensive SMS traffic through partner mobile network operators (MNO) and share the profits with the MNO. 

The value behind artificial intelligence APIs could prove similarly alluring to fraudsters now, just as Stripe, Twilio, or Plaid did when they emerged. Not only are artificial intelligence APIs expensive to use, their API responses are also relatively fungible and public-facing, making them particularly ripe for abuse. Cloud services and other tools that, by design, expose valuable compute resources (e.g. companies like Replit, Github, and Heroku) provide a glimpse into the attack vector that applications exposing AI APIs will need to anticipate. In attacks against open compute resources, fraudsters commit first-party fraud by creating fake accounts for the sole purpose of running commands that can be monetized – with cloud services, cryptomining has been the most reliable way for attackers to make money through this vector

For applications leveraging expensive AI APIs, we can expect a similar attack vector where fraudsters create numerous fake accounts to initiate valuable queries while purposefully evading rate limits. A couple of the most likely monetization paths behind this attack include:

  1. API skimming attacks: This is an attack where the bots are effectively piggybacking on the the application’s AI resources as a sort of man-in-the-middle approach: the attacker sells access to the API outputs to another party, but they piggyback on a legitimate application’s API routes in order to avoid paying for the costly resource. Because the outputs are fungible and exposed to the end user in most cases, this is an API attack route that does not currently exist with services like Stripe or Twilio.
  2. Model stealing: B2B AI companies often invest significant time and resources into developing their AI models. If these models are not properly protected, they can be stolen and used by competitors or malicious actors. In the case of theft, an attacker could attempt to reverse engineer the model by analyzing its behavior and trying to infer its structure and algorithms. This could be done by feeding the model a large number of inputs and analyzing the outputs, or by using machine learning techniques to try and recreate the model. This can be a difficult and time-consuming process, but a sophisticated botnet could potentially automate some of the steps involved.
  3. Adversarial attacks: AI systems are vulnerable to adversarial attacks, where an attacker deliberately manipulates the input data to cause the AI model to make incorrect predictions or decisions. This could lead to serious consequences, such as financial losses or safety risks depending on the application use case.

How to protect AI from bot attacks

As a result of these threats, AI companies have unique bot detection needs, as they need to ensure that the APIs they expose are only accessed by authorized users and not by unauthorized bots or other malicious actors.

Some of the unique bot detection needs that AI companies might implement include:

  1. Detecting and blocking bots that are attempting to access the API without proper authorization. This could include bots that are attempting to scrape data from the API, or bots that are trying to use the API to carry out malicious activities, such as spamming or fraud. To defend against this, it’s critical to protect against common bot detection evasion techniques, which include reverse engineering APIs and using methods like headless browsing alongside proxies or residential IP addresses to appear like a human rather than a bot.
  2. Monitoring the usage of the API to identify and block any suspicious or unusual patterns of behavior. This could include looking for anomalies in the frequency or volume of API requests, or for patterns of behavior that are indicative of bot activity, such as rapid and repeated requests from the same IP address.

Overall, AI companies have unique bot detection needs because of the valuable APIs they expose. To protect these APIs, AI companies need to implement effective bot detection and blocking mechanisms, and they need to monitor and protect against bot detection evasion techniques as well. 

At Stytch, we help companies tackle Fraud and Risk Prevention with products like Device Fingerprinting and Strong CAPTCHA, both of which make it easier for developers to build in bot-protection without adding any friction for the user. By combining these tools with our user-friendly, ironclad authentication products, companies can rest easy knowing their data and resources are protected from malicious bot traffic.

If you want to learn more about how to protect your product from bot traffic, talk to an auth expert today! 

SHARE

Get started with Stytch