The Data Science Behind Fraud Detection in Affiliate Marketing


The estimated cost of digital ad fraud worldwide is predicted to rise from $19 billion in 2018 to $44 billion in 2022. While it is difficult to measure the channels being hit the hardest, the affiliate and partner marketing sectors in which I work seem less vulnerable to malicious theft than other channels. This is because the pay-for-performance model is harder to dupe than soft metrics like ad viewability or fake Instagram followers.

That said, as long as there is money to be made, there will be bad actors trying to exploit legitimate brands and partners. Fortunately, there are data scientists hard at work to stay one step ahead of these fraudsters. As these cybercriminals get more sophisticated, so do the methods to detect and combat their actions. Data science is the backbone of any fraud detection approach, and will identify any of the obvious forms of fraud without the need for human investigation.

I spoke with the data scientists at my company to understand how they are thinking about the landscape when it comes to staying ahead of fraudsters. Here is what they shared.

Fraudster Greed Helps Detection

Fraud occurs in most areas of digital, and marketers need every available tool to discover and combat it. In the realm of partnerships and affiliate marketing, for example, data science has proven to be one of the most effective early warning systems available. Data scientists can employ models to train systems to spot known indicators of potential fraud, including:

  • Long click to conversion time spans, which can be an indication of cookie stuffing.
  • Very high click count but low conversion volume, which is a sign of possible click fraud.
  • Abnormally high conversion rates by placement or IP address, which can be indicative of creative fraud in which fraudsters alter a message.
  • Large numbers of clicks associated with particular URLs, which often signals bot traffic.

Early detection of these data anomalies is the best way to stay ahead of fraudsters. Some anti-fraud tools automatically block these activities once identified. For other tools, data variances they identify are a reason for investigation, not accusation. Once alerted to potential fraud, it is up to a brand to determine if the actions are fraudulent or valid. Fraudster greed often means that the data variances they cause will be large. As a rule of thumb, marketers should look for big patterns and variances, and then collaborate with media partners to understand the whys behind the spikes.

Think Outside the Bot

There are data patterns that we know to be indicative of fraud, but marketers and their data teams can’t stop there. Part of remaining vigilant against fraud is looking for things we have not seen before to determine if they are new forms of fraud. Data science teams train systems to spot known patterns, but new ways of shrouding fraud are constantly being developed.

For example, during part of our normal data mining process for one client, we recently spotted an “abnormally” high number of identical baskets full of seemingly random, unrelated items being purchased and attributed to one of the client’s medium-sized partners. There’s nothing new about a fraudster trying to fake purchases so it can take credit for them. But the ordinary signals we might use to detect this — like abnormally high conversion rates by IP address — did not apply in this case. It took a human to look at the data and create a hypothesis for additional investigation. It’s possible, of course, that a hot promotion could yield unusual or identical product assortments. But in this case, it was revealed this was the work of a bot.

Examining and understanding approaches like this example have made us able to build automatic detection of identical baskets into our models. In other words, fraudsters are always trying to stay one step ahead. So we must constantly look for new anomalies that may indicate fraud,  and then train our tools to spot them.

Fighting Fraud is an Arms Race

Fraudster tools will continue to become more sophisticated. New tactics arise regularly that we don’t know to look for. Because of this, we may not immediately recognize them as fraud. From a data science standpoint, it is critical to continually remain focused on spotting new types of fraud, while using machine learning to improve models of detection. In partnership, for example, affiliate marketing platforms can bulk up their fraud detection by recognizing new fraud methods and the patterns they display.

The best marketing data science and product teams are in the trenches looking for new types of fraud daily. As data models are updated and new ones are built, the fraud monitoring tools integrated into marketing automation will only get better. That’s how we stay one step ahead of the bad actors.


Please enter your comment!
Please enter your name here