Publication Details
Abstract
The ad fraud and act of coordinated manipulation has become a more sensitive issue in the highly complex ecosystems formed by fast-left programmatic digital advertising. Fraudulent advertising traffic in the United States through the work of automated bots, click farms, and organized campaigns costs a lot of money, worsens the performance metrics of campaigns, and prompts more general inquiries about information integrity and national security. Rudimentary rule-based systems of detecting fraud are unable to scale to changing and evolving attacks, and require newer and more intelligent approaches to security. This study will discuss the use of machine learning to enhance security of the American online advertising ecosystem by identifying fraudulent and possibly cross-border manipulative activity based on user behavior and ad clickstream logs. This study uses a publicly available fraud detection data set, which consists of 2,043 records of ad interactions that have behavioral, time, technical, and geographic features. The main aspects are the intervals of clicking, duration of a session, number of clicks per session, bouncing rate, Devices and browser, geo-location. Machine learning methods are used under supervision to identify ad clicks as legitimate or abnormal, whereby the abnormal behavioral patterns can be observed as signs of automation and coordinated abuse. The standard classification measures are used to estimate the model performance; they are accuracy, precision, recall, F1-score and receiver operating characteristic analysis. The results indicate that machine learning models are useful in obtaining complex non-linear relationship patterns in clickstream behavior that are better than traditional heuristic methods in detecting fraudulent traffic. Behavioral and session-level attributes prove to be close predictors of fraud and geographic ones should present valuable hints at the fraud potential presence of cross-border manipulation. Even though the dataset does not directly accuse foreign actors of any activity, the findings show that machine learning can be used to assist in early detection and risk evaluation of manipulation attempts in digital advertising pipelines. This study will be of value to the emerging literature on ad-tech security because it established the practical relevance of machine learning to increase the levels of trust, transparency, and resilience in American digital advertising ecosystems.