This paper proposes a lightweight defense mechanism to protect AI models exposed through AI-as-a-Service (AIaaS) from black-box adversarial attacks. As future networks rely more on remotely accessed AI functions, models become vulnerable to malicious queries that subtly manipulate inputs and cause misclassification. The core idea of the method is to use uncertainty as a signal of suspicious behavior. When the model encounters an input for which its prediction confidence is unusually low often the case for adversarial samples near decision boundaries—it performs a temporary, single-step weight update to reduce that uncertainty. After producing the corrected prediction, the model immediately reverts to its original parameters, preserving stability while improving robustness.

This uncertainty-guided adjustment also enables adversarial detection: if the updated and original predictions disagree, the system can flag the input as malicious. In addition, the approach allows operators to collect real adversarial examples during normal operation for future retraining, strengthening long-term model security. The paper places this mechanism within an MLOps workflow, showing how detection, adaptive updates, and continuous training cycles can support more trustworthy AIaaS deployments in emerging 6G environments.

A_Novel_Method_to_Mitigate_Adversarial_Attacks_Against_AI-as-a-Service_Functionality

A Novel Method to Mitigate Adversarial Attacks Against AI-as-a-Service Functionality