The effectiveness and predictive power of machine learning models is highly dependent on the quality of data used during the training phase. In most real-world scenarios, models are trained using domain specific data provided by known and trusted sources. However, not all data sources are known and benevolent; some have an adversarial nature and aim to corrupt the way models make their predictions. For example, malicious users can poison the data used for training a machine learning model by injecting false samples into the training dataset. Consequently, training data security and authentication should be considered as a crucial step during the training, testing and development process of machine learning models. At Modzy, we have developed a unique solution which intelligently filters out unclean data points before the data is sent to the machine learning model to guarantee the quality and security of training and test datasets.

What You Need to Know About Adversarial Attacks and Training Data

A body of work in the machine learning research community points to the effectiveness of data poisoning attacks in degrading the performance of machine learning models [1, 2]. In these attacks, the attacker aims to manipulate the performance of a model by inserting carefully constructed poison instances into the data. For example, poison frog attacks are a type of attack where the poisoned instances are introduced to the training dataset in an engineered way so that after training, the model is incapable of correctly classifying specific instances belonging to a specific object class [2]. Under another attack framework called backdoor attacks, the attacker has control over a portion of the data and can leverage that portion to force the model into making decisions based on what the attacker desires [3], turning the model into malware. For example, an attacker can teach a malware classifier that if a certain string is present in a file, then that file should be tagged as benign. This means that the attacker now can compose malware files, including the specific string somewhere in the file, which will be tagged as benign by the classifier.

Traditionally, computer security focused on protecting a system against attackers by solidifying boundaries between the system and the outside world [4]. However, the most critical part of any machine learning process is the training and testing data which come directly from the outside world, which includes the security of training and testing data. Given that most machine learning models are either trained on user data or tested against it, an attacker can easily inject malicious data to affect the performance of the machine learning model [5]. Further, as transfer learning becomes a more common tool used in the training of machine learning models in different applications, these types of data-based attacks may be transferred from one model to another easily and propagate quickly in an inconspicuous manner. The possibility of such attacks drives us to define training data security as a fundamental part of any data science and machine learning process at Modzy.

Modzy Approach to Training Data Security

Security is at the heart of the Modzy platform. At Modzy, we take the security and authentication of datasets and the data used during training and inference very seriously, because it is of utmost concern for many of the customers we serve. Our data scientists developed a new approach for detecting data points in training and testing datasets that were manipulated by adversaries. This detection framework acts as a filter on the data during both training and inference to detect the poisoned data instances before they get to the model. Our detection model consists of a novel architecture which utilizes Residual Networks (ResNet) and was trained on a large adversarial dataset to detect data points poisoned by a variety of different attack methodologies. This model can detect poisoned data by learning how adversarial data points behave inside a machine learning model. One of the most interesting properties of our detection solution is its ability to transfer from one dataset to another. In other words, our detection solution can detect adversarial inputs for a range of applications, datasets, and model architectures. This means that our modular detection solution can be attached to a variety of machine learning models to increase the defensive capabilities and robustness of the models against a range of different adversarial attacks.

What This Means for You

As machine learning is being increasingly used for consequential decision-making processes in mission-critical environments, model protection against adversarial attacks becomes increasingly important. To do so, we must first understand various types of adversarial attacks during both training and inference. One aspect of this protection is to authenticate and ensure the security of the training data before training starts, and also to protect the input data points before they are input to the model during inference. Most machine learning models were originally designed without any concern for security and robustness against attacks, but researchers in the field have since identified several kinds of attacks under the umbrella of adversarial machine learning; all of these can greatly undermine the utility of machine learning models. It is of paramount importance that any machine learning pipeline used for training, testing and development of models is designed to take in account the security of both training and inference data. Modzy’s data scientists are actively working toward developing better defensive solutions and applying these solutions to the training and development of all Modzy’s AI models.


  • Biggio, Battista, Blaine Nelson, and Pavel Laskov. “Poisoning attacks against support vector machines.” arXiv preprint arXiv:1206.6389 (2012).
  • Shafahi, Ali, et al. “Poison frogs! targeted clean-label poisoning attacks on neural networks.” Advances in Neural Information Processing Systems. 2018.
  • Chen, Xinyun, et al. “Targeted backdoor attacks on deep learning systems using data poisoning.” arXiv preprint arXiv:1712.05526 (2017).
  • Bishop, Matthew A. “The art and science of computer security.” (2002).
  • Steinhardt, Jacob, Pang Wei W. Koh, and Percy S. Liang. “Certified defenses for data poisoning attacks.” Advances in neural information processing systems. 2017.