How to build secure ML systems? — Part 2 Model Tampering

Abinav Ravi
2 min readFeb 19, 2024

In my previous post we talked about data poisoning and how it affects an ML system —

In this post we delve a little deeper into another topic called model tampering

Photo by Thierry K on Unsplash


Model tampering is an attack where an attacker modifies the ML model to change behavior during deployment.

How can the model be modified?

Model can be modified by changing either the parameters to introduce biases into model, or by modifying weights that were used to model the training data.

What are different forms this can take?

  • Model Modification — Modify weights or biases to make incorrect or biased predictions
  • Model Substitution — Replace an existing model with some malicious model that behaves differently during deployment
  • Model Inversion — Extract details from the model such as personal information or sensitive information.

What makes this a dangerous attack?

When decision making is reliant on such automated systems without a person in the loop they can affect individuals or institutions.

For example a simple way in which model tampering can have an impact is if a model substitution/ modification for a loan granting application by having a majority weight for personal wealth. This would mean that people who are in dire need of loan and qualify but with lesser personal wealth would qualify while defaulters who can show significant personal wealth tied up in defaults, loans and collaterals can keep getting loans. This is a point of breaking for the financial systems and can impact society as a whole.

Practices to avoid Model Tampering

Some of the practices to avoid model tampering are

  1. Secure from unauthorized access of data and model
  2. Maintain Data quality by implementing checks and data quality tests (this is to maintain features in a secure state)
  3. Monitoring for performance metrics and perform retraining as regularly as possible
  4. Continuous Auditing and experimentation for any possible exploitation points to keep the models and data secure.