Deploying Machine Learning Models with Flask: A Step-by-Step Guide
Deploying ML models with Docker and Flask example
Introduction
Building a machine learning model is only half the battle—deploying it so that others can use it in real-world applications is the real challenge. Flask, a lightweight Python web framework, is one of the most popular tools for deploying ML models as REST APIs or web applications.
In this article, we’ll explain the basics of Flask deployment, step-by-step implementation, advantages, and real-world use cases, with code examples you can run yourself.
Flask is widely used because:
Lightweight & flexible – Easy to set up for small-to-medium projects.
REST API support – Expose ML models as APIs for integration.
Python-friendly – Works seamlessly with ML libraries like Scikit-learn, TensorFlow, and PyTorch.
Rapid prototyping – Perfect for testing models in real-world environments before scaling.
Before you start, ensure you have:
Basic knowledge of Python & Flask
A trained machine learning model (Scikit-learn, TensorFlow, or PyTorch)
Libraries installed:
pip install flask scikit-learn
Let’s train a simple model with Scikit-learn and save it using joblib
.
# train_model.py
import joblib
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
from sklearn.ensemble import RandomForestClassifier
# Load dataset
iris = load_iris()
X_train, X_test, y_train, y_test = train_test_split(iris.data, iris.target, test_size=0.3, random_state=42)
# Train model
model = RandomForestClassifier(n_estimators=100, random_state=42)
model.fit(X_train, y_train)
# Save model
joblib.dump(model, "iris_model.pkl")
print("Model saved as iris_model.pkl")
Now we’ll build a Flask app to load and serve the trained model.
# app.py
from flask import Flask, request, jsonify
import joblib
import numpy as np
# Initialize Flask app
app = Flask(__name__)
# Load saved model
model = joblib.load("iris_model.pkl")
@app.route('/')
def home():
return "ML Model Deployment with Flask is running!"
@app.route('/predict', methods=['POST'])
def predict():
data = request.get_json(force=True) # Get input data
prediction = model.predict([np.array(data['features'])])
return jsonify({"prediction": int(prediction[0])})
if __name__ == "__main__":
app.run(debug=True)
Run the Flask app:
python app.py
The server will start at http://127.0.0.1:5000/.
Send a POST request using cURL or Postman:
curl -X POST http://127.0.0.1:5000/predict \
-H "Content-Type: application/json" \
-d '{"features":[5.1, 3.5, 1.4, 0.2]}'
Output:
{"prediction": 0}
While Flask works locally, you can also deploy your model to:
Heroku (simple cloud deployment)
AWS Elastic Beanstalk
Google Cloud Run
Docker for containerized deployments
Quick and easy setup for small ML projects.
Perfect for hackathons, prototypes, and teaching.
Easily extendable with templates and frontend frameworks.
Integrates smoothly with Docker for scalable solutions.
Not suitable for very large-scale production systems (use FastAPI, Django, or Kubernetes for scaling).
Performance may drop under high traffic.
Flask-deployed ML models can be used for:
Chatbots (NLP models served via APIs)
Recommendation systems in e-commerce
Fraud detection APIs in finance
Image classification services in healthcare and security
Deploying machine learning models is a crucial step in transforming research into practical solutions. Flask provides a simple yet powerful framework for serving ML models as APIs or web applications.
Whether you’re showcasing your project, testing prototypes, or building small-scale applications, Flask is one of the best ways to deploy ML models quickly.