Deploying Machine Learning Models with Flask: A Step-by-Step Guide

Introduction

Building a machine learning model is only half the battle—deploying it so that others can use it in real-world applications is the real challenge. Flask, a lightweight Python web framework, is one of the most popular tools for deploying ML models as REST APIs or web applications.

In this article, we’ll explain the basics of Flask deployment, step-by-step implementation, advantages, and real-world use cases, with code examples you can run yourself.

Deploying ML models with Docker and Flask example

Why Deploy ML Models with Flask?

Flask is widely used because:

Lightweight & flexible – Easy to set up for small-to-medium projects.
REST API support – Expose ML models as APIs for integration.
Python-friendly – Works seamlessly with ML libraries like Scikit-learn, TensorFlow, and PyTorch.
Rapid prototyping – Perfect for testing models in real-world environments before scaling.

Prerequisites

Before you start, ensure you have:

Basic knowledge of Python & Flask
A trained machine learning model (Scikit-learn, TensorFlow, or PyTorch)
Libraries installed:

pip install flask scikit-learn

Step 1: Train and Save a Machine Learning Model

Let’s train a simple model with Scikit-learn and save it using joblib.

# train_model.py
import joblib
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
from sklearn.ensemble import RandomForestClassifier

# Load dataset
iris = load_iris()
X_train, X_test, y_train, y_test = train_test_split(iris.data, iris.target, test_size=0.3, random_state=42)

# Train model
model = RandomForestClassifier(n_estimators=100, random_state=42)
model.fit(X_train, y_train)

# Save model
joblib.dump(model, "iris_model.pkl")
print("Model saved as iris_model.pkl")

Step 2: Create a Flask API for the Model

Now we’ll build a Flask app to load and serve the trained model.

# app.py
from flask import Flask, request, jsonify
import joblib
import numpy as np

# Initialize Flask app
app = Flask(__name__)

# Load saved model
model = joblib.load("iris_model.pkl")

@app.route('/')
def home():
    return "ML Model Deployment with Flask is running!"

@app.route('/predict', methods=['POST'])
def predict():
    data = request.get_json(force=True)  # Get input data
    prediction = model.predict([np.array(data['features'])])
    return jsonify({"prediction": int(prediction[0])})

if __name__ == "__main__":
    app.run(debug=True)

Step 3: Test the API

Run the Flask app:

python app.py

The server will start at http://127.0.0.1:5000/.

Send a POST request using cURL or Postman:

curl -X POST http://127.0.0.1:5000/predict \
-H "Content-Type: application/json" \
-d '{"features":[5.1, 3.5, 1.4, 0.2]}'

Output:

{"prediction": 0}

Step 4: Deploying to Production

While Flask works locally, you can also deploy your model to:

Heroku (simple cloud deployment)
AWS Elastic Beanstalk
Google Cloud Run
Docker for containerized deployments

Advantages of Deploying with Flask

Quick and easy setup for small ML projects.
Perfect for hackathons, prototypes, and teaching.
Easily extendable with templates and frontend frameworks.
Integrates smoothly with Docker for scalable solutions.

Limitations

Not suitable for very large-scale production systems (use FastAPI, Django, or Kubernetes for scaling).
Performance may drop under high traffic.

Real-World Applications

Flask-deployed ML models can be used for:

Chatbots (NLP models served via APIs)
Recommendation systems in e-commerce
Fraud detection APIs in finance
Image classification services in healthcare and security

Conclusion

Deploying machine learning models is a crucial step in transforming research into practical solutions. Flask provides a simple yet powerful framework for serving ML models as APIs or web applications.

Whether you’re showcasing your project, testing prototypes, or building small-scale applications, Flask is one of the best ways to deploy ML models quickly.