Building APIs with Flask
What is Flask?
Flask is a lightweight, micro web framework for Python created by Armin Ronacher in 2010. It's called a "micro" framework because it doesn't include built-in tools for database abstraction, form validation, or authentication — you choose and add the extensions you need.
Flask has been the standard framework for deploying ML models for years and remains extremely popular due to its simplicity and massive ecosystem.
The Flask Philosophy
View Flask Architecture
Flask is often compared to Django (full-featured framework). For ML APIs, Flask is preferred because you don't need Django's admin panel, ORM, or template engine. You just need lightweight HTTP routing.
Flask vs FastAPI — When to Use Which?
This is one of the most common questions in ML deployment. Here's a comprehensive comparison:
| Feature | Flask | FastAPI |
|---|---|---|
| Release year | 2010 | 2018 |
| Architecture | WSGI (synchronous) | ASGI (asynchronous) |
| Type hints | Optional, not used by framework | Required, drives validation |
| Data validation | Manual or via extensions | Built-in via Pydantic |
| API documentation | Flask-RESTX / Flasgger | Auto-generated (Swagger + ReDoc) |
| Async support | Limited (Flask 2.0+) | Native, first-class |
| Performance | Good | Excellent |
| Learning curve | Very low | Low |
| Ecosystem | Massive (15+ years) | Growing rapidly |
| Community size | Very large | Large and active |
| Production readiness | Battle-tested | Production-proven |
Decision Matrix
- Choose Flask when you want maximum simplicity, have an existing Flask codebase, or need mature extensions.
- Choose FastAPI for new projects that benefit from auto-documentation, validation, and async performance.
- Both are excellent for serving ML models. The "best" choice depends on your team and project.
Installation and Setup
pip install flask flask-cors
pip install scikit-learn joblib numpy pandas
Project Structure
flask-ml-api/
├── app/
│ ├── __init__.py # Flask app factory
│ ├── routes/
│ │ ├── __init__.py
│ │ └── predictions.py
│ ├── services/
│ │ ├── __init__.py
│ │ └── ml_service.py
│ └── utils/
│ ├── __init__.py
│ └── validators.py
├── models/
│ └── model_v1.joblib
├── config.py
├── run.py
└── requirements.txt
Your First Flask API
Minimal Example
from flask import Flask, jsonify
app = Flask(__name__)
@app.route("/")
def home():
return jsonify({"message": "ML Prediction API is running"})
@app.route("/health")
def health():
return jsonify({"status": "healthy"})
if __name__ == "__main__":
app.run(debug=True, port=5000)
Run it:
python run.py
Your API is available at http://localhost:5000.
Request Parsing in Flask
Unlike FastAPI, Flask doesn't have built-in Pydantic validation. You parse request data manually from the request object.
Parsing JSON Data
from flask import Flask, request, jsonify
app = Flask(__name__)
@app.route("/api/v1/predict", methods=["POST"])
def predict():
# Get JSON from request body
data = request.get_json()
if data is None:
return jsonify({"error": "Request body must be JSON"}), 400
# Manual validation
required_fields = ["age", "income", "credit_score",
"employment_years", "loan_amount"]
missing = [f for f in required_fields if f not in data]
if missing:
return jsonify({
"error": "Missing required fields",
"missing_fields": missing,
}), 422
# Type validation
try:
age = int(data["age"])
income = float(data["income"])
credit_score = int(data["credit_score"])
except (ValueError, TypeError) as e:
return jsonify({
"error": f"Invalid data type: {str(e)}",
}), 422
# Range validation
if not (18 <= age <= 120):
return jsonify({
"error": "Age must be between 18 and 120",
}), 422
# Proceed with prediction...
return jsonify({"prediction": "approved", "probability": 0.87})
Notice how much boilerplate code you need for validation in Flask. In FastAPI, this is one Pydantic model. For complex APIs, this is one reason FastAPI saves significant development time.
Using a Validation Helper
To reduce repetition, create a reusable validator:
def validate_prediction_input(data):
"""Validate prediction input and return errors if any."""
errors = []
if data is None:
return None, [{"message": "Request body must be JSON"}]
schema = {
"age": {"type": int, "min": 18, "max": 120, "required": True},
"income": {"type": float, "min": 0, "required": True},
"credit_score": {"type": int, "min": 300, "max": 850, "required": True},
"employment_years": {"type": float, "min": 0, "required": True},
"loan_amount": {"type": float, "min": 0, "required": True},
}
validated = {}
for field, rules in schema.items():
if field not in data:
if rules["required"]:
errors.append({"field": field, "message": "Field is required"})
continue
try:
value = rules["type"](data[field])
except (ValueError, TypeError):
errors.append({
"field": field,
"message": f"Must be {rules['type'].__name__}",
})
continue
if "min" in rules and value < rules["min"]:
errors.append({
"field": field,
"message": f"Must be >= {rules['min']}",
})
elif "max" in rules and value > rules["max"]:
errors.append({
"field": field,
"message": f"Must be <= {rules['max']}",
})
else:
validated[field] = value
if errors:
return None, errors
return validated, None
Loading and Serving an ML Model
Model Service
import joblib
import numpy as np
from pathlib import Path
class MLService:
_instance = None
def __new__(cls):
if cls._instance is None:
cls._instance = super().__new__(cls)
cls._instance.model = None
cls._instance.version = "unknown"
return cls._instance
def load_model(self, model_path: str):
path = Path(model_path)
if not path.exists():
raise FileNotFoundError(f"Model not found: {model_path}")
self.model = joblib.load(path)
self.version = path.stem
def predict(self, features: dict) -> dict:
if self.model is None:
raise RuntimeError("Model not loaded")
arr = np.array([[
features["age"],
features["income"],
features["credit_score"],
features["employment_years"],
features["loan_amount"],
]])
prediction = self.model.predict(arr)[0]
probabilities = self.model.predict_proba(arr)[0]
return {
"prediction": "approved" if prediction == 1 else "denied",
"probability": float(max(probabilities)),
"model_version": self.version,
}
Application Factory Pattern
The application factory is a Flask best practice. Instead of creating the app in the global scope, you use a function that creates and configures it.
from flask import Flask
from flask_cors import CORS
from app.services.ml_service import MLService
def create_app(config=None):
app = Flask(__name__)
if config:
app.config.update(config)
CORS(app, origins=["http://localhost:3000"])
ml_service = MLService()
ml_service.load_model("models/model_v1.joblib")
app.ml_service = ml_service
from app.routes.predictions import predictions_bp
app.register_blueprint(predictions_bp, url_prefix="/api/v1")
return app
Blueprints — Organizing Routes
Flask Blueprints let you organize routes into logical modules — similar to how FastAPI uses APIRouter.
from flask import Blueprint, request, jsonify, current_app
from datetime import datetime
predictions_bp = Blueprint("predictions", __name__)
@predictions_bp.route("/predict", methods=["POST"])
def predict():
data = request.get_json()
validated, errors = validate_prediction_input(data)
if errors:
return jsonify({"errors": errors}), 422
ml_service = current_app.ml_service
try:
result = ml_service.predict(validated)
return jsonify({
"prediction": result["prediction"],
"probability": result["probability"],
"model_version": result["model_version"],
"timestamp": datetime.utcnow().isoformat(),
})
except RuntimeError as e:
return jsonify({"error": str(e)}), 503
except Exception as e:
return jsonify({"error": f"Prediction failed: {str(e)}"}), 500
@predictions_bp.route("/health", methods=["GET"])
def health():
ml_service = current_app.ml_service
return jsonify({
"status": "healthy" if ml_service.model else "degraded",
"model_loaded": ml_service.model is not None,
"model_version": ml_service.version,
})
Blueprint Architecture
Error Handlers
Flask lets you register custom error handlers for specific HTTP status codes or exception types.
from werkzeug.exceptions import HTTPException
@app.errorhandler(404)
def not_found(error):
return jsonify({
"error_code": "NOT_FOUND",
"message": "The requested resource was not found",
}), 404
@app.errorhandler(422)
def validation_error(error):
return jsonify({
"error_code": "VALIDATION_ERROR",
"message": "Request validation failed",
"details": error.description if hasattr(error, "description") else str(error),
}), 422
@app.errorhandler(500)
def internal_error(error):
return jsonify({
"error_code": "INTERNAL_ERROR",
"message": "An unexpected error occurred",
}), 500
@app.errorhandler(Exception)
def handle_unexpected(error):
"""Catch-all for unhandled exceptions."""
if isinstance(error, HTTPException):
return jsonify({"error": error.description}), error.code
return jsonify({
"error_code": "UNEXPECTED_ERROR",
"message": "Something went wrong",
}), 500
Flask-RESTX for API Documentation
Flask doesn't generate Swagger docs automatically. Flask-RESTX is an extension that adds Swagger documentation.
pip install flask-restx
from flask import Flask
from flask_restx import Api, Resource, fields
app = Flask(__name__)
api = Api(
app,
title="ML Prediction API",
version="1.0",
description="Loan approval prediction service",
doc="/docs",
)
ns = api.namespace("predictions", description="Prediction operations")
input_model = api.model("PredictionInput", {
"age": fields.Integer(required=True, min=18, max=120,
description="Applicant age"),
"income": fields.Float(required=True, min=0,
description="Annual income"),
"credit_score": fields.Integer(required=True, min=300, max=850,
description="FICO score"),
"employment_years": fields.Float(required=True, min=0,
description="Years employed"),
"loan_amount": fields.Float(required=True, min=0,
description="Loan amount"),
})
output_model = api.model("PredictionOutput", {
"prediction": fields.String(description="Predicted class"),
"probability": fields.Float(description="Confidence score"),
"model_version": fields.String(description="Model version"),
"timestamp": fields.DateTime(description="Prediction timestamp"),
})
@ns.route("/predict")
class Predict(Resource):
@ns.expect(input_model, validate=True)
@ns.marshal_with(output_model, code=200)
@ns.response(422, "Validation Error")
@ns.response(500, "Internal Server Error")
def post(self):
"""Submit features for loan approval prediction."""
data = api.payload
result = ml_service.predict(data)
return result
Visit http://localhost:5000/docs to see the Swagger UI.
Complete Flask ML API
Here's a complete, runnable Flask application:
from flask import Flask, request, jsonify
from flask_cors import CORS
from datetime import datetime
import joblib
import numpy as np
# --- App Setup ---
app = Flask(__name__)
CORS(app, origins=["http://localhost:3000"])
# --- Model Loading ---
model = None
model_version = "unknown"
def load_model():
global model, model_version
model = joblib.load("models/model_v1.joblib")
model_version = "v1.0"
# --- Routes ---
@app.route("/health", methods=["GET"])
def health():
return jsonify({
"status": "healthy" if model else "degraded",
"model_version": model_version,
})
@app.route("/api/v1/predict", methods=["POST"])
def predict():
data = request.get_json()
if not data:
return jsonify({"error": "JSON body required"}), 400
required = ["age", "income", "credit_score",
"employment_years", "loan_amount"]
missing = [f for f in required if f not in data]
if missing:
return jsonify({"error": "Missing fields", "fields": missing}), 422
try:
features = np.array([[
int(data["age"]),
float(data["income"]),
int(data["credit_score"]),
float(data["employment_years"]),
float(data["loan_amount"]),
]])
except (ValueError, TypeError) as e:
return jsonify({"error": f"Invalid data: {str(e)}"}), 422
if model is None:
return jsonify({"error": "Model not loaded"}), 503
try:
pred = model.predict(features)[0]
proba = model.predict_proba(features)[0]
return jsonify({
"prediction": "approved" if pred == 1 else "denied",
"probability": round(float(max(proba)), 4),
"model_version": model_version,
"timestamp": datetime.utcnow().isoformat(),
})
except Exception as e:
return jsonify({"error": f"Prediction failed: {str(e)}"}), 500
# --- Startup ---
if __name__ == "__main__":
load_model()
app.run(debug=True, host="0.0.0.0", port=5000)
Side-by-Side: Flask vs FastAPI
The same prediction endpoint in both frameworks:
Flask Version
@app.route("/api/v1/predict", methods=["POST"])
def predict():
data = request.get_json()
if not data:
return jsonify({"error": "JSON required"}), 400
required = ["age", "income", "credit_score"]
missing = [f for f in required if f not in data]
if missing:
return jsonify({"error": f"Missing: {missing}"}), 422
result = ml_service.predict(data)
return jsonify(result)
FastAPI Version
@app.post("/api/v1/predict", response_model=PredictionOutput)
def predict(data: PredictionInput):
result = ml_service.predict(data.model_dump())
return result
| Aspect | Flask | FastAPI |
|---|---|---|
| Lines of code | ~12 lines | ~4 lines |
| Validation | Manual, verbose | Automatic via Pydantic |
| Error messages | Custom for each check | Auto-generated, detailed |
| Documentation | Requires Flask-RESTX | Auto-generated |
| Type safety | None at runtime | Full runtime validation |
Summary
| Topic | Key Takeaway |
|---|---|
| Flask | Lightweight, battle-tested micro-framework |
| Request parsing | Manual with request.get_json() |
| Validation | Manual or via extensions (more boilerplate) |
| Blueprints | Organize routes into modules |
| Error handlers | Register per status code or exception type |
| Flask-RESTX | Adds Swagger documentation to Flask |
| vs FastAPI | Simpler but more manual; FastAPI automates validation and docs |
Flask Quick Reference
| Action | Code |
|---|---|
| Create app | app = Flask(__name__) |
| GET route | @app.route("/path", methods=["GET"]) |
| POST route | @app.route("/path", methods=["POST"]) |
| Get JSON body | request.get_json() |
| Return JSON | jsonify({"key": "value"}) |
| Return with status | return jsonify({...}), 422 |
| Register blueprint | app.register_blueprint(bp, url_prefix="/api") |
| Run app | app.run(debug=True, port=5000) |
| Add CORS | CORS(app, origins=[...]) |
| Error handler | @app.errorhandler(404) |