Troubleshooting - API Development

Troubleshooting Reference

How to Use This Guide

Each issue follows the same structure:

Section	Description
Symptom	What you see (error message, behavior)
Cause	Why it happens
Solution	Step-by-step fix
Prevention	How to avoid it in the future

Issue 1: Model Loading Errors

`FileNotFoundError: Model file not found`

Symptom:

FileNotFoundError: [Errno 2] No such file or directory: 'models/model_v1.joblib'

The API starts but immediately crashes or enters degraded mode.

Cause:

The model file path is relative to the current working directory (where you run uvicorn or python), not relative to the Python file. If you start the server from a different directory, the path breaks.

Solution:

Use an absolute path or path relative to the script:

from pathlib import Path

BASE_DIR = Path(__file__).resolve().parent.parent
MODEL_PATH = BASE_DIR / "models" / "model_v1.joblib"

ml_service.load_model(str(MODEL_PATH))

Prevention:

Always use pathlib.Path with __file__ to build paths
Set the model path via an environment variable: MODEL_PATH=./models/model_v1.joblib
Log the resolved path at startup for debugging

`ModuleNotFoundError: No module named 'sklearn'`

Symptom:

ModuleNotFoundError: No module named 'sklearn'

Happens when loading a model serialized with scikit-learn.

Cause:

The environment where you run the API doesn't have scikit-learn installed, or has a different version than the one used to train the model.

Solution:

pip install scikit-learn

If version mismatch:

pip install scikit-learn==1.3.2  # match training environment version

Prevention:

Pin exact versions in requirements.txt
Use the same virtual environment for training and serving
Consider ONNX format for framework-independent serialization

`UnpicklingError` or `ValueError` when loading model

Symptom:

_pickle.UnpicklingError: invalid load key, '\x00'
ValueError: unsupported pickle protocol: 5

Cause:

Model was serialized with a different Python/scikit-learn version
File is corrupted or truncated
Wrong file (not a valid pickle/joblib file)

Solution:

Verify the file is a valid joblib file:

import joblib
model = joblib.load("models/model_v1.joblib")
print(type(model))

Check Python version compatibility:

python --version  # must match training environment

Re-serialize the model if versions don't match.

Prevention:

Document Python and scikit-learn versions alongside each model file
Use a model registry that tracks metadata

Issue 2: CORS Errors

`Access to fetch has been blocked by CORS policy`

Symptom:

Browser console shows:

Access to fetch at 'http://localhost:8000/api/v1/predict'
from origin 'http://localhost:3000' has been blocked by CORS policy:
No 'Access-Control-Allow-Origin' header is present on the requested resource.

The API works fine with curl but fails from a browser.

Cause:

Browsers enforce the Same-Origin Policy. When your frontend (localhost:3000) calls your API (localhost:8000), the browser blocks the request unless the API explicitly allows cross-origin requests.

Solution — FastAPI:

from fastapi.middleware.cors import CORSMiddleware

app.add_middleware(
    CORSMiddleware,
    allow_origins=["http://localhost:3000"],
    allow_methods=["GET", "POST"],
    allow_headers=["*"],
)

Solution — Flask:

from flask_cors import CORS

CORS(app, origins=["http://localhost:3000"])

Prevention:

Always configure CORS at the beginning of your project
Test from a browser early, not just curl
Never use allow_origins=["*"] in production

CORS Preflight (OPTIONS) Fails

Symptom:

You see an OPTIONS request with a 405 or 500 error in the browser network tab, followed by the real request never being sent.

Cause:

The browser sends a preflight OPTIONS request before POST requests with custom headers. If your server doesn't handle OPTIONS, the preflight fails and the actual request is blocked.

Solution:

The CORS middleware handles this automatically. Make sure it's added before your routes:

# FastAPI — add middleware first
app.add_middleware(CORSMiddleware, ...)

# Then define routes
@app.post("/predict")
def predict():
    ...

Issue 3: Validation Errors (422)

FastAPI Returns 422 for Seemingly Valid Data

Symptom:

{
  "detail": [
    {
      "loc": ["body", "age"],
      "msg": "value is not a valid integer",
      "type": "type_error.integer"
    }
  ]
}

But you're sending "age": "35" which looks correct.

Cause:

Pydantic in strict mode does not coerce strings to integers. "35" (string) is not the same as 35 (integer) in JSON.

Solution:

Send proper JSON types:

# Wrong — age is a string
curl -d '{"age": "35", ...}'

# Correct — age is an integer
curl -d '{"age": 35, ...}'

Or enable coercion in Pydantic:

class PredictionInput(BaseModel):
    model_config = {"coerce_numbers_to_str": False, "strict": False}
    age: int = Field(...)

Prevention:

Always validate your JSON payloads (use a JSON linter)
Document expected types clearly in your API docs
Test with the Swagger UI which enforces correct types

Missing `Content-Type` Header

Symptom:

Flask returns None from request.get_json(), or FastAPI returns a 422 error.

Cause:

The client didn't set Content-Type: application/json.

Solution:

Always include the header:

curl -X POST http://localhost:8000/api/v1/predict \
  -H "Content-Type: application/json" \
  -d '{"age": 35, ...}'

Issue 4: Memory Leaks and High Memory Usage

Memory Grows Over Time

Symptom:

API memory usage (RSS) increases steadily over hours/days until the process is killed by the OS or container runtime.

Cause:

Common causes in ML APIs:

Accumulating predictions in memory (logging lists that never get cleared)
Creating new model instances per request instead of reusing
Large temporary arrays not being garbage collected
Circular references in custom objects

Solution:

Ensure the model is loaded once and reused:

# Bad — loads model every request
@app.post("/predict")
def predict():
    model = joblib.load("model.joblib")  # memory leak!
    ...

# Good — load once, reuse
ml_service = MLService()
ml_service.load_model("model.joblib")

@app.post("/predict")
def predict():
    result = ml_service.predict(...)  # reuses loaded model
    ...

Don't accumulate data in global lists:

# Bad
prediction_log = []

@app.post("/predict")
def predict():
    prediction_log.append(result)  # grows forever!

Monitor memory:

import psutil
import os

@app.get("/debug/memory")
def memory():
    process = psutil.Process(os.getpid())
    return {"memory_mb": process.memory_info().rss / 1024 / 1024}

Prevention:

Monitor memory usage in production (Prometheus, CloudWatch)
Set memory limits in your container/process manager
Use a dedicated logging service instead of in-memory lists

Issue 5: Slow Predictions

High Latency on Prediction Endpoint

Symptom:

Predictions take 500ms–5s instead of the expected 10–50ms.

Cause:

Solution:

Diagnose — add timing to your endpoint:

import time

@app.post("/predict")
def predict(data: PredictionInput):
    t0 = time.perf_counter()

    t1 = time.perf_counter()
    features = preprocess(data)
    preprocess_ms = (time.perf_counter() - t1) * 1000

    t2 = time.perf_counter()
    result = model.predict(features)
    predict_ms = (time.perf_counter() - t2) * 1000

    total_ms = (time.perf_counter() - t0) * 1000

    return {
        "result": result,
        "timing": {
            "preprocess_ms": preprocess_ms,
            "predict_ms": predict_ms,
            "total_ms": total_ms,
        }
    }

Model loading: Load once at startup (see Issue 4)
Async blocking: Use def (sync) for CPU-bound inference in FastAPI:

# Wrong — blocks the event loop
@app.post("/predict")
async def predict(data: PredictionInput):
    result = model.predict(...)  # CPU-bound, blocking!

# Right — FastAPI runs in thread pool
@app.post("/predict")
def predict(data: PredictionInput):
    result = model.predict(...)  # runs in thread pool

Model optimization: Consider lighter models (decision tree vs. large ensemble)

Prevention:

Add response time headers (X-Response-Time-Ms)
Set latency budgets (e.g., p95 < 100ms)
Profile before optimizing

Issue 6: 422 Errors with Nested/Complex Inputs

Pydantic Fails on Nested Objects

Symptom:

{
  "detail": [{"loc": ["body"], "msg": "value is not a valid dict"}]
}

Cause:

Client sends data in an unexpected format (e.g., form-encoded instead of JSON, or wrapping data in an extra layer).

Solution:

Verify what the client actually sends:

@app.post("/debug")
async def debug(request: Request):
    body = await request.body()
    return {
        "content_type": request.headers.get("content-type"),
        "body_raw": body.decode(),
        "body_size": len(body),
    }

Common fixes:

Ensure Content-Type: application/json
Don't double-wrap: {"data": {"age": 35}} when the schema expects {"age": 35}
Check for BOM characters in the request body

Issue 7: Deployment Issues

`uvicorn` Refuses Connections from Other Machines

Symptom:

API works on localhost but not when accessed from another machine or container.

Cause:

uvicorn binds to 127.0.0.1 (localhost only) by default.

Solution:

Bind to all interfaces:

uvicorn app.main:app --host 0.0.0.0 --port 8000

`OSError: [Errno 98] Address already in use`

Symptom:

Can't start the server because the port is occupied.

Solution:

# Find the process using the port
# Linux/macOS
lsof -i :8000

# Windows
netstat -ano | findstr :8000

# Kill it
kill <PID>        # Linux/macOS
taskkill /PID <PID> /F  # Windows

Or use a different port:

uvicorn app.main:app --port 8001

Multiple Workers and Model Loading

Symptom:

When running with multiple workers (uvicorn --workers 4), each worker loads the model separately, causing high memory usage.

Cause:

Each uvicorn worker is a separate process. The model is loaded in each one.

Solution:

For small models, this is acceptable. For large models:

Use fewer workers
Use a model server (TensorFlow Serving, Triton)
Use shared memory or memory-mapped files

# 4 workers = 4x model memory
uvicorn app.main:app --workers 4

# Consider: is 1 worker with async enough?
uvicorn app.main:app --workers 1

Quick Reference: Error Code → Fix

Error Code	Common Cause	Quick Fix
400	Malformed JSON	Check JSON syntax, add `Content-Type: application/json`
404	Wrong URL path	Verify endpoint URL, check for typos
405	Wrong HTTP method	Use POST not GET for `/predict`
422	Validation failure	Check data types match schema, verify required fields
500	Unhandled exception	Check server logs, add try/except in route handler
503	Model not loaded	Verify model file path, check startup logs

Debugging Checklist

When your API doesn't work, follow this systematic checklist:

Check the server logs — the error message is usually there
Verify the endpoint URL — http://, port number, path
Check the HTTP method — POST /predict, not GET /predict
Verify Content-Type header — application/json
Validate your JSON — use a JSON validator/linter
Test with curl first — eliminates browser/CORS issues
Check model file — does it exist at the expected path?
Check dependencies — pip list | grep scikit-learn
Try the Swagger UI — /docs in FastAPI
Read the full error trace — scroll up in the terminal

When All Else Fails

Add a debug endpoint that returns the raw request information:

@app.post("/debug")
async def debug(request: Request):
    body = await request.body()
    return {
        "method": request.method,
        "url": str(request.url),
        "headers": dict(request.headers),
        "body": body.decode("utf-8", errors="replace"),
    }

This tells you exactly what the server receives, eliminating guesswork.

How to Use This Guide​

Issue 1: Model Loading Errors​

FileNotFoundError: Model file not found​

ModuleNotFoundError: No module named 'sklearn'​

UnpicklingError or ValueError when loading model​

Issue 2: CORS Errors​

Access to fetch has been blocked by CORS policy​

CORS Preflight (OPTIONS) Fails​

Issue 3: Validation Errors (422)​

FastAPI Returns 422 for Seemingly Valid Data​

Missing Content-Type Header​

Issue 4: Memory Leaks and High Memory Usage​

Memory Grows Over Time​

Issue 5: Slow Predictions​

High Latency on Prediction Endpoint​

Issue 6: 422 Errors with Nested/Complex Inputs​

Pydantic Fails on Nested Objects​

Issue 7: Deployment Issues​

uvicorn Refuses Connections from Other Machines​

OSError: [Errno 98] Address already in use​

Multiple Workers and Model Loading​

Quick Reference: Error Code → Fix​

Debugging Checklist​

How to Use This Guide

Issue 1: Model Loading Errors

`FileNotFoundError: Model file not found`

`ModuleNotFoundError: No module named 'sklearn'`

`UnpicklingError` or `ValueError` when loading model

Issue 2: CORS Errors

`Access to fetch has been blocked by CORS policy`

CORS Preflight (OPTIONS) Fails

Issue 3: Validation Errors (422)

FastAPI Returns 422 for Seemingly Valid Data

Missing `Content-Type` Header

Issue 4: Memory Leaks and High Memory Usage

Memory Grows Over Time

Issue 5: Slow Predictions

High Latency on Prediction Endpoint

Issue 6: 422 Errors with Nested/Complex Inputs

Pydantic Fails on Nested Objects

Issue 7: Deployment Issues

`uvicorn` Refuses Connections from Other Machines

`OSError: [Errno 98] Address already in use`

Multiple Workers and Model Loading

Quick Reference: Error Code → Fix

Debugging Checklist