Troubleshooting - API Development
How to Use This Guide
Each issue follows the same structure:
| Section | Description |
|---|---|
| Symptom | What you see (error message, behavior) |
| Cause | Why it happens |
| Solution | Step-by-step fix |
| Prevention | How to avoid it in the future |
Issue 1: Model Loading Errors
FileNotFoundError: Model file not found
Symptom:
FileNotFoundError: [Errno 2] No such file or directory: 'models/model_v1.joblib'
The API starts but immediately crashes or enters degraded mode.
Cause:
The model file path is relative to the current working directory (where you run uvicorn or python), not relative to the Python file. If you start the server from a different directory, the path breaks.
Solution:
Use an absolute path or path relative to the script:
from pathlib import Path
BASE_DIR = Path(__file__).resolve().parent.parent
MODEL_PATH = BASE_DIR / "models" / "model_v1.joblib"
ml_service.load_model(str(MODEL_PATH))
Prevention:
- Always use
pathlib.Pathwith__file__to build paths - Set the model path via an environment variable:
MODEL_PATH=./models/model_v1.joblib - Log the resolved path at startup for debugging
ModuleNotFoundError: No module named 'sklearn'
Symptom:
ModuleNotFoundError: No module named 'sklearn'
Happens when loading a model serialized with scikit-learn.
Cause:
The environment where you run the API doesn't have scikit-learn installed, or has a different version than the one used to train the model.
Solution:
pip install scikit-learn
If version mismatch:
pip install scikit-learn==1.3.2 # match training environment version
Prevention:
- Pin exact versions in
requirements.txt - Use the same virtual environment for training and serving
- Consider ONNX format for framework-independent serialization
UnpicklingError or ValueError when loading model
Symptom:
_pickle.UnpicklingError: invalid load key, '\x00'
ValueError: unsupported pickle protocol: 5
Cause:
- Model was serialized with a different Python/scikit-learn version
- File is corrupted or truncated
- Wrong file (not a valid pickle/joblib file)
Solution:
- Verify the file is a valid joblib file:
import joblib
model = joblib.load("models/model_v1.joblib")
print(type(model))
- Check Python version compatibility:
python --version # must match training environment
- Re-serialize the model if versions don't match.
Prevention:
- Document Python and scikit-learn versions alongside each model file
- Use a model registry that tracks metadata
Issue 2: CORS Errors
Access to fetch has been blocked by CORS policy
Symptom:
Browser console shows:
Access to fetch at 'http://localhost:8000/api/v1/predict'
from origin 'http://localhost:3000' has been blocked by CORS policy:
No 'Access-Control-Allow-Origin' header is present on the requested resource.
The API works fine with curl but fails from a browser.
Cause:
Browsers enforce the Same-Origin Policy. When your frontend (localhost:3000) calls your API (localhost:8000), the browser blocks the request unless the API explicitly allows cross-origin requests.
Solution — FastAPI:
from fastapi.middleware.cors import CORSMiddleware
app.add_middleware(
CORSMiddleware,
allow_origins=["http://localhost:3000"],
allow_methods=["GET", "POST"],
allow_headers=["*"],
)
Solution — Flask:
from flask_cors import CORS
CORS(app, origins=["http://localhost:3000"])
Prevention:
- Always configure CORS at the beginning of your project
- Test from a browser early, not just
curl - Never use
allow_origins=["*"]in production
CORS Preflight (OPTIONS) Fails
Symptom:
You see an OPTIONS request with a 405 or 500 error in the browser network tab, followed by the real request never being sent.
Cause:
The browser sends a preflight OPTIONS request before POST requests with custom headers. If your server doesn't handle OPTIONS, the preflight fails and the actual request is blocked.
Solution:
The CORS middleware handles this automatically. Make sure it's added before your routes:
# FastAPI — add middleware first
app.add_middleware(CORSMiddleware, ...)
# Then define routes
@app.post("/predict")
def predict():
...
Issue 3: Validation Errors (422)
FastAPI Returns 422 for Seemingly Valid Data
Symptom:
{
"detail": [
{
"loc": ["body", "age"],
"msg": "value is not a valid integer",
"type": "type_error.integer"
}
]
}
But you're sending "age": "35" which looks correct.
Cause:
Pydantic in strict mode does not coerce strings to integers. "35" (string) is not the same as 35 (integer) in JSON.
Solution:
Send proper JSON types:
# Wrong — age is a string
curl -d '{"age": "35", ...}'
# Correct — age is an integer
curl -d '{"age": 35, ...}'
Or enable coercion in Pydantic:
class PredictionInput(BaseModel):
model_config = {"coerce_numbers_to_str": False, "strict": False}
age: int = Field(...)
Prevention:
- Always validate your JSON payloads (use a JSON linter)
- Document expected types clearly in your API docs
- Test with the Swagger UI which enforces correct types
Missing Content-Type Header
Symptom:
Flask returns None from request.get_json(), or FastAPI returns a 422 error.
Cause:
The client didn't set Content-Type: application/json.
Solution:
Always include the header:
curl -X POST http://localhost:8000/api/v1/predict \
-H "Content-Type: application/json" \
-d '{"age": 35, ...}'
Issue 4: Memory Leaks and High Memory Usage
Memory Grows Over Time
Symptom:
API memory usage (RSS) increases steadily over hours/days until the process is killed by the OS or container runtime.
Cause:
Common causes in ML APIs:
- Accumulating predictions in memory (logging lists that never get cleared)
- Creating new model instances per request instead of reusing
- Large temporary arrays not being garbage collected
- Circular references in custom objects
Solution:
- Ensure the model is loaded once and reused:
# Bad — loads model every request
@app.post("/predict")
def predict():
model = joblib.load("model.joblib") # memory leak!
...
# Good — load once, reuse
ml_service = MLService()
ml_service.load_model("model.joblib")
@app.post("/predict")
def predict():
result = ml_service.predict(...) # reuses loaded model
...
- Don't accumulate data in global lists:
# Bad
prediction_log = []
@app.post("/predict")
def predict():
prediction_log.append(result) # grows forever!
- Monitor memory:
import psutil
import os
@app.get("/debug/memory")
def memory():
process = psutil.Process(os.getpid())
return {"memory_mb": process.memory_info().rss / 1024 / 1024}
Prevention:
- Monitor memory usage in production (Prometheus, CloudWatch)
- Set memory limits in your container/process manager
- Use a dedicated logging service instead of in-memory lists
Issue 5: Slow Predictions
High Latency on Prediction Endpoint
Symptom:
Predictions take 500ms–5s instead of the expected 10–50ms.
Cause:
Solution:
- Diagnose — add timing to your endpoint:
import time
@app.post("/predict")
def predict(data: PredictionInput):
t0 = time.perf_counter()
t1 = time.perf_counter()
features = preprocess(data)
preprocess_ms = (time.perf_counter() - t1) * 1000
t2 = time.perf_counter()
result = model.predict(features)
predict_ms = (time.perf_counter() - t2) * 1000
total_ms = (time.perf_counter() - t0) * 1000
return {
"result": result,
"timing": {
"preprocess_ms": preprocess_ms,
"predict_ms": predict_ms,
"total_ms": total_ms,
}
}
-
Model loading: Load once at startup (see Issue 4)
-
Async blocking: Use
def(sync) for CPU-bound inference in FastAPI:
# Wrong — blocks the event loop
@app.post("/predict")
async def predict(data: PredictionInput):
result = model.predict(...) # CPU-bound, blocking!
# Right — FastAPI runs in thread pool
@app.post("/predict")
def predict(data: PredictionInput):
result = model.predict(...) # runs in thread pool
- Model optimization: Consider lighter models (decision tree vs. large ensemble)
Prevention:
- Add response time headers (
X-Response-Time-Ms) - Set latency budgets (e.g., p95 < 100ms)
- Profile before optimizing
Issue 6: 422 Errors with Nested/Complex Inputs
Pydantic Fails on Nested Objects
Symptom:
{
"detail": [{"loc": ["body"], "msg": "value is not a valid dict"}]
}
Cause:
Client sends data in an unexpected format (e.g., form-encoded instead of JSON, or wrapping data in an extra layer).
Solution:
Verify what the client actually sends:
@app.post("/debug")
async def debug(request: Request):
body = await request.body()
return {
"content_type": request.headers.get("content-type"),
"body_raw": body.decode(),
"body_size": len(body),
}
Common fixes:
- Ensure
Content-Type: application/json - Don't double-wrap:
{"data": {"age": 35}}when the schema expects{"age": 35} - Check for BOM characters in the request body
Issue 7: Deployment Issues
uvicorn Refuses Connections from Other Machines
Symptom:
API works on localhost but not when accessed from another machine or container.
Cause:
uvicorn binds to 127.0.0.1 (localhost only) by default.
Solution:
Bind to all interfaces:
uvicorn app.main:app --host 0.0.0.0 --port 8000
OSError: [Errno 98] Address already in use
Symptom:
Can't start the server because the port is occupied.
Solution:
# Find the process using the port
# Linux/macOS
lsof -i :8000
# Windows
netstat -ano | findstr :8000
# Kill it
kill <PID> # Linux/macOS
taskkill /PID <PID> /F # Windows
Or use a different port:
uvicorn app.main:app --port 8001
Multiple Workers and Model Loading
Symptom:
When running with multiple workers (uvicorn --workers 4), each worker loads the model separately, causing high memory usage.
Cause:
Each uvicorn worker is a separate process. The model is loaded in each one.
Solution:
For small models, this is acceptable. For large models:
- Use fewer workers
- Use a model server (TensorFlow Serving, Triton)
- Use shared memory or memory-mapped files
# 4 workers = 4x model memory
uvicorn app.main:app --workers 4
# Consider: is 1 worker with async enough?
uvicorn app.main:app --workers 1
Quick Reference: Error Code → Fix
| Error Code | Common Cause | Quick Fix |
|---|---|---|
| 400 | Malformed JSON | Check JSON syntax, add Content-Type: application/json |
| 404 | Wrong URL path | Verify endpoint URL, check for typos |
| 405 | Wrong HTTP method | Use POST not GET for /predict |
| 422 | Validation failure | Check data types match schema, verify required fields |
| 500 | Unhandled exception | Check server logs, add try/except in route handler |
| 503 | Model not loaded | Verify model file path, check startup logs |
Debugging Checklist
When your API doesn't work, follow this systematic checklist:
- Check the server logs — the error message is usually there
- Verify the endpoint URL —
http://, port number, path - Check the HTTP method —
POST /predict, notGET /predict - Verify Content-Type header —
application/json - Validate your JSON — use a JSON validator/linter
- Test with curl first — eliminates browser/CORS issues
- Check model file — does it exist at the expected path?
- Check dependencies —
pip list | grep scikit-learn - Try the Swagger UI —
/docsin FastAPI - Read the full error trace — scroll up in the terminal
Add a debug endpoint that returns the raw request information:
@app.post("/debug")
async def debug(request: Request):
body = await request.body()
return {
"method": request.method,
"url": str(request.url),
"headers": dict(request.headers),
"body": body.decode("utf-8", errors="replace"),
}
This tells you exactly what the server receives, eliminating guesswork.