Infrastructure Planning for AI

Theory 45 min

Why Infrastructure Matters

The Foundation Analogy

Infrastructure for AI deployment is like the foundation of a building. The most beautiful architecture is useless if the foundation is weak. Similarly, the most accurate model is worthless if it can't run reliably in production.

Python Virtual Environments

The Problem: Dependency Hell

Imagine you have two projects:

Project A requires scikit-learn==1.2.0
Project B requires scikit-learn==1.4.0

If both use your system Python, installing one version breaks the other. This is called dependency hell.

The Solution: Virtual Environments

A virtual environment is an isolated Python installation. Each project gets its own set of packages without interfering with others.

venv — The Built-in Option

venv comes with Python and is the simplest option:

# Create a virtual environment
python -m venv .venv

# Activate it (Windows)
.venv\Scripts\activate

# Activate it (macOS/Linux)
source .venv/bin/activate

# Your terminal shows the active environment
(.venv) $ python --version
Python 3.11.5

# Install packages in isolation
(.venv) $ pip install scikit-learn pandas fastapi

# Deactivate when done
(.venv) $ deactivate

conda — The Data Science Option

Conda is a package manager popular in data science. It manages both Python packages and system-level dependencies (like CUDA for GPUs).

# Create a conda environment
conda create -n ml-project python=3.11

# Activate it
conda activate ml-project

# Install packages (can mix conda and pip)
conda install scikit-learn pandas
pip install fastapi

# Export environment
conda env export > environment.yml

# Recreate from file
conda env create -f environment.yml

venv vs conda

Feature	venv	conda
Installation	Built-in (Python 3.3+)	Requires Anaconda/Miniconda
Package source	PyPI only	Conda channels + PyPI
Non-Python deps	Cannot manage	Can manage (CUDA, C libs)
Speed	Fast	Slower (dependency solving)
Reproducibility	`requirements.txt`	`environment.yml`
Disk space	Lightweight	Heavier
Best for	Web apps, APIs, CI/CD	Data science, GPU projects

Recommendation for This Course

We use venv + pip throughout this course. It's simpler, faster, and sufficient for our API-focused deployment workflow. Use conda if you need GPU support or complex scientific libraries.

Dependency Management

requirements.txt — Pinning Versions

A requirements.txt file lists all your project's dependencies with pinned versions for reproducibility:

# Core ML
scikit-learn==1.4.2
pandas==2.2.0
numpy==1.26.4
joblib==1.3.2

# API Framework
fastapi==0.109.0
uvicorn==0.27.0
pydantic==2.5.3

# Testing
pytest==8.0.0
httpx==0.26.0

# Explainability
shap==0.44.1
lime==0.2.0.1

Always Pin Versions

Never use pip install scikit-learn without a version in your requirements file. An unpinned dependency means your project might break tomorrow if a new version is released.

Generating requirements.txt

# Option 1: Freeze all installed packages
pip freeze > requirements.txt

# Option 2: Use pipreqs (only project imports)
pip install pipreqs
pipreqs . --force

# Install from requirements
pip install -r requirements.txt

The Lock File Pattern

For stricter reproducibility, modern tools create lock files that pin every sub-dependency:

Tool	Config File	Lock File
pip	`requirements.txt`	`requirements.txt` (manually)
pip-tools	`requirements.in`	`requirements.txt` (compiled)
Poetry	`pyproject.toml`	`poetry.lock`
Pipenv	`Pipfile`	`Pipfile.lock`

# Using pip-tools for better dependency management
pip install pip-tools

# Write your direct dependencies in requirements.in
# Then compile the full locked file:
pip-compile requirements.in --output-file requirements.txt

Docker Basics for ML

What is Docker?

Docker packages your application, its dependencies, and the operating system into a single container — a lightweight, portable, self-sufficient unit.

The Shipping Container Analogy

Before standardized shipping containers, every port had different cranes, trucks, and warehouses. Shipping was chaotic and slow. The standardized container revolutionized global trade.

Docker does the same for software:

Shipping Container	Docker Container
Standard size fits any ship/truck/crane	Runs on any machine with Docker
Contents are isolated and sealed	App is isolated from host system
Stackable and composable	Multiple containers work together
Reusable across the world	Same image runs dev/staging/prod

Dockerfile for an ML Project

A Dockerfile is a recipe for building a container image:

# Start from a Python base image
FROM python:3.11-slim

# Set working directory
WORKDIR /app

# Copy and install dependencies first (better caching)
COPY requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt

# Copy the application code
COPY . .

# Expose the API port
EXPOSE 8000

# Start the FastAPI server
CMD ["uvicorn", "app.main:app", "--host", "0.0.0.0", "--port", "8000"]

Key Docker Commands

# Build an image
docker build -t my-ml-api:v1.0 .

# Run a container
docker run -p 8000:8000 my-ml-api:v1.0

# Run in background
docker run -d -p 8000:8000 --name ml-api my-ml-api:v1.0

# Check running containers
docker ps

# View logs
docker logs ml-api

# Stop container
docker stop ml-api

Docker Layer Caching

Docker builds images in layers. Each instruction in the Dockerfile creates a layer. If a layer hasn't changed, Docker reuses the cached version.

Optimization Tip

Always copy requirements.txt and install dependencies before copying your code. This way, Docker only reinstalls packages when dependencies actually change, not when you edit a Python file.

.dockerignore

Just like .gitignore, a .dockerignore file excludes unnecessary files from the Docker build context:

__pycache__
*.pyc
.git
.venv
.env
*.ipynb_checkpoints
data/raw/
notebooks/
.pytest_cache

GPU vs CPU Considerations

When Do You Need a GPU?

Cost Comparison

Instance Type	vCPUs	RAM	GPU	Price/hour (approx.)	Use Case
t3.medium	2	4 GB	None	$0.04	Simple sklearn models
c5.xlarge	4	8 GB	None	$0.17	XGBoost, feature-heavy models
g4dn.xlarge	4	16 GB	1x T4	$0.53	PyTorch inference
p3.2xlarge	8	61 GB	1x V100	$3.06	Training deep learning models
p4d.24xlarge	96	1152 GB	8x A100	$32.77	Large Language Models

Cost Alert

A GPU instance can cost 10-100x more than a CPU instance. Always start with CPU and only upgrade to GPU if latency requirements demand it. For this course, CPU instances are sufficient.

Training vs Inference

Phase	Compute Needs	Duration	Cost Strategy
Training	High (GPU often)	Hours to days	Use spot instances (60-90% savings)
Inference	Lower (CPU often OK)	Continuous	Use reserved instances or serverless

Cloud Services for ML

The Big Three

Cloud Services Comparison

Feature	AWS SageMaker	GCP Vertex AI	Azure ML
Notebooks	SageMaker Studio	Vertex Workbench	Azure ML Studio
Training	Training Jobs	Custom Training	Training Pipelines
Deployment	Endpoints	Endpoints	Managed Endpoints
AutoML	Autopilot	AutoML	AutoML
MLOps	Pipelines	Pipelines	Designer + Pipelines
Containers	ECR + ECS/EKS	GCR + GKE/Cloud Run	ACR + ACI/AKS
Serverless	Lambda	Cloud Functions	Azure Functions
Pricing	Pay-as-you-go	Pay-as-you-go	Pay-as-you-go

Simpler Deployment Options

For college projects and small services, you don't need the full power of SageMaker or Vertex AI:

Platform	Best For	Free Tier	Complexity
Render	Simple API hosting	750 hours/month	⭐ Very Low
Railway	Python apps + DB	$5 credit/month	⭐ Very Low
Fly.io	Docker containers	3 shared VMs	⭐⭐ Low
AWS Lambda	Serverless functions	1M requests/month	⭐⭐ Low
Google Cloud Run	Container-based APIs	2M requests/month	⭐⭐ Low
Heroku	Full-stack apps	Eco plan $5/month	⭐⭐ Low

For This Course

We'll use local development (FastAPI + uvicorn) for most labs. For the final project, you may optionally deploy to a cloud platform.

CI/CD Basics for ML

What is CI/CD?

CI/CD stands for Continuous Integration / Continuous Deployment. It automates the process of testing and deploying code changes.

View CI/CD Pipeline

The Assembly Line Analogy

CI/CD is like a car assembly line:

CI = Quality checks at every station (unit tests, linting, building)
CD = The car rolls off the line and drives to the dealership (deployment)

Without CI/CD, it's like hand-building each car and manually driving it to the customer.

CI/CD for ML — What's Different?

Traditional CI/CD tests code. ML CI/CD must also test data and models:

Traditional CI/CD	ML CI/CD
Unit tests pass?	Unit tests pass?
Code compiles?	Code compiles?
—	Data validation passes?
—	Model metrics above threshold?
—	No data drift detected?
—	Model size within limits?
Deploy application	Deploy model + application

Example: GitHub Actions for ML

name: ML Pipeline
on:
  push:
    branches: [main]

jobs:
  test-and-deploy:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      
      - name: Set up Python
        uses: actions/setup-python@v5
        with:
          python-version: '3.11'
      
      - name: Install dependencies
        run: |
          python -m pip install --upgrade pip
          pip install -r requirements.txt
      
      - name: Run tests
        run: pytest tests/ -v
      
      - name: Check model metrics
        run: python scripts/validate_model.py
      
      - name: Build Docker image
        run: docker build -t ml-api:latest .

Environment Reproducibility

The Reproducibility Pyramid

Minimum Reproducibility Checklist

File	Purpose	Required?
`requirements.txt`	Python dependencies with versions	✅ Yes
`Dockerfile`	Complete environment definition	✅ Yes (for deployment)
`.dockerignore`	Exclude unnecessary files	✅ Yes
`.gitignore`	Exclude generated files from Git	✅ Yes
`README.md`	Setup and run instructions	✅ Yes
`pyproject.toml`	Project metadata and tool config	Recommended
`.env.example`	Template for environment variables	Recommended
`Makefile`	Common commands shortcuts	Optional

Standard Project Structure

ml-project/
├── app/
│   ├── __init__.py
│   ├── main.py              # FastAPI application
│   ├── model.py             # Model loading and prediction
│   └── schemas.py           # Pydantic request/response models
├── models/
│   └── model_v1.0.0.pkl     # Serialized model
├── data/
│   ├── raw/                  # Original data (gitignored)
│   └── processed/            # Cleaned data
├── tests/
│   ├── __init__.py
│   ├── test_api.py           # API endpoint tests
│   └── test_model.py         # Model prediction tests
├── notebooks/
│   └── exploration.ipynb     # Data exploration (gitignored in prod)
├── scripts/
│   └── train.py              # Training script
├── .gitignore
├── .dockerignore
├── Dockerfile
├── requirements.txt
├── README.md
└── pyproject.toml

This Structure is Used Throughout the Course

Every lab in this course follows this project structure. You'll build it incrementally — starting with the environment setup in TP1, adding the model in Module 2, the API in Module 3, and tests in Module 5.

Summary

Infrastructure Decision Tree

Key Takeaways

#	Concept	Remember
1	Virtual environments	Always isolate project dependencies
2	Pin versions	`requirements.txt` with exact versions
3	Docker	Package everything for reproducibility
4	CPU first	Only use GPU if deep learning demands it
5	Cloud options	Simple platforms (Render, Cloud Run) for small projects
6	CI/CD	Automate testing and deployment
7	Project structure	Follow conventions for maintainability

Next Steps

In TP1, you'll put these concepts into practice by setting up your project environment from scratch — creating a virtual environment, installing dependencies, and building the standard project structure.

Why Infrastructure Matters​

The Foundation Analogy​

Python Virtual Environments​

The Problem: Dependency Hell​

The Solution: Virtual Environments​

venv — The Built-in Option​

conda — The Data Science Option​

venv vs conda​

Dependency Management​

requirements.txt — Pinning Versions​

Generating requirements.txt​

The Lock File Pattern​

Docker Basics for ML​

What is Docker?​

The Shipping Container Analogy​

Dockerfile for an ML Project​

Key Docker Commands​

Docker Layer Caching​

.dockerignore​

GPU vs CPU Considerations​

When Do You Need a GPU?​

Cost Comparison​

Training vs Inference​

Cloud Services for ML​

The Big Three​

Cloud Services Comparison​

Simpler Deployment Options​

CI/CD Basics for ML​

What is CI/CD?​

The Assembly Line Analogy​

CI/CD for ML — What's Different?​

Example: GitHub Actions for ML​

Environment Reproducibility​

The Reproducibility Pyramid​

Minimum Reproducibility Checklist​

Standard Project Structure​

Summary​

Infrastructure Decision Tree​

Key Takeaways​