TP1 - Project Brief and Environment Setup

Practical Lab 60 min Beginner

Learning Objectives

By the end of this lab, you will be able to:

Create a structured project directory for ML deployment
Configure a Python virtual environment with venv
Install and manage dependencies with pip and requirements.txt
Write a project brief for an ML project
Verify that the environment is functional with a test script

Project Context

Throughout this course, you will build a complete ML prediction service — from model training to API deployment. This first lab lays the foundations.

View Project Roadmap

Prerequisites

Item	Description
Python 3.10+	Installed and accessible via terminal (`python --version`)
pip	Python package manager (included with Python)
Git	For code versioning
Terminal	PowerShell (Windows) or bash (macOS/Linux)
Code editor	VS Code, Cursor, or PyCharm

Verify your Python installation

Before starting, run these commands in your terminal:

python --version    # Should display Python 3.10+
pip --version       # Should display pip 23+
git --version       # Should display git 2+

If a command fails, install the missing component before continuing.

Step 1: Create the project structure

1.1 Create the root directory

Open your terminal and create the project directory:

# Create the main directory
mkdir ml-deployment-project
cd ml-deployment-project

1.2 Create the full directory tree

Create the standard structure for an ML project:

# Create the directories
mkdir -p app
mkdir -p models
mkdir -p data/raw
mkdir -p data/processed
mkdir -p tests
mkdir -p scripts
mkdir -p notebooks
mkdir -p docs

1.3 Create the initialization files

# __init__.py files for Python packages
touch app/__init__.py
touch tests/__init__.py

On Windows (PowerShell)

Replace touch with New-Item:

New-Item -ItemType File -Path app/__init__.py
New-Item -ItemType File -Path tests/__init__.py

Or simply use your code editor to create the files.

1.4 Verify the structure

Your project should look like this:

ml-deployment-project/
├── app/
│   └── __init__.py
├── models/
├── data/
│   ├── raw/
│   └── processed/
├── tests/
│   └── __init__.py
├── scripts/
├── notebooks/
└── docs/

Verification — Command to display the directory tree

# macOS/Linux
find . -type f -o -type d | head -20

# Windows PowerShell
Get-ChildItem -Recurse -Depth 2 | Select-Object FullName

You should see all directories and the __init__.py files.

Step 2: Configure the virtual environment

2.1 Create the virtual environment

# From the project root
python -m venv .venv

2.2 Activate the environment

# Windows (PowerShell)
.venv\Scripts\Activate.ps1

# Windows (CMD)
.venv\Scripts\activate.bat

# macOS / Linux
source .venv/bin/activate

PowerShell — Execution policy

If you get an error on PowerShell, run first:

Set-ExecutionPolicy -ExecutionPolicy RemoteSigned -Scope CurrentUser

Then try activating again.

2.3 Verify activation

# The prompt should show (.venv)
(.venv) $ python --version
Python 3.11.x

# Verify that pip points to the virtual environment
(.venv) $ pip --version
pip 23.x.x from .../ml-deployment-project/.venv/lib/...

Verification — How to know if the environment is active?

Your prompt displays (.venv) at the beginning
which python (Linux/macOS) or Get-Command python (PowerShell) points to .venv/
pip list shows very few packages (only pip and setuptools)

2.4 Update pip

python -m pip install --upgrade pip

Step 3: Install dependencies

3.1 Create the requirements.txt file

Create a requirements.txt file at the project root with the following content:

# ===========================================
# ML Deployment Project - Dependencies
# ===========================================

# --- Core ML ---
scikit-learn==1.4.2
pandas==2.2.0
numpy==1.26.4
joblib==1.3.2

# --- API Framework ---
fastapi==0.109.0
uvicorn[standard]==0.27.0
pydantic==2.5.3

# --- Testing ---
pytest==8.0.0
pytest-cov==4.1.0
httpx==0.26.0

# --- Model Explainability ---
shap==0.44.1
lime==0.2.0.1

# --- Utilities ---
python-dotenv==1.0.0
requests==2.31.0

3.2 Install the dependencies

pip install -r requirements.txt

Installation taking a long time?

Installing SHAP and its dependencies can take 2-5 minutes. This is normal — SHAP compiles C extensions in the background.

If SHAP installation fails, you can temporarily comment it out in requirements.txt and install it later.

3.3 Verify the installation

# Verify that the main packages are installed
pip list | grep -i "scikit-learn\|fastapi\|pandas\|pytest\|shap"

Verification — Expected output

You should see something like:

fastapi          0.109.0
pandas           2.2.0
pytest           8.0.0
scikit-learn     1.4.2
shap             0.44.1

If a package is missing, check the installation errors and try again with:

pip install <package-name>

Step 4: Create configuration files

4.1 .gitignore file

Create a .gitignore file at the root:

# Python
__pycache__/
*.py[cod]
*.egg-info/
dist/
build/
*.egg

# Virtual environment
.venv/
venv/
ENV/

# IDE
.vscode/
.idea/
*.swp
*.swo

# Data files (too large for Git)
data/raw/
*.csv
*.parquet
*.h5

# Model files (too large for Git)
models/*.pkl
models/*.joblib
models/*.onnx

# Environment variables
.env

# Jupyter
.ipynb_checkpoints/

# OS
.DS_Store
Thumbs.db

# Testing
.pytest_cache/
htmlcov/
.coverage

4.2 README.md file

Create a README.md file at the root:

# ML Deployment Project

A complete machine learning deployment pipeline — from model training to production API.

## Quick Start

```bash
# Clone the repository
git clone <repo-url>
cd ml-deployment-project

# Create virtual environment
python -m venv .venv
source .venv/bin/activate  # or .venv\Scripts\activate on Windows

# Install dependencies
pip install -r requirements.txt

# Run tests
pytest tests/ -v

# Start the API server
uvicorn app.main:app --reload

Project Structure

ml-deployment-project/
├── app/            # FastAPI application
├── models/         # Serialized ML models
├── data/           # Datasets (raw + processed)
├── tests/          # Unit and integration tests
├── scripts/        # Training and utility scripts
├── notebooks/      # Jupyter notebooks for exploration
└── docs/           # Project documentation

Technology Stack

ML: scikit-learn, pandas, NumPy
API: FastAPI, uvicorn
Testing: pytest, httpx
Explainability: SHAP, LIME

### 4.3 pyproject.toml file

Create a `pyproject.toml` file at the root:

```toml
[project]
name = "ml-deployment-project"
version = "0.1.0"
description = "ML model deployment with FastAPI"
requires-python = ">=3.10"

[tool.pytest.ini_options]
testpaths = ["tests"]
python_files = ["test_*.py"]
python_functions = ["test_*"]
addopts = "-v --tb=short"

[tool.ruff]
line-length = 88
target-version = "py311"

Step 5: Write the project brief

5.1 Create the document

Create the file docs/project_brief.md with the following content. Adapt it to your project — this template is for a customer churn prediction model.

# Project Brief — Customer Churn Prediction Service

## 1. Problem Statement

Predict whether a customer will cancel their subscription within the next 30 days,
based on their usage patterns and account information.

## 2. End Users

- **Primary**: Customer Success team (via web dashboard)
- **Secondary**: Marketing team (batch reports)

## 3. Data Requirements

| Feature | Type | Source |
|---------|------|--------|
| tenure_months | int | CRM database |
| monthly_charges | float | Billing system |
| total_charges | float | Billing system |
| contract_type | categorical | CRM database |
| internet_service | categorical | Service DB |
| tech_support_calls | int | Support tickets |

## 4. Model Requirements

| Metric | Target |
|--------|--------|
| AUC-ROC | > 0.85 |
| Precision | > 0.80 |
| Recall | > 0.70 |
| Inference latency | < 200ms |

## 5. Input / Output

**Input (API request):**
```json
{
  "tenure_months": 24,
  "monthly_charges": 65.5,
  "total_charges": 1572.0,
  "contract_type": "month-to-month",
  "internet_service": "fiber_optic",
  "tech_support_calls": 3
}

Output (API response):

{
  "churn_probability": 0.73,
  "risk_level": "high",
  "model_version": "1.0.0"
}

6. Scope

In Scope

Binary classification model (churn / no churn)
REST API with FastAPI
Unit tests with pytest
Model explainability (SHAP)
API documentation (Swagger)

Out of Scope

Real-time streaming predictions
Mobile application
Multi-tenant architecture
GPU inference

7. Timeline

Week	Deliverable
1-2	Environment setup + project brief
3-5	Model training + evaluation
6-8	API development + documentation
9-10	Testing + explainability
11-15	Integration + final project

:::tip[Customize your brief]
You can choose another ML problem if you prefer:
- **Spam detection** (text classification)
- **House price prediction** (regression)
- **Image classification** (if you're comfortable with deep learning)

The important thing is to clearly document the inputs, outputs, metrics, and scope.
:::

---

## Step 6: Verification script

### 6.1 Create the test script

Create the file `scripts/verify_setup.py`:

```python
"""
Verification script for ML Deployment Project setup.
Run this script to confirm your environment is correctly configured.
"""

import sys


def check_python_version():
    version = sys.version_info
    assert version.major == 3 and version.minor >= 10, (
        f"Python 3.10+ required, got {version.major}.{version.minor}"
    )
    print(f"  Python {version.major}.{version.minor}.{version.micro}")


def check_import(module_name, display_name=None):
    display = display_name or module_name
    try:
        mod = __import__(module_name)
        version = getattr(mod, "__version__", "unknown")
        print(f"  {display} {version}")
        return True
    except ImportError:
        print(f"  {display} — NOT INSTALLED")
        return False


def check_project_structure():
    from pathlib import Path

    required_dirs = ["app", "models", "data", "tests", "scripts", "docs"]
    required_files = [
        "requirements.txt",
        "app/__init__.py",
        "tests/__init__.py",
    ]

    project_root = Path(__file__).parent.parent
    missing = []

    for d in required_dirs:
        if not (project_root / d).is_dir():
            missing.append(f"Directory: {d}/")

    for f in required_files:
        if not (project_root / f).is_file():
            missing.append(f"File: {f}")

    if missing:
        print("  MISSING:")
        for m in missing:
            print(f"    - {m}")
        return False

    print(f"  All {len(required_dirs)} directories present")
    print(f"  All {len(required_files)} required files present")
    return True


def main():
    print("=" * 50)
    print("ML Deployment Project — Setup Verification")
    print("=" * 50)
    all_ok = True

    print("\n[1/3] Python Version")
    try:
        check_python_version()
    except AssertionError as e:
        print(f"  FAIL: {e}")
        all_ok = False

    print("\n[2/3] Package Imports")
    packages = [
        ("sklearn", "scikit-learn"),
        ("pandas", None),
        ("numpy", None),
        ("fastapi", None),
        ("uvicorn", None),
        ("pydantic", None),
        ("pytest", None),
        ("httpx", None),
        ("joblib", None),
    ]

    for module_name, display_name in packages:
        if not check_import(module_name, display_name):
            all_ok = False

    # SHAP and LIME are optional (may fail on some systems)
    print("\n  Optional packages:")
    check_import("shap")
    check_import("lime")

    print("\n[3/3] Project Structure")
    if not check_project_structure():
        all_ok = False

    print("\n" + "=" * 50)
    if all_ok:
        print("ALL CHECKS PASSED — Your environment is ready!")
    else:
        print("SOME CHECKS FAILED — Review the errors above.")
    print("=" * 50)

    return 0 if all_ok else 1


if __name__ == "__main__":
    sys.exit(main())

6.2 Run the verification

python scripts/verify_setup.py

Verification — Expected output

==================================================
ML Deployment Project — Setup Verification
==================================================

[1/3] Python Version
  Python 3.11.5

[2/3] Package Imports
  scikit-learn 1.4.2
  pandas 2.2.0
  numpy 1.26.4
  fastapi 0.109.0
  uvicorn 0.27.0
  pydantic 2.5.3
  pytest 8.0.0
  httpx 0.26.0
  joblib 1.3.2

  Optional packages:
  shap 0.44.1
  lime 0.2.0.1

[3/3] Project Structure
  All 6 directories present
  All 3 required files present

==================================================
ALL CHECKS PASSED — Your environment is ready!
==================================================

Step 7: Create a first test

7.1 Write a unit test

Create the file tests/test_setup.py:

"""Basic tests to verify the project environment."""

import importlib


def test_python_version():
    import sys
    assert sys.version_info >= (3, 10), "Python 3.10+ required"


def test_sklearn_import():
    sklearn = importlib.import_module("sklearn")
    assert hasattr(sklearn, "__version__")


def test_fastapi_import():
    fastapi = importlib.import_module("fastapi")
    assert hasattr(fastapi, "FastAPI")


def test_pandas_import():
    pd = importlib.import_module("pandas")
    assert hasattr(pd, "DataFrame")


def test_project_structure():
    from pathlib import Path

    root = Path(__file__).parent.parent
    assert (root / "app").is_dir()
    assert (root / "models").is_dir()
    assert (root / "tests").is_dir()
    assert (root / "requirements.txt").is_file()

7.2 Run the tests

pytest tests/test_setup.py -v

Verification — Expected output

========================= test session starts =========================
collected 5 items

tests/test_setup.py::test_python_version PASSED          [ 20%]
tests/test_setup.py::test_sklearn_import PASSED           [ 40%]
tests/test_setup.py::test_fastapi_import PASSED           [ 60%]
tests/test_setup.py::test_pandas_import PASSED            [ 80%]
tests/test_setup.py::test_project_structure PASSED        [100%]

========================= 5 passed in 0.42s =========================

All tests must pass (5/5). If a test fails, verify that the corresponding package is installed.

Step 8: Initialize Git

8.1 Initialize the repository

git init
git add .
git commit -m "Initial project setup: structure, dependencies, and verification"

8.2 Verify the status

git status
git log --oneline

Verification — What Git should ignore

Verify that .gitignore works correctly:

git status

You should NOT see:

.venv/ (virtual environment)
__pycache__/ (compiled files)
data/raw/ (raw data)

If these files appear, check your .gitignore.

Lab Summary

What you have accomplished

Step	Description	Status
1	Project structure created	☐
2	Virtual environment configured	☐
3	Dependencies installed	☐
4	Configuration files created (.gitignore, README, pyproject.toml)	☐
5	Project brief written	☐
6	Verification script run successfully	☐
7	First unit test written and passed	☐
8	Git repository initialized	☐

Final project structure

ml-deployment-project/
├── app/
│   └── __init__.py
├── models/
├── data/
│   ├── raw/
│   └── processed/
├── tests/
│   ├── __init__.py
│   └── test_setup.py
├── scripts/
│   └── verify_setup.py
├── notebooks/
├── docs/
│   └── project_brief.md
├── .gitignore
├── .venv/                  (not in Git)
├── pyproject.toml
├── README.md
└── requirements.txt

Essential commands to remember

Action	Command
Activate environment	`source .venv/bin/activate` or `.venv\Scripts\activate`
Install dependencies	`pip install -r requirements.txt`
Run tests	`pytest tests/ -v`
Verify environment	`python scripts/verify_setup.py`
Start API (Module 3)	`uvicorn app.main:app --reload`

Next lab

In Lab 2 (Module 2), you will train a classification model with scikit-learn and serialize it for deployment. The structure you just created will serve as the foundation for the rest of the course.

Learning Objectives​

Project Context​

Prerequisites​

Step 1: Create the project structure​

1.1 Create the root directory​

1.2 Create the full directory tree​

1.3 Create the initialization files​

1.4 Verify the structure​

Step 2: Configure the virtual environment​

2.1 Create the virtual environment​

2.2 Activate the environment​

2.3 Verify activation​

2.4 Update pip​

Step 3: Install dependencies​

3.1 Create the requirements.txt file​

3.2 Install the dependencies​

3.3 Verify the installation​

Step 4: Create configuration files​

4.1 .gitignore file​

4.2 README.md file​

Project Structure​

Technology Stack​

Step 5: Write the project brief​

5.1 Create the document​

6. Scope​

In Scope​

Out of Scope​

7. Timeline​

6.2 Run the verification​

Step 7: Create a first test​

7.1 Write a unit test​

7.2 Run the tests​

Step 8: Initialize Git​

8.1 Initialize the repository​

8.2 Verify the status​

Lab Summary​

What you have accomplished​

Final project structure​

Essential commands to remember​

Learning Objectives

Project Context

Prerequisites

Step 1: Create the project structure

1.1 Create the root directory

1.2 Create the full directory tree

1.3 Create the initialization files

1.4 Verify the structure

Step 2: Configure the virtual environment

2.1 Create the virtual environment

2.2 Activate the environment

2.3 Verify activation

2.4 Update pip

Step 3: Install dependencies

3.1 Create the requirements.txt file

3.2 Install the dependencies

3.3 Verify the installation

Step 4: Create configuration files

4.1 .gitignore file

4.2 README.md file

Project Structure

Technology Stack

Step 5: Write the project brief

5.1 Create the document

6. Scope

In Scope

Out of Scope

7. Timeline

6.2 Run the verification

Step 7: Create a first test

7.1 Write a unit test

7.2 Run the tests

Step 8: Initialize Git

8.1 Initialize the repository

8.2 Verify the status

Lab Summary

What you have accomplished

Final project structure

Essential commands to remember