LIA - Detailed Requirements

Requirements Reference Document Advanced

Requirements Overview

Your LIA is evaluated across six components. Each component has specific requirements that must be met. This page provides the detailed rubric and checklists for each.

A. Model Training (20%)

Objectives

Demonstrate your ability to select, train, evaluate, and serialize machine learning models using a structured methodology.

Requirements

Dataset Selection: Choose an appropriate dataset (at least 500 samples, ≥ 5 features)
Exploratory Data Analysis: Perform thorough EDA with visualizations (distributions, correlations, missing values)
Data Preprocessing: Handle missing values, encode categorical variables, normalize/scale features
Train at least 2 models: Different algorithms (e.g., Logistic Regression vs Random Forest)
Evaluate with at least 3 metrics: Choose metrics appropriate for your problem type
Hyperparameter tuning: Attempt at least basic tuning (GridSearch or RandomizedSearch)
Serialize the best model: Save with joblib, pickle, or ONNX
Document your methodology: Justify every decision in your notebook and report

Problem Type	Metric 1	Metric 2	Metric 3	Metric 4 (bonus)
Binary Classification	Accuracy	Precision / Recall	F1-Score	AUC-ROC
Multi-class Classification	Accuracy	Macro F1-Score	Confusion Matrix	Classification Report
Regression	MAE	RMSE	R² Score	MAPE

Expected Code Structure

# notebooks/02_model_training.ipynb

# 1. Load and explore data
import pandas as pd
from sklearn.model_selection import train_test_split

df = pd.read_csv("data/dataset.csv")
X = df.drop(columns=["target"])
y = df["target"]
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# 2. Train Model A
from sklearn.linear_model import LogisticRegression
model_a = LogisticRegression(max_iter=1000)
model_a.fit(X_train, y_train)

# 3. Train Model B
from sklearn.ensemble import RandomForestClassifier
model_b = RandomForestClassifier(n_estimators=100, random_state=42)
model_b.fit(X_train, y_train)

# 4. Evaluate both models
from sklearn.metrics import accuracy_score, f1_score, roc_auc_score

results = {
    "Model": ["Logistic Regression", "Random Forest"],
    "Accuracy": [accuracy_score(y_test, model_a.predict(X_test)),
                 accuracy_score(y_test, model_b.predict(X_test))],
    "F1-Score": [f1_score(y_test, model_a.predict(X_test)),
                 f1_score(y_test, model_b.predict(X_test))],
    "AUC-ROC": [roc_auc_score(y_test, model_a.predict_proba(X_test)[:, 1]),
                roc_auc_score(y_test, model_b.predict_proba(X_test)[:, 1])]
}
pd.DataFrame(results)

# 5. Serialize best model
import joblib
joblib.dump(model_b, "models/best_model.pkl")

Grading Rubric — Model Training

Criterion	Excellent (18-20)	Good (16-17)	Satisfactory (14-15)	Insufficient (< 14)
Dataset	Well-chosen, relevant, properly justified	Appropriate dataset	Dataset works but poorly justified	Inappropriate or too small dataset
EDA	Thorough analysis with ≥ 5 visualizations, insights documented	Good EDA with 3-4 visualizations	Basic EDA, few visualizations	No EDA or minimal exploration
Preprocessing	All steps justified, pipeline documented	Most steps present	Basic preprocessing only	Missing critical preprocessing steps
Models	≥ 3 models compared, hyperparameter tuning performed	2 models compared with basic tuning	2 models trained, no tuning	Only 1 model, no comparison
Metrics	≥ 4 metrics, correct interpretation, comparison table	3 metrics with interpretation	3 metrics reported, minimal interpretation	< 3 metrics or incorrect metrics
Serialization	Model saved with metadata (version, metrics, features)	Model saved correctly	Model saved but no metadata	Model not serialized
Methodology	Clear narrative, every decision justified	Most decisions explained	Some explanations provided	No methodology documentation

B. API Service (25%)

Objectives

Build a production-ready REST API that serves predictions from your trained model, with proper input validation, error handling, and auto-generated documentation.

Requirements

Framework: Use FastAPI (recommended) or Flask
Endpoint /predict (POST): Accept input features, return prediction and confidence
Endpoint /health (GET): Return service health status
Endpoint /model-info (GET): Return model metadata (name, version, features, metrics)
Input validation: Use Pydantic models (FastAPI) or manual validation (Flask)
Error handling: Return proper HTTP status codes (400, 404, 422, 500) with descriptive messages
Swagger documentation: Auto-generated and accessible at /docs
Clean code: Functions documented, consistent naming, no hardcoded values

API Endpoint Specifications

Endpoint	Method	Input	Output	Status Codes
`/predict`	POST	JSON with feature values	`{ prediction, confidence, model_version }`	200, 400, 422, 500
`/health`	GET	None	`{ status: "healthy", timestamp }`	200
`/model-info`	GET	None	`{ model_name, version, features, metrics }`	200
`/predict/batch`	POST	Array of inputs	Array of predictions	200, 400, 422

Bonus Endpoint

Implementing a /predict/batch endpoint for batch predictions earns bonus consideration. It demonstrates understanding of real-world API patterns.

Expected Code — FastAPI

# src/app.py
from fastapi import FastAPI, HTTPException
from pydantic import BaseModel, Field
from datetime import datetime
import joblib
import numpy as np

app = FastAPI(
    title="My ML Prediction API",
    description="LIA Project — AI Model Deployment",
    version="1.0.0"
)

model = joblib.load("models/best_model.pkl")

class PredictionInput(BaseModel):
    feature_1: float = Field(..., description="Description of feature 1")
    feature_2: float = Field(..., description="Description of feature 2")
    feature_3: float = Field(..., ge=0, description="Must be non-negative")

    class Config:
        json_schema_extra = {
            "example": {
                "feature_1": 5.1,
                "feature_2": 3.5,
                "feature_3": 1.4
            }
        }

class PredictionOutput(BaseModel):
    prediction: int
    confidence: float
    model_version: str

@app.get("/health")
def health_check():
    return {
        "status": "healthy",
        "timestamp": datetime.now().isoformat(),
        "model_loaded": model is not None
    }

@app.get("/model-info")
def model_info():
    return {
        "model_name": "RandomForestClassifier",
        "version": "1.0.0",
        "features": ["feature_1", "feature_2", "feature_3"],
        "training_metrics": {
            "accuracy": 0.95,
            "f1_score": 0.93,
            "auc_roc": 0.97
        }
    }

@app.post("/predict", response_model=PredictionOutput)
def predict(data: PredictionInput):
    try:
        features = np.array([[data.feature_1, data.feature_2, data.feature_3]])
        prediction = model.predict(features)[0]
        confidence = float(model.predict_proba(features).max())
        return PredictionOutput(
            prediction=int(prediction),
            confidence=round(confidence, 4),
            model_version="1.0.0"
        )
    except Exception as e:
        raise HTTPException(status_code=500, detail=f"Prediction failed: {str(e)}")

Expected Code — Flask Alternative

# src/app.py
from flask import Flask, request, jsonify
from datetime import datetime
import joblib
import numpy as np

app = Flask(__name__)
model = joblib.load("models/best_model.pkl")

@app.route("/health", methods=["GET"])
def health_check():
    return jsonify({
        "status": "healthy",
        "timestamp": datetime.now().isoformat(),
        "model_loaded": model is not None
    })

@app.route("/predict", methods=["POST"])
def predict():
    data = request.get_json()
    if not data:
        return jsonify({"error": "No input data provided"}), 400

    required_fields = ["feature_1", "feature_2", "feature_3"]
    for field in required_fields:
        if field not in data:
            return jsonify({"error": f"Missing field: {field}"}), 400

    try:
        features = np.array([[data["feature_1"], data["feature_2"], data["feature_3"]]])
        prediction = model.predict(features)[0]
        confidence = float(model.predict_proba(features).max())
        return jsonify({
            "prediction": int(prediction),
            "confidence": round(confidence, 4),
            "model_version": "1.0.0"
        })
    except Exception as e:
        return jsonify({"error": f"Prediction failed: {str(e)}"}), 500

Grading Rubric — API Service

Criterion	Excellent (22-25)	Good (20-21)	Satisfactory (18-19)	Insufficient (< 18)
Endpoints	All 3 required + bonus endpoints	All 3 required endpoints work	2 of 3 endpoints work	< 2 endpoints working
Validation	Pydantic models with constraints, type checking, examples	Basic Pydantic validation	Minimal validation	No input validation
Error Handling	All error codes correct, descriptive messages, edge cases handled	Most errors handled properly	Basic error handling	No error handling
Documentation	Swagger complete with examples, descriptions for all fields	Swagger accessible and useful	Swagger auto-generated, minimal customization	No API documentation
Code Quality	Clean architecture, separation of concerns, docstrings	Well-organized code	Code works but messy	Spaghetti code, hardcoded values
Response Format	Consistent JSON structure, proper status codes, timestamps	Consistent responses	Inconsistent response formats	Broken or missing responses

C. Testing (15%)

Objectives

Demonstrate your ability to write automated tests that validate your model and API, and to use Postman for API testing.

Requirements

Minimum 10 tests using pytest
Unit tests: Test model loading, prediction logic, input validation
Integration tests: Test API endpoints end-to-end
Edge case tests: Test with invalid inputs, empty data, extreme values
Code coverage > 70%: Measured with pytest-cov
Postman collection: Exported .json with tests for all endpoints
Test documentation: Explain what each test validates

Test Categories

Expected Test Code

# tests/test_api.py
import pytest
from fastapi.testclient import TestClient
from src.app import app

client = TestClient(app)

# --- Health Check Tests ---

def test_health_returns_200():
    response = client.get("/health")
    assert response.status_code == 200
    assert response.json()["status"] == "healthy"

def test_health_has_timestamp():
    response = client.get("/health")
    assert "timestamp" in response.json()

# --- Model Info Tests ---

def test_model_info_returns_200():
    response = client.get("/model-info")
    assert response.status_code == 200

def test_model_info_contains_version():
    response = client.get("/model-info")
    data = response.json()
    assert "version" in data
    assert "model_name" in data
    assert "features" in data

# --- Prediction Tests ---

def test_predict_valid_input():
    payload = {"feature_1": 5.1, "feature_2": 3.5, "feature_3": 1.4}
    response = client.post("/predict", json=payload)
    assert response.status_code == 200
    data = response.json()
    assert "prediction" in data
    assert "confidence" in data

def test_predict_confidence_range():
    payload = {"feature_1": 5.1, "feature_2": 3.5, "feature_3": 1.4}
    response = client.post("/predict", json=payload)
    confidence = response.json()["confidence"]
    assert 0.0 <= confidence <= 1.0

def test_predict_missing_field_returns_422():
    payload = {"feature_1": 5.1}  # Missing features
    response = client.post("/predict", json=payload)
    assert response.status_code == 422

def test_predict_wrong_type_returns_422():
    payload = {"feature_1": "not_a_number", "feature_2": 3.5, "feature_3": 1.4}
    response = client.post("/predict", json=payload)
    assert response.status_code == 422

def test_predict_empty_body_returns_422():
    response = client.post("/predict", json={})
    assert response.status_code == 422

# --- Edge Case Tests ---

def test_predict_extreme_values():
    payload = {"feature_1": 99999.0, "feature_2": -99999.0, "feature_3": 0.0}
    response = client.post("/predict", json=payload)
    assert response.status_code in [200, 400]

Running Tests with Coverage

# Run all tests
pytest tests/ -v

# Run with coverage report
pytest tests/ --cov=src --cov-report=term-missing

# Generate HTML coverage report
pytest tests/ --cov=src --cov-report=html

Postman Collection Requirements

Your Postman collection must include:

Request	Method	Tests
Health Check	GET `/health`	Status is 200, body contains "healthy"
Model Info	GET `/model-info`	Status is 200, body contains model name
Valid Prediction	POST `/predict`	Status is 200, response has prediction and confidence
Missing Fields	POST `/predict`	Status is 422, error message present
Wrong Types	POST `/predict`	Status is 422, validation error
Empty Body	POST `/predict`	Status is 422

Postman Test Scripts

Each Postman request should include test scripts. Example:

pm.test("Status code is 200", function () {
    pm.response.to.have.status(200);
});

pm.test("Response has prediction", function () {
    var jsonData = pm.response.json();
    pm.expect(jsonData).to.have.property("prediction");
    pm.expect(jsonData).to.have.property("confidence");
});

Grading Rubric — Testing

Criterion	Excellent (14-15)	Good (12-13)	Satisfactory (11)	Insufficient (< 11)
Test Count	≥ 15 tests, comprehensive coverage	10-14 tests, good variety	10 tests, basic coverage	< 10 tests
Unit Tests	Model + validation + utility functions tested	Model and validation tested	Basic model test only	No unit tests
Integration Tests	All endpoints tested with multiple scenarios	All endpoints tested once	Most endpoints tested	Few endpoint tests
Edge Cases	Invalid inputs, extreme values, empty data, wrong types	Some edge cases covered	1-2 edge cases	No edge cases
Coverage	> 80%	> 70%	> 60%	< 60%
Postman	Complete collection with test scripts for all requests	Collection with basic tests	Collection exists, few tests	No Postman collection

D. Explainability (15%)

Objectives

Apply model interpretability techniques to explain your model's predictions and build trust in your AI system.

Requirements

Apply LIME and/or SHAP to your trained model
Generate at least 3 visualizations: Feature importance, individual explanations, summary plots
Interpret results: Explain in plain language what the model is learning
Identify potential biases: Check if certain features have unexpected influence
Document findings: Include analysis in your report and notebook

Visualization Requirements

#	Visualization	Tool	Description
1	Global Feature Importance	SHAP `summary_plot` or `bar_plot`	Which features matter most overall?
2	Individual Prediction Explanation	LIME `explain_instance`	Why did the model make this specific prediction?
3	Feature Interaction	SHAP `dependence_plot`	How do features interact?
4	Decision Boundary (if applicable)	LIME or custom	How does the model separate classes?
5	Waterfall Plot (bonus)	SHAP `waterfall_plot`	Step-by-step contribution of each feature

Expected Code — SHAP

# notebooks/03_explainability.ipynb
import shap
import matplotlib.pyplot as plt

explainer = shap.TreeExplainer(model)  # For tree-based models
shap_values = explainer.shap_values(X_test)

# 1. Global feature importance (summary plot)
shap.summary_plot(shap_values, X_test, show=False)
plt.title("SHAP Feature Importance — Global View")
plt.tight_layout()
plt.savefig("docs/shap_summary.png", dpi=150)
plt.show()

# 2. Single prediction explanation (waterfall)
shap.plots.waterfall(shap.Explanation(
    values=shap_values[0],
    base_values=explainer.expected_value,
    data=X_test.iloc[0],
    feature_names=X_test.columns.tolist()
))

# 3. Feature dependence plot
shap.dependence_plot("feature_1", shap_values, X_test)

Expected Code — LIME

# notebooks/03_explainability.ipynb
from lime.lime_tabular import LimeTabularExplainer

explainer = LimeTabularExplainer(
    training_data=X_train.values,
    feature_names=X_train.columns.tolist(),
    class_names=["Class 0", "Class 1"],
    mode="classification"
)

# Explain a single prediction
instance = X_test.iloc[0].values
explanation = explainer.explain_instance(
    instance,
    model.predict_proba,
    num_features=10
)

# Show explanation
explanation.show_in_notebook()

# Save as HTML
explanation.save_to_file("docs/lime_explanation.html")

Interpretation Guide

When interpreting your results, answer these questions:

Question	What to Look For
Which features are most important?	Top features in SHAP summary or LIME global
Are the important features logical?	Do they make business sense?
Are there any surprises?	Unexpected feature importance may indicate data leakage
Is the model biased?	Does it rely heavily on sensitive attributes?
How confident is the model?	Are predictions high-confidence or uncertain?

Data Leakage

If a feature has disproportionately high importance (e.g., one feature alone has SHAP value 10x higher than all others), it may indicate data leakage — a variable that wouldn't be available at prediction time. Investigate before finalizing your model.

Grading Rubric — Explainability

Criterion	Excellent (14-15)	Good (12-13)	Satisfactory (11)	Insufficient (< 11)
Methods	Both LIME and SHAP applied	One method applied thoroughly	One method, basic usage	No explainability analysis
Visualizations	≥ 5 clear, well-labeled visualizations	3-4 visualizations	2-3 basic visualizations	< 2 or poorly formatted
Interpretation	Deep analysis, business implications, bias check	Good interpretation, some insights	Surface-level interpretation	No interpretation, just plots
Documentation	Findings integrated in report with recommendations	Findings mentioned in report	Brief mention in report	Not documented

E. Documentation & Report (10%)

Objectives

Produce professional documentation that would allow another developer to understand, run, and extend your project.

Requirements

README.md: Project description, setup instructions, usage examples, API reference
Technical report (5-8 pages): Follows the provided template
API documentation: Swagger/OpenAPI accessible at /docs
Code documentation: Docstrings for all public functions
Clear writing: No spelling errors, consistent formatting, logical structure

README.md Requirements

Your README must include:

Section	Content
Project Title	Clear, descriptive name
Description	2-3 sentences explaining the project
Installation	Step-by-step setup instructions
Usage	How to start the API, example requests
API Endpoints	Table with all endpoints
Testing	How to run tests
Project Structure	Directory tree
Model Information	Algorithm, metrics, dataset
Author	Your name and student ID

Technical Report Structure

Your report must follow the template provided in the Report Template section:

Section	Pages	Key Content
Executive Summary	0.5	Overview of entire project
Problem Definition	1	Business context, dataset, objectives
Methodology	1-1.5	Preprocessing, model selection, evaluation strategy
Results	1-1.5	Metrics, comparisons, confusion matrices
API Design	1	Architecture, endpoints, response formats
Testing Strategy	0.5	Test plan, coverage, Postman results
Explainability	1	LIME/SHAP results, interpretation
Deployment	0.5	How to run, Docker, environment
Conclusion	0.5	Summary, lessons learned, future work

Grading Rubric — Documentation & Report

Criterion	Excellent (9-10)	Good (8)	Satisfactory (7)	Insufficient (< 7)
README	Complete, clear, anyone can set up the project	Most sections present, minor gaps	Basic README, missing sections	No README or unusable
Report Structure	All sections present, logical flow	Most sections, minor gaps	Missing 1-2 sections	Missing many sections
Writing Quality	Professional, clear, concise, no errors	Good writing, few minor errors	Understandable, some errors	Poor writing, many errors
Technical Depth	Deep analysis, justified decisions	Good depth, most decisions explained	Surface-level analysis	No depth, just descriptions
Formatting	Consistent, tables, figures captioned	Mostly consistent	Inconsistent formatting	No formatting effort

F. Oral Presentation (15%)

Objectives

Present your project professionally, demonstrate a working system, and defend your technical decisions during Q&A.

Requirements

Duration: 15 minutes presentation + 5 minutes Q&A
Slide deck: 10-15 slides, clean design
Live demo: Show the API working in real-time
Technical depth: Explain model choice, metrics, architecture decisions
Q&A readiness: Be able to answer questions about any aspect of your project

Presentation Time Allocation

Section	Duration	Content
Introduction & Context	2 min	Problem statement, dataset, objectives
Model Training & Results	3 min	EDA highlights, model comparison, best model
API Architecture	2 min	Endpoints, validation, error handling
Live Demo	3 min	Show API working: health, predict, model-info
Testing & Explainability	3 min	Test results, coverage, LIME/SHAP insights
Conclusion & Lessons	2 min	What you learned, what you'd do differently
Q&A	5 min	Answer instructor questions

Grading Rubric — Oral Presentation

Criterion	Excellent (14-15)	Good (12-13)	Satisfactory (11)	Insufficient (< 11)
Content	Comprehensive, well-structured, covers all components	Good coverage, minor gaps	Adequate coverage, missing details	Major components missing
Demo	API works flawlessly, multiple requests shown	Demo works, minor hiccups	Demo partially works	Demo fails or not attempted
Communication	Confident, clear, professional, good pace	Good delivery, minor nervousness	Adequate delivery	Unclear, reading from slides
Slides	Clean design, informative visuals, no text walls	Good slides, minor issues	Acceptable but text-heavy	Poor design, too much text
Q&A	Answers all questions confidently, shows deep understanding	Answers most questions well	Answers some questions	Cannot answer basic questions
Time Management	14-16 minutes, well-paced	12-18 minutes, mostly well-paced	Under 10 or over 20 minutes	Significantly over/under time

See the Presentation Guide

For detailed presentation tips, time allocation advice, and common Q&A questions, see the Presentation Guide.

Complete Submission Checklist

Use this checklist before submitting your LIA:

Code & Model

Git repository is clean and organized
.gitignore excludes __pycache__, .env, large files
requirements.txt with all dependencies (pinned versions)
Trained model file is in models/ directory
EDA notebook is complete and runs without errors
Training notebook is complete and runs without errors

API

API starts without errors (uvicorn src.app:app --reload)
/health returns 200 with status and timestamp
/predict accepts valid input and returns prediction + confidence
/predict returns 422 for invalid input
/model-info returns model metadata
Swagger docs accessible at /docs

Testing

At least 10 pytest tests pass
Code coverage > 70%
Postman collection exported to postman/collection.json
Edge cases are tested (invalid types, missing fields, extreme values)

Explainability

LIME and/or SHAP analysis complete
At least 3 visualizations saved
Interpretations written in notebook and report

Documentation

README.md is complete with setup instructions
Technical report is 5-8 pages, follows template
All code has docstrings

Presentation

Slide deck is 10-15 slides
Live demo prepared and tested
Practiced within 15-minute time limit
Prepared for potential Q&A questions

Final Verification

Before submitting, clone your repository to a fresh directory and follow your own README instructions to set up the project from scratch. If you can't run it, neither can your instructor.

Requirements Overview​

A. Model Training (20%)​

Objectives​

Requirements​

Recommended Metrics by Problem Type​

Expected Code Structure​

Grading Rubric — Model Training​

B. API Service (25%)​

Objectives​

Requirements​

API Endpoint Specifications​

Expected Code — FastAPI​

Expected Code — Flask Alternative​

Grading Rubric — API Service​

C. Testing (15%)​

Objectives​

Requirements​

Test Categories​

Expected Test Code​

Running Tests with Coverage​

Postman Collection Requirements​

Grading Rubric — Testing​

D. Explainability (15%)​

Objectives​

Requirements​

Visualization Requirements​

Expected Code — SHAP​

Expected Code — LIME​

Interpretation Guide​

Grading Rubric — Explainability​

E. Documentation & Report (10%)​

Objectives​

Requirements​

README.md Requirements​

Technical Report Structure​

Grading Rubric — Documentation & Report​

F. Oral Presentation (15%)​

Objectives​

Requirements​

Presentation Time Allocation​

Grading Rubric — Oral Presentation​

Complete Submission Checklist​

Code & Model​

API​

Testing​

Explainability​

Documentation​

Presentation​

Requirements Overview

A. Model Training (20%)

Objectives

Requirements

Recommended Metrics by Problem Type

Expected Code Structure

Grading Rubric — Model Training

B. API Service (25%)

Objectives

Requirements

API Endpoint Specifications

Expected Code — FastAPI

Expected Code — Flask Alternative

Grading Rubric — API Service

C. Testing (15%)

Objectives

Requirements

Test Categories

Expected Test Code

Running Tests with Coverage

Postman Collection Requirements

Grading Rubric — Testing

D. Explainability (15%)

Objectives

Requirements

Visualization Requirements

Expected Code — SHAP

Expected Code — LIME

Interpretation Guide

Grading Rubric — Explainability

E. Documentation & Report (10%)

Objectives

Requirements

README.md Requirements

Technical Report Structure

Grading Rubric — Documentation & Report

F. Oral Presentation (15%)

Objectives

Requirements

Presentation Time Allocation

Grading Rubric — Oral Presentation

Complete Submission Checklist

Code & Model

API

Testing

Explainability

Documentation

Presentation