LIA - Detailed Requirements
Requirements Overview
Your LIA is evaluated across six components. Each component has specific requirements that must be met. This page provides the detailed rubric and checklists for each.
A. Model Training (20%)
Objectives
Demonstrate your ability to select, train, evaluate, and serialize machine learning models using a structured methodology.
Requirements
- Dataset Selection: Choose an appropriate dataset (at least 500 samples, ≥ 5 features)
- Exploratory Data Analysis: Perform thorough EDA with visualizations (distributions, correlations, missing values)
- Data Preprocessing: Handle missing values, encode categorical variables, normalize/scale features
- Train at least 2 models: Different algorithms (e.g., Logistic Regression vs Random Forest)
- Evaluate with at least 3 metrics: Choose metrics appropriate for your problem type
- Hyperparameter tuning: Attempt at least basic tuning (GridSearch or RandomizedSearch)
- Serialize the best model: Save with
joblib,pickle, orONNX - Document your methodology: Justify every decision in your notebook and report
Recommended Metrics by Problem Type
| Problem Type | Metric 1 | Metric 2 | Metric 3 | Metric 4 (bonus) |
|---|---|---|---|---|
| Binary Classification | Accuracy | Precision / Recall | F1-Score | AUC-ROC |
| Multi-class Classification | Accuracy | Macro F1-Score | Confusion Matrix | Classification Report |
| Regression | MAE | RMSE | R² Score | MAPE |
Expected Code Structure
# notebooks/02_model_training.ipynb
# 1. Load and explore data
import pandas as pd
from sklearn.model_selection import train_test_split
df = pd.read_csv("data/dataset.csv")
X = df.drop(columns=["target"])
y = df["target"]
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
# 2. Train Model A
from sklearn.linear_model import LogisticRegression
model_a = LogisticRegression(max_iter=1000)
model_a.fit(X_train, y_train)
# 3. Train Model B
from sklearn.ensemble import RandomForestClassifier
model_b = RandomForestClassifier(n_estimators=100, random_state=42)
model_b.fit(X_train, y_train)
# 4. Evaluate both models
from sklearn.metrics import accuracy_score, f1_score, roc_auc_score
results = {
"Model": ["Logistic Regression", "Random Forest"],
"Accuracy": [accuracy_score(y_test, model_a.predict(X_test)),
accuracy_score(y_test, model_b.predict(X_test))],
"F1-Score": [f1_score(y_test, model_a.predict(X_test)),
f1_score(y_test, model_b.predict(X_test))],
"AUC-ROC": [roc_auc_score(y_test, model_a.predict_proba(X_test)[:, 1]),
roc_auc_score(y_test, model_b.predict_proba(X_test)[:, 1])]
}
pd.DataFrame(results)
# 5. Serialize best model
import joblib
joblib.dump(model_b, "models/best_model.pkl")
Grading Rubric — Model Training
| Criterion | Excellent (18-20) | Good (16-17) | Satisfactory (14-15) | Insufficient (< 14) |
|---|---|---|---|---|
| Dataset | Well-chosen, relevant, properly justified | Appropriate dataset | Dataset works but poorly justified | Inappropriate or too small dataset |
| EDA | Thorough analysis with ≥ 5 visualizations, insights documented | Good EDA with 3-4 visualizations | Basic EDA, few visualizations | No EDA or minimal exploration |
| Preprocessing | All steps justified, pipeline documented | Most steps present | Basic preprocessing only | Missing critical preprocessing steps |
| Models | ≥ 3 models compared, hyperparameter tuning performed | 2 models compared with basic tuning | 2 models trained, no tuning | Only 1 model, no comparison |
| Metrics | ≥ 4 metrics, correct interpretation, comparison table | 3 metrics with interpretation | 3 metrics reported, minimal interpretation | < 3 metrics or incorrect metrics |
| Serialization | Model saved with metadata (version, metrics, features) | Model saved correctly | Model saved but no metadata | Model not serialized |
| Methodology | Clear narrative, every decision justified | Most decisions explained | Some explanations provided | No methodology documentation |
B. API Service (25%)
Objectives
Build a production-ready REST API that serves predictions from your trained model, with proper input validation, error handling, and auto-generated documentation.
Requirements
- Framework: Use FastAPI (recommended) or Flask
- Endpoint
/predict(POST): Accept input features, return prediction and confidence - Endpoint
/health(GET): Return service health status - Endpoint
/model-info(GET): Return model metadata (name, version, features, metrics) - Input validation: Use Pydantic models (FastAPI) or manual validation (Flask)
- Error handling: Return proper HTTP status codes (400, 404, 422, 500) with descriptive messages
- Swagger documentation: Auto-generated and accessible at
/docs - Clean code: Functions documented, consistent naming, no hardcoded values
API Endpoint Specifications
| Endpoint | Method | Input | Output | Status Codes |
|---|---|---|---|---|
/predict | POST | JSON with feature values | { prediction, confidence, model_version } | 200, 400, 422, 500 |
/health | GET | None | { status: "healthy", timestamp } | 200 |
/model-info | GET | None | { model_name, version, features, metrics } | 200 |
/predict/batch | POST | Array of inputs | Array of predictions | 200, 400, 422 |
Implementing a /predict/batch endpoint for batch predictions earns bonus consideration. It demonstrates understanding of real-world API patterns.
Expected Code — FastAPI
# src/app.py
from fastapi import FastAPI, HTTPException
from pydantic import BaseModel, Field
from datetime import datetime
import joblib
import numpy as np
app = FastAPI(
title="My ML Prediction API",
description="LIA Project — AI Model Deployment",
version="1.0.0"
)
model = joblib.load("models/best_model.pkl")
class PredictionInput(BaseModel):
feature_1: float = Field(..., description="Description of feature 1")
feature_2: float = Field(..., description="Description of feature 2")
feature_3: float = Field(..., ge=0, description="Must be non-negative")
class Config:
json_schema_extra = {
"example": {
"feature_1": 5.1,
"feature_2": 3.5,
"feature_3": 1.4
}
}
class PredictionOutput(BaseModel):
prediction: int
confidence: float
model_version: str
@app.get("/health")
def health_check():
return {
"status": "healthy",
"timestamp": datetime.now().isoformat(),
"model_loaded": model is not None
}
@app.get("/model-info")
def model_info():
return {
"model_name": "RandomForestClassifier",
"version": "1.0.0",
"features": ["feature_1", "feature_2", "feature_3"],
"training_metrics": {
"accuracy": 0.95,
"f1_score": 0.93,
"auc_roc": 0.97
}
}
@app.post("/predict", response_model=PredictionOutput)
def predict(data: PredictionInput):
try:
features = np.array([[data.feature_1, data.feature_2, data.feature_3]])
prediction = model.predict(features)[0]
confidence = float(model.predict_proba(features).max())
return PredictionOutput(
prediction=int(prediction),
confidence=round(confidence, 4),
model_version="1.0.0"
)
except Exception as e:
raise HTTPException(status_code=500, detail=f"Prediction failed: {str(e)}")
Expected Code — Flask Alternative
# src/app.py
from flask import Flask, request, jsonify
from datetime import datetime
import joblib
import numpy as np
app = Flask(__name__)
model = joblib.load("models/best_model.pkl")
@app.route("/health", methods=["GET"])
def health_check():
return jsonify({
"status": "healthy",
"timestamp": datetime.now().isoformat(),
"model_loaded": model is not None
})
@app.route("/predict", methods=["POST"])
def predict():
data = request.get_json()
if not data:
return jsonify({"error": "No input data provided"}), 400
required_fields = ["feature_1", "feature_2", "feature_3"]
for field in required_fields:
if field not in data:
return jsonify({"error": f"Missing field: {field}"}), 400
try:
features = np.array([[data["feature_1"], data["feature_2"], data["feature_3"]]])
prediction = model.predict(features)[0]
confidence = float(model.predict_proba(features).max())
return jsonify({
"prediction": int(prediction),
"confidence": round(confidence, 4),
"model_version": "1.0.0"
})
except Exception as e:
return jsonify({"error": f"Prediction failed: {str(e)}"}), 500
Grading Rubric — API Service
| Criterion | Excellent (22-25) | Good (20-21) | Satisfactory (18-19) | Insufficient (< 18) |
|---|---|---|---|---|
| Endpoints | All 3 required + bonus endpoints | All 3 required endpoints work | 2 of 3 endpoints work | < 2 endpoints working |
| Validation | Pydantic models with constraints, type checking, examples | Basic Pydantic validation | Minimal validation | No input validation |
| Error Handling | All error codes correct, descriptive messages, edge cases handled | Most errors handled properly | Basic error handling | No error handling |
| Documentation | Swagger complete with examples, descriptions for all fields | Swagger accessible and useful | Swagger auto-generated, minimal customization | No API documentation |
| Code Quality | Clean architecture, separation of concerns, docstrings | Well-organized code | Code works but messy | Spaghetti code, hardcoded values |
| Response Format | Consistent JSON structure, proper status codes, timestamps | Consistent responses | Inconsistent response formats | Broken or missing responses |
C. Testing (15%)
Objectives
Demonstrate your ability to write automated tests that validate your model and API, and to use Postman for API testing.
Requirements
- Minimum 10 tests using pytest
- Unit tests: Test model loading, prediction logic, input validation
- Integration tests: Test API endpoints end-to-end
- Edge case tests: Test with invalid inputs, empty data, extreme values
- Code coverage > 70%: Measured with
pytest-cov - Postman collection: Exported
.jsonwith tests for all endpoints - Test documentation: Explain what each test validates
Test Categories
Expected Test Code
# tests/test_api.py
import pytest
from fastapi.testclient import TestClient
from src.app import app
client = TestClient(app)
# --- Health Check Tests ---
def test_health_returns_200():
response = client.get("/health")
assert response.status_code == 200
assert response.json()["status"] == "healthy"
def test_health_has_timestamp():
response = client.get("/health")
assert "timestamp" in response.json()
# --- Model Info Tests ---
def test_model_info_returns_200():
response = client.get("/model-info")
assert response.status_code == 200
def test_model_info_contains_version():
response = client.get("/model-info")
data = response.json()
assert "version" in data
assert "model_name" in data
assert "features" in data
# --- Prediction Tests ---
def test_predict_valid_input():
payload = {"feature_1": 5.1, "feature_2": 3.5, "feature_3": 1.4}
response = client.post("/predict", json=payload)
assert response.status_code == 200
data = response.json()
assert "prediction" in data
assert "confidence" in data
def test_predict_confidence_range():
payload = {"feature_1": 5.1, "feature_2": 3.5, "feature_3": 1.4}
response = client.post("/predict", json=payload)
confidence = response.json()["confidence"]
assert 0.0 <= confidence <= 1.0
def test_predict_missing_field_returns_422():
payload = {"feature_1": 5.1} # Missing features
response = client.post("/predict", json=payload)
assert response.status_code == 422
def test_predict_wrong_type_returns_422():
payload = {"feature_1": "not_a_number", "feature_2": 3.5, "feature_3": 1.4}
response = client.post("/predict", json=payload)
assert response.status_code == 422
def test_predict_empty_body_returns_422():
response = client.post("/predict", json={})
assert response.status_code == 422
# --- Edge Case Tests ---
def test_predict_extreme_values():
payload = {"feature_1": 99999.0, "feature_2": -99999.0, "feature_3": 0.0}
response = client.post("/predict", json=payload)
assert response.status_code in [200, 400]
Running Tests with Coverage
# Run all tests
pytest tests/ -v
# Run with coverage report
pytest tests/ --cov=src --cov-report=term-missing
# Generate HTML coverage report
pytest tests/ --cov=src --cov-report=html
Postman Collection Requirements
Your Postman collection must include:
| Request | Method | Tests |
|---|---|---|
| Health Check | GET /health | Status is 200, body contains "healthy" |
| Model Info | GET /model-info | Status is 200, body contains model name |
| Valid Prediction | POST /predict | Status is 200, response has prediction and confidence |
| Missing Fields | POST /predict | Status is 422, error message present |
| Wrong Types | POST /predict | Status is 422, validation error |
| Empty Body | POST /predict | Status is 422 |
Each Postman request should include test scripts. Example:
pm.test("Status code is 200", function () {
pm.response.to.have.status(200);
});
pm.test("Response has prediction", function () {
var jsonData = pm.response.json();
pm.expect(jsonData).to.have.property("prediction");
pm.expect(jsonData).to.have.property("confidence");
});
Grading Rubric — Testing
| Criterion | Excellent (14-15) | Good (12-13) | Satisfactory (11) | Insufficient (< 11) |
|---|---|---|---|---|
| Test Count | ≥ 15 tests, comprehensive coverage | 10-14 tests, good variety | 10 tests, basic coverage | < 10 tests |
| Unit Tests | Model + validation + utility functions tested | Model and validation tested | Basic model test only | No unit tests |
| Integration Tests | All endpoints tested with multiple scenarios | All endpoints tested once | Most endpoints tested | Few endpoint tests |
| Edge Cases | Invalid inputs, extreme values, empty data, wrong types | Some edge cases covered | 1-2 edge cases | No edge cases |
| Coverage | > 80% | > 70% | > 60% | < 60% |
| Postman | Complete collection with test scripts for all requests | Collection with basic tests | Collection exists, few tests | No Postman collection |
D. Explainability (15%)
Objectives
Apply model interpretability techniques to explain your model's predictions and build trust in your AI system.
Requirements
- Apply LIME and/or SHAP to your trained model
- Generate at least 3 visualizations: Feature importance, individual explanations, summary plots
- Interpret results: Explain in plain language what the model is learning
- Identify potential biases: Check if certain features have unexpected influence
- Document findings: Include analysis in your report and notebook
Visualization Requirements
| # | Visualization | Tool | Description |
|---|---|---|---|
| 1 | Global Feature Importance | SHAP summary_plot or bar_plot | Which features matter most overall? |
| 2 | Individual Prediction Explanation | LIME explain_instance | Why did the model make this specific prediction? |
| 3 | Feature Interaction | SHAP dependence_plot | How do features interact? |
| 4 | Decision Boundary (if applicable) | LIME or custom | How does the model separate classes? |
| 5 | Waterfall Plot (bonus) | SHAP waterfall_plot | Step-by-step contribution of each feature |
Expected Code — SHAP
# notebooks/03_explainability.ipynb
import shap
import matplotlib.pyplot as plt
explainer = shap.TreeExplainer(model) # For tree-based models
shap_values = explainer.shap_values(X_test)
# 1. Global feature importance (summary plot)
shap.summary_plot(shap_values, X_test, show=False)
plt.title("SHAP Feature Importance — Global View")
plt.tight_layout()
plt.savefig("docs/shap_summary.png", dpi=150)
plt.show()
# 2. Single prediction explanation (waterfall)
shap.plots.waterfall(shap.Explanation(
values=shap_values[0],
base_values=explainer.expected_value,
data=X_test.iloc[0],
feature_names=X_test.columns.tolist()
))
# 3. Feature dependence plot
shap.dependence_plot("feature_1", shap_values, X_test)
Expected Code — LIME
# notebooks/03_explainability.ipynb
from lime.lime_tabular import LimeTabularExplainer
explainer = LimeTabularExplainer(
training_data=X_train.values,
feature_names=X_train.columns.tolist(),
class_names=["Class 0", "Class 1"],
mode="classification"
)
# Explain a single prediction
instance = X_test.iloc[0].values
explanation = explainer.explain_instance(
instance,
model.predict_proba,
num_features=10
)
# Show explanation
explanation.show_in_notebook()
# Save as HTML
explanation.save_to_file("docs/lime_explanation.html")
Interpretation Guide
When interpreting your results, answer these questions:
| Question | What to Look For |
|---|---|
| Which features are most important? | Top features in SHAP summary or LIME global |
| Are the important features logical? | Do they make business sense? |
| Are there any surprises? | Unexpected feature importance may indicate data leakage |
| Is the model biased? | Does it rely heavily on sensitive attributes? |
| How confident is the model? | Are predictions high-confidence or uncertain? |
If a feature has disproportionately high importance (e.g., one feature alone has SHAP value 10x higher than all others), it may indicate data leakage — a variable that wouldn't be available at prediction time. Investigate before finalizing your model.
Grading Rubric — Explainability
| Criterion | Excellent (14-15) | Good (12-13) | Satisfactory (11) | Insufficient (< 11) |
|---|---|---|---|---|
| Methods | Both LIME and SHAP applied | One method applied thoroughly | One method, basic usage | No explainability analysis |
| Visualizations | ≥ 5 clear, well-labeled visualizations | 3-4 visualizations | 2-3 basic visualizations | < 2 or poorly formatted |
| Interpretation | Deep analysis, business implications, bias check | Good interpretation, some insights | Surface-level interpretation | No interpretation, just plots |
| Documentation | Findings integrated in report with recommendations | Findings mentioned in report | Brief mention in report | Not documented |
E. Documentation & Report (10%)
Objectives
Produce professional documentation that would allow another developer to understand, run, and extend your project.
Requirements
- README.md: Project description, setup instructions, usage examples, API reference
- Technical report (5-8 pages): Follows the provided template
- API documentation: Swagger/OpenAPI accessible at
/docs - Code documentation: Docstrings for all public functions
- Clear writing: No spelling errors, consistent formatting, logical structure
README.md Requirements
Your README must include:
| Section | Content |
|---|---|
| Project Title | Clear, descriptive name |
| Description | 2-3 sentences explaining the project |
| Installation | Step-by-step setup instructions |
| Usage | How to start the API, example requests |
| API Endpoints | Table with all endpoints |
| Testing | How to run tests |
| Project Structure | Directory tree |
| Model Information | Algorithm, metrics, dataset |
| Author | Your name and student ID |
Technical Report Structure
Your report must follow the template provided in the Report Template section:
| Section | Pages | Key Content |
|---|---|---|
| Executive Summary | 0.5 | Overview of entire project |
| Problem Definition | 1 | Business context, dataset, objectives |
| Methodology | 1-1.5 | Preprocessing, model selection, evaluation strategy |
| Results | 1-1.5 | Metrics, comparisons, confusion matrices |
| API Design | 1 | Architecture, endpoints, response formats |
| Testing Strategy | 0.5 | Test plan, coverage, Postman results |
| Explainability | 1 | LIME/SHAP results, interpretation |
| Deployment | 0.5 | How to run, Docker, environment |
| Conclusion | 0.5 | Summary, lessons learned, future work |
Grading Rubric — Documentation & Report
| Criterion | Excellent (9-10) | Good (8) | Satisfactory (7) | Insufficient (< 7) |
|---|---|---|---|---|
| README | Complete, clear, anyone can set up the project | Most sections present, minor gaps | Basic README, missing sections | No README or unusable |
| Report Structure | All sections present, logical flow | Most sections, minor gaps | Missing 1-2 sections | Missing many sections |
| Writing Quality | Professional, clear, concise, no errors | Good writing, few minor errors | Understandable, some errors | Poor writing, many errors |
| Technical Depth | Deep analysis, justified decisions | Good depth, most decisions explained | Surface-level analysis | No depth, just descriptions |
| Formatting | Consistent, tables, figures captioned | Mostly consistent | Inconsistent formatting | No formatting effort |
F. Oral Presentation (15%)
Objectives
Present your project professionally, demonstrate a working system, and defend your technical decisions during Q&A.
Requirements
- Duration: 15 minutes presentation + 5 minutes Q&A
- Slide deck: 10-15 slides, clean design
- Live demo: Show the API working in real-time
- Technical depth: Explain model choice, metrics, architecture decisions
- Q&A readiness: Be able to answer questions about any aspect of your project
Presentation Time Allocation
| Section | Duration | Content |
|---|---|---|
| Introduction & Context | 2 min | Problem statement, dataset, objectives |
| Model Training & Results | 3 min | EDA highlights, model comparison, best model |
| API Architecture | 2 min | Endpoints, validation, error handling |
| Live Demo | 3 min | Show API working: health, predict, model-info |
| Testing & Explainability | 3 min | Test results, coverage, LIME/SHAP insights |
| Conclusion & Lessons | 2 min | What you learned, what you'd do differently |
| Q&A | 5 min | Answer instructor questions |
Grading Rubric — Oral Presentation
| Criterion | Excellent (14-15) | Good (12-13) | Satisfactory (11) | Insufficient (< 11) |
|---|---|---|---|---|
| Content | Comprehensive, well-structured, covers all components | Good coverage, minor gaps | Adequate coverage, missing details | Major components missing |
| Demo | API works flawlessly, multiple requests shown | Demo works, minor hiccups | Demo partially works | Demo fails or not attempted |
| Communication | Confident, clear, professional, good pace | Good delivery, minor nervousness | Adequate delivery | Unclear, reading from slides |
| Slides | Clean design, informative visuals, no text walls | Good slides, minor issues | Acceptable but text-heavy | Poor design, too much text |
| Q&A | Answers all questions confidently, shows deep understanding | Answers most questions well | Answers some questions | Cannot answer basic questions |
| Time Management | 14-16 minutes, well-paced | 12-18 minutes, mostly well-paced | Under 10 or over 20 minutes | Significantly over/under time |
For detailed presentation tips, time allocation advice, and common Q&A questions, see the Presentation Guide.
Complete Submission Checklist
Use this checklist before submitting your LIA:
Code & Model
- Git repository is clean and organized
-
.gitignoreexcludes__pycache__,.env, large files -
requirements.txtwith all dependencies (pinned versions) - Trained model file is in
models/directory - EDA notebook is complete and runs without errors
- Training notebook is complete and runs without errors
API
- API starts without errors (
uvicorn src.app:app --reload) -
/healthreturns 200 with status and timestamp -
/predictaccepts valid input and returns prediction + confidence -
/predictreturns 422 for invalid input -
/model-inforeturns model metadata - Swagger docs accessible at
/docs
Testing
- At least 10 pytest tests pass
- Code coverage > 70%
- Postman collection exported to
postman/collection.json - Edge cases are tested (invalid types, missing fields, extreme values)
Explainability
- LIME and/or SHAP analysis complete
- At least 3 visualizations saved
- Interpretations written in notebook and report
Documentation
- README.md is complete with setup instructions
- Technical report is 5-8 pages, follows template
- All code has docstrings
Presentation
- Slide deck is 10-15 slides
- Live demo prepared and tested
- Practiced within 15-minute time limit
- Prepared for potential Q&A questions
Before submitting, clone your repository to a fresh directory and follow your own README instructions to set up the project from scratch. If you can't run it, neither can your instructor.