LIA - Technical Report Template

Template Reference Document Writing Guide

Report Overview

Your technical report is a 5 to 8 page PDF that documents your entire LIA project. It should be written so that a technical reader who has never seen your project can understand your approach, reproduce your results, and evaluate your decisions.

View Report Structure

Formatting Guidelines

Aspect	Requirement
Length	5-8 pages (excluding references and appendices)
Format	PDF
Font	11-12pt, readable font (e.g., Calibri, Arial, Times New Roman)
Margins	2.5 cm (1 inch) on all sides
Spacing	1.15 or 1.5 line spacing
Figures	Numbered and captioned (e.g., "Figure 1: Model comparison")
Tables	Numbered and captioned (e.g., "Table 1: Evaluation metrics")
Code	Use monospaced font, minimal code (only key snippets)
Language	French or English (consistent throughout)

Page Limit

Do not exceed 8 pages for the main body. The report should be concise and focused. Appendices (additional charts, full code, Postman screenshots) do not count toward the page limit.

Section 1: Executive Summary

Length: 0.5 page

The executive summary is written last but appears first. It provides a high-level overview of the entire project in a single paragraph or short section.

What to Include

The problem you solved
The dataset used
The best model and its key metric
The API framework used
One key explainability finding
The main conclusion

Template

This report presents the development and deployment of [a predictive service for YOUR PROBLEM] using the [dataset name] dataset. After comparing [Model A] and [Model B], the [best model] was selected with a [primary metric] of [value]. The model is served via a [FastAPI/Flask] REST API with [number] endpoints, validated by [number] automated tests achieving [X}%] code coverage. Explainability analysis using [LIME/SHAP] revealed that [key finding, e.g., "monthly charges and contract type are the strongest predictors of customer churn"]. The complete system is documented, tested, and ready for deployment.

Example — Customer Churn Project

This report presents the development and deployment of a predictive service for customer churn using the Telco Customer Churn dataset (7,043 samples, 20 features). After comparing Logistic Regression, Random Forest, and XGBoost, the XGBoost classifier was selected with an AUC-ROC of 0.87 and F1-score of 0.63. The model is served via a FastAPI REST API with 3 endpoints (/predict, /health, /model-info), validated by 14 automated tests achieving 82% code coverage. SHAP analysis revealed that month-to-month contracts, high monthly charges, and short tenure are the strongest predictors of churn. The complete system is documented, tested, and ready for deployment.

Section 2: Problem Definition

Length: 1 page

2.1 Business Context

Explain why this problem matters. Who would benefit from this prediction service? What business decision does it support?

Template

Business context: [Describe the real-world problem and its impact]

End user: [Who would use this service? How?]

Business value: [What decision does this model help make? What cost does it save?]

2.2 Dataset Description

Provide a factual description of the dataset.

Attribute	Value
Name	[Dataset name]
Source	[URL or reference]
Samples	[Number of rows]
Features	[Number of features]
Target variable	[Name and type]
Class distribution	[e.g., 73% No Churn, 27% Churn]
Missing values	[Description]

Include 1-2 key EDA visualizations:

Target distribution (bar chart or pie chart)
Correlation heatmap or top feature distributions

2.3 Project Objectives

State your objectives as measurable goals:

Train and compare at least [N] models for [problem type]
Achieve [primary metric] > [threshold]
Build a REST API with [framework] serving predictions with < [N]ms latency
Achieve test coverage > 70%
Provide explainability analysis using [LIME/SHAP]

Section 3: Methodology

Length: 1 to 1.5 pages

3.1 Data Preprocessing

Document every transformation applied to the data:

Step	Action	Justification
Missing values	[e.g., Imputed with median]	[e.g., Only 11 rows affected, median preserves distribution]
Encoding	[e.g., One-hot encoding for categorical]	[e.g., Required for tree-based models]
Scaling	[e.g., StandardScaler on numerical features]	[e.g., Required for Logistic Regression]
Feature selection	[e.g., Dropped customer_id]	[e.g., Not a predictive feature]
Train/test split	[e.g., 80/20 with stratification]	[e.g., Preserves class distribution]

3.2 Model Selection

Explain why you chose these specific models:

Model	Why Selected
[Model A]	[e.g., Simple baseline, interpretable, fast to train]
[Model B]	[e.g., Handles non-linear relationships, good for tabular data]
[Model C (if applicable)]	[e.g., State-of-the-art for structured data, gradient boosting]

3.3 Hyperparameter Tuning

Document your tuning strategy:

Model	Method	Parameters Tuned	Best Values
[Model A]	[GridSearch / RandomizedSearch]	[e.g., C, penalty]	[e.g., C=0.1, penalty=l2]
[Model B]	[GridSearch / RandomizedSearch]	[e.g., n_estimators, max_depth]	[e.g., n_estimators=200, max_depth=10]

3.4 Evaluation Strategy

Validation method: [e.g., 5-fold cross-validation]
Primary metric: [e.g., AUC-ROC — because the dataset is imbalanced]
Secondary metrics: [e.g., F1-Score, Precision, Recall]

Section 4: Results

Length: 1 to 1.5 pages

4.1 Model Comparison

Present a clear comparison table:

Model	Accuracy	Precision	Recall	F1-Score	AUC-ROC
[Model A]	[value]	[value]	[value]	[value]	[value]
[Model B]	[value]	[value]	[value]	[value]	[value]
[Best Model]	[value]	[value]	[value]	[value]	[value]

4.2 Confusion Matrix

Include the confusion matrix for your best model:

	Predicted Negative	Predicted Positive
Actual Negative	[TN]	[FP]
Actual Positive	[FN]	[TP]

Interpret the results:

False positives: [What does a false positive mean in your context?]
False negatives: [What does a false negative mean? Which is more costly?]

4.3 Best Model Selection

The [best model] was selected because [justification based on metrics AND business context]. Although [Model A] achieved [slight advantage in one metric], [best model] provides [better performance on primary metric / better trade-off / lower inference time].

4.4 Key Visualizations

Include:

ROC curve comparing all models (for classification)
Actual vs. Predicted scatter plot (for regression)
Learning curves (optional but valuable)

Section 5: API Design

Length: 1 page

5.1 Architecture

Describe the overall API architecture:

5.2 Endpoints

Endpoint	Method	Description	Status Codes
`/health`	GET	Service health check	200
`/predict`	POST	Single prediction	200, 400, 422, 500
`/model-info`	GET	Model metadata	200

5.3 Request/Response Examples

For each endpoint, provide one example request and response (you can reference your Swagger documentation).

5.4 Error Handling

Error Type	HTTP Code	Example
Validation error	422	Missing required field
Bad request	400	Malformed JSON
Server error	500	Model prediction failure

5.5 Design Decisions

Explain key decisions:

Why you chose [FastAPI/Flask]
How the model is loaded (at startup vs. per request)
How input validation works

Section 6: Testing Strategy

Length: 0.5 page

6.1 Test Plan

Test Category	Count	Tools
Unit tests	[N]	pytest
Integration tests	[N]	pytest + TestClient/httpx
Edge case tests	[N]	pytest
API tests	[N]	Postman
Total	[N]

6.2 Code Coverage

Module	Statements	Covered	Coverage
`src/app.py`	[N]	[N]	[X}%]
`src/model.py`	[N]	[N]	[X}%]
`src/schemas.py`	[N]	[N]	[X}%]
Total	[N]	[N]	[X}%]

6.3 Key Test Results

Mention any notable edge cases you discovered and how you handled them.

6.4 Postman Collection

Reference the exported Postman collection (include a screenshot of a successful test run in the appendix).

Section 7: Model Explainability

Length: 1 page

7.1 Methodology

Tool(s) used: [LIME / SHAP / Both]
Number of explanations generated: [N individual predictions explained]
Visualizations produced: [List them]

7.2 Global Feature Importance

Include your SHAP summary plot or LIME global feature importance and interpret it:

The top 3 most important features are:

[Feature A] — [interpretation]

[Feature B] — [interpretation]

[Feature C] — [interpretation]

7.3 Individual Prediction Explanations

Show 1-2 LIME or SHAP explanations for specific predictions:

Example 1: For a customer with [characteristics], the model predicted [prediction] with [confidence]%. The main contributing factors were [factors].

7.4 Insights and Potential Biases

Are the important features expected? Do they make domain sense?
Are there any features that should NOT influence the prediction? (e.g., sensitive attributes)
What surprised you about the model's behavior?

Writing Effective Interpretations

Don't just describe the plots — explain what they mean for the business or end user. Example: "The model relies heavily on tenure (months as customer), which makes business sense since new customers have had less time to build loyalty."

Section 8: Deployment Architecture

Length: 0.5 page

8.1 How to Run the Service

Provide step-by-step instructions (abbreviated — full instructions are in README):

# 1. Clone the repository
git clone https://github.com/username/lia-project.git

# 2. Create virtual environment
python -m venv venv
source venv/bin/activate  # or venv\Scripts\activate on Windows

# 3. Install dependencies
pip install -r requirements.txt

# 4. Start the API
uvicorn src.app:app --host 0.0.0.0 --port 8000

8.2 Docker (if applicable)

FROM python:3.10-slim
WORKDIR /app
COPY requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt
COPY . .
EXPOSE 8000
CMD ["uvicorn", "src.app:app", "--host", "0.0.0.0", "--port", "8000"]

8.3 Environment

Component	Version
Python	[version]
Framework	[FastAPI/Flask version]
scikit-learn	[version]
OS	[e.g., Windows 11 / Ubuntu 22.04]

Section 9: Conclusion and Future Work

Length: 0.5 page

9.1 Summary

Recap the key achievements:

This project successfully demonstrates [main achievement]. The [model] achieves [metric] on [dataset], and is served through a [framework] API with [N] endpoints, [N] tests, and [X}%] code coverage. Explainability analysis using [tool] provided [key insight].

9.2 Lessons Learned

List 2-3 important lessons:

[e.g., "Data preprocessing took more time than expected — 60% of the project time was spent cleaning data"]
[e.g., "Writing tests first (TDD) would have saved debugging time later"]
[e.g., "SHAP revealed that my model relied on a feature I hadn't expected"]

9.3 Future Improvements

Suggest 2-3 realistic improvements:

[e.g., "Implement a CI/CD pipeline with GitHub Actions"]
[e.g., "Add model monitoring to detect data drift in production"]
[e.g., "Test deep learning models for comparison"]

Section 10: References

Use a consistent citation format (APA, IEEE, or numbered). Include:

Dataset source and documentation
Framework documentation (FastAPI, Flask, scikit-learn)
LIME/SHAP papers or documentation
Any course materials referenced
Any AI tools used (with a brief description of how)

Example References

[1] T. Chen and C. Guestrin, "XGBoost: A Scalable Tree Boosting System," in Proc. KDD, 2016.

[2] S. Lundberg and S.-I. Lee, "A Unified Approach to Interpreting Model Predictions," in NeurIPS, 2017.

[3] M. Ribeiro, S. Singh, and C. Guestrin, ""Why Should I Trust You?": Explaining the Predictions of Any Classifier," in Proc. KDD, 2016.

[4] FastAPI Documentation, https://fastapi.tiangolo.com/

[5] Telco Customer Churn Dataset, Kaggle, https://www.kaggle.com/datasets/blastchar/telco-customer-churn

Appendices (Not Counted in Page Limit)

Include additional supporting material:

Appendix	Content
A	Additional EDA visualizations
B	Full confusion matrix and classification report
C	Postman test run screenshots
D	Additional LIME/SHAP visualizations
E	Full code coverage report (HTML export or screenshot)

Quality Checklist for Your Report

Before submitting, verify:

AI Disclosure

If you used AI tools (ChatGPT, Copilot, etc.) during your project, you must disclose this in your report. Add a brief section before the references explaining which tools you used and how. Example: "GitHub Copilot was used for boilerplate code generation. ChatGPT was consulted for debugging error messages. All generated code was reviewed, tested, and adapted."

Report Overview​

Formatting Guidelines​

Section 1: Executive Summary​

What to Include​

Template​

Section 2: Problem Definition​

2.1 Business Context​

Template​

2.2 Dataset Description​

2.3 Project Objectives​

Section 3: Methodology​

3.1 Data Preprocessing​

3.2 Model Selection​

3.3 Hyperparameter Tuning​

3.4 Evaluation Strategy​

Section 4: Results​

4.1 Model Comparison​

4.2 Confusion Matrix​

4.3 Best Model Selection​

4.4 Key Visualizations​

Section 5: API Design​

5.1 Architecture​

5.2 Endpoints​

5.3 Request/Response Examples​

5.4 Error Handling​

5.5 Design Decisions​

Section 6: Testing Strategy​

6.1 Test Plan​

6.2 Code Coverage​

6.3 Key Test Results​

6.4 Postman Collection​

Section 7: Model Explainability​

7.1 Methodology​

7.2 Global Feature Importance​

7.3 Individual Prediction Explanations​

7.4 Insights and Potential Biases​

Section 8: Deployment Architecture​

8.1 How to Run the Service​

8.2 Docker (if applicable)​

8.3 Environment​

Section 9: Conclusion and Future Work​

9.1 Summary​

9.2 Lessons Learned​

9.3 Future Improvements​

Section 10: References​

Example References​

Appendices (Not Counted in Page Limit)​

Quality Checklist for Your Report​

Report Overview

Formatting Guidelines

Section 1: Executive Summary

What to Include

Template

Section 2: Problem Definition

2.1 Business Context

Template

2.2 Dataset Description

2.3 Project Objectives

Section 3: Methodology

3.1 Data Preprocessing

3.2 Model Selection

3.3 Hyperparameter Tuning

3.4 Evaluation Strategy

Section 4: Results

4.1 Model Comparison

4.2 Confusion Matrix

4.3 Best Model Selection

4.4 Key Visualizations

Section 5: API Design

5.1 Architecture

5.2 Endpoints

5.3 Request/Response Examples

5.4 Error Handling

5.5 Design Decisions

Section 6: Testing Strategy

6.1 Test Plan

6.2 Code Coverage

6.3 Key Test Results

6.4 Postman Collection

Section 7: Model Explainability

7.1 Methodology

7.2 Global Feature Importance

7.3 Individual Prediction Explanations

7.4 Insights and Potential Biases

Section 8: Deployment Architecture

8.1 How to Run the Service

8.2 Docker (if applicable)

8.3 Environment

Section 9: Conclusion and Future Work

9.1 Summary

9.2 Lessons Learned

9.3 Future Improvements

Section 10: References

Example References

Appendices (Not Counted in Page Limit)

Quality Checklist for Your Report