Skip to main content

LIA - Technical Report Template

Template Reference Document Writing Guide

Report Overview

Your technical report is a 5 to 8 page PDF that documents your entire LIA project. It should be written so that a technical reader who has never seen your project can understand your approach, reproduce your results, and evaluate your decisions.

View Report Structure

Formatting Guidelines

AspectRequirement
Length5-8 pages (excluding references and appendices)
FormatPDF
Font11-12pt, readable font (e.g., Calibri, Arial, Times New Roman)
Margins2.5 cm (1 inch) on all sides
Spacing1.15 or 1.5 line spacing
FiguresNumbered and captioned (e.g., "Figure 1: Model comparison")
TablesNumbered and captioned (e.g., "Table 1: Evaluation metrics")
CodeUse monospaced font, minimal code (only key snippets)
LanguageFrench or English (consistent throughout)
Page Limit

Do not exceed 8 pages for the main body. The report should be concise and focused. Appendices (additional charts, full code, Postman screenshots) do not count toward the page limit.


Section 1: Executive Summary

Length: 0.5 page

The executive summary is written last but appears first. It provides a high-level overview of the entire project in a single paragraph or short section.

What to Include

  • The problem you solved
  • The dataset used
  • The best model and its key metric
  • The API framework used
  • One key explainability finding
  • The main conclusion

Template

This report presents the development and deployment of [a predictive service for YOUR PROBLEM] using the [dataset name] dataset. After comparing [Model A] and [Model B], the [best model] was selected with a [primary metric] of [value]. The model is served via a [FastAPI/Flask] REST API with [number] endpoints, validated by [number] automated tests achieving [X}%] code coverage. Explainability analysis using [LIME/SHAP] revealed that [key finding, e.g., "monthly charges and contract type are the strongest predictors of customer churn"]. The complete system is documented, tested, and ready for deployment.

Example — Customer Churn Project

This report presents the development and deployment of a predictive service for customer churn using the Telco Customer Churn dataset (7,043 samples, 20 features). After comparing Logistic Regression, Random Forest, and XGBoost, the XGBoost classifier was selected with an AUC-ROC of 0.87 and F1-score of 0.63. The model is served via a FastAPI REST API with 3 endpoints (/predict, /health, /model-info), validated by 14 automated tests achieving 82% code coverage. SHAP analysis revealed that month-to-month contracts, high monthly charges, and short tenure are the strongest predictors of churn. The complete system is documented, tested, and ready for deployment.


Section 2: Problem Definition

Length: 1 page

2.1 Business Context

Explain why this problem matters. Who would benefit from this prediction service? What business decision does it support?

Template

Business context: [Describe the real-world problem and its impact]

End user: [Who would use this service? How?]

Business value: [What decision does this model help make? What cost does it save?]

2.2 Dataset Description

Provide a factual description of the dataset.

AttributeValue
Name[Dataset name]
Source[URL or reference]
Samples[Number of rows]
Features[Number of features]
Target variable[Name and type]
Class distribution[e.g., 73% No Churn, 27% Churn]
Missing values[Description]

Include 1-2 key EDA visualizations:

  • Target distribution (bar chart or pie chart)
  • Correlation heatmap or top feature distributions

2.3 Project Objectives

State your objectives as measurable goals:

  1. Train and compare at least [N] models for [problem type]
  2. Achieve [primary metric] > [threshold]
  3. Build a REST API with [framework] serving predictions with < [N]ms latency
  4. Achieve test coverage > 70%
  5. Provide explainability analysis using [LIME/SHAP]

Section 3: Methodology

Length: 1 to 1.5 pages

3.1 Data Preprocessing

Document every transformation applied to the data:

StepActionJustification
Missing values[e.g., Imputed with median][e.g., Only 11 rows affected, median preserves distribution]
Encoding[e.g., One-hot encoding for categorical][e.g., Required for tree-based models]
Scaling[e.g., StandardScaler on numerical features][e.g., Required for Logistic Regression]
Feature selection[e.g., Dropped customer_id][e.g., Not a predictive feature]
Train/test split[e.g., 80/20 with stratification][e.g., Preserves class distribution]

3.2 Model Selection

Explain why you chose these specific models:

ModelWhy Selected
[Model A][e.g., Simple baseline, interpretable, fast to train]
[Model B][e.g., Handles non-linear relationships, good for tabular data]
[Model C (if applicable)][e.g., State-of-the-art for structured data, gradient boosting]

3.3 Hyperparameter Tuning

Document your tuning strategy:

ModelMethodParameters TunedBest Values
[Model A][GridSearch / RandomizedSearch][e.g., C, penalty][e.g., C=0.1, penalty=l2]
[Model B][GridSearch / RandomizedSearch][e.g., n_estimators, max_depth][e.g., n_estimators=200, max_depth=10]

3.4 Evaluation Strategy

  • Validation method: [e.g., 5-fold cross-validation]
  • Primary metric: [e.g., AUC-ROC — because the dataset is imbalanced]
  • Secondary metrics: [e.g., F1-Score, Precision, Recall]

Section 4: Results

Length: 1 to 1.5 pages

4.1 Model Comparison

Present a clear comparison table:

ModelAccuracyPrecisionRecallF1-ScoreAUC-ROC
[Model A][value][value][value][value][value]
[Model B][value][value][value][value][value]
[Best Model][value][value][value][value][value]

4.2 Confusion Matrix

Include the confusion matrix for your best model:

Predicted NegativePredicted Positive
Actual Negative[TN][FP]
Actual Positive[FN][TP]

Interpret the results:

  • False positives: [What does a false positive mean in your context?]
  • False negatives: [What does a false negative mean? Which is more costly?]

4.3 Best Model Selection

The [best model] was selected because [justification based on metrics AND business context]. Although [Model A] achieved [slight advantage in one metric], [best model] provides [better performance on primary metric / better trade-off / lower inference time].

4.4 Key Visualizations

Include:

  • ROC curve comparing all models (for classification)
  • Actual vs. Predicted scatter plot (for regression)
  • Learning curves (optional but valuable)

Section 5: API Design

Length: 1 page

5.1 Architecture

Describe the overall API architecture:

5.2 Endpoints

EndpointMethodDescriptionStatus Codes
/healthGETService health check200
/predictPOSTSingle prediction200, 400, 422, 500
/model-infoGETModel metadata200

5.3 Request/Response Examples

For each endpoint, provide one example request and response (you can reference your Swagger documentation).

5.4 Error Handling

Error TypeHTTP CodeExample
Validation error422Missing required field
Bad request400Malformed JSON
Server error500Model prediction failure

5.5 Design Decisions

Explain key decisions:

  • Why you chose [FastAPI/Flask]
  • How the model is loaded (at startup vs. per request)
  • How input validation works

Section 6: Testing Strategy

Length: 0.5 page

6.1 Test Plan

Test CategoryCountTools
Unit tests[N]pytest
Integration tests[N]pytest + TestClient/httpx
Edge case tests[N]pytest
API tests[N]Postman
Total[N]

6.2 Code Coverage

ModuleStatementsCoveredCoverage
src/app.py[N][N][X}%]
src/model.py[N][N][X}%]
src/schemas.py[N][N][X}%]
Total[N][N][X}%]

6.3 Key Test Results

Mention any notable edge cases you discovered and how you handled them.

6.4 Postman Collection

Reference the exported Postman collection (include a screenshot of a successful test run in the appendix).


Section 7: Model Explainability

Length: 1 page

7.1 Methodology

  • Tool(s) used: [LIME / SHAP / Both]
  • Number of explanations generated: [N individual predictions explained]
  • Visualizations produced: [List them]

7.2 Global Feature Importance

Include your SHAP summary plot or LIME global feature importance and interpret it:

The top 3 most important features are:

  1. [Feature A] — [interpretation]
  2. [Feature B] — [interpretation]
  3. [Feature C] — [interpretation]

7.3 Individual Prediction Explanations

Show 1-2 LIME or SHAP explanations for specific predictions:

Example 1: For a customer with [characteristics], the model predicted [prediction] with [confidence]%. The main contributing factors were [factors].

7.4 Insights and Potential Biases

  • Are the important features expected? Do they make domain sense?
  • Are there any features that should NOT influence the prediction? (e.g., sensitive attributes)
  • What surprised you about the model's behavior?
Writing Effective Interpretations

Don't just describe the plots — explain what they mean for the business or end user. Example: "The model relies heavily on tenure (months as customer), which makes business sense since new customers have had less time to build loyalty."


Section 8: Deployment Architecture

Length: 0.5 page

8.1 How to Run the Service

Provide step-by-step instructions (abbreviated — full instructions are in README):

# 1. Clone the repository
git clone https://github.com/username/lia-project.git

# 2. Create virtual environment
python -m venv venv
source venv/bin/activate # or venv\Scripts\activate on Windows

# 3. Install dependencies
pip install -r requirements.txt

# 4. Start the API
uvicorn src.app:app --host 0.0.0.0 --port 8000

8.2 Docker (if applicable)

FROM python:3.10-slim
WORKDIR /app
COPY requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt
COPY . .
EXPOSE 8000
CMD ["uvicorn", "src.app:app", "--host", "0.0.0.0", "--port", "8000"]

8.3 Environment

ComponentVersion
Python[version]
Framework[FastAPI/Flask version]
scikit-learn[version]
OS[e.g., Windows 11 / Ubuntu 22.04]

Section 9: Conclusion and Future Work

Length: 0.5 page

9.1 Summary

Recap the key achievements:

This project successfully demonstrates [main achievement]. The [model] achieves [metric] on [dataset], and is served through a [framework] API with [N] endpoints, [N] tests, and [X}%] code coverage. Explainability analysis using [tool] provided [key insight].

9.2 Lessons Learned

List 2-3 important lessons:

  1. [e.g., "Data preprocessing took more time than expected — 60% of the project time was spent cleaning data"]
  2. [e.g., "Writing tests first (TDD) would have saved debugging time later"]
  3. [e.g., "SHAP revealed that my model relied on a feature I hadn't expected"]

9.3 Future Improvements

Suggest 2-3 realistic improvements:

  1. [e.g., "Implement a CI/CD pipeline with GitHub Actions"]
  2. [e.g., "Add model monitoring to detect data drift in production"]
  3. [e.g., "Test deep learning models for comparison"]

Section 10: References

Use a consistent citation format (APA, IEEE, or numbered). Include:

  • Dataset source and documentation
  • Framework documentation (FastAPI, Flask, scikit-learn)
  • LIME/SHAP papers or documentation
  • Any course materials referenced
  • Any AI tools used (with a brief description of how)

Example References

[1] T. Chen and C. Guestrin, "XGBoost: A Scalable Tree Boosting System," in Proc. KDD, 2016.

[2] S. Lundberg and S.-I. Lee, "A Unified Approach to Interpreting Model Predictions," in NeurIPS, 2017.

[3] M. Ribeiro, S. Singh, and C. Guestrin, ""Why Should I Trust You?": Explaining the Predictions of Any Classifier," in Proc. KDD, 2016.

[4] FastAPI Documentation, https://fastapi.tiangolo.com/

[5] Telco Customer Churn Dataset, Kaggle, https://www.kaggle.com/datasets/blastchar/telco-customer-churn


Appendices (Not Counted in Page Limit)

Include additional supporting material:

AppendixContent
AAdditional EDA visualizations
BFull confusion matrix and classification report
CPostman test run screenshots
DAdditional LIME/SHAP visualizations
EFull code coverage report (HTML export or screenshot)

Quality Checklist for Your Report

Before submitting, verify:

  • Executive summary covers all major points
  • Problem is clearly defined with measurable objectives
  • All preprocessing steps are documented and justified
  • Model comparison table is complete with ≥ 3 metrics
  • Confusion matrix is present and interpreted
  • API endpoints are documented with examples
  • Testing coverage is reported with numbers
  • Explainability section has ≥ 3 visualizations with interpretations
  • Conclusion includes lessons learned and future work
  • References are complete and formatted consistently
  • All figures and tables are numbered and captioned
  • Report is 5-8 pages (no more, no less)
  • No spelling or grammar errors
  • PDF format, properly formatted
AI Disclosure

If you used AI tools (ChatGPT, Copilot, etc.) during your project, you must disclose this in your report. Add a brief section before the references explaining which tools you used and how. Example: "GitHub Copilot was used for boilerplate code generation. ChatGPT was consulted for debugging error messages. All generated code was reviewed, tested, and adapted."