Prompt Engineering for Code Generation
Why Prompt Engineering Matters
The difference between a useless AI response and a production-ready code snippet often comes down to a single sentence in your prompt. Prompt engineering is the skill of communicating effectively with AI models to get precise, useful code output.
Imagine hiring a freelance developer remotely. If you send them a vague message like "build me a website," you'll get something generic. But if you say "build a FastAPI endpoint that accepts a JSON payload with fields name (string, required) and age (int, 1–120), validates the input, and returns a 201 response with the created user object," you'll get exactly what you need. Prompting AI is the same discipline.
Anatomy of a Good Code Prompt
A well-structured prompt has five key components:
| Component | Description | Example |
|---|---|---|
| Role | Set the AI's persona and expertise level | "You are a senior Python backend developer" |
| Context | Provide background on the project and stack | "I'm building a FastAPI app with SQLAlchemy and PostgreSQL" |
| Task | Clearly define what you want | "Write a function that paginates query results" |
| Constraints | Specify rules, limitations, and preferences | "Use async/await, return Pydantic models, handle empty results" |
| Format | Define the expected output structure | "Include type hints, docstring, and example usage" |
The Five-Component Prompt in Action
❌ Bad Prompt:
Write a pagination function
✅ Good Prompt:
You are a senior Python backend developer.
I'm building a FastAPI application with SQLAlchemy (async) and PostgreSQL.
Write an async function called `paginate_query` that:
- Accepts a SQLAlchemy select statement, page number (int, default 1),
and page size (int, default 20, max 100)
- Returns a Pydantic model with fields: items (list), total (int),
page (int), pages (int), has_next (bool), has_prev (bool)
- Handles edge cases: page < 1, page_size < 1, empty results
- Uses SQLAlchemy 2.0 style with `select()` and `session.execute()`
Include type hints, a docstring, and the Pydantic response model definition.
Prompting Strategies
Strategy 1: Zero-Shot Prompting
Ask directly without providing examples. Works best for standard, well-known tasks.
Write a Python decorator that measures the execution time of a function
and logs it using the `logging` module at DEBUG level.
Include the function name and elapsed time in milliseconds.
Expected output:
import functools
import logging
import time
logger = logging.getLogger(__name__)
def timing_decorator(func):
"""Measure and log the execution time of a function."""
@functools.wraps(func)
def wrapper(*args, **kwargs):
start = time.perf_counter()
result = func(*args, **kwargs)
elapsed_ms = (time.perf_counter() - start) * 1000
logger.debug(f"{func.__name__} executed in {elapsed_ms:.2f}ms")
return result
return wrapper
Strategy 2: Few-Shot Prompting
Provide examples of your desired input/output pattern. The AI mimics your style.
I have a pattern for creating FastAPI route handlers. Follow this pattern exactly:
Example 1:
@router.get("/users", response_model=list[UserResponse])
async def list_users(
db: AsyncSession = Depends(get_db),
skip: int = Query(0, ge=0),
limit: int = Query(20, ge=1, le=100),
):
"""List all users with pagination."""
result = await db.execute(select(User).offset(skip).limit(limit))
return result.scalars().all()
Example 2:
@router.get("/users/{user_id}", response_model=UserResponse)
async def get_user(
user_id: int = Path(..., ge=1),
db: AsyncSession = Depends(get_db),
):
"""Get a specific user by ID."""
result = await db.execute(select(User).where(User.id == user_id))
user = result.scalar_one_or_none()
if not user:
raise HTTPException(status_code=404, detail="User not found")
return user
Now, following the same pattern, create:
1. POST /users (create user, return 201)
2. PUT /users/{user_id} (update user)
3. DELETE /users/{user_id} (delete user, return 204)
Strategy 3: Chain-of-Thought (CoT)
Ask the AI to reason step by step before writing code. Produces better results for complex tasks.
I need to implement a rate limiter for my API. Before writing code:
1. List 3 common rate limiting algorithms and their trade-offs
2. Recommend the best one for a FastAPI app handling ~1000 req/sec
3. Explain how you would implement it step by step
4. Then write the complete implementation
Use Redis as the backend store. The rate limiter should support
per-user limits with configurable windows.
Strategy Comparison
| Strategy | Complexity | Tokens Used | Best For | Quality |
|---|---|---|---|---|
| Zero-shot | Low | Few | Standard tasks, boilerplate | Good |
| Few-shot | Medium | Moderate | Custom patterns, project-specific style | Very Good |
| Chain-of-thought | High | Many | Complex algorithms, architecture | Excellent |
Prompt Templates for Common Tasks
Template 1: Write a Function
Write a [language] function called `[name]` that:
- Input: [describe parameters with types]
- Output: [describe return value with type]
- Behavior: [describe what it does step by step]
- Edge cases to handle: [list them]
- Constraints: [performance, style, libraries allowed]
Include type hints and a docstring.
Example: Using the "Write a Function" Template
Write a Python function called `chunk_list` that:
- Input: a list of any type (items: list[T]) and chunk size (size: int)
- Output: a list of lists, each containing at most `size` elements (list[list[T]])
- Behavior: split the input list into chunks of the given size;
last chunk may be smaller
- Edge cases: empty list → return [], size <= 0 → raise ValueError,
size > len(list) → return [list]
- Constraints: use only standard library, must be generic (TypeVar)
Include type hints and a docstring.
Result:
from typing import TypeVar
T = TypeVar('T')
def chunk_list(items: list[T], size: int) -> list[list[T]]:
"""Split a list into chunks of the given size.
Args:
items: The list to split.
size: Maximum number of elements per chunk.
Returns:
A list of lists, each containing at most `size` elements.
Raises:
ValueError: If size is less than or equal to 0.
"""
if size <= 0:
raise ValueError(f"Chunk size must be positive, got {size}")
return [items[i:i + size] for i in range(0, len(items), size)]
Template 2: Debug an Error
I'm getting this error in my [language/framework] application:
[paste the full error traceback]
Here is the relevant code:
```[language]
[paste the code that triggers the error]
Environment: [Python version, OS, key library versions]
Please:
- Explain what is causing this error
- Show the corrected code
- Explain what you changed and why
<details>
<summary>Example: Using the "Debug an Error" Template</summary>
```text
I'm getting this error in my FastAPI application:
TypeError: Object of type datetime is not JSON serializable
Here is the relevant code:
```python
@app.get("/events")
async def get_events():
events = await db.fetch_all(query)
return {"events": events}
Environment: Python 3.11, FastAPI 0.104, SQLAlchemy 2.0
Please:
- Explain what is causing this error
- Show the corrected code
- Explain what you changed and why
</details>
### Template 3: Refactor Code
```text
Refactor the following [language] code for:
- [ ] Better readability
- [ ] Improved performance
- [ ] Following [convention/standard] conventions
- [ ] Reducing code duplication
Current code:
```[language]
[paste the code]
Requirements:
- Preserve the exact same external behavior (inputs/outputs)
- Add type hints if missing
- [Any specific constraints]
### Template 4: Write Tests
```text
Write [test framework] tests for the following function:
```[language]
[paste the function to test]
Cover these scenarios:
- Normal/happy path with typical inputs
- Edge cases: [list them: empty input, None, boundaries]
- Error cases: [list expected exceptions]
- [Any specific test patterns to follow]
Use [fixtures/mocks/parametrize] as appropriate. Follow the Arrange-Act-Assert pattern.
### Template 5: Explain Code
```text
Explain the following [language] code in detail:
```[language]
[paste the code]
Please:
- What is the overall purpose of this code?
- Walk through it line by line
- What are the time and space complexities?
- Are there any potential issues or improvements?
- Explain any non-obvious design choices
---
## Good vs. Bad Prompts — Real Examples
### Example 1: Data Validation
| Aspect | ❌ Bad Prompt | ✅ Good Prompt |
|--------|-------------|---------------|
| **Prompt** | "validate user data" | "Write a Pydantic BaseModel called UserCreate with fields: email (valid email format), password (min 8 chars, must include uppercase, lowercase, and digit), age (int, 13–120). Include custom validators with clear error messages." |
| **Result** | Generic function, unclear inputs | Precise Pydantic model with validators |
| **Why** | No specifics on what data, format, rules | Clear fields, types, constraints, format |
### Example 2: Database Query
| Aspect | ❌ Bad Prompt | ✅ Good Prompt |
|--------|-------------|---------------|
| **Prompt** | "write a SQL query for users" | "Write a SQLAlchemy 2.0 async query that fetches users who signed up in the last 30 days, have verified their email, and have at least one completed order. Return user id, email, signup_date, and order_count. Sort by order_count descending. Use the select() style, not the legacy Query API." |
| **Result** | `SELECT * FROM users` | Full async query with joins and filters |
| **Why** | What about users? All? Some? | Clear criteria, output fields, and style |
### Example 3: Error Handling
| Aspect | ❌ Bad Prompt | ✅ Good Prompt |
|--------|-------------|---------------|
| **Prompt** | "add error handling" | "Add error handling to this FastAPI endpoint. Catch ValueError (return 422), SQLAlchemyError (return 500 with generic message, log full error), and HTTPException (re-raise). Use a try/except block and return structured JSON error responses with `detail` and `error_code` fields." |
| **Result** | Bare `except Exception: pass` | Specific, layered error handling |
---
## Advanced Prompt Patterns
### Pattern 1: Persona Pattern
Set a specific expertise level and perspective for the AI:
```text
You are a staff-level Python developer with 15 years of experience,
specializing in high-performance API design. You follow PEP 8 strictly,
always write comprehensive docstrings, and prefer composition over inheritance.
Review this code and suggest improvements:
[code]
Pattern 2: Template Pattern
Define the exact structure you want the output to follow:
Generate a FastAPI CRUD router following this exact template:
# File: routers/{resource_name}.py
# Imports: [list needed imports]
router = APIRouter(prefix="/{resource_name}", tags=["{Resource Name}"])
# GET /{resource_name} - List all
# GET /{resource_name}/{id} - Get one
# POST /{resource_name} - Create
# PUT /{resource_name}/{id} - Update
# DELETE /{resource_name}/{id} - Delete
Generate this for a "Product" resource with fields:
name (str), price (float), category (str), in_stock (bool)
Pattern 3: Constraint Pattern
Explicitly state what the AI should and should NOT do:
Write a file upload handler for FastAPI.
DO:
- Accept only PNG, JPG, and PDF files
- Limit file size to 5MB
- Generate a UUID-based filename
- Save to an /uploads directory
- Return the file URL in the response
DO NOT:
- Use synchronous file I/O
- Store the original filename (security risk)
- Allow path traversal in filenames
- Skip MIME type validation (don't rely only on extension)
Pattern 4: Iterative Refinement Pattern
Build complex solutions through multiple turns:
Turn 1:
I need to build a JWT authentication system for FastAPI.
First, outline the architecture: what modules, classes, and functions
do we need? Don't write code yet.
Turn 2:
Good. Now implement the `auth/jwt_handler.py` module with the
create_access_token and verify_token functions.
Turn 3:
Now add proper error handling: expired tokens, invalid signatures,
missing claims. Use custom exception classes.
Turn 4:
Write pytest tests for all the functions in jwt_handler.py.
Cover happy paths, expired tokens, tampered tokens, and missing fields.
Multi-Turn Conversations for Complex Tasks
When to Use Multi-Turn
Single prompts work for simple tasks. For complex features, use multi-turn conversations to:
- Explore the problem space first
- Design the solution architecture
- Implement incrementally
- Review and refine each piece
Effective Multi-Turn Strategies
| Strategy | Description | When to Use |
|---|---|---|
| Scaffold first | Ask for structure/skeleton, then fill in | Large features |
| Test-driven | Ask for tests first, then implementation | Critical business logic |
| Review loop | Generate → review → refine → repeat | Performance-sensitive code |
| Incremental | Build feature piece by piece | Learning a new framework |
In long conversations, the AI may "forget" earlier context. Periodically re-summarize the key decisions and constraints:
To recap: we're building a JWT auth system for FastAPI with:
- Access tokens (15 min expiry) and refresh tokens (7 days)
- Stored in HTTP-only cookies
- RS256 signing algorithm
- Custom User model with SQLAlchemy
Now, let's implement the refresh token rotation logic.
Prompt Engineering Anti-Patterns
Avoid these common mistakes:
| Anti-Pattern | Problem | Better Approach |
|---|---|---|
| "Write me an app" | Way too vague, massive scope | Break into specific, small tasks |
| No context | AI guesses your tech stack | Always specify language, framework, versions |
| Accepting first output | First attempt is rarely optimal | Iterate and refine |
| Ignoring errors in output | AI code may not run | Always test before committing |
| Over-prompting | 2000-word prompts for simple tasks | Match prompt length to task complexity |
| No constraints | AI picks defaults you don't want | State what you do and don't want |
Don't fall into the trap of prompt-and-pray — writing a prompt, accepting the output without review, and moving on. AI-generated code should always be treated as a draft that needs human validation.
Measuring Prompt Effectiveness
How do you know if your prompts are working well? Track these metrics:
| Metric | What It Measures | Target |
|---|---|---|
| Acceptance rate | % of AI suggestions you use as-is | 30–50% |
| Iteration count | Number of prompt refinements needed | 1–3 |
| Time to working code | From prompt to tested, working code | < 2x manual coding |
| Bug introduction rate | Bugs found in AI code during review | < 10% of suggestions |
| Context switches | Times you abandon AI and write manually | < 20% |
Key Takeaways
| Concept | Summary |
|---|---|
| 5-component prompt | Role, Context, Task, Constraints, Format |
| Zero-shot | Direct instruction, no examples — for standard tasks |
| Few-shot | Provide examples of desired pattern — for custom style |
| Chain-of-thought | Step-by-step reasoning — for complex problems |
| Templates | Reusable prompt structures for common tasks |
| Persona pattern | Set the AI's expertise level and style |
| Iterative refinement | Build complex solutions through multi-turn dialogue |
| Anti-patterns | Avoid vague prompts, no context, and accepting without review |
Practice Exercises
- Take a function from your current project and write a prompt that would generate it
- Try the same task with zero-shot, few-shot, and chain-of-thought — compare results
- Write a prompt template for the most common task in your workflow
- Practice the iterative refinement pattern on a multi-file feature