Write a System Prompt That Actually Changes Claude's Behaviour

The difference between a system prompt that does something and one that does nothing. How to give Claude a role, constraints, and a consistent voice that holds across a conversation.

Most system prompts do not work as well as they could. They are vague, they describe behaviour in general terms that get ignored under pressure, or they tell Claude what it is without telling it how to act. This tutorial covers the patterns that produce reliable, consistent results.

What a system prompt actually does

A system prompt is instructions that apply before the conversation starts and persist for its entire duration. Claude reads it, establishes a set of behaviours, and tries to maintain them. The key word is "tries" — a vague or contradictory system prompt produces inconsistent behaviour because Claude has nothing concrete to hold onto.

A system prompt that says "You are a helpful assistant" does almost nothing. That is the default. A system prompt that says "You are a senior Linux sysadmin. When a user asks for a command, always show the command first, then explain it. Never use interactive prompts in examples — always show the non-interactive flag." — that produces different, measurable behaviour.

Give Claude a specific role with a specific audience

Vague:

You are a coding assistant.

Specific:

You are a backend engineer specialising in Python and PostgreSQL.
Your users are other engineers — skip basic explanations unless asked.
Default to type hints, dataclasses, and modern Python (3.11+).
When you show database queries, use asyncpg syntax.

The role sets the expertise level. The audience sets how much you explain. The defaults set the technical choices Claude makes without being asked.

Constraints are more reliable than encouragements

Encouragements ("always be concise", "try to be accurate") are vague goals that Claude weighs against other goals. Constraints are rules that are either met or not.

Encouragement (weak):

Try to keep responses short and focused.

Constraint (strong):

Never write more than three sentences before a code block or a numbered step.
If the user asks a yes/no question, answer it in the first sentence before elaborating.

The constraint gives Claude something specific to check against. It is harder to accidentally violate.

Use negative instructions for common failure modes

Tell Claude explicitly what not to do, especially for patterns it defaults to:

Do not add a summary paragraph at the end of responses.
Do not suggest the user "consult a professional" unless there is a genuine legal or medical risk.
Do not re-explain what the user just told you.
Do not hedge every statement with "however" or "it depends" — give a direct answer and add nuance after.

Negative instructions are particularly effective for stopping behaviours that are baked in as defaults.

Show, do not just tell

For output format, an example beats a description:

Instead of:

Format your responses using markdown with clear headings.

Use:

Format responses like this:

**Short answer** — one sentence.

**Detail** — two to four sentences explaining the reasoning.

**Code** (if applicable):
\`\`\`language
code here
\`\`\`

**Watch out for** — one sentence on the most common mistake.

Claude will follow a demonstrated template more reliably than a described one.

Scope what Claude knows about

Telling Claude to stay in a specific domain makes it more useful in that domain and prevents it from drifting into unhelpful territory:

You only answer questions about Linux system administration, networking, and shell scripting.
If the user asks about something else, acknowledge the question briefly and explain that you are focused on those topics.
Do not attempt to answer questions outside your defined scope — redirect instead.

Handle multi-turn consistency

System prompts apply to every turn, but certain things are easy to lose track of over a long conversation. Anchor them explicitly:

Throughout the conversation, always refer to the user's project as "the pipeline" — that is what they call it.
If the user asks you to remember something, summarise it in a "Memory:" line at the start of your next response and incorporate it going forward.

Test it

Write the system prompt, then probe it:

Ask something it should handle well — does it respond the right way?
Ask something at the edge of its scope — does it stay in character?
Ask a leading question that usually triggers the behaviour you blocked — does the constraint hold?
Push back on one of its answers — does it maintain its role or collapse into vague agreement?

The last test is the most revealing. A system prompt that holds under light pressure usually holds for real users. One that collapses when you say "are you sure?" was never working properly.

A complete example

You are a code reviewer for a team that writes Go and TypeScript.
Your users are engineers at the mid-to-senior level — skip introductory explanations.

When reviewing code:
1. List actual bugs and correctness issues first, labelled [BUG].
2. Then performance concerns, labelled [PERF].
3. Then style and clarity issues, labelled [STYLE].
4. End with one sentence summarising the overall quality.

Do not praise code unless the user explicitly asks for positive feedback.
Do not suggest changes that are purely subjective preferences.
Do not add a closing summary paragraph — the labelled list is the whole response.

If no code is provided, ask the user to paste the code they want reviewed.

This gives Claude a specific job, a consistent output format, explicit blockers on common unwanted behaviours, and a fallback for missing input. Every element is checkable.