Prompt Engineering & Model Tuning

Tuning AI behaviour

Simple explanation

A model is like a talented musician — it can play anything, but it needs direction. Prompt engineering is the sheet music. Model parameters are the volume and tempo knobs.

The same model can give wildly different responses depending on how you ask (prompt) and what settings you use (temperature, max tokens, etc.). Mastering these controls is what separates a demo from a production AI app.

Model parameters

Parameter	What It Controls	Range	Default	When to Adjust
Temperature	Randomness/creativity	0.0 - 2.0	~1.0	Lower for factual tasks, higher for creative
Top P	Diversity of token selection	0.0 - 1.0	~1.0	Lower to constrain vocabulary, higher for variety
Max tokens	Maximum response length	1 - model limit	Varies	Set to prevent runaway responses
Frequency penalty	Reduces repetition of tokens	0.0 - 2.0	0	Increase if responses are repetitive
Presence penalty	Encourages new topics	0.0 - 2.0	0	Increase for more diverse content
Stop sequences	Tokens that end generation	Custom strings	None	Use to control output format

Exam tip: Temperature is the most tested parameter

Temperature exam questions follow a pattern:

Temperature 0 = deterministic, same input gives same output. Best for: factual Q&A, extraction, classification
Temperature 0.3-0.7 = balanced. Best for: most production applications
Temperature 1.0+ = creative, varied. Best for: brainstorming, creative writing, diverse options

If the scenario needs consistency and accuracy, the answer is low temperature. If it needs creativity and variety, the answer is higher temperature.

Prompt engineering techniques

Technique	What It Does	Example
System prompt	Sets the model’s role, rules, and context	”You are a compliance analyst. Always cite regulations.”
Few-shot	Provides example input/output pairs	”Q: What is DLP? A: Data Loss Prevention prevents…”
Chain-of-thought	Asks model to show reasoning steps	”Think step by step before answering.”
Output formatting	Specifies response structure	”Respond in JSON format with fields: answer, confidence, sources”
Grounding instruction	Constrains model to use provided context	”Answer ONLY from the provided documents.”
Persona	Gives the model a specific expert identity	”You are a senior Azure architect with 15 years experience.”

Chain-of-thought and self-critique

Advanced reasoning techniques that improve output quality:

Advanced reasoning techniques
Feature	Chain-of-Thought	Self-Critique	Reflection
What it is	Model explains its reasoning step by step	Model reviews its own response and identifies errors	Model evaluates whether it achieved the task goal
How to trigger	'Think step by step'	'Review your response. Are there any errors?'	'Did your answer fully address the question? What did you miss?'
Best for	Complex reasoning, math, multi-step problems	Catching factual errors and inconsistencies	Ensuring completeness and accuracy
Cost	More tokens (reasoning + answer)	Double the tokens (answer + review)	Additional tokens for evaluation step

Real-world example: Atlas Financial's self-critique loop

Atlas Financial’s compliance agent uses a two-pass approach:

Pass 1: Generate assessment

Agent reviews loan application against regulations
Produces initial compliance assessment with citations

Pass 2: Self-critique

Same agent reviews its own assessment with the prompt: “Review your compliance assessment. Check: (1) Are all citations accurate? (2) Did you miss any applicable regulations? (3) Is the risk assessment justified?”
Agent corrects errors and fills gaps

Result: 23% reduction in false compliance flags after adding the self-critique loop. The extra tokens are worth it for high-stakes financial decisions.

Key terms

Question

What is temperature in model parameters?

Click or press Enter to reveal answer

Answer

A number (0-2) controlling response randomness. Temperature 0 = deterministic (same input = same output). Temperature 1+ = more creative and varied. Lower for factual tasks, higher for creative tasks.

Click to flip back

Question

What is chain-of-thought prompting?

Click or press Enter to reveal answer

Answer

A technique where you instruct the model to explain its reasoning step by step before giving the final answer. Improves accuracy on complex reasoning tasks by forcing the model to 'show its work.'

Click to flip back

Question

What is a self-critique loop?

Click or press Enter to reveal answer

Answer

A pattern where the model generates a response, then reviews its own response to identify and correct errors. Costs more tokens but significantly improves accuracy for high-stakes outputs.

Click to flip back

Knowledge check

Knowledge Check

MediaForge's content generation tool produces the same headline every time for similar briefs. The marketing team wants more creative variety. Which parameter should they adjust?

Knowledge Check

NeuralMed's patient chatbot sometimes makes reasoning errors when answering multi-step medical questions (e.g., 'If the patient has condition A AND takes medication B, what are the risks?'). Which technique would most improve accuracy?