Building a Chat App with the Foundry SDK
Time to write code. This module shows you how to build a simple chat application using the Microsoft Foundry SDK in Python β connecting your code to a deployed AI model.
Your first AI app
Building an AI chat app is like sending text messages β but to an AI model.
You already know how to test prompts in the Foundry Playground. Now youβre going to do the exact same thing, but from Python code. Your app sends a message to the model, the model sends a response back, and you display it.
Donβt worry if youβre new to coding β the exam expects you to understand the code, not write it from scratch. Weβll walk through every line.
The architecture of a chat app
Your Python App β Foundry SDK β Foundry Project β Deployed Model (GPT-4o)
| |
βββββββββββββββββββ Response βββββββββββββββββββββββββββββββ
| Component | What It Does |
|---|---|
| Your app | Sends user messages, displays responses |
| Foundry SDK | Handles authentication, API calls, formatting |
| Foundry project | Routes the request to the right deployment |
| Deployed model | Generates the response |
The code: step by step
Step 1: Install the SDK
pip install azure-ai-projects azure-identity
Step 2: Connect to your project
Note: The Foundry SDK is actively evolving. The examples below illustrate the core concepts β always check the official SDK documentation for the latest syntax.
from azure.identity import DefaultAzureCredential
from azure.ai.projects import AIProjectClient
# Connect to your Foundry project
client = AIProjectClient(
credential=DefaultAzureCredential(),
endpoint="https://your-project.services.ai.azure.com"
)
Whatβs happening:
DefaultAzureCredentialhandles authentication (uses your Azure login)AIProjectClientconnects to your specific Foundry project- The
endpointis your projectβs URL (found in the Foundry portal)
Step 3: Send a message and get a response
# Get an OpenAI-compatible chat client from the project
chat = client.inference.get_chat_completions_client()
# Send a message
response = chat.complete(
model="gpt4o-coursework", # Your deployment name
messages=[
{"role": "system", "content": "You are a helpful study assistant."},
{"role": "user", "content": "Explain what a neural network is in simple terms."}
],
temperature=0.7,
max_tokens=200
)
# Display the response
print(response.choices[0].message.content)
Exam focus: You donβt need to memorise exact SDK method names. Understand the pattern: connect to project β get a client β send messages with roles β read the response.
Whatβs happening:
messagesis a list with a system prompt and user messagemodelis the name of your deployment (not the model name β your deployment name)temperatureandmax_tokenscontrol the response behaviour- The response comes back in
response.choices[0].message.content
Step 4: Build a conversation loop
messages = [
{"role": "system", "content": "You are a helpful study assistant."}
]
while True:
user_input = input("You: ")
if user_input.lower() == "quit":
break
messages.append({"role": "user", "content": user_input})
response = chat.complete(
model="gpt4o-coursework",
messages=messages,
temperature=0.7,
max_tokens=200
)
assistant_message = response.choices[0].message.content
print(f"AI: {assistant_message}")
messages.append({"role": "assistant", "content": assistant_message})
Key insight: Each new message includes the full conversation history (messages list). This is how the model maintains context across a conversation.
Why the full conversation is sent every time
AI models are stateless β they donβt remember previous conversations. Every API call is independent.
To create the illusion of a conversation, you send the entire chat history with each request:
- First call: system + user message
- Second call: system + user message 1 + assistant response 1 + user message 2
- Third call: system + all previous messages + new user message
This is why token limits matter β long conversations eventually exceed the modelβs context window.
Exam relevance: Understanding that models are stateless and conversations require full history is a commonly tested concept.
Message roles explained
| Feature | Purpose | Who Creates It |
|---|---|---|
| system | Sets the AI's role, rules, and behaviour for the entire conversation | The developer (in code) |
| user | The human's question or instruction | The end user (typed input) |
| assistant | The AI's previous responses (for conversation history) | The model (via API response) |
Authentication methods
| Method | When to Use |
|---|---|
| DefaultAzureCredential | Development β uses your Azure CLI login or managed identity |
| API key | Quick testing β paste the key from the Foundry portal |
| Managed identity | Production β Azure assigns an identity to your app automatically |
Exam tip:
DefaultAzureCredentialis the recommended approach because it works in development (Azure CLI login) and production (managed identity) without code changes.
π¬ Video walkthrough
π¬ Video coming soon
Building a Chat App β AI-901 Module 14
Building a Chat App β AI-901 Module 14
~16 minFlashcards
Knowledge Check
Priya notices that her chat app gives relevant answers in the first few messages but seems to 'forget' context after about 20 exchanges. What is the most likely cause?
In a chat completion API call, which message role defines the AI's persistent behaviour rules like 'always respond in French' and 'never discuss politics'?
Next up: Agents in Foundry β creating AI that doesnβt just talk, but takes action.