Building a Chat App with the Foundry SDK

Your first AI app

Simple explanation

Building an AI chat app is like sending text messages — but to an AI model.

You already know how to test prompts in the Foundry Playground. Now you’re going to do the exact same thing, but from Python code. Your app sends a message to the model, the model sends a response back, and you display it.

Don’t worry if you’re new to coding — the exam expects you to understand the code, not write it from scratch. We’ll walk through every line.

The architecture of a chat app

Your Python App  →  Foundry SDK  →  Foundry Project  →  Deployed Model (GPT-4o)
     |                                                          |
     ←────────────────── Response ──────────────────────────────←

Component	What It Does
Your app	Sends user messages, displays responses
Foundry SDK	Handles authentication, API calls, formatting
Foundry project	Routes the request to the right deployment
Deployed model	Generates the response

The code: step by step

Step 1: Install the SDK

pip install azure-ai-projects azure-identity

Step 2: Connect to your project

Note: The Foundry SDK is actively evolving. The examples below illustrate the core concepts — always check the official SDK documentation for the latest syntax.

from azure.identity import DefaultAzureCredential
from azure.ai.projects import AIProjectClient

# Connect to your Foundry project
client = AIProjectClient(
    credential=DefaultAzureCredential(),
    endpoint="https://your-project.services.ai.azure.com"
)

What’s happening:

DefaultAzureCredential handles authentication (uses your Azure login)
AIProjectClient connects to your specific Foundry project
The endpoint is your project’s URL (found in the Foundry portal)

Step 3: Send a message and get a response

# Get an OpenAI-compatible chat client from the project
chat = client.inference.get_chat_completions_client()

# Send a message
response = chat.complete(
    model="gpt4o-coursework",  # Your deployment name
    messages=[
        {"role": "system", "content": "You are a helpful study assistant."},
        {"role": "user", "content": "Explain what a neural network is in simple terms."}
    ],
    temperature=0.7,
    max_tokens=200
)

# Display the response
print(response.choices[0].message.content)

Exam focus: You don’t need to memorise exact SDK method names. Understand the pattern: connect to project → get a client → send messages with roles → read the response.

What’s happening:

messages is a list with a system prompt and user message
model is the name of your deployment (not the model name — your deployment name)
temperature and max_tokens control the response behaviour
The response comes back in response.choices[0].message.content

Step 4: Build a conversation loop

messages = [
    {"role": "system", "content": "You are a helpful study assistant."}
]

while True:
    user_input = input("You: ")
    if user_input.lower() == "quit":
        break

    messages.append({"role": "user", "content": user_input})

    response = chat.complete(
        model="gpt4o-coursework",
        messages=messages,
        temperature=0.7,
        max_tokens=200
    )

    assistant_message = response.choices[0].message.content
    print(f"AI: {assistant_message}")

    messages.append({"role": "assistant", "content": assistant_message})

Key insight: Each new message includes the full conversation history (messages list). This is how the model maintains context across a conversation.

Why the full conversation is sent every time

AI models are stateless — they don’t remember previous conversations. Every API call is independent.

To create the illusion of a conversation, you send the entire chat history with each request:

First call: system + user message
Second call: system + user message 1 + assistant response 1 + user message 2
Third call: system + all previous messages + new user message

This is why token limits matter — long conversations eventually exceed the model’s context window.

Exam relevance: Understanding that models are stateless and conversations require full history is a commonly tested concept.

Message roles explained

The three message roles in chat completions
Feature	Purpose	Who Creates It
system	Sets the AI's role, rules, and behaviour for the entire conversation	The developer (in code)
user	The human's question or instruction	The end user (typed input)
assistant	The AI's previous responses (for conversation history)	The model (via API response)

Authentication methods

Method	When to Use
DefaultAzureCredential	Development — uses your Azure CLI login or managed identity
API key	Quick testing — paste the key from the Foundry portal
Managed identity	Production — Azure assigns an identity to your app automatically

Exam tip: DefaultAzureCredential is the recommended approach because it works in development (Azure CLI login) and production (managed identity) without code changes.

🎬 Video walkthrough

Flashcards

Question

What SDK package do you install to build AI apps with Microsoft Foundry in Python?

Click or press Enter to reveal answer

Answer

azure-ai-projects (and azure-identity for authentication). Install with: pip install azure-ai-projects azure-identity

Click to flip back

Question

Why does a chat app send the full conversation history with every API call?

Click or press Enter to reveal answer

Answer

Because AI models are stateless — they don't remember previous messages. To maintain context, you send the complete message history (system + all user/assistant messages) with each request.

Click to flip back

Question

What are the three message roles in a chat completion API call?

Click or press Enter to reveal answer

Answer

system (sets the AI's role and rules), user (the human's question), and assistant (the AI's previous responses, included for conversation history).

Click to flip back

Question

What is DefaultAzureCredential?

Click or press Enter to reveal answer

Answer

A flexible authentication method that automatically detects the best available credential — Azure CLI login in development, managed identity in production. Recommended because it works in both environments without code changes.

Click to flip back

Knowledge Check

Priya notices that her chat app gives relevant answers in the first few messages but seems to 'forget' context after about 20 exchanges. What is the most likely cause?

Knowledge Check

In a chat completion API call, which message role defines the AI's persistent behaviour rules like 'always respond in French' and 'never discuss politics'?

Next up: Agents in Foundry — creating AI that doesn’t just talk, but takes action.