A hands-on preview from the AiBricks bootcamp. In 30 minutes you'll understand how LLMs work, set up the Anthropic API, and build a working prompt chain — from scratch. No prior AI experience required.
A Large Language Model is a neural network trained to predict the next token given the ones before it — at massive scale. When you "prompt" one, you're giving it a starting sequence of tokens and asking it to continue. That's the entire mechanism.
Everything else — reasoning, code generation, summarisation — emerges from doing that one thing really well across trillions of examples.
LLMs don't see "words" — they see tokens. A token is roughly 4 characters or ¾ of a word. The sentence "Hello world!" is 3–4 tokens. This matters for cost (you pay per token), for context limits, and for understanding why models sometimes split words strangely.
The context window is the maximum number of tokens a model can "see" at once — your prompt + the model's response combined. Claude 3.5 Sonnet has a 200K token context. GPT-4o has 128K. When you exceed it, older content gets cut off. This shapes how you design long conversations and document processing.
Temperature controls randomness. At 0, the model always picks the most probable next token — deterministic, precise. At 1, it samples more broadly — more creative, less predictable. For structured outputs (JSON, code), use low temperature. For brainstorming, use higher temperature.
You need Python 3.10+ and an Anthropic API key. Get one free at console.anthropic.com — new accounts include $5 of free credits, enough for hundreds of API calls.
pip install anthropic
export ANTHROPIC_API_KEY="sk-ant-..."
$env:ANTHROPIC_API_KEY="sk-ant-..."
Best practice: Store your key in a .env file and add .env to .gitignore. Never commit API keys to git — even in private repos.
Here's the minimal code to call Claude and get a response:
import anthropic client = anthropic.Anthropic() # reads ANTHROPIC_API_KEY from env message = client.messages.create( model="claude-3-5-sonnet-20241022", max_tokens=1024, messages=[ {"role": "user", "content": "What is a RAG pipeline?"} ] ) print(message.content[0].text)
Run it: python first_call.py
The system prompt sets the model's persona and instructions before the user says anything. It's the most powerful lever you have for shaping outputs.
import anthropic client = anthropic.Anthropic() message = client.messages.create( model="claude-3-5-sonnet-20241022", max_tokens=1024, system="""You are a senior AI engineer explaining concepts to a developer learning AI engineering for the first time. Be clear and concrete. Use analogies when helpful. Keep answers to 3–5 sentences.""", messages=[ {"role": "user", "content": "What is a vector database?"} ] ) print(message.content[0].text)
Notice: The system prompt doesn't appear in the messages list — it's a separate parameter. This is intentional: the system prompt is the model's fixed instruction set, separate from the conversation.
Real applications rarely want free text — they want structured data. Here's how to reliably get JSON back from the model:
import anthropic import json client = anthropic.Anthropic() prompt = """Analyse the following text and return a JSON object with: - sentiment: "positive", "negative", or "neutral" - confidence: a number from 0 to 1 - key_topics: list of up to 3 main topics Return ONLY valid JSON. No explanation, no markdown fences. Text: "The new Claude model is incredibly fast and handles long documents well. Some edge cases still trip it up." """ message = client.messages.create( model="claude-3-5-sonnet-20241022", max_tokens=256, temperature=0, # deterministic for structured output messages=[{"role": "user", "content": prompt}] ) result = json.loads(message.content[0].text) print(result) # → {'sentiment': 'positive', 'confidence': 0.75, 'key_topics': [...]}
When you need valid JSON, you want the most probable token sequence — not a creative interpretation. temperature=0 makes the model deterministic: same prompt → same output every time. This is critical for production code where you're parsing the response programmatically.
Most real applications involve multiple LLM calls — the output of one becomes the input of the next. This is "prompt chaining" and it's the foundation of more complex agentic systems.
import anthropic client = anthropic.Anthropic() # Step 1: Extract key points from a document document = "AI adoption in enterprise is accelerating. Companies report 30% productivity gains in knowledge work. Main blockers: data privacy, integration complexity, and lack of skilled engineers." step1 = client.messages.create( model="claude-3-5-haiku-20241022", # cheaper model for extraction max_tokens=256, system="Extract the 3 most important facts. Return as a JSON array of strings.", messages=[{"role": "user", "content": document}] ) key_facts = step1.content[0].text # Step 2: Use those facts to write an executive summary step2 = client.messages.create( model="claude-3-5-sonnet-20241022", # smarter model for synthesis max_tokens=512, system="Write a one-paragraph executive summary for a CTO.", messages=[{"role": "user", "content": f"Key facts: {key_facts}"}] ) print(step2.content[0].text)
Cost tip: Use a fast, cheap model (Haiku) for extraction/classification steps. Reserve the smarter model (Sonnet, Opus) for synthesis and reasoning. This can cut API costs by 80% in real pipelines.
You've just covered the core mechanics of Week 1. In the bootcamp, you'll go much further:
Cohort 01 starts May 5, 2026. 95 spots remaining. Full refund if you're not satisfied after Week 1.
Apply to Cohort 01