Working with APIs
REST APIs, authentication, OpenAI API, Anthropic API, HuggingFace API.
APIs are how software systems talk to each other. When you use an AI-powered app, it's almost certainly calling an API behind the scenes — sending your prompt to a model hosted in the cloud and receiving the response back. Understanding APIs unlocks the ability to build your own AI-powered applications, automate workflows, and integrate AI capabilities into any system you control.
What Is an API?
An API (Application Programming Interface) is a contract between two pieces of software. It defines how to request a service and what you'll get back. Think of it like a restaurant menu — you don't need to know how the kitchen works. You just need to know what you can order (the API endpoints), how to place your order (the request format), and what you'll receive (the response format).
When you type a message to ChatGPT on the website, the browser is making an API call to OpenAI's servers. The API receives your message, runs it through the model, and sends the generated text back. Every AI product you use — from Midjourney to GitHub Copilot to voice assistants — works this way under the hood.
REST APIs and HTTP Methods
Most modern APIs follow the REST (Representational State Transfer) pattern, which uses standard HTTP methods to perform operations. These are the same protocols your web browser uses — APIs just speak the same language programmatically.
| HTTP Method | Purpose | Example |
|---|---|---|
| GET | Retrieve data (read-only, no side effects) | Get a list of your models, check API status |
| POST | Send data to create or trigger something | Send a prompt to generate a completion, upload a file |
| PUT / PATCH | Update an existing resource | Update a fine-tuning job's metadata |
| DELETE | Remove a resource | Delete a fine-tuned model, remove a file |
For AI APIs, you will use POST most of the time — sending a prompt and receiving generated text, images, or embeddings in return.
Authentication: API Keys and OAuth
APIs need to know who is making the request — both for billing and for security. The two most common authentication methods are API keys and OAuth.
API Keys
An API key is a long random string that identifies your account. You include it in every request, typically as an HTTP header. Most AI APIs (OpenAI, Anthropic, Hugging Face) use API keys because they are simple and developer-friendly.
Using API keys in Python:
import os
# NEVER hardcode API keys in your source code
# Store them in environment variables instead
api_key = os.environ.get("OPENAI_API_KEY")
# The key is sent as an Authorization header
headers = {
"Authorization": f"Bearer {api_key}",
"Content-Type": "application/json",
}
# Or set it via a .env file with python-dotenv
from dotenv import load_dotenv
load_dotenv() # Loads variables from .env file
api_key = os.environ.get("ANTHROPIC_API_KEY")OAuth 2.0
OAuth is a more complex authentication system used when an application needs to access a user's data on their behalf — for example, a third-party app accessing your Google Drive. OAuth involves tokens, scopes (permissions), and redirect flows. You won't typically encounter OAuth with AI APIs directly, but it's important when integrating AI into applications that connect to services like Google, Microsoft, or Slack.
Rate Limiting
Every API limits how many requests you can make in a given time window. This prevents abuse and ensures fair access for all users. Rate limits are typically expressed as requests per minute (RPM) and tokens per minute (TPM) for AI APIs.
Handling rate limits gracefully:
import time
import requests
def call_api_with_retry(url, headers, data, max_retries=3):
"""Make an API call with exponential backoff on rate limits."""
for attempt in range(max_retries):
response = requests.post(url, headers=headers, json=data)
if response.status_code == 200:
return response.json()
elif response.status_code == 429: # Rate limited
wait_time = 2 ** attempt # 1s, 2s, 4s...
print(f"Rate limited. Waiting {wait_time}s...")
time.sleep(wait_time)
else:
response.raise_for_status()
raise Exception("Max retries exceeded")JSON: The Language of APIs
Almost all modern APIs communicate using JSON (JavaScript Object Notation). JSON is a lightweight, human-readable data format that uses key-value pairs, arrays, and nested objects — structures that map directly to Python dictionaries and lists.
JSON request and response example:
# Sending a JSON request body (Python dict → JSON automatically)
import requests
data = {
"model": "gpt-4o",
"messages": [
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": "Explain APIs in one sentence."},
],
"temperature": 0.7,
"max_tokens": 100,
}
response = requests.post(url, headers=headers, json=data)
# Parsing the JSON response (JSON → Python dict automatically)
result = response.json()
# Navigate the nested structure
message = result["choices"][0]["message"]["content"]
tokens_used = result["usage"]["total_tokens"]
print(f"Response: {message}")
print(f"Tokens used: {tokens_used}")Making API Calls with Python
The requests library is the standard tool for making HTTP requests in Python. It handles headers, JSON encoding/decoding, error handling, and more with a clean, simple API.
Making API calls with the requests library:
import requests
import os
# --- GET request: retrieve information ---
response = requests.get(
"https://api.openai.com/v1/models",
headers={"Authorization": f"Bearer {os.environ['OPENAI_API_KEY']}"},
)
print(f"Status: {response.status_code}") # 200 = success
models = response.json()["data"]
print(f"Available models: {len(models)}")
# --- POST request: send data and get a response ---
response = requests.post(
"https://api.openai.com/v1/chat/completions",
headers={
"Authorization": f"Bearer {os.environ['OPENAI_API_KEY']}",
"Content-Type": "application/json",
},
json={
"model": "gpt-4o",
"messages": [{"role": "user", "content": "Hello!"}],
},
)
if response.status_code == 200:
print(response.json()["choices"][0]["message"]["content"])
else:
print(f"Error {response.status_code}: {response.text}")OpenAI API
OpenAI provides API access to GPT-4o, GPT-4.5, and the latest GPT-5.4 models for text generation, as well as DALL-E for image generation and Whisper for speech-to-text. The chat completions endpoint is the core API for text generation.
Chat Completions
OpenAI chat completions with the official SDK:
from openai import OpenAI
client = OpenAI() # Reads OPENAI_API_KEY from environment
# Basic chat completion
response = client.chat.completions.create(
model="gpt-4o",
messages=[
{"role": "system", "content": "You are a helpful coding tutor."},
{"role": "user", "content": "Explain list comprehensions in Python."},
],
temperature=0.7,
max_tokens=500,
)
print(response.choices[0].message.content)
print(f"Tokens used: {response.usage.total_tokens}")Function Calling (Tool Use)
Function calling allows GPT models to output structured JSON that maps to functions you define. Instead of just generating text, the model can decide to "call" one of your functions with specific arguments — enabling AI to interact with databases, APIs, calculators, and any other tools you provide.
OpenAI function calling:
from openai import OpenAI
import json
client = OpenAI()
# Define tools the model can use
tools = [
{
"type": "function",
"function": {
"name": "get_weather",
"description": "Get current weather for a location",
"parameters": {
"type": "object",
"properties": {
"location": {
"type": "string",
"description": "City and state, e.g. Austin, TX",
},
"unit": {
"type": "string",
"enum": ["celsius", "fahrenheit"],
},
},
"required": ["location"],
},
},
}
]
response = client.chat.completions.create(
model="gpt-4o",
messages=[{"role": "user", "content": "What's the weather in Austin?"}],
tools=tools,
tool_choice="auto",
)
# Check if the model wants to call a function
message = response.choices[0].message
if message.tool_calls:
call = message.tool_calls[0]
args = json.loads(call.function.arguments)
print(f"Model wants to call: {call.function.name}")
print(f"With arguments: {args}")
# You would then execute get_weather(args) and send the
# result back to the model for a final responseAnthropic API
Anthropic's API provides access to the Claude model family. The Messages API is the primary endpoint for generating text with Claude. Claude is known for strong reasoning, safety-oriented behavior, long context windows (up to 1 million tokens on Claude Opus 4.6), and excellent instruction following.
Messages API
Anthropic Messages API with the official SDK:
import anthropic
client = anthropic.Anthropic() # Reads ANTHROPIC_API_KEY from env
# Basic message
message = client.messages.create(
model="claude-sonnet-4-20250514",
max_tokens=1024,
system="You are a helpful coding tutor.",
messages=[
{"role": "user", "content": "Explain decorators in Python."},
],
)
print(message.content[0].text)
print(f"Input tokens: {message.usage.input_tokens}")
print(f"Output tokens: {message.usage.output_tokens}")Tool Use with Claude
Claude supports tool use (the equivalent of OpenAI's function calling) through the same Messages API. You define tools with JSON Schema parameters, and Claude can decide when and how to use them.
Claude tool use:
import anthropic
client = anthropic.Anthropic()
# Define available tools
tools = [
{
"name": "get_stock_price",
"description": "Get the current stock price for a ticker symbol",
"input_schema": {
"type": "object",
"properties": {
"ticker": {
"type": "string",
"description": "Stock ticker symbol, e.g. AAPL",
},
},
"required": ["ticker"],
},
}
]
response = client.messages.create(
model="claude-sonnet-4-20250514",
max_tokens=1024,
tools=tools,
messages=[
{"role": "user", "content": "What is Apple's stock price?"},
],
)
# Claude may respond with a tool_use content block
for block in response.content:
if block.type == "tool_use":
print(f"Tool: {block.name}")
print(f"Input: {block.input}") # {"ticker": "AAPL"}
# Execute the tool and send results back to Claude
elif block.type == "text":
print(f"Text: {block.text}")Hugging Face API
Hugging Face hosts thousands of open-source models and provides an Inference API that lets you run them without managing infrastructure. This is particularly useful for specialized models — sentiment analysis, translation, summarization, image generation — that may not be available through OpenAI or Anthropic.
Hugging Face Inference API:
from huggingface_hub import InferenceClient
client = InferenceClient(token=os.environ["HF_TOKEN"])
# Text generation with an open model
response = client.text_generation(
"The future of AI is",
model="meta-llama/Llama-3.3-70B-Instruct",
max_new_tokens=200,
)
print(response)
# Sentiment analysis with a specialized model
result = client.text_classification(
"I love this product! It works perfectly.",
model="distilbert/distilbert-base-uncased-finetuned-sst-2-english",
)
print(result) # [{"label": "POSITIVE", "score": 0.9998}]
# Image generation
image = client.text_to_image(
"A serene mountain landscape at sunset, oil painting style",
model="stabilityai/stable-diffusion-xl-base-1.0",
)
image.save("landscape.png")Building an AI-Powered Application
Let's put everything together and build a simple but functional AI-powered application: a command-line research assistant that takes a topic, generates a research summary, and extracts key points into structured JSON.
A complete AI-powered research assistant:
import anthropic
import json
client = anthropic.Anthropic()
def research_topic(topic: str) -> dict:
"""Generate a structured research summary on any topic."""
# Step 1: Generate a detailed research summary
summary_response = client.messages.create(
model="claude-sonnet-4-20250514",
max_tokens=2048,
system=(
"You are a research assistant. Provide accurate, "
"well-organized summaries with specific facts and "
"data points when available."
),
messages=[
{
"role": "user",
"content": f"Write a comprehensive research summary "
f"about: {topic}",
},
],
)
summary = summary_response.content[0].text
# Step 2: Extract structured data from the summary
extraction_response = client.messages.create(
model="claude-sonnet-4-20250514",
max_tokens=1024,
messages=[
{
"role": "user",
"content": f"""Extract key information from this research
summary and return it as JSON with these keys:
- "title": a concise title
- "key_points": array of 3-5 main findings
- "applications": array of practical applications
- "challenges": array of current challenges or limitations
- "confidence": "high", "medium", or "low" based on how
well-established the information is
Summary:
{summary}
Return only valid JSON, no other text.""",
},
],
)
structured = json.loads(extraction_response.content[0].text)
structured["full_summary"] = summary
return structured
# Usage
if __name__ == "__main__":
result = research_topic("quantum computing applications in 2026")
print(f"Title: {result['title']}")
print(f"Confidence: {result['confidence']}")
print("\nKey Points:")
for point in result["key_points"]:
print(f" - {point}")
print("\nApplications:")
for app in result["applications"]:
print(f" - {app}")
print("\nChallenges:")
for challenge in result["challenges"]:
print(f" - {challenge}")Error Handling and Best Practices
Production applications need to handle failures gracefully. APIs can fail for many reasons — network issues, rate limits, invalid inputs, server errors, or model outages.
Robust API error handling:
import anthropic
import time
client = anthropic.Anthropic()
def safe_api_call(
messages,
model = "claude-sonnet-4-20250514",
max_retries = 3,
):
"""Make a resilient API call with retry logic."""
for attempt in range(max_retries):
try:
response = client.messages.create(
model=model,
max_tokens=1024,
messages=messages,
)
return response.content[0].text
except anthropic.RateLimitError:
wait = 2 ** attempt
print(f"Rate limited. Retrying in {wait}s...")
time.sleep(wait)
except anthropic.APIConnectionError:
wait = 2 ** attempt
print(f"Connection error. Retrying in {wait}s...")
time.sleep(wait)
except anthropic.BadRequestError as e:
# Don't retry client errors — fix the request
print(f"Bad request: {e}")
raise
except anthropic.APIStatusError as e:
if e.status_code >= 500:
wait = 2 ** attempt
print(f"Server error. Retrying in {wait}s...")
time.sleep(wait)
else:
raise
raise Exception(f"Failed after {max_retries} retries")API Best Practices
Security
- ✓Store API keys in environment variables, never in code
- ✓Use .env files locally and secrets managers in production
- ✓Add .env to your .gitignore file
- ✓Rotate keys periodically and immediately if compromised
Reliability
- ✓Always implement retry logic with exponential backoff
- ✓Set reasonable timeouts on all API requests
- ✓Log API errors and response times for monitoring
- ✓Validate API responses before using them in your application
Cost Management
- ✓Set spending limits in your API provider's dashboard
- ✓Use cheaper models for simple tasks, reserve powerful models for complex ones
- ✓Cache responses when the same input might recur
- ✓Monitor token usage and set max_tokens to avoid runaway costs
Performance
- ✓Use streaming for long responses to improve perceived latency
- ✓Batch requests when processing multiple items
- ✓Use async/await (asyncio) for concurrent API calls in Python
- ✓Keep prompts concise — fewer input tokens means faster and cheaper responses
Resources
OpenAI API Documentation
OpenAI
Complete reference for the OpenAI API including chat completions, function calling, embeddings, fine-tuning, and best practices. Includes interactive examples.
Anthropic API Documentation
Anthropic
Full documentation for the Claude API including the Messages API, tool use, streaming, vision capabilities, and prompt engineering guides.
Hugging Face Documentation
Hugging Face
Documentation for the Hugging Face ecosystem — the Inference API, model hub, datasets, and the transformers library for running models locally.
Building Systems with the ChatGPT API
DeepLearning.AI (Andrew Ng & Isa Fulford)
Free short course covering how to build multi-step AI systems using the OpenAI API, including chaining calls, evaluating outputs, and handling edge cases.
Key Takeaways
- 1An API is a contract between software systems — you send a structured request and receive a structured response. REST APIs using HTTP methods (GET, POST) are the standard.
- 2API keys are the most common authentication method for AI APIs. Never hardcode them — use environment variables and add .env to your .gitignore.
- 3Rate limiting protects APIs from overuse. Always implement retry logic with exponential backoff to handle 429 (rate limit) responses gracefully.
- 4The OpenAI and Anthropic APIs follow similar patterns: send messages with a model name, system prompt, and conversation history; receive generated text. Both support tool use for structured interactions.
- 5Hugging Face provides API access to thousands of specialized open-source models for tasks like sentiment analysis, translation, and image generation.
- 6Production AI applications require robust error handling, cost monitoring, response validation, and security practices around API key management.
- 7Prompt chaining in code — making sequential API calls where each builds on the previous result — is the fundamental pattern behind most AI-powered applications.
Test Your Understanding
Module Assessment
5 questions · Score 70% or higher to complete this module
You can retake the quiz as many times as you need. Your best score is saved.