OpenAI API Guide — GPT-4, DALL-E, and Whisper Integration
The OpenAI API provides access to GPT-4, DALL-E, and Whisper models for text generation, image creation, and speech-to-text in a single unified interface.
What You’ll Learn
- How to obtain and secure API keys for OpenAI services
- Building chat completions with GPT-4 including streaming and function calling
- Generating images with DALL-E 3 and converting speech with Whisper
- Managing tokens, understanding pricing, and handling rate limits
Why the OpenAI API Matters
OpenAI’s models power millions of applications worldwide. From customer support chatbots to code assistants and content generation tools, the API is the most widely adopted AI integration point. DodaTech’s Doda Browser uses GPT-4 for inline page summarization, and Durga Antivirus Pro leverages embeddings for semantic malware signature matching — making OpenAI API skills essential for modern developers.
flowchart LR
A["API Key\n& Authentication"] --> B["Chat Completions\nGPT-4"]
A --> C["Images\nDALL-E 3"]
A --> D["Audio\nWhisper"]
A --> E["Embeddings\ntext-embedding-3"]
B --> F["Streaming &\nFunction Calling"]
B --> G["Token Counting\n& Pricing"]
style B fill:#dbeafe,stroke:#2563eb
Getting Started with API Keys
Every OpenAI API call requires an API key. Sign up at platform.openai.com, navigate to API keys, and create a new secret key. Store it as an environment variable — never hardcode keys in source code.
import os
from openai import OpenAI
client = OpenAI(api_key=os.environ["OPENAI_API_KEY"])Expected output: No output — the client initializes silently. If OPENAI_API_KEY is missing, Python raises KeyError.
Chat Completions with GPT-4
The chat completions endpoint is the core of OpenAI’s text generation. You send a list of messages and receive a model response.
response = client.chat.completions.create(
model="gpt-4",
messages=[
{"role": "system", "content": "You are a helpful Python tutor."},
{"role": "user", "content": "Explain list comprehensions in one sentence."}
]
)
print(response.choices[0].message.content)Expected output:
A list comprehension is a concise way to create lists by applying an expression to each item in an iterable, optionally filtering with a condition.Streaming Responses
For real-time applications, stream tokens as they arrive instead of waiting for the full response.
stream = client.chat.completions.create(
model="gpt-4",
messages=[{"role": "user", "content": "Count from 1 to 5."}],
stream=True
)
for chunk in stream:
if chunk.choices[0].delta.content is not None:
print(chunk.choices[0].delta.content, end="")Expected output:
1, 2, 3, 4, 5The stream yields delta objects containing partial content. This pattern is used in Doda Browser’s live page summarization feature where text appears progressively as the model generates it.
Function Calling
Function calling lets GPT-4 request structured data from your application. Define a function schema and the model will output a JSON object when it needs to call that function.
tools = [
{
"type": "function",
"function": {
"name": "get_weather",
"description": "Get current temperature for a city",
"parameters": {
"type": "object",
"properties": {
"city": {"type": "string"},
"unit": {"type": "string", "enum": ["celsius", "fahrenheit"]}
},
"required": ["city"]
}
}
}
]
response = client.chat.completions.create(
model="gpt-4",
messages=[{"role": "user", "content": "What's the weather in London?"}],
tools=tools,
tool_choice="auto"
)
print(response.choices[0].message.tool_calls[0].function)Expected output:
Function(name='get_weather', arguments='{"city":"London","unit":"celsius"}')The model decides when to call the function and returns structured arguments you can execute against your own data sources. Durga Antivirus Pro uses this pattern to let GPT-4 query internal threat databases when analyzing security incidents.
DALL-E 3 Image Generation
Generate images from text descriptions using DALL-E 3.
image = client.images.generate(
model="dall-e-3",
prompt="A futuristic city skyline at sunset with flying cars",
size="1024x1024",
quality="standard",
n=1
)
print(image.data[0].url)Expected output: A URL string pointing to the generated image, valid for approximately one hour.
Whisper Speech-to-Text
Transcribe audio files into text using the Whisper model.
audio_file = open("meeting_recording.mp3", "rb")
transcript = client.audio.transcriptions.create(
model="whisper-1",
file=audio_file
)
print(transcript.text)Expected output: The transcribed text from the audio file. Whisper supports MP3, WAV, M4A, and other common formats.
Embeddings for Semantic Search
Embeddings convert text into vector representations for semantic search and clustering.
response = client.embeddings.create(
model="text-embedding-3-small",
input="Doda Browser is a fast and private web browser"
)
vector = response.data[0].embedding
print(f"Vector dimension: {len(vector)}")
print(f"First 5 values: {vector[:5]}")Expected output:
Vector dimension: 1536
First 5 values: [-0.008327245, 0.02189445, -0.001234567, 0.03456789, -0.01234567]Durga Antivirus Pro uses embeddings to compare malware signatures semantically — finding threats that match the intent of known patterns rather than exact byte sequences.
Token Counting and Pricing
OpenAI charges per token. Use tiktoken to count tokens before sending requests.
import tiktoken
encoding = tiktoken.encoding_for_model("gpt-4")
tokens = encoding.encode("DodaZIP compresses files efficiently.")
print(f"Token count: {len(tokens)}")
print(f"Tokens: {[encoding.decode([t]) for t in tokens]}")Expected output:
Token count: 6
Tokens: ['D', 'oda', 'Z', 'IP', ' compresses', ' files efficiently.']Common Errors
1. InsufficientQuota Error
You’ve exhausted your usage tier or billing limit. Check your usage at platform.openai.com/account/usage and add credit or raise limits.
2. RateLimitError
Sending requests too fast. OpenAI imposes tiered rate limits (e.g., 500 RPM for Tier 1). Implement exponential backoff with tenacity or similar retry libraries.
3. AuthenticationError
The API key is invalid, missing, or revoked. Verify OPENAI_API_KEY is set correctly and hasn’t been rotated.
4. InvalidRequestError — Context Length Exceeded
Your prompt plus response exceeds the model’s context window. GPT-4 has 8K, 32K, and 128K variants. Truncate messages or switch to a larger context model.
5. Model Not Found
The model name is incorrect or you lack access. gpt-4 requires an approved access request. Use gpt-3.5-turbo as fallback.
6. Timeout Error
The request took longer than your timeout setting. For long generations, use the timeout parameter or switch to streaming.
7. Content Policy Violation
The prompt or generated output triggered OpenAI’s content filter. Review the safety guidelines and adjust your prompt.
Practice Questions
- What environment variable should hold your OpenAI API key?
- How does streaming differ from standard chat completions?
- What is the purpose of function calling in GPT-4?
- Which model should you use for semantic search embeddings?
- How do you count tokens for a GPT-4 request before sending it?
Answers:
OPENAI_API_KEY— never hardcode keys in source files.- Streaming returns tokens incrementally via
deltaobjects, enabling real-time display without waiting for the full response. - Function calling lets the model output structured JSON to invoke external tools or APIs, connecting GPT-4 to your own data sources.
text-embedding-3-small(1536 dimensions) ortext-embedding-3-large(3072 dimensions) for higher precision.- Use
tiktoken.encoding_for_model("gpt-4").encode(text)to get the token count.
Challenge: DodaZIP needs a feature that summarizes compressed file contents using GPT-4. Design a function that reads a file, truncates it to fit the context window, sends it to the API, and returns the summary with token usage statistics.
Mini Project: AI-Powered Chat Assistant
Build a CLI chat assistant that streams GPT-4 responses and saves conversations:
import os
from openai import OpenAI
client = OpenAI(api_key=os.environ["OPENAI_API_KEY"])
messages = [{"role": "system", "content": "You are a helpful assistant."}]
print("AI Chat Assistant (type 'quit' to exit)")
while True:
user = input("\nYou: ")
if user.lower() == "quit":
break
messages.append({"role": "user", "content": user})
stream = client.chat.completions.create(
model="gpt-4", messages=messages, stream=True
)
print("Assistant: ", end="")
reply = ""
for chunk in stream:
if chunk.choices[0].delta.content:
print(chunk.choices[0].delta.content, end="")
reply += chunk.choices[0].delta.content
messages.append({"role": "assistant", "content": reply})Try it: Run the script and have a conversation. The assistant remembers context within the session. Extend it by adding a save/load feature to persist conversations across sessions using JSON file storage.
FAQ
Built by the developers of Doda Browser, DodaZIP, and Durga Antivirus Pro.
Built by the developers of DodaTech
Doda Browser, DodaZIP & Durga Antivirus Pro