VoIP Explained — Beginner's Guide
Voice over IP (VoIP) is the technology that converts your voice into digital data packets and transmits them over an IP network — replacing traditional telephone lines with internet-based calling that’s cheaper, more flexible, and packed with features.
What You’ll Learn
- How VoIP converts voice into packets and back
- The SIP protocol’s role in call setup and tear-down
- Audio codecs and their bandwidth requirements
- A complete VoIP call flow from dial to hang-up
Why VoIP Matters
VoIP has transformed how the world communicates. Over 60% of business phone systems now use VoIP. Consumer apps like WhatsApp, Skype, and Zoom are built on VoIP. It has made international calling nearly free and enabled video conferencing, screen sharing, and unified communications.
Doda Browser uses VoIP-inspired real-time communication patterns for its peer-to-peer features. Durga Antivirus Pro leverages VoIP-quality audio codec principles for processing compressed audio files during malware scanning.
Learning Path
flowchart LR
A[Telecom Basics] --> B[Network Protocols]
B --> C[VoIP & SIP<br/>You are here]
C --> D[4G/LTE Architecture]
D --> E[5G Networks]
What Is VoIP?
Imagine sending a letter, but instead of writing the whole message on one page, you cut it into tiny strips, number each strip, mail them separately, and the recipient puts them back in order. That’s VoIP.
Your voice is analog (continuous sound waves). VoIP:
- Digitizes: Converts your voice into digital samples (8,000 times per second)
- Compresses: Codecs reduce the data size
- Packetizes: Frames of voice data are wrapped into IP packets
- Transmits: Packets travel over the network
- Reconstructs: The receiver reorders packets (some may arrive out of order) and plays them back
Analogy: VoIP Is Like a Jigsaw Puzzle
Think of a traditional phone call as a single long photograph of your voice. The whole image is transmitted at once.
VoIP is like cutting that photograph into 1,000 puzzle pieces, mailing each piece separately, and having the recipient reassemble them. If a few pieces are missing, the image is still understandable — you just lose some detail.
The VoIP Stack
A VoIP call uses multiple protocols working together:
| Layer | Protocol | Job |
|---|---|---|
| Signaling | SIP | Sets up, manages, and tears down calls |
| Media | RTP (Real-Time Transport Protocol) | Carries the actual voice packets |
| Quality | RTCP (RTP Control Protocol) | Monitors packet loss, jitter, latency |
| Negotiation | SDP (Session Description Protocol) | Agrees on codecs, IP, ports |
| Transport | UDP (User Datagram Protocol) | Faster than TCP; tolerates some loss |
Audio Codecs Explained
A codec (coder-decoder) converts analog voice to digital and back. Different codecs balance quality vs bandwidth.
| Codec | Bitrate | Quality | Use Case |
|---|---|---|---|
| G.711 (PCMU/PCMA) | 64 kbps | Best | Traditional VoIP, business phones |
| G.729 | 8 kbps | Good | Low-bandwidth links, international |
| G.722 (HD Voice) | 64 kbps | Excellent | High-definition voice, video apps |
| Opus | 6-510 kbps | Excellent | WebRTC, modern apps (WhatsApp, Zoom) |
| G.723.1 | 5.3/6.3 kbps | Fair | Very low bandwidth, legacy systems |
Why Codecs Matter
- Bandwidth: A G.711 call uses 64 kbps for audio + ~18 kbps for IP overhead = ~82 kbps total
- Quality: G.711 is essentially toll-quality (same as PSTN). Opus at higher bitrates sounds better than a traditional phone
- Trade-off: Lower bitrate = more compression = more CPU processing = potentially lower quality
# Calculate bandwidth for a G.711 VoIP call
bitrate_kbps = 64
ip_overhead_factor = 1.3 # IP/UDP/RTP headers add ~30%
total_kbps = bitrate_kbps * ip_overhead_factor
print(f"G.711 call bandwidth: {total_kbps:.1f} kbps")
# Output: G.711 call bandwidth: 83.2 kbps
# For Opus at 32 kbps
total_kbps = 32 * 1.3
print(f"Opus call bandwidth: {total_kbps:.1f} kbps")
# Output: Opus call bandwidth: 41.6 kbpsHow a VoIP Call Works — Step by Step
Let’s trace Alice calling Bob using a VoIP app:
Step 1: Registration
Alice’s phone registers with the SIP server:
REGISTER sip:voip-provider.com SIP/2.0
From: <sip:alice@voip-provider.com>
To: <sip:alice@voip-provider.com>
Contact: <sip:alice@192.168.1.10:5060>The SIP server notes: “Alice is reachable at 192.168.1.10:5060.”
Step 2: Call Initiation
Alice dials Bob’s number. Her phone sends an INVITE to the SIP server:
INVITE sip:bob@voip-provider.com SIP/2.0
Via: SIP/2.0/UDP 192.168.1.10:5060
From: <sip:alice@voip-provider.com>
To: <sip:bob@voip-provider.com>
Call-ID: abc123@192.168.1.10
CSeq: 1 INVITE
Content-Type: application/sdp
v=0
o=alice 12345 67890 IN IP4 192.168.1.10
c=IN IP4 192.168.1.10
m=audio 40000 RTP/AVP 0 8 101
a=rtpmap:0 PCMU/8000
a=rtpmap:8 PCMA/8000This says: “I want to call Bob. I can receive audio on port 40000. I support G.711 μ-law, G.711 A-law, and telephone events.”
Step 3: Routing and Ringing
The SIP server looks up Bob’s location and forwards the INVITE. Bob’s phone rings:
SIP/2.0 180 Ringing
From: <sip:bob@voip-provider.com>
To: <sip:alice@voip-provider.com>Step 4: Answer
Bob answers. His phone sends back a 200 OK with its SDP:
SIP/2.0 200 OK
From: <sip:bob@voip-provider.com>
To: <sip:alice@voip-provider.com>
Content-Type: application/sdp
v=0
o=bob 54321 98765 IN IP4 10.0.0.5
c=IN IP4 10.0.0.5
m=audio 40002 RTP/AVP 0 101Bob says: “I’m answering. Send audio to 10.0.0.5:40002. I also support G.711.”
Step 5: Confirmation
Alice’s phone sends ACK, and media begins flowing:
ACK sip:bob@voip-provider.com SIP/2.0Step 6: Conversation
RTP packets flow in both directions:
Alice: [RTP packet: sequence 4523, timestamp 16000, 20ms of G.711 audio]
Bob: [RTP packet: sequence 8721, timestamp 32000, 20ms of G.711 audio]Each packet contains 20 milliseconds of audio. That’s 50 packets per second.
Step 7: Call Termination
Bob hangs up. His phone sends BYE:
BYE sip:alice@voip-provider.com SIP/2.0
From: <sip:bob@voip-provider.com>
To: <sip:alice@voip-provider.com>Alice’s phone acknowledges with 200 OK. Call is complete.
sequenceDiagram
Alice->>SIP Server: REGISTER
Bob->>SIP Server: REGISTER
Alice->>SIP Server: INVITE (call Bob)
SIP Server->>Bob: INVITE
Bob-->>Alice: 180 Ringing
Bob-->>SIP Server: 200 OK (answer)
SIP Server-->>Alice: 200 OK
Alice->>Bob: ACK
Note over Alice,Bob: RTP Voice Stream
Bob->>SIP Server: BYE (hang up)
SIP Server->>Alice: BYE
Alice-->>SIP Server: 200 OK
Bandwidth Requirements
For a quality VoIP call, you need:
| Codec | Bandwidth (one direction) | Required Internet (both directions) |
|---|---|---|
| G.711 | 85 kbps | 170 kbps |
| G.729 | 30 kbps | 60 kbps |
| Opus (32k) | 42 kbps | 84 kbps |
| HD Voice (G.722) | 85 kbps | 170 kbps |
| Video (720p) | 1-2 Mbps | 2-4 Mbps |
Factors Affecting Call Quality
- Latency: Under 150ms is good. Over 300ms causes noticeable delay
- Jitter: Variation in packet arrival time. Jitter buffers help but add delay
- Packet loss: Under 1% is acceptable. Over 3% causes audible artifacts
- Bandwidth: Must be sufficient for the codec + overhead
Real-World Use: Enterprise VoIP Migration
A company with 500 employees decides to switch from a traditional PBX to VoIP:
Before (Traditional PBX)
- 100 analog phone lines
- $50,000/year in line rental
- $15,000/year in long-distance charges
- Limited to voice only
- Difficult to add/remove users
After (VoIP)
- SIP trunk with 100 concurrent call capacity
- $12,000/year for SIP trunk
- $2,000/year in long-distance (most calls are now free)
- Voice, video, conferencing, presence, unified messaging
- Add users in minutes via web portal
- Softphones for remote workers
Savings: 70%+ on telecommunications costs
Security Angle
VoIP security is critical. Common threats include:
- Toll fraud: Hackers compromise SIP credentials to make premium-rate calls
- Eavesdropping: Unencrypted RTP can be captured and reconstructed
- Denial of Service: VoIP servers can be flooded, disabling phone service
- SPIT (Spam over Internet Telephony): VoIP spam calls
Protection measures:
- Always use SRTP (encrypted media)
- Use TLS for SIP signaling
- Implement strong authentication (digest or certificate-based)
- Deploy session border controllers (SBCs) at the network edge
Durga Antivirus Pro applies VoIP-style encryption patterns in its real-time threat data transmission, ensuring security events are encrypted in transit and cannot be eavesdropped.
Common Mistakes
1. Not accounting for bandwidth overhead
A G.711 codec uses 64 kbps for audio, but with IP/UDP/RTP headers, the total is ~85 kbps per direction. A 1 Mbps connection supports only about 6 concurrent G.711 calls, not 15 as some assume.
2. Using TCP instead of UDP for media
RTP should run over UDP. TCP retransmissions cause delay and jitter, destroying voice quality.
3. Ignoring network QoS (Quality of Service)
Without QoS, a large file download can starve VoIP packets, causing choppy audio. VoIP traffic must be prioritized.
4. Not configuring NAT/firewall properly
VoIP and NAT don’t work well together. STUN, TURN, or ICE are needed for calls across NAT.
5. Assuming all SIP phones work with all providers
SIP interoperability varies. Different implementations of SIP extensions cause compatibility issues.
Practice Questions
What is the difference between SIP and RTP? SIP handles signaling (call setup/teardown). RTP carries the actual voice packets.
Why does VoIP use UDP instead of TCP? UDP is faster and tolerates some packet loss. TCP retransmissions would introduce delay and jitter, degrading call quality.
What does a codec do in VoIP? Converts analog voice to digital packets (encoding) and back (decoding). Different codecs balance quality vs bandwidth.
What is the approximate bandwidth for a G.711 VoIP call? ~85 kbps per direction or ~170 kbps total (64 kbps audio + ~21 kbps overhead).
What is jitter and how is it handled? Jitter is variation in packet arrival time. Jitter buffers smooth out the variation, but too much jitter causes buffer over/underflow.
Challenge: Calculate the bandwidth needed for a company with 50 concurrent VoIP calls using G.729 codec. Then design a QoS policy that prioritizes VoIP traffic and limits file downloads to 80% of available bandwidth.
FAQ
Try It Yourself
You can experiment with VoIP concepts using command-line tools:
# Check your network's suitability for VoIP
# Measure latency and packet loss
ping -c 20 google.comExpected output:
PING google.com (142.250.80.132): 56 data bytes
64 bytes from 142.250.80.132: icmp_seq=0 ttl=116 time=12.5 ms
64 bytes from 142.250.80.132: icmp_seq=1 ttl=116 time=11.8 ms
...
--- google.com ping statistics ---
20 packets transmitted, 20 packets received, 0.0% packet loss
round-trip min/avg/max = 10.2/12.1/15.3 msFor VoIP, you want:
- Average latency < 100ms
- Packet loss = 0%
- Consistent times (low jitter)
# Simulate VoIP bandwidth consumption
# Send UDP packets at VoIP-like rates
# Uses 100 packets/sec (simulating G.711 at 20ms intervals)
for i in $(seq 1 100); do
echo "VoIP packet $i" | nc -u -w 0 example.com 5060
sleep 0.01 # 10ms between packets
doneWhat’s Next
| Tutorial | What You’ll Learn |
|---|---|
| Telecom Network Protocols | SS7, SIP in-depth and call routing across networks |
| Telecom Overview | Foundations of telecommunications infrastructure |
| HTTP Protocol | Compare VoIP protocols with web protocols |
Built by the developers of Doda Browser, DodaZIP, and Durga Antivirus Pro. Updated 2026-06-06.
What’s Next
Congratulations on completing this Voip tutorial! Here’s where to go from here:
- Practice daily — Consistency is more important than long study sessions
- Build a project — Apply what you learned by building something real
- Explore related topics — Check out other tutorials in the same category
- Join the community — Discuss with other learners and share your progress
Remember: every expert was once a beginner. Keep coding!
Built by the developers of DodaTech
Doda Browser, DodaZIP & Durga Antivirus Pro