Skip to content
Python Generators & Iterators — Explained with Examples

Python Generators & Iterators — Explained with Examples

DodaTech Updated Jun 15, 2026 6 min read

Generators are functions that produce a sequence of values lazily — one at a time, on demand — instead of computing them all at once and storing them in memory. They use the yield keyword instead of return.

What You’ll Learn

  • How yield turns a function into a generator
  • Generator expressions vs list comprehensions
  • Memory efficiency: processing large files without loading everything
  • send(), throw(), close() for bidirectional communication
  • The itertools module for powerful iteration patterns

Why Generators Matter

Processing a 10 GB log file with a list would crash your machine. Durga Antivirus Pro uses generators to stream through file signatures without loading them all at once. DodaZIP generates archive contents lazily during extraction. Whenever you work with large or infinite data, generators are the memory-efficient answer.

    flowchart LR
    A["OOP"] --> B["Decorators"]
    B --> C["Generators"]
    C --> D["Context Managers"]
    D --> E["Async"]
    A:::done --> B:::done --> C:::current
    style A fill:#2563eb,stroke:#2563eb,color:#fff
    style B fill:#2563eb,stroke:#2563eb,color:#fff
    style C fill:#2563eb,stroke:#2563eb,color:#fff
    style D fill:#dbeafe,stroke:#2563eb,color:#1e40af
    style E fill:#f1f5f9,stroke:#94a3b8,color:#64748b
  
Prerequisite: Understand Python functions, iterables (lists, tuples), and memory basics. Review https://tutorials.dodatech.com/programming-languages/python/py-functions/ if needed.

Iterators vs Iterables

Every generator is an iterator, but not every iterable is a generator:

  • Iterable: Can be looped over (list, tuple, string, dict). Implements __iter__().
  • Iterator: Produces values one at a time via __next__(). Remembers its position.
nums = [1, 2, 3]
it = iter(nums)              # list → iterator
print(next(it))  # 1
print(next(it))  # 2
print(next(it))  # 3
# print(next(it))  # StopIteration

Generators are the easiest way to create iterators.

Creating Generators with yield

Replace return with yield and the function becomes a generator:

def count_up_to(n: int):
    count = 1
    while count <= n:
        yield count
        count += 1

counter = count_up_to(3)
print(next(counter))  # 1
print(next(counter))  # 2
print(next(counter))  # 3
# print(next(counter))  # StopIteration

Each call to next() resumes the function from where it paused (right after yield), keeping all local variables intact.

Memory Efficiency: Large File Processing

Loading a 10 GB file into memory would crash most systems. A generator reads one line at a time:

def read_large_file(filepath: str):
    """Yield lines one at a time — memory efficient."""
    with open(filepath) as f:
        for line in f:
            yield line.strip()

# Usage — only one line in memory at a time
for line in read_large_file("server.log"):
    if "ERROR" in line:
        print(line)

Compare:

  • List approach: loads entire file into memory — 10 GB used
  • Generator approach: one line at a time — ~1 KB used

Generator Expressions

Like list comprehensions but with parentheses and lazy evaluation:

# List comprehension — builds entire list in memory
squares_list = [x * x for x in range(10)]
print(squares_list)  # [0, 1, 4, 9, 16, 25, 36, 49, 64, 81]

# Generator expression — lazy, one value at a time
squares_gen = (x * x for x in range(10))
print(squares_gen)        # <generator object ...>
print(next(squares_gen))  # 0
print(next(squares_gen))  # 1
print(list(squares_gen))  # [4, 9, 16, 25, 36, 49, 64, 81] — consumes the rest

Use generator expressions when you don’t need the entire list at once — they’re faster and use less memory.

send(), throw(), close() — Bidirectional Generators

Generators can receive values too:

def echo() -> str:
    """Echo back whatever is sent."""
    while True:
        received = yield
        print(f"Received: {received}")

gen = echo()
next(gen)           # Prime the generator (advance to first yield)
gen.send("Hello")   # Received: Hello
gen.send("World")   # Received: World
gen.close()         # Stop iteration

Real-world use: Durga Antivirus Pro uses send() to push scan configurations into a running scan generator.

The itertools Module

Python’s standard library for advanced iteration:

import itertools

# count — infinite sequence
for i in itertools.count(10, 2):
    if i > 20:
        break
    print(i, end=" ")  # 10 12 14 16 18 20

print()

# cycle — repeat forever
colors = itertools.cycle(["red", "green", "blue"])
for _ in range(6):
    print(next(colors), end=" ")  # red green blue red green blue

print()

# chain — combine iterables
combined = itertools.chain([1, 2], [3, 4], [5, 6])
print(list(combined))  # [1, 2, 3, 4, 5, 6]

# groupby — group sorted data
data = [("A", 1), ("A", 2), ("B", 3), ("B", 4)]
for key, group in itertools.groupby(data, key=lambda x: x[0]):
    print(key, list(group))
# A [('A', 1), ('A', 2)]
# B [('B', 3), ('B', 4)]

Common Mistakes

1. Forgetting That Generators Are Single-Use

gen = (x for x in range(3))
print(list(gen))  # [0, 1, 2]
print(list(gen))  # [] — exhausted!

Fix: Create a new generator if you need to iterate again.

2. Using return Instead of yield

def get_numbers():
    result = []
    for i in range(5):
        result.append(i)
    return result  # Returns a list, not a generator

Fix: Use yield for lazy evaluation.

3. Not Priming a Generator Before send()

gen = echo()
gen.send("Hello")  # TypeError: can't send non-None value to a just-started generator

Fix: Call next(gen) once to advance to the first yield.

4. Converting a Generator to a List Unnecessarily

result = list(some_generator())  # Loads everything into memory
for item in result:              # Defeats the purpose!
    process(item)

Fix: Iterate directly over the generator.

5. Confusing yield and yield from

yield from delegates to a sub-generator:

def sub_gen():
    yield 1
    yield 2

def main_gen():
    yield from sub_gen()
    yield 3

print(list(main_gen()))  # [1, 2, 3]

Practice Questions

1. What’s the memory difference between [x*2 for x in range(10_000_000)] and (x*2 for x in range(10_000_000))?
The list comprehension allocates ~160 MB. The generator expression uses ~120 bytes regardless of size.

2. What does next(counter) return each time?

def gen():
    yield 1
    yield 2
    yield 3

g = gen()

1, 2, 3, then StopIteration.

3. Why does the read_large_file generator not need to close() the file?
The with statement (context manager) handles cleanup when the generator is garbage-collected or closed.

4. Write a generator fibonacci(n) that yields the first n Fibonacci numbers.

def fibonacci(n: int):
    a, b = 0, 1
    for _ in range(n):
        yield a
        a, b = b, a + b

Challenge: Write a generator chunks(lst, n) that yields successive n-sized chunks from a list.

Solution
def chunks(lst: list, n: int):
    for i in range(0, len(lst), n):
        yield lst[i:i + n]

for chunk in chunks([1, 2, 3, 4, 5, 6, 7], 3):
    print(chunk)
# [1, 2, 3]
# [4, 5, 6]
# [7]

Mini Project: Log Parser with Generators

import itertools
from typing import Iterator

def read_log(filepath: str) -> Iterator[str]:
    """Yield log lines lazily."""
    with open(filepath) as f:
        yield from (line.strip() for line in f)

def filter_level(lines: Iterator[str], level: str) -> Iterator[str]:
    """Filter log lines by severity level."""
    for line in lines:
        if f" {level} " in line:
            yield line

def count_by_hour(lines: Iterator[str]) -> dict:
    """Count log entries per hour."""
    hours = {}
    for line in lines:
        hour = line[11:13]  # Assuming ISO timestamp format
        hours[hour] = hours.get(hour, 0) + 1
    return hours

# Simulated usage
log_lines = [
    "2026-06-15 10:23:45 INFO Starting scan",
    "2026-06-15 10:23:46 WARN Memory usage high",
    "2026-06-15 11:01:02 ERROR Connection timeout",
    "2026-06-15 11:02:15 INFO Retry successful",
]
with open("/tmp/test.log", "w") as f:
    f.writelines(f"{l}\n" for l in log_lines)

lines = read_log("/tmp/test.log")
errors = filter_level(lines, "ERROR")
for err in errors:
    print(err)
# 2026-06-15 11:01:02 ERROR Connection timeout

Expected output: Only the ERROR line is printed. The pipeline reads nothing beyond what’s needed.

What’s Next

Generators are fundamental for async programming and context managers. Continue your journey:

TopicDescriptionLink
Python Context ManagersWith-statement patternshttps://tutorials.dodatech.com/programming-languages/python/py-context-managers/
Python Async Programmingasync/await and asynciohttps://tutorials.dodatech.com/programming-languages/python/py-async/
Python Error HandlingExceptions and logginghttps://tutorials.dodatech.com/programming-languages/python/py-error-handling/

Practice tip: Rewrite the chunks generator using itertools.islice for infinite iterators.

Built by the developers of DodaTech

Doda Browser, DodaZIP & Durga Antivirus Pro