Skip to content
YAML Guide — Human-Readable Data Serialization

YAML Guide — Human-Readable Data Serialization

DodaTech Updated Jun 7, 2026 7 min read

YAML (YAML Ain’t Markup Language) is a human-friendly data serialization standard used everywhere in modern DevOps — from Docker Compose and Kubernetes to CI/CD pipelines and application configuration.

What You’ll Learn

  • YAML syntax fundamentals (indentation, mappings, sequences, scalars)
  • Anchors, aliases, and multi-document files
  • Real-world usage in Docker, Kubernetes, and CI/CD
  • YAML vs JSON vs TOML comparison

Why It Matters

YAML is the configuration language of the cloud-native world. Every Docker Compose file, every Kubernetes manifest, every GitHub Actions workflow, every Ansible playbook — they’re all YAML. DodaZIP uses YAML for its plugin configuration system because it’s both machine-parseable and human-writable without special tools. Understanding YAML deeply means you can configure anything in the modern infrastructure stack without fighting the syntax.

Learning Path

    flowchart LR
  A[YAML Basics<br/>You are here] --> B[Scalars, Mappings & Sequences]
  B --> C[Anchors, Aliases & Multi-Doc]
  C --> D[Docker & Kubernetes Configs]
  D --> E[CI/CD Pipelines]
  

Scalars — The Building Blocks

YAML supports scalars: strings, numbers, booleans, and null.

# String scalars
name: Alice
greeting: "Hello, World!"
multiline: |
  This is a block scalar.
  Newlines are preserved.

folded: >
  This text gets folded
  into a single line.

# Numeric scalars
age: 30
pi: 3.14159
hex: 0xFF

# Boolean scalars
active: true
debug: false
enabled: yes
disabled: no

# Null
value: null
empty: ~
# Parsed equivalent:
name: "Alice"
greeting: "Hello, World!"
multiline: "This is a block scalar.\nNewlines are preserved.\n"
folded: "This text gets folded into a single line.\n"
age: 30
pi: 3.14159
hex: 255
active: true
debug: false
value: null

Mappings and Sequences

Mappings (dictionaries) and sequences (lists) form the structure of any YAML document.

# Mapping
person:
  name: Bob
  age: 25
  address:
    street: 123 Main St
    city: Springfield
    zip: "12345"

# Sequence
colors:
  - red
  - green
  - blue

# Sequence of mappings
employees:
  - name: Alice
    role: Developer
  - name: Bob
    role: Designer
  - name: Charlie
    role: Manager

# Inline flow style
person2: {name: Diana, age: 28}
tags: [python, yaml, tutorial]

Anchors and Aliases

Anchors (&) define reusable nodes. Aliases (*) reference them. The merge key <<: combines mappings.

# Define reusable defaults
defaults: &defaults
  adapter: postgres
  host: localhost
  port: 5432

development:
  database: myapp_dev
  <<: *defaults

test:
  database: myapp_test
  <<: *defaults
  port: 5433

production:
  database: myapp_prod
  <<: *defaults
  host: db.example.com
  pool: 25
# Parsed equivalent:
development:
  adapter: postgres
  host: localhost
  port: 5432
  database: myapp_dev

test:
  adapter: postgres
  host: localhost
  port: 5433
  database: myapp_test

production:
  adapter: postgres
  host: db.example.com
  port: 5432
  pool: 25
  database: myapp_prod

Multi-Document Files

A single .yaml file can contain multiple documents separated by ---.

# First document
apiVersion: v1
kind: Pod
metadata:
  name: web-app
---
# Second document
apiVersion: v1
kind: Service
metadata:
  name: web-service
---
# Third document
apiVersion: v1
kind: ConfigMap
metadata:
  name: app-config
data:
  debug: "true"
  port: "8080"

This is standard in Kubernetes — a single file defines a Pod, Service, and ConfigMap as separate documents parsed independently.

Docker Compose Example

YAML is the native language of Docker Compose:

version: "3.8"

services:
  web:
    image: nginx:alpine
    ports:
      - "80:80"
    volumes:
      - ./html:/usr/share/nginx/html
    depends_on:
      - api
    environment:
      - API_HOST=api
      - API_PORT=3000

  api:
    build:
      context: ./api
      dockerfile: Dockerfile
    ports:
      - "3000:3000"
    environment:
      DB_HOST: db
      DB_NAME: myapp
    depends_on:
      db:
        condition: service_healthy

  db:
    image: postgres:16-alpine
    volumes:
      - pgdata:/var/lib/postgresql/data
    environment:
      POSTGRES_DB: myapp
      POSTGRES_PASSWORD: ${DB_PASSWORD}
    healthcheck:
      test: ["CMD-SHELL", "pg_isready"]
      interval: 5s

volumes:
  pgdata:

CI/CD Pipeline Example (GitHub Actions)

name: CI Pipeline
on:
  push:
    branches: [main]
  pull_request:
    branches: [main]

jobs:
  test:
    runs-on: ubuntu-latest
    strategy:
      matrix:
        python-version: ["3.10", "3.11", "3.12"]

    steps:
      - uses: actions/checkout@v4
      - name: Set up Python
        uses: actions/setup-python@v5
        with:
          python-version: ${{ matrix.python-version }}
      - name: Install dependencies
        run: |
          python -m pip install --upgrade pip
          pip install -r requirements.txt
      - name: Run tests
        run: pytest

  deploy:
    needs: test
    runs-on: ubuntu-latest
    if: github.ref == 'refs/heads/main'
    steps:
      - uses: actions/checkout@v4
      - name: Deploy to production
        run: echo "Deploying..."

YAML vs JSON vs TOML

FeatureYAMLJSONTOML
Comments#No#
Human readabilityExcellentGoodExcellent
Multi-document---NoNo
Anchors/aliasesYesNoNo
Strict typingWeakStrongStrong
Line countLowestMediumLow
{
  "database": {
    "host": "localhost",
    "port": 5432,
    "tables": ["users", "posts"]
  }
}
[database]
host = "localhost"
port = 5432
tables = ["users", "posts"]

YAML wins on human readability and features like anchors. JSON wins on strictness and machine-to-machine communication. TOML is a middle ground — simpler than YAML but more structured than INI files.

Common Mistakes

1. Mixing tabs and spaces

YAML uses spaces for indentation. Tabs are NOT allowed and most parsers reject them immediately.

2. Inconsistent indentation

person:
  name: Alice
   age: 30    # ERROR: inconsistent indent

All items at the same level must have matching indentation. Use 2 spaces consistently.

3. Unquoted strings that look like booleans

value: yes    # parsed as boolean true
value: "yes"  # parsed as string "yes"
value: no     # parsed as boolean false
value: null   # parsed as null, not string

4. Forgetting a space after colons

key:value    # ERROR
key: value   # CORRECT

5. Trailing spaces in quoted strings

"Alice " has a trailing space. Use block scalars for multiline strings with controlled whitespace.

6. Confusing | (literal) with > (folded)

| preserves newlines. > folds newlines into spaces. Choosing wrong changes the parsed value.

7. Wrong list dash indentation

list:
  - item1
  -item2    # ERROR: missing space after dash

Practice Questions

  1. What is the difference between | and > in YAML? | is a literal block scalar — preserves newlines. > is a folded block scalar — converts newlines to spaces.

  2. How do anchors and aliases work? &name defines an anchor. *name references it (inserts the entire node). <<: *name merges keys.

  3. Why are tabs not allowed in YAML? The YAML spec forbids tabs because they cause inconsistent indentation across editors. Use spaces (typically 2).

  4. How do you write a multi-document YAML file? Separate documents with --- on its own line. Optionally end with .... Each document parses independently.

  5. When would you choose YAML over JSON? YAML for human-written config files. JSON for machine-to-machine communication where strictness matters.

Challenge: Write a YAML file describing a 5-service microservice architecture with environment-specific overrides using anchors, a multi-document Kubernetes manifest, and a GitHub Actions CI/CD pipeline step.

Mini Project — Multi-Service Docker Compose

# docker-compose.yml with anchors for DRY configuration
x-logging: &logging
  logging:
    driver: "json-file"
    options:
      max-size: "10m"
      max-file: "3"

x-restart: &restart
  restart: unless-stopped

services:
  nginx:
    image: nginx:alpine
    ports: ["80:80", "443:443"]
    volumes:
      - ./nginx.conf:/etc/nginx/nginx.conf
      - ./ssl:/etc/nginx/ssl
    depends_on:
      - frontend
      - api
    <<: [*logging, *restart]

  frontend:
    image: node:20-alpine
    working_dir: /app
    volumes:
      - ./frontend:/app
    command: npm run build
    environment:
      NODE_ENV: production
      API_URL: https://api.example.com
    <<: *logging

  api:
    build: ./api
    ports: ["3000:3000"]
    environment:
      DB_HOST: postgres
      DB_PORT: 5432
      DB_NAME: myapp
      DB_USER: myapp
      DB_PASSWORD: ${DB_PASSWORD}
      REDIS_HOST: redis
    depends_on:
      postgres:
        condition: service_healthy
      redis:
        condition: service_started
    <<: [*logging, *restart]

  postgres:
    image: postgres:16-alpine
    volumes:
      - postgres_data:/var/lib/postgresql/data
      - ./init.sql:/docker-entrypoint-initdb.d/init.sql
    environment:
      POSTGRES_DB: myapp
      POSTGRES_USER: myapp
      POSTGRES_PASSWORD: ${DB_PASSWORD}
    healthcheck:
      test: ["CMD-SHELL", "pg_isready -U myapp"]
      interval: 5s
      timeout: 5s
      retries: 5
    <<: [*logging, *restart]

  redis:
    image: redis:7-alpine
    ports: ["6379:6379"]
    volumes:
      - redis_data:/data
    command: redis-server --appendonly yes
    healthcheck:
      test: ["CMD", "redis-cli", "ping"]
      interval: 5s
    <<: [*logging, *restart]

volumes:
  postgres_data:
  redis_data:

FAQ

Is YAML the same as YML?
Yes — .yaml (preferred) and .yml are the same extension. The official extension is .yaml, but .yml remains common for backward compatibility.
Why is YAML sometimes dangerous?
YAML’s !! tag syntax can instantiate arbitrary objects in some parsers (PyYAML, Ruby). This is a security risk when loading untrusted YAML. Always use safe_load() in Python and similar safe parsers in other languages.
Can YAML contain multiple documents?
Yes. Separate documents with ---. This is standard in Kubernetes manifests. Each document is parsed independently.
What is the difference between JSON and YAML?
YAML is a superset of JSON — valid JSON is valid YAML. YAML adds comments, anchors, multi-document support, and more readable indentation-based syntax at the cost of complexity.
Why does my YAML parser fail silently?
YAML parsers are lenient. A typo often parses as a different structure rather than throwing an error. Always validate YAML with yamllint or a schema validator.
Is YAML good for large configuration files?
YAML becomes hard to maintain past ~500 lines. For large configs, consider HCL (Terraform), Dhall (type-safe config), or multiple smaller files with !include or anchors.

Built by the developers of Doda Browser, DodaZIP, and Durga Antivirus Pro.

Built by the developers of DodaTech

Doda Browser, DodaZIP & Durga Antivirus Pro