Skip to content
Security Operations (SOC & SIEM) — Complete Beginner's Guide

Security Operations (SOC & SIEM) — Complete Beginner's Guide

DodaTech Updated Jun 7, 2026 12 min read

Security Operations (SecOps) is the practice of continuously monitoring, detecting, analyzing, and responding to security threats — typically run by a Security Operations Center (SOC) using a Security Information and Event Management (SIEM) system.

What You’ll Learn

By the end of this tutorial, you’ll understand SOC team roles and tiers, how SIEM systems work (log ingestion, correlation, alerting), how to triage security alerts, and how to build a security monitoring program from scratch.

Why SecOps Matters

The average organization takes 207 days to identify and 73 days to contain a breach (IBM 2025). A well-run SOC reduces detection time to hours and containment to minutes. Without a SOC, you’re blind to attacks already inside your network. At DodaTech, the Durga Antivirus Pro SOC monitors millions of endpoints in real time, processing over 10 billion telemetry events daily.

SecOps Learning Path

    flowchart LR
  A[Security Basics] --> B[DevSecOps]
  B --> C[Incident Response]
  C --> D[Cloud Security]
  D --> E[Security Operations]
  E --> F{You Are Here}
  style F fill:#f90,color:#fff
  
Prerequisites: Cyber Security basics, Linux command line, and familiarity with Python for scripting log analysis.

What Is a SOC? (The “Why” First)

Think of a SOC as the security team’s mission control center. Just as NASA’s mission control monitors every system on a spacecraft, the SOC monitors every system in an organization’s network — firewalls, servers, endpoints, cloud services, and applications.

A SOC operates 24/7/365 because attackers don’t take weekends off. When an alert fires, SOC analysts determine: Is this a real attack or a false positive? How severe is it? What should we do?

SOC Team Structure (Tiers)

    flowchart TD
  subgraph "SOC Team"
    T1[Tier 1: Triage Analyst]
    T2[Tier 2: Incident Responder]
    T3[Tier 3: Threat Hunter]
    L[SOC Manager]
  end
  Alerts[Security Alerts] --> T1
  T1 -->|Escalate| T2
  T2 -->|Deep dive| T3
  T1 --> L
  T2 --> L
  T3 --> L
  
TierRoleResponsibilitiesExperience
Tier 1Triage AnalystMonitor dashboards, acknowledge alerts, filter false positives, create tickets0-2 years
Tier 2Incident ResponderInvestigate escalated alerts, contain threats, collect evidence, write reports2-5 years
Tier 3Threat HunterProactively search for hidden threats, analyze malware, develop detection rules5+ years
ManagerSOC ManagerOversee operations, report to CISO, manage team, improve processes8+ years

What Is a SIEM? (The “Why” First)

A SIEM (Security Information and Event Management) system collects logs from everywhere and makes sense of them. Imagine a thousand different security cameras (firewalls, servers, endpoints, cloud services) all recording footage. A SIEM is the central control room where all feeds are visible, correlated, and analyzed.

SIEM Architecture

    flowchart TD
  subgraph "Log Sources"
    FW[Firewalls]
    SRV[Servers]
    EDR[Endpoint Detection]
    APP[Applications]
    CLD[Cloud Services]
  end
  subgraph "SIEM Platform"
    COL[Log Collector]
    IDX[Indexer]
    CORR[Correlation Engine]
    STORE[Storage]
  end
  subgraph "Output"
    DASH[Dashboards]
    ALERT[Alerts]
    RPT[Reports]
  end
  FW --> COL
  SRV --> COL
  EDR --> COL
  APP --> COL
  CLD --> COL
  COL --> IDX
  IDX --> CORR
  CORR --> DASH
  CORR --> ALERT
  IDX --> RPT
  

Key SIEM Functions

FunctionWhat It DoesExample
Log collectionIngests logs from all sourcesSyslog, WinEventLog, CloudTrail
NormalizationConverts different log formats to a common schemaDifferent date formats → ISO 8601
CorrelationConnects related events across sourcesFailed login + lateral movement + data exfiltration
AlertingGenerates alerts based on rules“5 failed logins from same IP in 1 minute”
DashboardsVisualizes security postureLive threat map, top alerts
ReportingCompliance and operational reportsMonthly SOC metrics

Setting Up a SIEM (Simulated)

Here’s a simplified log analysis system that demonstrates SIEM concepts:

# simple_siem.py — A minimal SIEM implementation
import json
import re
from collections import defaultdict
from datetime import datetime, timedelta

class SimpleSIEM:
    """A minimal SIEM engine for demonstration."""

    def __init__(self):
        self.events = []
        self.correlation_rules = []
        self.alerts = []

    def ingest_log(self, log_entry: dict):
        """Ingest a normalized log entry."""
        # Add timestamp if not present
        if 'timestamp' not in log_entry:
            log_entry['timestamp'] = datetime.now().isoformat()
        self.events.append(log_entry)

    def ingest_syslog(self, raw_log: str):
        """Parse and ingest a syslog-style log."""
        # Simple syslog parser
        pattern = r'(\w{3}\s+\d+\s+\d+:\d+:\d+)\s+(\S+)\s+(\S+)\[(?\d+)?\]:?\s*(.*)'
        match = re.match(pattern, raw_log)

        if match:
            self.ingest_log({
                'raw': raw_log,
                'source': 'syslog',
                'hostname': match.group(2),
                'process': match.group(3),
                'message': match.group(5)
            })

    def add_correlation_rule(self, name: str, condition: callable, severity: str = "MEDIUM"):
        """Add a correlation rule — condition is a function that takes events."""
        self.correlation_rules.append({
            'name': name,
            'condition': condition,
            'severity': severity
        })

    def run_correlation(self):
        """Run all correlation rules against recent events."""
        recent = [e for e in self.events
                  if datetime.fromisoformat(e['timestamp']) > datetime.now() - timedelta(hours=1)]

        for rule in self.correlation_rules:
            matches = rule['condition'](recent)
            if matches:
                self.alerts.append({
                    'rule': rule['name'],
                    'severity': rule['severity'],
                    'timestamp': datetime.now().isoformat(),
                    'matches': len(matches),
                    'details': matches[:3]  # First 3 matches
                })
        return self.alerts

    def get_dashboard(self) -> dict:
        """Generate a SOC dashboard summary."""
        total = len(self.events)
        alerts_last_hour = len([a for a in self.alerts
                               if datetime.fromisoformat(a['timestamp']) > datetime.now() - timedelta(hours=1)])

        return {
            "total_events": total,
            "events_last_hour": len([e for e in self.events
                                    if datetime.fromisoformat(e['timestamp']) > datetime.now() - timedelta(hours=1)]),
            "active_alerts": alerts_last_hour,
            "top_sources": dict(Counter(e.get('source', 'unknown') for e in self.events[-1000:]).most_common(5))
        }

from collections import Counter

# Example usage
siem = SimpleSIEM()

# Add correlation rules
def brute_force_detection(events):
    """Detect brute force attempts: 5+ failed logins from same IP in 1 minute."""
    ip_attempts = defaultdict(list)
    for e in events:
        if 'failed login' in e.get('message', '').lower():
            ip = e.get('source_ip', 'unknown')
            ip_attempts[ip].append(e['timestamp'])

    suspicious = []
    for ip, timestamps in ip_attempts.items():
        if len(timestamps) >= 5:
            suspicious.append({"ip": ip, "attempts": len(timestamps)})
    return suspicious

def port_scan_detection(events):
    """Detect port scans: connections to 10+ different ports from same IP."""
    ip_ports = defaultdict(set)
    for e in events:
        if 'connection' in e.get('message', '').lower():
            ip = e.get('source_ip', 'unknown')
            port = e.get('dest_port', 0)
            ip_ports[ip].add(port)

    suspicious = []
    for ip, ports in ip_ports.items():
        if len(ports) >= 10:
            suspicious.append({"ip": ip, "ports_scanned": len(ports)})
    return suspicious

siem.add_correlation_rule("Brute Force Detection", brute_force_detection, "HIGH")
siem.add_correlation_rule("Port Scan Detection", port_scan_detection, "MEDIUM")

# Simulate some logs
for i in range(10):
    siem.ingest_log({
        'source': 'ssh',
        'source_ip': '192.168.1.100',
        'message': f'Failed login attempt #{i} from 192.168.1.100'
    })

# Run correlation
alerts = siem.run_correlation()
print("=== SOC Dashboard ===")
dashboard = siem.get_dashboard()
for k, v in dashboard.items():
    print(f"{k}: {v}")

print("\n=== Active Alerts ===")
for alert in alerts:
    print(f"[{alert['severity']}] {alert['rule']}{alert['matches']} matches")

SOC Tools and Technologies

Open Source SIEM Stack (ELK)

The ELK stack (Elasticsearch, Logstash, Kibana) is the most popular open-source SIEM:

# docker-compose.yml — Minimal ELK SIEM stack
version: '3.8'
services:
  elasticsearch:
    image: docker.elastic.co/elasticsearch/elasticsearch:8.12.0
    environment:
      - discovery.type=single-node
      - xpack.security.enabled=false
    ports:
      - "9200:9200"

  logstash:
    image: docker.elastic.co/logstash/logstash:8.12.0
    volumes:
      - ./logstash.conf:/usr/share/logstash/pipeline/logstash.conf
    ports:
      - "5000:5000"
    depends_on:
      - elasticsearch

  kibana:
    image: docker.elastic.co/kibana/kibana:8.12.0
    ports:
      - "5601:5601"
    environment:
      - ELASTICSEARCH_HOSTS=http://elasticsearch:9200
    depends_on:
      - elasticsearch

Commercial SIEMs

SIEMBest ForPricing
SplunkLarge enterprises, advanced analyticsExpensive (GB/day licensing)
Microsoft SentinelAzure shops, cloud-nativePay-as-you-go
WazuhOpen-source, small-medium teamsFree
ELK StackSelf-hosted, customizationFree (open-source)
Sumo LogicCloud-native, ease of usePay-as-you-go

Log Sources Every SOC Should Monitor

SourceWhat It ProvidesKey Events
FirewallNetwork trafficDenied connections, port scans
DNSDomain resolutionQueries to known malicious domains
EDREndpoint behaviorProcess creation, file changes, registry changes
Cloud TrailCloud activityAPI calls, resource changes
Auth logsAuthenticationFailed logins, privilege escalation
Web proxyHTTP trafficMalware downloads, C2 callbacks
Email gatewayEmail securityPhishing attempts, malware attachments

Alert Triage Process

When an alert fires, SOC analysts follow this triage workflow:

# alert_triage.py — Alert triage decision framework

class AlertTriage:
    """Triage security alerts using a structured decision framework."""

    SEVERITY_MAP = {
        "CRITICAL": {"response_time": "5 minutes", "requires_escalation": True},
        "HIGH": {"response_time": "15 minutes", "requires_escalation": True},
        "MEDIUM": {"response_time": "1 hour", "requires_escalation": False},
        "LOW": {"response_time": "24 hours", "requires_escalation": False},
    }

    def __init__(self, alert: dict):
        self.alert = alert
        self.status = "new"
        self.notes = []

    def evaluate_severity(self) -> str:
        """Determine severity based on multiple factors."""
        score = 0

        # Factor 1: Is sensitive data involved?
        if self.alert.get('data_classification') in ['PII', 'PCI', 'PHI']:
            score += 30

        # Factor 2: Is a critical system affected?
        if self.alert.get('system_criticality') == 'critical':
            score += 25

        # Factor 3: Has lateral movement been detected?
        if self.alert.get('lateral_movement'):
            score += 20

        # Factor 4: Is there confirmed data exfiltration?
        if self.alert.get('exfiltration_detected'):
            score += 25

        # Map score to severity
        if score >= 80:
            return "CRITICAL"
        elif score >= 50:
            return "HIGH"
        elif score >= 20:
            return "MEDIUM"
        return "LOW"

    def triage_steps(self) -> list[str]:
        """Generate triage steps based on severity."""
        severity = self.evaluate_severity()
        config = self.SEVERITY_MAP[severity]

        steps = [
            f"1. Acknowledge alert within {config['response_time']}",
            "2. Determine if alert is a true positive or false positive:",
        ]

        if severity in ["CRITICAL", "HIGH"]:
            steps += [
                "   a. Check if any data was accessed/exfiltrated",
                "   b. Identify affected systems and users",
                "   c. Begin containment (isolate systems if needed)",
                "   d. Escalate to Tier 2/3 incident response team",
                f"3. Notify SOC Manager immediately",
                f"4. Create incident ticket with severity {severity}"
            ]
        else:
            steps += [
                "   a. Review alert details in SIEM",
                "   b. Check for related events in last 24 hours",
                "   c. If false positive: document and close",
                "   d. If true positive: investigate during business hours",
                f"3. Create ticket and assign to next available analyst"
            ]

        return steps

    def investigate(self, siem_data: dict) -> dict:
        """Perform initial investigation."""
        finding = {
            "alert_id": self.alert.get('id'),
            "severity": self.evaluate_severity(),
            "status": "investigating",
            "indicators": [],
            "verdict": "pending"
        }

        # Check if IP is in threat intel feeds
        ip = self.alert.get('source_ip')
        if ip and ip in siem_data.get('threat_intel', {}).get('malicious_ips', []):
            finding["indicators"].append(f"Source IP {ip} matches known threat intel")
            finding["verdict"] = "malicious"

        # Check for related events
        related = siem_data.get('related_events', [])
        if len(related) > 5:
            finding["indicators"].append(f"Multiple related events ({len(related)}): pattern suggests active attack")
            if finding["verdict"] != "malicious":
                finding["verdict"] = "suspicious"

        if not finding["indicators"]:
            finding["verdict"] = "false_positive"

        return finding

# Example triage
alert = {
    "id": "ALERT-2026-001",
    "source_ip": "185.220.101.1",
    "data_classification": "PII",
    "system_criticality": "critical",
    "lateral_movement": True,
    "exfiltration_detected": False
}

triage = AlertTriage(alert)
print(f"Severity: {triage.evaluate_severity()}")
print("Triage Steps:")
for step in triage.triage_steps():
    print(f"  {step}")

SOC Metrics (KPIs)

MetricTargetWhat It Measures
MTTD (Mean Time to Detect)< 1 hourHow fast you identify incidents
MTTR (Mean Time to Respond)< 2 hoursHow fast you contain threats
Alert triage time< 15 min (HIGH)How fast analysts acknowledge
False positive rate< 30%How accurate your detection rules are
Coverage100% of critical assetsAre all important systems monitored?

Common SOC Mistakes

1. Alert Fatigue

Too many alerts → analysts ignore them. Tune your rules: eliminate false positives, aggregate similar alerts, and prioritize by severity.

2. Not Having 24/7 Coverage

Attackers know when your SOC goes home. If you can’t staff 24/7, use an MSSP (Managed Security Service Provider) for after-hours coverage.

3. Poor On-Call Processes

Unclear escalation paths, unanswered pages, and burnout. Define clear on-call rotations, escalation procedures, and post-incident decompression.

4. Not Tuning SIEM Rules

Default SIEM rules generate noise. Spend the first 3 months tuning rules before adding more. A quiet SIEM is a well-tuned SIEM.

5. Ignoring Threat Intelligence

Without threat intel feeds, you’re looking for needles in haystacks blindly. Integrate threat intel (MISP, AlienVault OTX, VirusTotal) into your SIEM.

6. No Playbooks for Common Scenarios

When phishing alerts fire, analysts should follow a playbook — not improvise. Write and maintain playbooks for top 10 scenarios.

7. Not Automating Tier 1 Tasks

Automated enrichment (IP reputation checks, user lookups) reduces triage time. If analysts manually copy-paste into VirusTotal, you’re wasting money.

Practice Questions

1. What are the three tiers of a SOC team?

Tier 1 (Triage — monitors and filters alerts), Tier 2 (Responder — investigates and contains), Tier 3 (Threat Hunter — proactively hunts for threats).

2. What does SIEM stand for and what are its three main functions?

Security Information and Event Management. Main functions: log collection and normalization, event correlation and alerting, dashboards and reporting.

3. What is the difference between MTTD and MTTR?

MTTD (Mean Time to Detect) — how long between breach and discovery. MTTR (Mean Time to Respond) — how long between discovery and containment. Both should be as low as possible.

4. Why is alert tuning important for a SOC?

Without tuning, analysts drown in false positives and miss real threats. Tuning reduces noise so analysts can focus on actual incidents.

5. Challenge: Design a SIEM correlation rule that detects ransomware behavior.

Look for: mass file renaming events (EDR), multiple file encryption logs, SMB connections to many workstations, and a ransom note creation event. Correlate these within a 5-minute window.

Mini Project: SOC Dashboard Simulator

# soc_dashboard.py
# Simulate a SOC monitoring dashboard
from datetime import datetime, timedelta
import random
import time

class SOCDashboard:
    """Simulate a real-time SOC dashboard."""

    def __init__(self):
        self.metrics = {
            "events_per_second": 0,
            "active_alerts": 0,
            "open_incidents": 0,
            "false_positive_rate": 0.0,
            "avg_response_time": 0
        }
        self.recent_alerts = []

    def simulate_tick(self):
        """Generate simulated SOC metrics."""
        self.metrics["events_per_second"] = random.randint(500, 2000)
        self.metrics["active_alerts"] = random.randint(2, 15)
        self.metrics["open_incidents"] = max(0, self.metrics["active_alerts"] - random.randint(0, 3))
        self.metrics["false_positive_rate"] = round(random.uniform(0.15, 0.40), 2)
        self.metrics["avg_response_time"] = random.randint(5, 30)

        # Random alert generation
        if random.random() < 0.3:
            severity = random.choices(
                ["CRITICAL", "HIGH", "MEDIUM", "LOW"],
                weights=[0.05, 0.15, 0.40, 0.40]
            )[0]
            alert = {
                "timestamp": datetime.now().isoformat(),
                "severity": severity,
                "rule": random.choice([
                    "Brute Force Detected", "Malspam Detected",
                    "C2 Beaconing", "Data Exfiltration",
                    "Ransomware Indicator", "Phishing URL Clicked"
                ])
            }
            self.recent_alerts.append(alert)

    def display(self):
        """Print dashboard to console."""
        print("\033[2J\033[H")  # Clear screen (ANSI)
        print("=" * 60)
        print(f"SOC DASHBOARD — {datetime.now().strftime('%Y-%m-%d %H:%M:%S')}")
        print("=" * 60)
        print(f"Events/sec:    {self.metrics['events_per_second']}")
        print(f"Active Alerts: {self.metrics['active_alerts']}")
        print(f"Open Incidents: {self.metrics['open_incidents']}")
        print(f"FP Rate:       {self.metrics['false_positive_rate']:.0%}")
        print(f"Avg Response:  {self.metrics['avg_response_time']} min")
        print("-" * 60)
        print("Recent Alerts:")
        for alert in self.recent_alerts[-5:]:
            sev = alert['severity']
            print(f"  [{sev:<8}] {alert['rule']}")

# Run simulation
dashboard = SOCDashboard()
for _ in range(5):
    dashboard.simulate_tick()
    dashboard.display()
    time.sleep(0.5)

FAQ

Do I need a SIEM if I’m a small business?
Not necessarily. For small teams, managed EDR (CrowdStrike, SentinelOne) with basic log review may suffice. SIEM becomes valuable when you have 50+ servers and compliance requirements.
What’s the best free SIEM?
Wazuh is the most popular open-source SIEM. It includes EDR capabilities, compliance monitoring (PCI, HIPAA, GDPR), and integrates with the ELK stack. The ELK stack itself is also a capable SIEM with the right configuration.
How long should logs be retained?
Compliance minimums: PCI-DSS requires 1 year, HIPAA requires 6 years, GDPR requires as long as data is processed. Security best practice: 90 days hot storage, 1 year warm, 3-7 years cold/archived.
What certifications are useful for SOC roles?
CompTIA Security+ (entry-level), CySA+ (analyst), GCIA (network analysis), GCIH (incident handling), CISSP (management). Practical experience with Splunk or ELK is more important than certifications.
Can a SIEM detect zero-day attacks?
Not directly — zero-days have no known signature. SIEMs detect zero-days through behavioral anomalies: unusual network traffic, unexpected process execution, abnormal data access patterns.

Try It Yourself

Set up a home SOC lab:

  1. Install Wazuh (SIEM + EDR) using their quickstart script on a VM
  2. Configure a second VM as an endpoint with Wazuh agent installed
  3. Generate security events: failed SSH logins, failed sudo attempts, file changes
  4. View alerts in the Wazuh dashboard

This is the same setup DodaTech uses for Durga Antivirus Pro SOC training — monitoring thousands of simulated endpoints to practice triage and response.

What’s Next

What’s Next

Congratulations on completing this Security Operations tutorial! Here’s where to go from here:

  • Practice daily — Consistency is more important than long study sessions
  • Build a project — Apply what you learned by building something real
  • Explore related topics — Check out other tutorials in the same category
  • Join the community — Discuss with other learners and share your progress

Remember: every expert was once a beginner. Keep coding!

Built by the developers of DodaTech

Doda Browser, DodaZIP & Durga Antivirus Pro