SIEM & SOC: Security Operations Center Guide
A Security Operations Center (SOC) is a centralized team that monitors, detects, analyzes, and responds to security incidents using SIEM (Security Information and Event Management) tools like Splunk, ELK Stack, Microsoft Sentinel, and QRadar.
What You’ll Learn
You’ll understand SOC team structure (Tier 1/2/3), deploy and configure SIEM tools for log ingestion, write correlation rules that detect attacks, set up automated alerts, integrate SOAR for response orchestration, map incidents to compliance frameworks (GDPR, ISO 27001), and generate compliance reports.
Why It Matters
Every organization needs a SOC — or SIEM as a service — to detect breaches before they cause damage. Durga Antivirus Pro’s cloud security team uses a multi-SIEM architecture (Splunk + Sentinel) to monitor millions of endpoints and detect zero-day attacks in real time.
Real-World Use
A SOC analyst receives an alert: “User JSmith — 100 failed logins in 5 minutes from IP 185.234.x.x.” The SIEM correlation rule triggers, the SOAR playbook blocks the IP in the firewall automatically, and the Tier 1 analyst triages the alert in 90 seconds. Without SIEM, this attack goes unnoticed until data is exfiltrated.
SOC Architecture
flowchart TD
A[Log Sources] --> B[Log Collection]
B --> C[SIEM Platform]
C --> D[Correlation Engine]
D --> E[Alert Generation]
E --> F{Triage}
F -->|False Positive| G[Close Alert]
F -->|True Positive| H[Incident Response]
H --> I[SOAR Automation]
I --> J[Containment]
I --> K[Remediation]
subgraph "SOC Tiers"
L[T1 - Triage] --> M[T2 - Investigation]
M --> N[T3 - Advanced Analysis]
end
F --> L
style C fill:#2563eb,color:#fff
style D fill:#dc2626,color:#fff
style L fill:#059669,color:#fff
style N fill:#8b5cf6,color:#fff
Step 1: SOC Team Structure
| Tier | Role | Skills | Tools | Avg Salary |
|---|---|---|---|---|
| T1 | Triage Analyst | Alert triage, basic filtering, escalation | SIEM dashboard, ticket system | $60-85k |
| T2 | Incident Responder | Deep investigation, containment, forensics | SIEM, EDR, Threat Intel | $85-120k |
| T3 | Threat Hunter | Proactive hunting, malware analysis, research | Sandbox, RE tools, query languages | $120-180k |
T1 Daily Workflow:
# Pseudocode for T1 alert triage
def triage_alert(alert):
# 1. Validate the alert — is it a true positive?
if not is_actionable(alert):
close_as_false_positive(alert, "No evidence of compromise")
return
# 2. Determine severity based on asset criticality
if alert.asset.type in ["domain_controller", "database_server"]:
alert.severity = "CRITICAL"
elif alert.asset.risk_score > 7:
alert.severity = "HIGH"
else:
alert.severity = "MEDIUM"
# 3. Quick containment if possible
if alert.type == "malware_detected" and alert.severity == "CRITICAL":
isolate_endpoint(alert.asset.hostname)
# 4. Escalate with context
escalate_to_tier2(alert, context={
"detection_source": alert.rule_name,
"affected_users": extract_users(alert),
"iocs": extract_iocs(alert),
"first_seen": alert.timestamp,
})Expected output: An automated triage pipeline. T1 analysts should process 80%+ of alerts without escalation. Each alert gets a severity classification and initial containment actions before T2 gets involved.
Step 2: Log Ingestion with Elastic Stack
# Filebeat configuration — ship logs to Elasticsearch
filebeat.inputs:
- type: filestream
enabled: true
paths:
- /var/log/auth.log
- /var/log/syslog
- /var/log/nginx/access.log
- type: winlog
enabled: false
# Windows Event Log channels
event_logs:
- Security
- System
- Application
- PowerShell
output.elasticsearch:
hosts: ["https://elastic-siem.internal:9200"]
username: "filebeat_system"
password: "${ES_PWD}"
ssl.verification_mode: certificate# Deploy Filebeat agent
sudo filebeat setup --index-management -E output.logstash=true
sudo systemctl enable filebeat
sudo systemctl start filebeat
# Verify logs are flowing
curl -u elastic:password "https://elastic-siem.internal:9200/filebeat-*/_count"Expected output: JSON-encoded logs from auth.log, syslog, and Nginx access logs appear in Elasticsearch indices. Each log entry is parsed into structured fields (timestamp, host, process, message, etc.). You can query with: GET filebeat-*/_search?q=event.action:authentication_failed.
Step 3: Writing Correlation Rules (Splunk)
# Splunk correlation rule: Brute force detection
index=* sourcetype=win:security EventCode=4625
| stats count as failed_count by src_ip, user, _time span=5m
| where failed_count > 10
| lookup critical_assets.csv output asset
| eval severity=case(
failed_count > 50, "CRITICAL",
failed_count > 20, "HIGH",
1=1, "MEDIUM"
)
| table _time, src_ip, user, failed_count, severity, asset
| sort - failed_count# Correlation: Multiple failed logins THEN successful login (password spray)
index=* sourcetype=win:security EventCode=4625
| stats count as attempts by src_ip, user
| where attempts > 5
| join src_ip, user type=inner [
search index=* sourcetype=win:security EventCode=4624
| stats count as success_count by src_ip, user
]
| where success_count > 0
| eval severity = if(attempts > 20, "HIGH", "MEDIUM")
| table src_ip, user, attempts, success_count, severityExpected output: Splunk correlation rules generate alerts in real-time. The brute force rule triggers when any IP exceeds 10 failed logins in 5 minutes. The password spray rule fires when the same IP successfully logs in after multiple failures across different accounts — indicating the attacker found a valid credential.
KQL for Microsoft Sentinel
// Sentinel KQL: Anomalous RDP logins
SigninLogs
| where UserPrincipalName has "@"
| where AppDisplayName == "Microsoft Remote Desktop"
| where ResultType == "0" // Success
| summarize LoginCount = count(),
UniqueIPs = dcount(IPAddress)
by UserPrincipalName, bin(TimeGenerated, 1h)
| where LoginCount > 10 or UniqueIPs > 3
| join kind=leftanti (
// Exclude known admins
IdentityInfo
| where Tags contains "admin"
| project AccountName
) on $left.UserPrincipalName == $right.AccountName
| project TimeGenerated, UserPrincipalName, LoginCount, UniqueIPs
| order by LoginCount descStep 4: SOAR Automation Playbook
# SOAR playbook: Automated IP blocking
import requests
def block_ip(ip_address, duration_hours=24):
"""Block a malicious IP on firewall and EDR."""
# 1. Block on firewall (Palo Alto)
payload = {
"entry": {
"@name": f"soar-block-{ip_address}",
"source": ip_address,
"action": "deny",
"log-start": "yes",
}
}
requests.post(
"https://firewall-mgmt/api/addresses",
json=payload,
headers={"Authorization": f"Bearer {FIREWALL_TOKEN}"}
)
# 2. Block on EDR (CrowdStrike)
requests.post(
"https://api.crowdstrike.com/indicators/entities/iocs/v1",
json={
"indicators": [{
"type": "ipv4",
"value": ip_address,
"policy": "detect",
"action": "block",
"expiration": f"+{duration_hours}h"
}]
},
headers={"Authorization": f"Bearer {CS_TOKEN}"}
)
# 3. Update SIEM incident
requests.patch(
f"https://splunk.internal/services/incidents/{INCIDENT_ID}",
json={"status": "Containment Applied"},
)
return {"status": "blocked", "ip": ip_address, "duration": duration_hours}
# Example: block IP from alert
alert = {
"ip": "185.234.45.67",
"reason": "C2 beacon detected — 100+ connections to unknown external"
}
result = block_ip(alert["ip"])
print(f"Blocked {result['ip']} for {result['duration']}h")Expected output: The playbook runs automatically when a C2 beacon alert triggers. Within seconds, the IP is blocked on the firewall and EDR. The SIEM incident is updated with the containment action. T1 analysts don’t need to manually block — the playbook handles it.
Step 5: Compliance Reporting
# Splunk query: User access review report (ISO 27001 A.9.2.1)
index=* sourcetype=win:security (EventCode=4624 OR EventCode=4634)
| eval action=case(
EventCode==4624, "Logon",
EventCode==4634, "Logoff"
)
| stats values(action) as actions,
dc(date) as active_days,
count as total_events
by user, workstation
| where active_days > 0
| eval last_access = strftime(_time, "%Y-%m-%d %H:%M:%S")
| table user, workstation, actions, active_days, total_events, last_access
| sort user# Generate monthly compliance report
splunk search "index=security sourcetype=win:security EventCode=4625
| timechart count by src_ip span=1d" \
-output report_monthly_auth_audit.csv \
-earliest_time @month \
-latest_time @nowExpected output: A CSV report listing all users, their workstations, total access events, and last access date. This satisfies ISO 27001 A.9.2.1 (user access review) — auditors require this report monthly. The SIEM makes compliance reporting a query, not a manual audit.
Common Errors
1. Alert fatigue from poorly tuned rules Too many false positives desensitize analysts. Every correlation rule must be tuned — start with high-specificity rules (e.g., “Admin account logs in from non-corporate VPN”) and add broader rules only after false positive rates are below 5%. Use threshold-based escalation: 1 alert = info, 10 alerts = warning, 50 alerts = incident.
2. Not normalizing log formats Logs come in dozens of formats (Syslog, JSON, Windows Event, CEF, LEEF). Without normalization, correlation rules break when a different log source is added. Use a log parser/normalizer (Logstash, Cribl) to convert all logs to a common schema with consistent field names.
3. Ignoring log source coverage A SIEM is useless if critical log sources aren’t connected. missing: cloudtrail, DNS logs, DHCP logs, VPN logs, web proxy logs, antivirus logs. Inventory all log sources and track ingestion with a dashboard. Any source offline for >24h generates a “SIEM health” alert.
4. No SOAR integration Manual response doesn’t scale. A SIEM that generates 500 alerts/day needs SOAR automation for 80% of alerts (false positive close, simple IP blocks, account disabling). Only complex incidents need human investigation.
5. Not correlating endpoint and network data
A network alert (C2 beacon) without endpoint context (which process made the connection) is half the picture. Deploy EDR agents and correlate process_id from endpoint logs with network connection logs. The SIEM should combine both for full visibility.
Practice Questions
1. What are the three SOC tiers and their responsibilities? T1 (Triage) — triage alerts, close false positives, escalate confirmed incidents. T2 (Responder) — deep investigation, containment, evidence collection, and initial remediation. T3 (Hunter) — proactive threat hunting, malware analysis, detection engineering, advanced forensics.
2. What is a SIEM correlation rule? A query that analyzes multiple log events across time to detect attack patterns. Example: “More than 10 failed logins from same IP in 5 minutes” correlates 10+ individual Windows Event 4625 entries into one security alert.
3. How does SOAR improve incident response? SOAR (Security Orchestration, Automation, and Response) automates repetitive tasks: blocking IPs on firewalls, creating tickets, enriching alerts with threat intelligence, and running containment playbooks. This reduces response time from hours to seconds.
4. What logs are most important for a SIEM? Authentication logs (Windows Security Event 4624/4625), firewall logs, web proxy logs, DNS logs, Active Directory changes, VPN logs, endpoint detection logs (EDR), cloud audit logs (CloudTrail, Azure Monitor), and mail gateway logs.
5. Challenge: Build a detection rule for data exfiltration Write a Splunk or KQL correlation rule that detects when a single user uploads more than 100MB of data via HTTP POST to external IPs in an hour. Include an exception for known cloud storage (Google Drive, OneDrive).
FAQ
What’s Next
Built by the developers of Doda Browser, DodaZIP, and Durga Antivirus Pro.
Built by the developers of DodaTech
Doda Browser, DodaZIP & Durga Antivirus Pro