Skip to main content

Skillber v1.0 is here!

Learn more

Log Management

Checking access...

Log management is the foundation of threat detection. Without complete, reliable, and accessible logs, a SIEM cannot function and incident investigators cannot reconstruct events.

What to Log

Mandatory Log Sources

Critical Sources (must log — no exceptions):
└─ Authentication: Windows Event 4624/4625, SSH auth.log, VPN auth
└─ Network: Firewall accepts/denies, NetFlow, proxy logs
└─ System: Process creation (Sysmon/EDR), scheduled tasks, service installs
└─ Cloud: CloudTrail (AWS), Activity Log (Azure), Audit Log (GCP)
└─ Email: DMARC/DKIM reports, phishing filter logs
└─ DNS: All queries and responses
└─ Database: Access logs, query audit logs
└─ Application: Web server access logs, API gateway logs
High-Value Sources (strongly recommended):
└─ DHCP: IP-to-MAC mapping (forensic trail)
└─ VPN: Connection logs, source IP
└─ TLS: Certificate transparency logs
└─ Vulnerability scanner: Scan results
└─ Patch management: Patch status
Nice-to-Have:
└─ Physical access: Badge in/out logs
└─ HVAC/OT logs (if applicable)
└─ Printer logs
└─ Voicemail logs

Log Content Requirements

Each log event should answer: Who did What, Where, When, From Where, and How?

Minimum Fields (RFC 5424 syslog):
└─ Timestamp (with timezone, ISO 8601 format)
└─ Source IP
└─ Destination IP
└─ User identity (if applicable)
└─ Action/event type
└─ Result (success/failure)
└─ Unique event ID
└─ Hostname / system identifier
Example well-formed log (JSON):
{
"timestamp": "2026-06-15T14:30:00.123Z",
"event_id": "4625",
"source_ip": "203.0.113.42",
"hostname": "DC-01.corp.company.com",
"user": "administrator",
"action": "failed_logon",
"result": "failure",
"failure_reason": "unknown_user_or_bad_password",
"logon_type": 3,
"process_id": 456,
"session_id": "0x123abc"
}

Log Formats

FormatStructureBest ForExample
Syslog (RFC 5424)Structured key-valueNetwork devices, Unix/Linux<14>1 2024-01-15T10:00:00Z host sshd[1234]: Failed password for root
JSONName-value pairsCloud APIs, modern apps{"event":"login","user":"admin","status":"failed"}
CEF (ArcSight)Pipe-delimited fieldsLegacy SIEM integration`CEF:0
W3C ExtendedTab/space-delimitedIIS web logs, proxies#Fields: date time c-ip cs-uri-stem sc-status
Windows EVTXBinary (parsed by API)Windows systemsRead via Event Log API or wevtutil

Log Retention

Retention Requirements

Retention by Regulation:
└─ PCI DSS: 12 months (3 months immediately available)
└─ HIPAA: 6 years
└─ GDPR: As long as needed for purpose (guidance: 12 months for security)
└─ SOX: 7 years
└─ NIST CSF: Based on risk assessment (guidance: 12 months minimum)
└─ ISO 27001: Based on retention schedule (guidance: 12 months)
Tiered Retention Strategy:
└─ Hot Tier (immediate search): 30-90 days
└─ Fast storage (SSD/NVMe)
└─ Full indexing
└─ Budget: ~$1-3/GB/month
└─ Warm Tier (searchable, slower): 6-12 months
└─ Standard storage (HDD)
└─ Reduced indexing
└─ Budget: ~$0.50-1/GB/month
└─ Cold Tier (archive): 1-7+ years
└─ Object storage (S3 Glacier, Azure Archive)
└─ No indexing (must restore for search)
└─ Budget: ~$0.01/GB/month

Log Volume Estimation

Daily Log Volume Estimates:
└─ Per Windows workstation: 50-200 MB/day (verbose auditing)
└─ Per Windows server: 100-500 MB/day
└─ Per Linux server: 10-50 MB/day
└─ Per firewall: 500 MB - 5 GB/day
└─ Per web server: 1-10 GB/day
└─ Per EDR agent: 100-500 MB/day (telemetry)
└─ AWS CloudTrail (per region): 1-5 GB/day
└─ Azure AD: 500 MB - 2 GB/day
└─ DNS server: 1-5 GB/day
Sample Calculation — 500-user organisation:
└─ 500 workstations: 50 MB × 500 = 25 GB/day
└─ 50 servers: 200 MB × 50 = 10 GB/day
└─ 2 firewalls: 2 GB × 2 = 4 GB/day
└─ CloudTrail (3 regions): 2 GB × 3 = 6 GB/day
└─ EDR: 200 MB × 550 = 110 GB/day
└─ Total: ~155 GB/day → 4.6 TB/month → 55 TB/year

Log Forwarding

Linux: rsyslog to SIEM

Terminal window
cat > /etc/rsyslog.d/60-forward.conf << 'EOF'
# Security-relevant events (auth, sudo, cron)
auth.*;authpriv.*;cron.*;kern.* @@siem.company.com:6514
# Send in JSON format for structured parsing
$template JsonFormat,"{\"timestamp\":\"%timegenerated:::date-rfc3339%\",\"host\":\"%hostname%\",\"facility\":\"%syslogfacility-text%\",\"priority\":\"%syslogpriority%\",\"message\":\"%msg%\"}\n"
*.* action(type="omfwd" target="siem.company.com" port="6514"
protocol="tcp" template="JsonFormat"
StreamDriver="gtls" StreamDriverMode="1"
StreamDriverAuthMode="x509/name")
EOF
systemctl restart rsyslog
systemctl status rsyslog

Windows: Winlogbeat to ELK

Terminal window
# Download and install Winlogbeat
Invoke-WebRequest -Uri "https://artifacts.elastic.co/downloads/beats/winlogbeat/winlogbeat-8.14.0-windows-x86_64.zip" -OutFile "winlogbeat.zip"
Expand-Archive winlogbeat.zip -DestinationPath "C:\Program Files\"
# Configure winlogbeat.yml
@"
winlogbeat.event_logs:
- name: Security
event_id: 4624, 4625, 4688, 4698, 4719, 4732, 4756
- name: Microsoft-Windows-Sysmon/Operational
- name: Microsoft-Windows-PowerShell/Operational
event_id: 4103, 4104
- name: System
event_id: 7036, 7045
output.elasticsearch:
hosts: ["https://elastic.company.com:9200"]
username: "winlogbeat"
password: "YourPassword"
ssl.verification_mode: certificate
"@ | Out-File -Encoding UTF8 "C:\Program Files\winlogbeat\winlogbeat.yml"
# Install and start service
cd "C:\Program Files\winlogbeat"
.\install-service-winlogbeat.ps1
Start-Service winlogbeat

Cloud: AWS CloudTrail to SIEM

Terminal window
# Forward CloudTrail logs via S3 bucket notifications
# 1. Create S3 bucket for CloudTrail
aws s3api create-bucket --bucket company-cloudtrail-logs --region us-east-1
# 2. Enable CloudTrail
aws cloudtrail create-trail --name company-trail \
--s3-bucket-name company-cloudtrail-logs \
--is-multi-region-trail \
--enable-log-file-validation
# 3. Configure S3 event notification to send logs to SIEM
aws s3api put-bucket-notification-configuration \
--bucket company-cloudtrail-logs \
--notification-configuration file://notification.json
# notification.json
{
"QueueConfigurations": [{
"QueueArn": "arn:aws:sqs:us-east-1:123456789012:siem-ingestion",
"Events": ["s3:ObjectCreated:*"]
}]
}

Log Integrity

Logs must be tamper-proof. If an attacker can modify logs, they can erase evidence of their activity.

Log Integrity Controls:
Forward Immediately:
└─ Logs sent to central SIEM as generated
└─ Local logs are copies, not primary
└─ Network capture: switch port mirroring (cannot be disabled by attacker)
Write-Once Storage:
└─ Append-only storage (immutable)
└─ AWS S3 Object Lock with retention mode
└─ WORM (Write Once Read Many) storage
Hashing / Signing:
└─ Chain hashing (each log entry includes hash of previous)
└─ RFC 5702: syslog-sign extension
└─ Block-chain based verification (for high-security environments)
Access Control:
└─ Separate admin for SIEM (different credentials from AD)
└─ MFA for log management access
└─ Audit log for who accesses logs
└─ Alert when logs are modified or deleted
Monitoring:
└─ Alert on log source going silent (attacker disabled logging)
└─ Alert on log volume anomalies
└─ Monitor SIEM health dashboard

Key Takeaways

  • Log management is the foundation of detection — without complete logs, SIEM and incident investigation are impossible
  • Every log event should answer: Who, What, Where, When, From Where, and How — structured formats (JSON, CEF) enable automated parsing
  • Tiered retention (hot → warm → cold) balances cost with accessibility — 30-90 days hot, 6-12 months warm, 1-7+ years cold
  • A 500-user organisation generates ~155 GB/day of security logs — plan storage accordingly
  • Logs must be immutable — if an attacker can modify logs, they can hide their activity (append-only storage, chain hashing)
  • Forward logs immediately to central SIEM — local logs can be deleted; copies in transit can be captured at the switch level
  • Monitor log sources for silence — a log source that stops sending is a red flag (attacker disabled logging or source is down)
  • PCI DSS requires 12 months retention with 3 months immediately available — HIPAA requires 6 years
  • Cloud logging (CloudTrail, Azure Activity Log, GCP Audit Log) must be enabled across all regions and sent to SIEM
  • Log volume grows with org size — scalable architecture (SIEM with tiered storage) prevents cost overruns