Log Management
Checking access...
Log management is the foundation of threat detection. Without complete, reliable, and accessible logs, a SIEM cannot function and incident investigators cannot reconstruct events.
What to Log
Mandatory Log Sources
Critical Sources (must log — no exceptions): └─ Authentication: Windows Event 4624/4625, SSH auth.log, VPN auth └─ Network: Firewall accepts/denies, NetFlow, proxy logs └─ System: Process creation (Sysmon/EDR), scheduled tasks, service installs └─ Cloud: CloudTrail (AWS), Activity Log (Azure), Audit Log (GCP) └─ Email: DMARC/DKIM reports, phishing filter logs └─ DNS: All queries and responses └─ Database: Access logs, query audit logs └─ Application: Web server access logs, API gateway logs
High-Value Sources (strongly recommended): └─ DHCP: IP-to-MAC mapping (forensic trail) └─ VPN: Connection logs, source IP └─ TLS: Certificate transparency logs └─ Vulnerability scanner: Scan results └─ Patch management: Patch status
Nice-to-Have: └─ Physical access: Badge in/out logs └─ HVAC/OT logs (if applicable) └─ Printer logs └─ Voicemail logsLog Content Requirements
Each log event should answer: Who did What, Where, When, From Where, and How?
Minimum Fields (RFC 5424 syslog): └─ Timestamp (with timezone, ISO 8601 format) └─ Source IP └─ Destination IP └─ User identity (if applicable) └─ Action/event type └─ Result (success/failure) └─ Unique event ID └─ Hostname / system identifier
Example well-formed log (JSON):{ "timestamp": "2026-06-15T14:30:00.123Z", "event_id": "4625", "source_ip": "203.0.113.42", "hostname": "DC-01.corp.company.com", "user": "administrator", "action": "failed_logon", "result": "failure", "failure_reason": "unknown_user_or_bad_password", "logon_type": 3, "process_id": 456, "session_id": "0x123abc"}Log Formats
| Format | Structure | Best For | Example |
|---|---|---|---|
| Syslog (RFC 5424) | Structured key-value | Network devices, Unix/Linux | <14>1 2024-01-15T10:00:00Z host sshd[1234]: Failed password for root |
| JSON | Name-value pairs | Cloud APIs, modern apps | {"event":"login","user":"admin","status":"failed"} |
| CEF (ArcSight) | Pipe-delimited fields | Legacy SIEM integration | `CEF:0 |
| W3C Extended | Tab/space-delimited | IIS web logs, proxies | #Fields: date time c-ip cs-uri-stem sc-status |
| Windows EVTX | Binary (parsed by API) | Windows systems | Read via Event Log API or wevtutil |
Log Retention
Retention Requirements
Retention by Regulation: └─ PCI DSS: 12 months (3 months immediately available) └─ HIPAA: 6 years └─ GDPR: As long as needed for purpose (guidance: 12 months for security) └─ SOX: 7 years └─ NIST CSF: Based on risk assessment (guidance: 12 months minimum) └─ ISO 27001: Based on retention schedule (guidance: 12 months)
Tiered Retention Strategy: └─ Hot Tier (immediate search): 30-90 days └─ Fast storage (SSD/NVMe) └─ Full indexing └─ Budget: ~$1-3/GB/month └─ Warm Tier (searchable, slower): 6-12 months └─ Standard storage (HDD) └─ Reduced indexing └─ Budget: ~$0.50-1/GB/month └─ Cold Tier (archive): 1-7+ years └─ Object storage (S3 Glacier, Azure Archive) └─ No indexing (must restore for search) └─ Budget: ~$0.01/GB/monthLog Volume Estimation
Daily Log Volume Estimates: └─ Per Windows workstation: 50-200 MB/day (verbose auditing) └─ Per Windows server: 100-500 MB/day └─ Per Linux server: 10-50 MB/day └─ Per firewall: 500 MB - 5 GB/day └─ Per web server: 1-10 GB/day └─ Per EDR agent: 100-500 MB/day (telemetry) └─ AWS CloudTrail (per region): 1-5 GB/day └─ Azure AD: 500 MB - 2 GB/day └─ DNS server: 1-5 GB/day
Sample Calculation — 500-user organisation: └─ 500 workstations: 50 MB × 500 = 25 GB/day └─ 50 servers: 200 MB × 50 = 10 GB/day └─ 2 firewalls: 2 GB × 2 = 4 GB/day └─ CloudTrail (3 regions): 2 GB × 3 = 6 GB/day └─ EDR: 200 MB × 550 = 110 GB/day └─ Total: ~155 GB/day → 4.6 TB/month → 55 TB/yearLog Forwarding
Linux: rsyslog to SIEM
cat > /etc/rsyslog.d/60-forward.conf << 'EOF'# Security-relevant events (auth, sudo, cron)auth.*;authpriv.*;cron.*;kern.* @@siem.company.com:6514
# Send in JSON format for structured parsing$template JsonFormat,"{\"timestamp\":\"%timegenerated:::date-rfc3339%\",\"host\":\"%hostname%\",\"facility\":\"%syslogfacility-text%\",\"priority\":\"%syslogpriority%\",\"message\":\"%msg%\"}\n"
*.* action(type="omfwd" target="siem.company.com" port="6514" protocol="tcp" template="JsonFormat" StreamDriver="gtls" StreamDriverMode="1" StreamDriverAuthMode="x509/name")EOF
systemctl restart rsyslogsystemctl status rsyslogWindows: Winlogbeat to ELK
# Download and install WinlogbeatInvoke-WebRequest -Uri "https://artifacts.elastic.co/downloads/beats/winlogbeat/winlogbeat-8.14.0-windows-x86_64.zip" -OutFile "winlogbeat.zip"Expand-Archive winlogbeat.zip -DestinationPath "C:\Program Files\"
# Configure winlogbeat.yml@"winlogbeat.event_logs: - name: Security event_id: 4624, 4625, 4688, 4698, 4719, 4732, 4756 - name: Microsoft-Windows-Sysmon/Operational - name: Microsoft-Windows-PowerShell/Operational event_id: 4103, 4104 - name: System event_id: 7036, 7045
output.elasticsearch: hosts: ["https://elastic.company.com:9200"] username: "winlogbeat" password: "YourPassword" ssl.verification_mode: certificate"@ | Out-File -Encoding UTF8 "C:\Program Files\winlogbeat\winlogbeat.yml"
# Install and start servicecd "C:\Program Files\winlogbeat".\install-service-winlogbeat.ps1Start-Service winlogbeatCloud: AWS CloudTrail to SIEM
# Forward CloudTrail logs via S3 bucket notifications
# 1. Create S3 bucket for CloudTrailaws s3api create-bucket --bucket company-cloudtrail-logs --region us-east-1
# 2. Enable CloudTrailaws cloudtrail create-trail --name company-trail \ --s3-bucket-name company-cloudtrail-logs \ --is-multi-region-trail \ --enable-log-file-validation
# 3. Configure S3 event notification to send logs to SIEMaws s3api put-bucket-notification-configuration \ --bucket company-cloudtrail-logs \ --notification-configuration file://notification.json
# notification.json{ "QueueConfigurations": [{ "QueueArn": "arn:aws:sqs:us-east-1:123456789012:siem-ingestion", "Events": ["s3:ObjectCreated:*"] }]}Log Integrity
Logs must be tamper-proof. If an attacker can modify logs, they can erase evidence of their activity.
Log Integrity Controls:
Forward Immediately: └─ Logs sent to central SIEM as generated └─ Local logs are copies, not primary └─ Network capture: switch port mirroring (cannot be disabled by attacker)
Write-Once Storage: └─ Append-only storage (immutable) └─ AWS S3 Object Lock with retention mode └─ WORM (Write Once Read Many) storage
Hashing / Signing: └─ Chain hashing (each log entry includes hash of previous) └─ RFC 5702: syslog-sign extension └─ Block-chain based verification (for high-security environments)
Access Control: └─ Separate admin for SIEM (different credentials from AD) └─ MFA for log management access └─ Audit log for who accesses logs └─ Alert when logs are modified or deleted
Monitoring: └─ Alert on log source going silent (attacker disabled logging) └─ Alert on log volume anomalies └─ Monitor SIEM health dashboardKey Takeaways
- Log management is the foundation of detection — without complete logs, SIEM and incident investigation are impossible
- Every log event should answer: Who, What, Where, When, From Where, and How — structured formats (JSON, CEF) enable automated parsing
- Tiered retention (hot → warm → cold) balances cost with accessibility — 30-90 days hot, 6-12 months warm, 1-7+ years cold
- A 500-user organisation generates ~155 GB/day of security logs — plan storage accordingly
- Logs must be immutable — if an attacker can modify logs, they can hide their activity (append-only storage, chain hashing)
- Forward logs immediately to central SIEM — local logs can be deleted; copies in transit can be captured at the switch level
- Monitor log sources for silence — a log source that stops sending is a red flag (attacker disabled logging or source is down)
- PCI DSS requires 12 months retention with 3 months immediately available — HIPAA requires 6 years
- Cloud logging (CloudTrail, Azure Activity Log, GCP Audit Log) must be enabled across all regions and sent to SIEM
- Log volume grows with org size — scalable architecture (SIEM with tiered storage) prevents cost overruns