Skip to main content

Skillber v1.0 is here!

Learn more

Risk Management

Checking access...

Risk management is the core discipline of cybersecurity. Every decision a security professional makes — what to prioritise, what to accept, what to pay for — is a risk decision. Without risk management, you are guessing. With it, you are making informed trade-offs that the business can understand.

The Risk Formula

Risk = Threat × Vulnerability × Impact

All three elements must be present for risk to exist:

ElementDefinitionExampleWithout This Element
ThreatWhat could cause harmA ransomware gang targeting your industryNo risk — no one is trying to attack you
VulnerabilityA weakness that could be exploitedUnpatched Apache Struts serverNo risk — there is no way in
ImpactThe consequence if exploited$50M in lost revenue from 2-week outageNo risk — even if breached, it does not matter

Real-world implication: If you have a critical vulnerability on an internal server that has no network connectivity and no sensitive data, there is vulnerability but no impact — so the risk is low. Conversely, if you have a well-patched internet-facing server with customer PII, the vulnerability likelihood is low but the impact is high — so the risk is still significant.

Threat, Vulnerability, Risk — The Relationship

These three terms are often confused. Here is how they relate:

THREAT (attacker knocking on doors)
↓ finds
VULNERABILITY (an unlocked door)
↓ leading to
IMPACT (what is inside gets stolen)
↓ =
RISK (how likely this is × how bad it would be)

Example: A ransomware gang (threat) scanning the internet for exposed RDP ports finds that your organisation has RDP open to the internet on a domain controller (vulnerability). If they exploit this, they could encrypt your domain and halt operations for weeks (impact). The risk is high because the threat is active, the vulnerability exists, and the impact is severe.

Risk Assessment Methodologies

Qualitative Risk Assessment

Qualitative assessment uses descriptive scales (High/Medium/Low) rather than numerical values. It is faster, more accessible, and widely used for initial assessments.

    graph TD
    A[Identify Assets] --> B[Identify Threats]
    B --> C[Identify Vulnerabilities]
    C --> D[Assess Likelihood: H/M/L]
    D --> E[Assess Impact: H/M/L]
    E --> F[Calculate Risk: Likelihood × Impact]
    F --> G[Prioritise: Heat Map]
  

Risk heat map (5×5 matrix):

Likelihood \ ImpactNegligibleMinorModerateMajorCatastrophic
Almost CertainMediumHighHighCriticalCritical
LikelyMediumMediumHighHighCritical
PossibleLowMediumMediumHighHigh
UnlikelyLowLowMediumMediumHigh
RareLowLowLowMediumMedium

Example qualitative risk register:

AssetThreatVulnerabilityLikelihoodImpactRiskTreatment
Customer databaseSQL injection via web appWeb app uses string concatenation for queriesLikelyCatastrophicCriticalRewrite with parameterised queries (project)
Email systemSpear-phishing attackNo MFA on executive accountsAlmost CertainMajorCriticalEnforce MFA (1 week)
Public websiteDDoS attackNo DDoS protectionPossibleMinorLowAccept risk
Office Wi-FiRogue access pointNo 802.1X network access controlUnlikelyModerateMediumImplement NAC (Q3 project)

Quantitative Risk Assessment (FAIR Model)

Quantitative assessment uses dollar values and statistical analysis. The Factor Analysis of Information Risk (FAIR) model is the industry standard.

Key quantitative terms:

TermDefinitionFormula
SLE (Single Loss Expectancy)Cost of a single incidentSLE = AV × EF
AV (Asset Value)Dollar value of the asset$
EF (Exposure Factor)Percentage of asset value lost%
ARO (Annual Rate of Occurrence)How many times per year0.0 - 365
ALE (Annualised Loss Expectancy)Expected annual costALE = SLE × ARO

Example — Ransomware risk calculation:

Terminal window
Asset: Financial databases + file servers
Asset Value (AV): $5,000,000 (replacement + data value + downtime cost)
Exposure Factor (EF): 60% (partial data loss + 3-week restoration)
Single Loss Expectancy (SLE) = $5,000,000 × 0.60 = $3,000,000
Annual Rate of Occurrence (ARO): 0.3 (one incident every 3-4 years, based on industry stats)
Annualised Loss Expectancy (ALE) = $3,000,000 × 0.3 = $900,000/year

Now compare control costs:

Terminal window
Control option 1: Offline backups + tested DR plan
Cost: $150,000/year (backup infrastructure + testing)
New EF: 10% (recover within 48 hours, minimal data loss)
New SLE: $500,000
New ALE: $500,000 × 0.3 = $150,000/year
Annual savings: $900,000 - $150,000 = $750,000
ROI: $750,000 - $150,000 = $600,000/year net benefit
Control option 2: Everything-in-the-cloud migration
Cost: $2,000,000/year (cloud migration + rearchitecture)
New ARO: 0.1 (cloud provider has better baseline security)
New ALE: $3,000,000 × 0.1 = $300,000/year
Annual savings: $900,000 - $300,000 = $600,000
But $2,000,000 > $600,000 ... negative ROI (unless cloud also provides other benefits)

FAIR analysis output:

Risk: Ransomware on financial systems
Probable loss frequency: 0.3 events/year
Probable loss magnitude: $3,000,000 per event
Annualised loss expectancy: $900,000
Confidence interval (90%): $450,000 - $1,800,000
Recommended control: Offline immutable backups + DR plan
Residual ALE with control: $150,000
Annual benefit of control: $750,000
Control cost: $150,000/year
Net benefit: $600,000/year

Risk Treatment Options

Once risk is identified and assessed, you must decide what to do:

StrategyDescriptionWhen to UseExample
AvoidEliminate the activity that creates riskWhen risk outweighs benefit”We will not store credit card numbers — use a tokenisation service”
TransferShift risk to a third partyWhen another party can manage it better”We bought cyber insurance to cover ransomware losses”
MitigateImplement controls to reduce likelihood or impactWhen cost of control < expected loss”We deployed MFA to reduce account takeover risk”
AcceptAcknowledge the risk and monitor itWhen cost of control > expected loss”We accept the risk of the public wiki being defaced — recovery is trivial”

Residual vs inherent risk:

INHERENT RISK: Risk before any controls are applied
CONTROLS: What you do to reduce risk
RESIDUAL RISK: Risk that remains after controls
Do you accept residual risk? If not → add more controls

Example — Inherent vs residual for a public web application:

Risk ElementInherent (no controls)Control AppliedResidual (after control)
SQL injectionCritical (public app, no input validation)WAF + parameterised queriesLow
DDoSHigh (single server, no capacity)Cloudflare CDN + auto-scalingLow
Credential stuffingCritical (no rate limiting, no MFA)Rate limiting + MFA + account lockoutLow
XSSCritical (user-submitted content, no sanitisation)Content Security Policy + output encodingMedium

Third-Party Risk Management

Most breaches involve a third party. Target was breached through an HVAC vendor. SolarWinds was a supply chain attack. Capital One’s cloud provider (AWS) was not breached — but Capital One misconfigured their use of it.

TPRM process:

1. IDENTIFY: Which vendors have access to our data or network?
→ Vendor registry with data classification and access level
2. ASSESS: What risk does each vendor present?
→ Questionnaires (SIG), SOC 2 reports, penetration test results
3. TIER: Not all vendors are equal
→ Tier 1 (critical): SOC 2 Type II + pen test required
→ Tier 2 (important): SOC 2 Type II or equivalent
→ Tier 3 (low): Self-assessment only
4. TREAT: What do we require from each vendor?
→ Contractual security requirements, right-to-audit clause
5. MONITOR: Continuous oversight
→ Annual reassessment, breach notification agreements

Sample vendor tiering:

TierCriteriaRequirementsCadence
Tier 1Processes PII, or has network accessSOC 2 Type II, pentest, BIA, cyber insuranceAnnual assessment
Tier 2Connects to non-sensitive systemsSOC 2 Type II or equivalentAnnual assessment
Tier 3No access to our environmentSelf-assessment questionnaireUpon onboarding only

Case Study: Equifax (Complete)

Equifax is the most important risk management case study in cybersecurity history because it involved failure at every level of the risk management process.

Timeline:

DateEvent
March 7, 2017Apache releases patch for CVE-2017-5638 (Struts RCE) — CVSS 10.0
March 7 - May 13Equifax has the patch available but does not apply it
May 13Attackers scan the internet for unpatched Struts servers, find Equifax
May 13 - July 29Attackers maintain access, move laterally, locate unencrypted database
July 29Equifax security team detects suspicious traffic
September 7Equifax publicly discloses the breach
2018-2019Congressional hearings, CEO retires, $1.4B in settlements

Risk management failures at every level:

FailureWhat Should Have Happened
No asset inventory — They did not know they had a Struts serverAsset discovery should identify all internet-facing systems
No patch SLA — CVSS 10.0 vulnerability unpatched for 2+ monthsCritical patching SLA of 48 hours or less
No vulnerability scanning — They had a scanner but it did not cover this systemAuthenticated scanning covering all systems
No network segmentation — Attacker reached the database from the web tierDatabase should be in a separate subnet with strict firewall rules
No database encryption — 147M records stored in plaintextEncryption at rest (AES-256) would have made stolen data useless
No data exfiltration detection — 147M records left over monthsDLP + anomaly detection would have alerted on the volume
No IR plan — Took 2 months to discover they were breachedEDR + SIEM with 24/7 monitoring

The root cause was not technical — it was risk management. Equifax had a vulnerability management program on paper. They ran scans. They had a patching process. But the process failed because:

  • The scanning did not cover all assets
  • The patching SLA was not enforced
  • There was no accountability for missed patches
  • The risk of not patching was not communicated to leadership
  • The board was not informed of cybersecurity risks

Regulatory outcome: Equifax settled with the FTC for $575 million (the largest data breach settlement in history at the time), plus $175 million to states, plus $1 billion for consumer remediation. Total: ~$1.4 billion.

Key regulatory findings from the FTC complaint:

  • Equifax failed to maintain an accurate inventory of their IT systems
  • Equifax failed to implement adequate patch management
  • Equifax failed to monitor network traffic for anomalous activity
  • Equifax failed to segment their network to limit access to sensitive data
  • Equifax failed to implement adequate access controls

Risk Management Maturity Model

LevelNameCharacteristics
1InitialAd-hoc, reactive, no formal process
2RepeatableBasic risk register, qualitative assessment, some SLAs
3DefinedStandardised risk methodology (NIST, FAIR), regular assessments
4ManagedQuantitative FAIR analysis, risk appetite defined, metrics-driven
5OptimisedContinuous risk monitoring, automated risk scoring, board-level reporting

Key Takeaways

  • Risk = Threat × Vulnerability × Impact — all three must be present for risk to exist
  • Qualitative assessment (H/M/L heat maps) is faster but subjective; Quantitative assessment (FAIR, ALE = SLE × ARO) is more precise but data-intensive — use both
  • Four risk treatment options: Avoid (eliminate the activity), Transfer (insurance, contracts), Mitigate (controls), Accept (monitor) — residual risk remains after controls
  • Third-party risk management is essential — most breaches involve a vendor or supply chain
  • Equifax is the definitive case study: every risk management process failed (asset inventory, scanning, patching, segmentation, encryption, monitoring)
  • Risk management maturity progresses from ad-hoc (Level 1) to continuous and board-reported (Level 5) — most enterprises are at Level 2-3