Identity Analytics
Checking access...
Identity Analytics applies data science and machine learning to identity data to detect anomalies, assess risk, and identify patterns that would be invisible to manual review. It transforms identity governance from a reactive compliance function into a proactive security capability.
The Data Sources of Identity Analytics
Identity analytics draws from multiple data sources to build a comprehensive view of identity behaviour:
| Data Source | What It Provides | Use Case |
|---|---|---|
| IGA platform | User entitlements, role assignments, access history | Entitlement creep detection, role optimisation |
| Authentication logs | Login timestamps, locations, devices, MFA status | Anomalous login detection |
| Application logs | Application access patterns, feature usage | Usage-based access optimisation |
| HR data | Department, manager, location, employment status | Entitlement baseline modelling |
| PAM logs | Privileged session activity, command history | Privileged behaviour analytics |
| SIEM / UEBA | Security events, threat intelligence | Correlated threat detection |
| Network logs | Network access patterns, VPN usage | Context-aware risk assessment |
Identity Analytics Use Cases
Entitlement Creep Detection
Analytics identifies users whose access has gradually expanded beyond what their role requires:
| Metric | How It’s Calculated | Alert Threshold |
|---|---|---|
| Entitlement count vs. peer median | User’s entitlement count compared to peers in same role/department | > 2x peer median |
| Entitlement growth rate | % change in entitlement count over 90 days | > 30% increase |
| Unique system count | Number of distinct systems accessed | > 3x peer median |
| Out-of-role entitlements | Entitlements not typically held by users in the same role | Any occurrence |
| Dormant entitlements | Entitlements not used in > 90 days | > 20% of user’s entitlements |
Peer Group Analysis
Users are compared against their peer group — other users in the same role, department, or location:
# Simplified peer group analysis logicuser_entitlements = get_user_entitlements(user_id)peer_group = find_peer_group(user_id, dimension="role")
# Calculate peer baselinepeer_median = median([len(get_user_entitlements(p)) for p in peer_group])peer_std = std_dev([len(get_user_entitlements(p)) for p in peer_group])
# Flag outliersif len(user_entitlements) > peer_median + (3 * peer_std): flag_anomaly(user_id, "entitlement_count_outlier")Anomalous Authentication Detection
| Anomaly | Detection Method | Risk Level |
|---|---|---|
| Impossible travel | Login from geographically distant locations within impossible timeframe | Critical |
| Off-hours access | Login outside user’s normal working hours | Medium |
| New device | First-time access from unknown device | Medium |
| Failed login spike | > 5 failed logins within 5 minutes | High |
| New location | Login from never-before-seen geographic location | Medium |
| Credential stuffing pattern | Multiple rapid login attempts with different usernames from same IP | High |
Privilege Creep Detection
Privilege creep occurs when users accumulate privileges over time beyond what their current role requires:
Timeline of privilege escalation:Month 1: Joined as Junior Developer (base role, read-only)Month 6: Added to project team (project role, write)Month 12: Promoted to Developer (elevated role, modify)Month 18: Given temporary admin access (admin, 30-day — but not revoked)Month 24: Added to on-call rotation (admin access made permanent)Month 30: Domain admin for "emergency" (never revoked)Result: Junior Developer → Effective Domain Admin in 2.5 yearsAnalytics detection: The system would flag this user at Month 24 when the temporary admin access was not revoked, and escalate at Month 30 when domain admin was granted without proper approval.
Identity Risk Scoring
A composite risk score combines multiple risk factors into a single metric:
| Risk Factor | Weight | Example |
|---|---|---|
| Privilege level | 30% | High (admin), Medium (power user), Low (standard) |
| Entitlement count | 20% | Percentile vs. peer group |
| SoD violations | 20% | Count of unresolved violations |
| Access recency | 15% | Time since last access review |
| Authentication risk | 10% | Failed logins, MFA presence |
| Data sensitivity | 5% | Classification of data accessible |
Example score calculation:
Risk Score = (0.30 × privilege_level_score) + (0.20 × entitlement_score) + (0.20 × sod_score) + (0.15 × recency_score) + (0.10 × auth_score) + (0.05 × data_sensitivity_score)
User A: 0.30(100) + 0.20(80) + 0.20(60) + 0.15(50) + 0.10(20) + 0.05(40) = 68/100 (HIGH)User B: 0.30(20) + 0.20(30) + 0.20(10) + 0.15(20) + 0.10(10) + 0.05(20) = 18/100 (LOW)User and Entity Behaviour Analytics (UEBA)
UEBA extends identity analytics to detect behavioural anomalies that may indicate compromised accounts or insider threats:
| Behavioural Signal | UEBA Detection | Potential Threat |
|---|---|---|
| Unusual data access volume | Bulk download of documents outside normal pattern | Data exfiltration |
| Unusual login time | First-time login at 3 AM | Compromised account |
| Unusual application sequence | Accessing systems in an order never seen before | Lateral movement |
| Unusual peer interaction | Accessing files from a department the user has never interacted with | Privilege escalation |
| Unusual credential use | Same credential used from multiple IPs simultaneously | Credential sharing or theft |
| Unusual command pattern | Running commands never used before in privileged session | Attacker activity |
Building an Identity Analytics Program
Establish Baseline
Collect 3-6 months of identity and access data to establish normal behaviour patterns. Define peer groups (role, department, location). Calculate baseline metrics for each peer group.
Define Anomaly Thresholds
Set thresholds for each analytics use case. Thresholds should be calibrated to balance false positives (too low = alert fatigue) and false negatives (too high = missed detections).
Implement Detection
Deploy analytics engine (SIEM UEBA module, dedicated identity analytics platform, or custom ML models). Connect to data sources: IGA, authentication logs, application logs, HR system.
Establish Response Playbooks
For each alert type, define a response playbook: Who is notified? What investigation is performed? What remediation actions are taken? What is the escalation path?
Continuous Tuning
Review alert effectiveness regularly. Tune thresholds based on false positive/negative rates. Add new data sources as they become available. Update peer group definitions as the organisation changes.
Identity Analytics Tools and Platforms
| Tool/Platform | Type | Key Capabilities |
|---|---|---|
| SailPoint Predictive Identity | IGA + Analytics | Risk scoring, peer group analysis, certification recommendations |
| Saviynt | IGA + Analytics | SoD analytics, user risk scoring, access certification |
| Microsoft Identity Protection | Cloud | Risky sign-in detection, compromised user detection, MFA prompts |
| Splunk UEBA | SIEM + UEBA | Behavioural analytics, peer group analysis, threat detection |
| Varonis | Data Security Analytics | File access monitoring, data classification, anomaly detection |
| Rapid7 InsightIDR | SIEM + UEBA | User behaviour analytics, attacker behaviour detection |
Key Takeaways
- Identity Analytics applies data science and ML to identity data to detect anomalies, assess risk, and identify patterns invisible to manual review — transforming governance from reactive to proactive
- Key use cases include entitlement creep detection, peer group analysis, anomalous authentication, privilege creep detection, and identity risk scoring
- Peer group analysis compares each user against similar users (same role, department) — users with significantly more entitlements than their peers are flagged for review
- Identity risk scoring combines privilege level, entitlement count, SoD violations, access recency, authentication risk, and data sensitivity into a composite score
- UEBA extends analytics to behavioural signals (unusual data access, login times, application sequences) that may indicate compromised accounts or insider threats
- Building an identity analytics program requires: baseline establishment → threshold definition → detection implementation → response playbooks → continuous tuning