News

Mastering Automated Adversary Emulation with MITRE Caldera

max seefeld

May 25, 2025 • 4 min read

Preview: In this article, I’ll guide you through deploying and mastering MITRE Caldera for automated adversary emulation—covering architecture, installation, advanced integrations, scalability considerations, and real-world best practices to sharpen your Red Team operations and validate defenses at enterprise scale.

By Maxwell Seefeld

1. The Case for Automated Emulation

Adversary emulation is the gold standard for validating your detection and response controls—but building and executing realistic attack chains by hand can consume days of effort. Caldera automates that grind:

Repeatable Playbooks: Define your end-to-end operation once, then run it weekly, monthly, or on demand.

Framework Alignment: Every step maps to an ATT&CK® technique, making results instantly consumable by blue teams.

Audit-Ready Output: Detailed logs and timing data feed straight into compliance reports and incident post-mortems.

When you need to prove that your SIEM, EDR, and network controls detect real attacker behavior—not just canned signatures—Caldera is the tool that scales your Red Team without scaling headcount.

2. Caldera’s Architecture Deep Dive

Before you click “Start Operation,” it pays to understand Caldera’s core components:

Server

A Python-based REST API and web UI.

Manages operations, agent inventory, and logging.

Agents

Lightweight implants deployed to targets via SSH, SMB, or custom delivery.

Report back over HTTP/S or encrypted channels.

Abilities

YAML-driven modules tagged by ATT&CK Technique ID (e.g., T1059 for PowerShell execution).

Encapsulate everything from process enumeration to credential dumping.

Adversary Profiles

Pre-built collections of Abilities modeling known threat actors (APT29, FIN7).

Easily forkable for custom campaigns.

This modular design lets you mix and match Abilities, trigger them manually or in sequence, and extend Caldera by dropping new YAML files into your plugins folder.

3. Step-by-Step Installation & Quickstart

Prerequisites

Python 3.8+

Linux host (bare-metal, LXC, or VM on Proxmox)

Network reachability between the Caldera server and your targets

Install & Configure

# Clone repository
git clone https://github.com/mitre/caldera.git --recursive
cd caldera

# Install dependencies
pip3 install -r requirements.txt

# Copy default configuration
cp conf/default.yml conf/local.yml

# Edit conf/local.yml:
#   * Set admin.password
#   * Define data store (Elasticsearch recommended)

Launch & Log In

python3 server.py --insecure

Open https://<server>:8888, log in with your admin credentials, and confirm you see the default adversary profiles.

4. Building Your First Operation

Select an Adversary

Choose from “Sandcat,” “Ember,” or community-contributed profiles.

Configure Agents

Generate an agent payload under Profiles → Agents.

Deploy via SSH:

ssh user@target 'bash -s' < sandcat-bash

Launch the Operation

In the UI, click Operations → New Operation.

Pick your profile, target hosts, and scheduling (immediate or delayed).

Monitor Progress

Watch live logs for each Tactic and Technique.

Drill into ability output to verify success or troubleshoot issues.

5. Integrating Caldera into Your Security Pipeline

Caldera shines when it’s part of a repeatable, automated process:

CI/CD Integration

Add a pipeline stage (Jenkins, GitLab CI) that triggers a Caldera operation against your staging environment after deploy.

SIEM & Alert Validation

Forward Caldera logs to Splunk or ELK.

Automatically verify that your detection rules fire on each ATT&CK technique.

Reporting Dashboards

Use your ELK stack or Grafana to plot success rates, dwell time, and coverage gaps over time.

By embedding Caldera into your development and security workflows, you shift left on adversary testing—catching misconfigurations and detection blind spots before code reaches production.

6. Scaling to Enterprise Environments

When you outgrow a single-host setup, consider:

Distributed Server Cluster

Run multiple Caldera instances behind a load balancer for high availability.

Containerization

Dockerize your Caldera server and agent builder, orchestrate with Kubernetes or Docker Swarm.

Agent Diversity

Deploy Linux, Windows, and macOS agents.

Leverage cloud-native execution (e.g., AWS Systems Manager Run Command) for ephemeral workloads.

Policy-Driven Scheduling

Automate regular “smoke test” operations weekly.

Trigger full kill-chain campaigns quarterly or after major infrastructure changes.

With this approach, you can emulate hundreds of techniques across thousands of hosts—and still review results in a single pane of glass.

7. Customizing Abilities & Extending Caldera

If you need bespoke actions:

YAML Ability Template

Copy data/abilities/powershell/Invoke-Mimikatz.yml → conf/abilities/My-Dump-Creds.yml.

Define Fields

name: “Custom Creds Dump”
technique: T1003
platforms: [windows]
command: |
  mimikatz.exe "privilege::debug" "sekurlsa::logonpasswords" exit
timeout: 300

Reload & Test

Restart server.py or call the plugin reload endpoint.

Verify your new ability appears in the UI under your custom adversary profile.

You’re no longer limited to community plugins—own every part of your emulation.

8. Measuring Success & Key Metrics

To demonstrate value to stakeholders, track:

Technique Coverage (% of ATT&CK techniques exercised)

Success Rate (abilities that completed vs. failed)

Average Dwell Time (time from agent deployment to exfiltration)

Detection Latency (time between ability execution and alert generation)

Visualize these in Grafana or Power BI by querying your Elasticsearch or TimescaleDB back end, and use them to drive continuous improvements in detection engineering.

9. Common Pitfalls & Hardening Guidance

Unscoped Service Accounts: Create low-privilege accounts for Caldera agents—don’t use Domain Admin.

Network Security: Isolate your Caldera server in a management VLAN or jump-host.

Credential Rotation: Automate rotation of conf/local.yml passwords and API keys.

Version Pinning: Lock to a specific Caldera Git tag in production to avoid unexpected plugin updates.

Regularly clean up stale agents and log indices to prevent clutter and potential security risks.

10. Conclusion & Next Steps

MITRE Caldera transforms hours of manual Red Team labor into minutes of automated adversary emulation—at enterprise scale and with audit-ready metrics. By aligning every step to the ATT&CK framework, integrating into your CI/CD and SIEM pipelines, and customizing your own Abilities, you gain unmatched visibility into your true detection posture.

Action Items for This Week

Deploy a two-node Caldera cluster in your lab.

Run a simple “Discovery → Lateral Movement” operation against a controlled target.

Forward logs to your SIEM and validate at least five ATT&CK techniques.

Share your custom Abilities back with MITRE or your internal team repository.

Stay sharp, stay automated, and let Caldera do the heavy lifting for your Red Team exercises.

—Maxwell Seefeld