Everything you need to get started with the development environment and our core repositories.
| Tool | Purpose | Installation |
|---|---|---|
| Docker Desktop | Containers, local k8s | docker.com |
| mise | Tool version management | brew install mise |
| AWS CLI v2 | AWS SSO authentication | brew install awscli |
| GitHub CLI | Workflow triggers, auth | brew install gh |
mise (pronounced "meez") is a polyglot tool version manager and task runner. It replaces tools like nvm, pyenv, rbenv, and make.
mise.tomlmise run all, mise run test are defined per-projectcd into the project# Install
brew install mise
# Add to shell (zsh)
echo 'eval "$(mise activate zsh)"' >> ~/.zshrc
source ~/.zshrc
# For bash
echo 'eval "$(mise activate bash)"' >> ~/.bashrc
source ~/.bashrc
The first run of mise run all or mise run agent in the tenzai repo auto-configures AWS SSO. For manual setup:
aws configure sso
# SSO start URL: https://tenzai.awsapps.com/start/#
# SSO region: eu-north-1
# Default region: eu-central-1
# Output format: json
| Profile | Purpose |
|---|---|
dev | Development/sandbox default |
staging | Staging environment |
prod | Production use with caution |
The easiest way to login to AWS is through mise:
# This will open browser for SSO login and decrypt secrets
cd ~/projects/tenzai
mise run secrets
Alternatively, use AWS CLI directly:
aws sso login --profile dev
Sessions expire after ~8 hours. If you get authentication errors, just run the login command again.
| Repo | Purpose | Clone URL |
|---|---|---|
| tenzai | Main platform (API, UI, Agent) | git@github.com:TenzaiLtd/tenzai.git |
| evaluation | Benchmarking & evaluation | git@github.com:TenzaiLtd/evaluation.git |
| labs | Vulnerable applications | git@github.com:TenzaiLtd/labs.git |
cd ~/projects
git clone git@github.com:TenzaiLtd/tenzai.git
git clone git@github.com:TenzaiLtd/evaluation.git
git clone git@github.com:TenzaiLtd/labs.git
Location: ~/projects/tenzai
cd ~/projects/tenzai
git checkout main && git pull
# Trust and install tools
mise trust
mise install
# Start full platform
mise run all
| Directory | Purpose |
|---|---|
agent/ | AI security testing agent - the brain that performs automated pentesting. Contains master agents, sub-agents, and phase-based workflow logic |
platform/ | FastAPI backend server - REST API, GraphQL, webhooks, and business logic |
ui/ | Angular frontend application (pnpm/nx workspace) |
| Directory | Purpose |
|---|---|
hackbox/ | Containerized security tools (nmap, sqlmap, ffuf, nuclei, etc.) exposed via HTTP API |
browserbox/ | Playwright-based browser automation container for web interaction during security testing |
proxybox/ | OWASP ZAP proxy container for intercepting and analyzing HTTP traffic |
| Directory | Purpose |
|---|---|
k8s/ | Kubernetes Helm charts (tenzai-chart, otel-collector-chart) |
infra/ | Terraform infrastructure code and SOPS-encrypted secrets for all environments |
| Directory | Purpose |
|---|---|
common/ | Shared Python utilities, database models, and SDK used across services |
cli/ | Command-line interface and TUI (terminal UI) for interacting with the platform |
agent-job-watcher/ | Kopf-based K8s controller that watches agent job events and updates the platform |
lambda/ | AWS Lambda functions for async operations |
integration-tests/ | End-to-end integration test suite |
tests/ | Unit tests for all Python components |
scripts/ | Utility scripts for development and operations |
| File | Purpose |
|---|---|
Tiltfile | Tilt configuration for local development orchestration |
mise.toml | Tool versions and task definitions |
Tilt is a toolkit for local Kubernetes development. It watches your source code, automatically rebuilds containers, and updates your cluster in real-time. Think of it as "hot reload for Kubernetes."
π Docs: https://docs.tilt.dev/
LocalStack emulates AWS services locally so you don't need real AWS resources during development:
mise run all
This command:
uv sync)tenzai-local with local registry| Service | URL |
|---|---|
| UI | http://localhost:4200 |
| API | http://localhost:8000 |
| API Docs | http://localhost:8000/docs |
| GraphQL | http://localhost:8000/graphql |
| LocalStack | http://localhost:4566 |
| PostgreSQL | localhost:5432 |
mise run all # Full platform via Tilt
mise run agent # Run agent standalone
mise run lint # Lint codebase
mise run test # Unit tests
mise run secrets # Edit encrypted secrets (SOPS)
mise run local-db:psql # Open psql shell
mise run local-db:reset # Reset DB + seed
mise run jwt # Get JWT for API testing
mise run k9s:local # k9s for local cluster
mise run clean # Complete cleanup
The tenzai repo uses GitHub Actions for continuous integration and deployment.
| Gate | Description |
|---|---|
| Lint | Ruff (Python) + ESLint (TypeScript) code style checks |
| Type Check | mypy (Python) + TypeScript compiler |
| Unit Tests | pytest for Python, Jest for TypeScript |
| Integration Tests | End-to-end tests against a test cluster |
| Build | Docker image builds for all services |
| Security Scan | Dependency vulnerability scanning |
When you open a PR, a preview environment is automatically deployed:
https://pr{number}.dev.tenzai.io/ (e.g., https://pr693.dev.tenzai.io/)Location: ~/projects/evaluation
The evaluation repo is the benchmarking infrastructure that measures the performance of the Tenzai security agent. It answers: "How well does our agent find vulnerabilities compared to ground truth?"
The evaluation flow:
This allows us to track agent improvements over time and catch regressions.
cd ~/projects/evaluation
git checkout main && git pull
uv sync
evaluation/
βββ .github/
β βββ actions/ # Composite actions (deploy-lab, eval, run-tenzai)
β βββ workflows/ # GitHub Actions workflows
βββ cli/ # Dojo CLI
βββ dojo-sdk/ # Python SDK for deployments
βββ dojo-web/ # Web UI (React)
βββ evaluator/ # LLM-based evaluation engine
βββ experiments/ # Benchmark orchestration
βββ suite-files/ # Lab suite configurations (YAML)
Dojo is our lab deployment and management system for deploying vulnerable applications to test the agent.
# Install globally
uv tool install --from "git+https://github.com/TenzaiLtd/evaluation.git#subdirectory=cli" dojo --force
# Run
uvx dojo
Features: Browse labs, deploy with custom configs, select agent/LLM, view active deployments, destroy labs.
Web interface at https://tenzailtd.github.io/evaluation/ β GitHub OAuth, visual lab browser, real-time deployment tracking.
| Workflow | Purpose | Trigger |
|---|---|---|
run-labs.yml | Manual lab deployments | workflow_dispatch |
run-sanity.yml | Nightly exploit validation | Schedule |
run-labs-nightly-tenzai.yml | Nightly benchmarks | Schedule |
evaluator-ci.yml | Evaluator tests/lint | PRs |
LLM-based system that compares agent findings against ground truth:
cd evaluator
uv sync --dev
uv run pytest tests/
Location: ~/projects/labs
Contains the source code and infrastructure for all vulnerable applications used in evaluation β the targets that the Tenzai agent scans.
| Type | Description | Count |
|---|---|---|
| T-Bench Tenzai Built | Complex applications spanning realistic customer tech stacks | 26 apps |
| X-Bench xBow | Single-page apps with capture-the-flag style vulnerabilities | 104 labs |
| OSS Labs Open Source | Real vulnerable apps (Juice Shop, DVWP, OpenCart, Zabbix) | ~10 apps |
| CVE-Bench | Labs reproducing specific CVEs (not actively used yet) | Various |
cd ~/projects/labs
git checkout main && git pull
labs/
βββ tbench/ # T-Bench labs (app-001 ... app-023b)
βββ xben-001-24/ ... xben-104-24/ # X-Bench labs
βββ cve-bench/ # CVE reproduction labs
βββ dvwp/ # Damn Vulnerable WordPress
βββ juiceshop/ # OWASP Juice Shop
βββ opencart/ # OpenCart e-commerce
βββ zabbix/ # Zabbix monitoring
cd labs/tbench/app-001
# Create env configuration
mkdir -p ~/.config/tbench
cat > ~/.config/tbench/.env << EOF
NAMESPACE_PREFIX=local
LOCAL_K8S_CONTEXT=docker-desktop
EOF
ln -s ~/.config/tbench/.env .env
# Deploy locally
just run
# Access at http://localhost:300XX
curl http://localhost:30010/api/health
# Stop
just stop
# Authenticate
gcloud auth login
gcloud container clusters get-credentials lab-cluster --region=us-central1
# Deploy (commit changes first!)
just deploy
# Get endpoint info
just describe
# Destroy
just destroy
Labs on GKE use internal LoadBalancers. To access from your machine:
brew install zerotier-onesudo zerotier-cli join <network-id> (get from 1Password web)| Environment | Purpose | Deployment |
|---|---|---|
| dev | Development and testing | Auto-deploy from main |
| staging | Pre-production validation | Manual promotion |
| prod | Production | Manual promotion |
# 1. Login to AWS (opens browser)
cd ~/projects/tenzai
mise run secrets
# 2. Start tenzai platform
git pull
mise run all
# 3. Access
open http://localhost:4200 # UI
open http://localhost:8000/docs # API
| Task | Command |
|---|---|
| Start full platform | mise run all |
| Run agent standalone | mise run agent -U https://target.com |
| Deploy a lab | uvx dojo |
| Run evaluator tests | cd evaluator && uv run pytest tests/ |
| Local lab deployment | cd labs/tbench/app-XXX && just run |
| View K8s dashboard | mise run k9s:local |
| Get API JWT | mise run jwt |
/tenzai/docs/