Tenzai Onboarding

Everything you need to get started with the development environment and our core repositories.


⚑ Prerequisites

Tool Purpose Installation
Docker Desktop Containers, local k8s docker.com
mise Tool version management brew install mise
AWS CLI v2 AWS SSO authentication brew install awscli
GitHub CLI Workflow triggers, auth brew install gh

What is mise?

mise (pronounced "meez") is a polyglot tool version manager and task runner. It replaces tools like nvm, pyenv, rbenv, and make.

Why we use mise

Install mise

# Install
brew install mise

# Add to shell (zsh)
echo 'eval "$(mise activate zsh)"' >> ~/.zshrc
source ~/.zshrc

# For bash
echo 'eval "$(mise activate bash)"' >> ~/.bashrc
source ~/.bashrc

☁️ AWS Setup

Initial SSO Configuration

The first run of mise run all or mise run agent in the tenzai repo auto-configures AWS SSO. For manual setup:

aws configure sso
# SSO start URL: https://tenzai.awsapps.com/start/#
# SSO region: eu-north-1
# Default region: eu-central-1
# Output format: json

AWS Start Page

Before diving into local development, verify you have AWS access:
  1. Open https://tenzai.awsapps.com/start/#
  2. Login with your Tenzai credentials
  3. You should see available AWS accounts (dev, staging, prod)
  4. Try clicking on "dev" β†’ "Management console" to verify access
  5. If you can't access, contact your manager to get AWS permissions

Available Profiles

ProfilePurpose
devDevelopment/sandbox default
stagingStaging environment
prodProduction use with caution

Daily Login

The easiest way to login to AWS is through mise:

# This will open browser for SSO login and decrypt secrets
cd ~/projects/tenzai
mise run secrets

Alternatively, use AWS CLI directly:

aws sso login --profile dev

Sessions expire after ~8 hours. If you get authentication errors, just run the login command again.


πŸ“¦ Repository Overview

RepoPurposeClone URL
tenzai Main platform (API, UI, Agent) git@github.com:TenzaiLtd/tenzai.git
evaluation Benchmarking & evaluation git@github.com:TenzaiLtd/evaluation.git
labs Vulnerable applications git@github.com:TenzaiLtd/labs.git

Clone All Repos

cd ~/projects
git clone git@github.com:TenzaiLtd/tenzai.git
git clone git@github.com:TenzaiLtd/evaluation.git
git clone git@github.com:TenzaiLtd/labs.git

πŸš€ Tenzai Platform

Location: ~/projects/tenzai

Get Started

cd ~/projects/tenzai
git checkout main && git pull

# Trust and install tools
mise trust
mise install

# Start full platform
mise run all

Directory Structure

Core Services (the main components)

DirectoryPurpose
agent/AI security testing agent - the brain that performs automated pentesting. Contains master agents, sub-agents, and phase-based workflow logic
platform/FastAPI backend server - REST API, GraphQL, webhooks, and business logic
ui/Angular frontend application (pnpm/nx workspace)

Agent Toolboxes (containers used by the agent)

DirectoryPurpose
hackbox/Containerized security tools (nmap, sqlmap, ffuf, nuclei, etc.) exposed via HTTP API
browserbox/Playwright-based browser automation container for web interaction during security testing
proxybox/OWASP ZAP proxy container for intercepting and analyzing HTTP traffic

Infrastructure

DirectoryPurpose
k8s/Kubernetes Helm charts (tenzai-chart, otel-collector-chart)
infra/Terraform infrastructure code and SOPS-encrypted secrets for all environments

Supportive Components

DirectoryPurpose
common/Shared Python utilities, database models, and SDK used across services
cli/Command-line interface and TUI (terminal UI) for interacting with the platform
agent-job-watcher/Kopf-based K8s controller that watches agent job events and updates the platform
lambda/AWS Lambda functions for async operations
integration-tests/End-to-end integration test suite
tests/Unit tests for all Python components
scripts/Utility scripts for development and operations

Config Files

FilePurpose
TiltfileTilt configuration for local development orchestration
mise.tomlTool versions and task definitions

Local Development with Tilt

What is Tilt?

Tilt is a toolkit for local Kubernetes development. It watches your source code, automatically rebuilds containers, and updates your cluster in real-time. Think of it as "hot reload for Kubernetes."

Key benefits

πŸ“š Docs: https://docs.tilt.dev/

Architecture Diagram

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚ Local Development Environment β”‚ β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€ β”‚ β”‚ β”‚ β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚ β”‚ β”‚ UI β”‚ β”‚ Platform β”‚ β”‚ Agent β”‚ β”‚ β”‚ β”‚ Angular │────▢│ FastAPI │────▢│ Python β”‚ β”‚ β”‚ β”‚ :4200 β”‚ β”‚ :8000 β”‚ β”‚ β”‚ β”‚ β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚ β”‚ β”‚ β”‚ β”‚ β”‚ β–Ό β–Ό β”‚ β”‚ β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚ β”‚ β”‚ PostgreSQL β”‚ β”‚ LocalStack β”‚ β”‚ Hackbox β”‚ β”‚ β”‚ β”‚ :5432 β”‚ β”‚ :4566 β”‚ β”‚ Sec Tools β”‚ β”‚ β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚ β”‚ β”‚ β”‚ β”‚ β–Ό β”‚ β”‚ β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚ β”‚ β”‚ S3 / SQS / β”‚ β”‚ β”‚ β”‚ SNS (mock) β”‚ β”‚ β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚ β”‚ β”‚ β”‚ β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚ β”‚ β”‚ k3d Cluster (tenzai-local) β”‚ β”‚ β”‚ β”‚ Registry: localhost:5001 β”‚ β”‚ β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚ β”‚ β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

What is LocalStack?

LocalStack emulates AWS services locally so you don't need real AWS resources during development:

Running the Platform

mise run all

This command:

  1. Syncs Python dependencies (uv sync)
  2. Decrypts SOPS secrets
  3. Creates k3d cluster tenzai-local with local registry
  4. Deploys all services via Helm chart
  5. Sets up LocalStack for AWS emulation
  6. Runs database migrations
  7. Starts the UI dev server

Local Endpoints

ServiceURL
UIhttp://localhost:4200
APIhttp://localhost:8000
API Docshttp://localhost:8000/docs
GraphQLhttp://localhost:8000/graphql
LocalStackhttp://localhost:4566
PostgreSQLlocalhost:5432

Common mise Tasks

mise run all           # Full platform via Tilt
mise run agent         # Run agent standalone
mise run lint          # Lint codebase
mise run test          # Unit tests
mise run secrets       # Edit encrypted secrets (SOPS)
mise run local-db:psql # Open psql shell
mise run local-db:reset # Reset DB + seed
mise run jwt           # Get JWT for API testing
mise run k9s:local     # k9s for local cluster
mise run clean         # Complete cleanup

CI/CD Pipeline

The tenzai repo uses GitHub Actions for continuous integration and deployment.

CI Gates (Pull Requests)

GateDescription
LintRuff (Python) + ESLint (TypeScript) code style checks
Type Checkmypy (Python) + TypeScript compiler
Unit Testspytest for Python, Jest for TypeScript
Integration TestsEnd-to-end tests against a test cluster
BuildDocker image builds for all services
Security ScanDependency vulnerability scanning

PR Environments

When you open a PR, a preview environment is automatically deployed:

Deployment Flow

PR Created β†’ CI Checks β†’ PR Environment β†’ Review β†’ Merge
↓
Deploy to Dev β†’ Deploy to Staging β†’ Deploy to Prod

πŸ“Š Evaluation System

Location: ~/projects/evaluation

What is the Evaluation System?

The evaluation repo is the benchmarking infrastructure that measures the performance of the Tenzai security agent. It answers: "How well does our agent find vulnerabilities compared to ground truth?"

The evaluation flow:

  1. Deploy a vulnerable lab (from the labs repo) to a GKE cluster
  2. Run the Tenzai agent against the deployed lab
  3. Compare agent findings against known vulnerabilities
  4. Score the results using LLM-based judges
  5. Report metrics to Slack, BigQuery, and dashboards

This allows us to track agent improvements over time and catch regressions.

Get Started

cd ~/projects/evaluation
git checkout main && git pull
uv sync

Directory Structure

evaluation/
β”œβ”€β”€ .github/
β”‚   β”œβ”€β”€ actions/       # Composite actions (deploy-lab, eval, run-tenzai)
β”‚   └── workflows/     # GitHub Actions workflows
β”œβ”€β”€ cli/               # Dojo CLI
β”œβ”€β”€ dojo-sdk/          # Python SDK for deployments
β”œβ”€β”€ dojo-web/          # Web UI (React)
β”œβ”€β”€ evaluator/         # LLM-based evaluation engine
β”œβ”€β”€ experiments/       # Benchmark orchestration
└── suite-files/       # Lab suite configurations (YAML)

Dojo

Dojo is our lab deployment and management system for deploying vulnerable applications to test the agent.

What Dojo Does

Dojo CLI

# Install globally
uv tool install --from "git+https://github.com/TenzaiLtd/evaluation.git#subdirectory=cli" dojo --force

# Run
uvx dojo

Features: Browse labs, deploy with custom configs, select agent/LLM, view active deployments, destroy labs.

Dojo Web UI

Web interface at https://tenzailtd.github.io/evaluation/ β€” GitHub OAuth, visual lab browser, real-time deployment tracking.

GitHub Actions Workflows

WorkflowPurposeTrigger
run-labs.ymlManual lab deploymentsworkflow_dispatch
run-sanity.ymlNightly exploit validationSchedule
run-labs-nightly-tenzai.ymlNightly benchmarksSchedule
evaluator-ci.ymlEvaluator tests/lintPRs

Evaluator

LLM-based system that compares agent findings against ground truth:

cd evaluator
uv sync --dev
uv run pytest tests/

πŸ§ͺ Labs Repository

Location: ~/projects/labs

What is the Labs Repo?

Contains the source code and infrastructure for all vulnerable applications used in evaluation β€” the targets that the Tenzai agent scans.

Lab Types

TypeDescriptionCount
T-Bench Tenzai Built Complex applications spanning realistic customer tech stacks 26 apps
X-Bench xBow Single-page apps with capture-the-flag style vulnerabilities 104 labs
OSS Labs Open Source Real vulnerable apps (Juice Shop, DVWP, OpenCart, Zabbix) ~10 apps
CVE-Bench Labs reproducing specific CVEs (not actively used yet) Various

Get Started

cd ~/projects/labs
git checkout main && git pull

Directory Structure

labs/
β”œβ”€β”€ tbench/                     # T-Bench labs (app-001 ... app-023b)
β”œβ”€β”€ xben-001-24/ ... xben-104-24/  # X-Bench labs
β”œβ”€β”€ cve-bench/                  # CVE reproduction labs
β”œβ”€β”€ dvwp/                       # Damn Vulnerable WordPress
β”œβ”€β”€ juiceshop/                  # OWASP Juice Shop
β”œβ”€β”€ opencart/                   # OpenCart e-commerce
└── zabbix/                     # Zabbix monitoring

Local Development

cd labs/tbench/app-001

# Create env configuration
mkdir -p ~/.config/tbench
cat > ~/.config/tbench/.env << EOF
NAMESPACE_PREFIX=local
LOCAL_K8S_CONTEXT=docker-desktop
EOF
ln -s ~/.config/tbench/.env .env

# Deploy locally
just run

# Access at http://localhost:300XX
curl http://localhost:30010/api/health

# Stop
just stop

Remote Deployment (GKE)

# Authenticate
gcloud auth login
gcloud container clusters get-credentials lab-cluster --region=us-central1

# Deploy (commit changes first!)
just deploy

# Get endpoint info
just describe

# Destroy
just destroy

ZeroTier Network Access

Labs on GKE use internal LoadBalancers. To access from your machine:

  1. Install ZeroTier: brew install zerotier-one
  2. Join network: sudo zerotier-cli join <network-id> (get from 1Password web)
  3. Approve your device in ZeroTier admin console
  4. Lab endpoints accessible via internal IPs

🌐 Environments

Environment Overview

EnvironmentPurposeDeployment
devDevelopment and testingAuto-deploy from main
stagingPre-production validationManual promotion
prodProductionManual promotion

πŸš€ Production Environment


⚑ Quick Reference

Daily Workflow

# 1. Login to AWS (opens browser)
cd ~/projects/tenzai
mise run secrets

# 2. Start tenzai platform
git pull
mise run all

# 3. Access
open http://localhost:4200  # UI
open http://localhost:8000/docs  # API

Common Commands

TaskCommand
Start full platformmise run all
Run agent standalonemise run agent -U https://target.com
Deploy a labuvx dojo
Run evaluator testscd evaluator && uv run pytest tests/
Local lab deploymentcd labs/tbench/app-XXX && just run
View K8s dashboardmise run k9s:local
Get API JWTmise run jwt

Getting Help


Links & References

πŸ“¦ Repositories

πŸ”§ Tools & Dashboards

πŸ“š Documentation