Skip to content

Latest commit

 

History

History
276 lines (194 loc) · 10.7 KB

File metadata and controls

276 lines (194 loc) · 10.7 KB

CLAUDE.md

This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.

Project Overview

DoubleZero is a decentralized high-performance network built on Solana. Contributors register network devices and links onchain; clients (e.g., Solana validators) connect via GRE tunnels and receive optimized routes via BGP. The codebase is a hybrid Rust/Go monorepo with TypeScript and Python SDKs.

Architecture (key components)

Component Language Purpose
smartcontract/ Rust Solana programs (serviceability, telemetry, revenue distribution) + CLI
activator/ Rust Offchain validator — approves/rejects entities, allocates resources (IPs, tunnel IDs)
client/ Rust (doublezero CLI) + Go (doublezerod daemon) End-user connectivity — GRE tunnels, BGP sessions
controlplane/ Go Controller pushes configs to devices; agent runs on Arista EOS; funder, monitor, admin tools
telemetry/ Go Flow ingestion (NetFlow/IPFIX), gNMI writer, global monitor → ClickHouse/InfluxDB
api/ Go API server
sdk/ Go, Python, TypeScript Read-only account deserialization for serviceability, telemetry, revenue distribution
e2e/ Go End-to-end tests using testcontainers-go with Arista cEOS devices

Onchain state lifecycle: Pending → Activated/Rejected → Suspended/Deleting. The activator drives all transitions.

Serialization: Borsh for onchain accounts, Protobuf for gRPC (controller↔agent), JSON for APIs.

Build Commands

make build          # Build all (Rust + Go)
make lint           # Lint all
make fmt            # Format all
make test           # Test all
make ci             # build + lint + test

Rust

make rust-build     # Build workspace + onchain programs (cargo build-sbf)
make rust-lint      # rustfmt check + clippy (-Dclippy::all -Dwarnings)
make rust-fmt       # Format with nightly rustfmt (imports_granularity=Crate)
make rust-test      # Workspace tests + program tests + account compat checks

Run a single Rust test:

cargo test -p <crate-name> <test_name>

Always run make rust-fmt before committing Rust changes.

Rust toolchain: 1.90.0 (via rust-toolchain.toml). Solana SDK: 2.2.x.

Go

make go-build       # Build all Go packages (tags: qa, e2e)
make go-lint        # golangci-lint
make go-fmt         # go fmt + rewrite interface{} → any
make go-test        # Tests (requires sudo for libpcap), container tests, fault tests

Run a single Go test:

go test -run TestFunctionName -v ./path/to/package/...

Note: controlplane/s3-uploader/ is a separate Go module — build/lint/test targets handle it automatically, but manual commands need cd controlplane/s3-uploader.

Go version: 1.25.0.

SDKs

TypeScript (uses bun, not npm):

cd sdk/<module>/typescript && bun install && bun tsc --noEmit && bun test

Python (uses uv):

cd sdk/<module>/python && uv run pytest

All SDKs: make sdk-test

Fixture generation: SDKs test against binary fixtures generated by Rust:

make generate-fixtures   # Regenerate .bin/.json fixtures from Rust

Test Failures

  • Main is expected to be green. If a test fails on your branch, your changes most likely broke it — investigate and fix it rather than assuming the failure is pre-existing.

RFCs and Documentation

  • When asked if a doc is up to date, evaluate it against its intended purpose and scope — not against whatever was most recently worked on. Implementation bug fixes and edge-case handling are not design decisions. Don't inflate docs with implementation details just because they're fresh in context.

Style & Terminology

  • Use "onchain" (one word, no hyphen), never "on-chain"

Git Commits

  • Do not add "Co-Authored-By" lines to commit messages
  • Use the format component: short description (e.g., lake/indexer: fix flaky staging test, telemetry: use CLICKHOUSE_PASS env var)
  • Keep the description lowercase (except proper nouns) and concise

Pull Requests

  • Use the /pr-text skill to generate PR descriptions, then use gh pr create
  • Do not include "Generated with Claude Code" or similar footers
  • PR title format: component: short description (same as commit messages)
  • Summary bullets should be concise, ordered by importance/significance
  • Focus on "what" and "why", not implementation details
  • Include a "Testing Verification" section
  • Don't mention table-stakes items in testing verification (e.g., "compiles cleanly", "builds successfully", "no lint errors"). Only include meaningful verification like specific test scenarios, behavioral observations, or edge cases validated.
  • Limit the size of each PR to about 500 lines of new code if possible (not including tests). If not possible, make the operator aware that the size of the PR exceeds doublezero's best practice, and recommend breaking the work apart into multiple PRs.
  • If the change is related to an RFC in the rfcs/ directory, provide the reviewer with a link to the RFC.

Local Devnet / E2E Environment

The local devnet runs in Docker containers with the naming convention dz-local-*.

Container Types

  • Devices: dz-local-device-dz1, dz-local-device-dz2 - Arista cEOS containers
  • Clients: dz-local-client-{pubkey} - Client containers running doublezerod
  • Manager: dz-local-manager - Runs the doublezero CLI for admin operations
  • Controller: dz-local-controller - Pushes configs to devices

Arista Device Interaction

# Basic CLI command
docker exec dz-local-device-dz1 Cli -c "show ip bgp summary"

# Privileged mode (needed for show running-config, configure, etc.)
docker exec dz-local-device-dz1 Cli -p 15 -c "show running-config section Tunnel500"

# Multi-line config via heredoc
docker exec dz-local-device-dz1 bash -c 'Cli -p 15 << EOF
configure
daemon doublezero-agent
shutdown
no shutdown
end
EOF
'

Useful Device Commands

  • show ip bgp summary / show ip bgp summary vrf vrf1 - BGP neighbor status
  • show interfaces Tunnel500 - Tunnel interface status
  • show ip pim neighbor - PIM neighbors
  • show ip mroute - Multicast routes
  • show vrf - VRF info and interface assignments

Agent Logs on Devices

  • Logs are in /var/log/agents/ with symlinks in /var/log/agents-latest/
  • doublezero-agent log: /var/log/agents-latest/doublezero-agent
  • Launcher log shows daemon start/stop events: /var/log/agents/Launcher-*

Common Device Issues

  1. Config commits failing with "internal error": Check /var/tmp disk space - core dumps can fill it up

    docker exec dz-local-device-dz1 df -h /var/tmp
    docker exec dz-local-device-dz1 bash -c 'rm -f /var/tmp/agents/core.*'
  2. Tunnel config in running-config but interface doesn't exist: The Tunnel agent may not have created the kernel interface. Restart the container:

    docker restart dz-local-device-dz2
  3. doublezero-agent not applying configs: Check if agent is running and logs for errors

    docker exec dz-local-device-dz1 ps aux | grep doublezero-agent
    docker exec dz-local-device-dz1 tail -30 /var/log/agents-latest/doublezero-agent

Client Interaction

# Check client tunnel status
docker exec dz-local-client-{pubkey} doublezero status

# Check routes on client
docker exec dz-local-client-{pubkey} ip route show

# Check tunnel interface
docker exec dz-local-client-{pubkey} ip addr show doublezero1

Manager Commands

# List users and their multicast group subscriptions
docker exec dz-local-manager doublezero user list

# List devices
docker exec dz-local-manager doublezero device list

# List multicast groups
docker exec dz-local-manager doublezero multicast group list

cEOS Interface Mapping

cEOS maps Arista interface names to Linux kernel interfaces:

  • Ethernet1eth1 (CYOA network - client tunnels)
  • Ethernet2eth2 (inter-device WAN link)
  • Management0eth0 (management network)
  • Tunnel500tu500 (user GRE tunnels)

To find interface indices:

docker exec dz-local-device-dz1 cat /sys/class/net/tu500/ifindex
docker exec dz-local-device-dz1 cat /sys/class/net/eth2/ifindex

Restarting Parts of the Devnet

You don't always need to dev/dzctl destroy and rebuild everything. If only specific containers need restarting:

  • Devices or clients: Remove just those containers, rebuild, and re-add them:
    docker rm -f dz-local-device-dz1
    dev/dzctl build
    dev/dzctl add-device --code dz1 --exchange xams --location ams --cyoa-network-host-id 8 --additional-networks dz1:dz2
  • Clients:
    docker rm -f dz-local-client-FposHWrkvPP3VErBAWCd4ELWGuh2mgx2Wx6cuNEA4X2S
    dev/dzctl build
    dev/dzctl add-client --cyoa-network-host-id 100 --keypair-path dev/.deploy/dz-local/client-FposHWrkvPP3VErBAWCd4ELWGuh2mgx2Wx6cuNEA4X2S/keypair.json
  • Core services (manager, controller, etc.): These are lighter and can be restarted with docker restart dz-local-controller.

Only use dev/dzctl destroy -y when you need a completely clean slate (e.g., ledger state is corrupted or you want to reset onchain state).

Running E2E Tests

Important: E2E tests are resource-intensive (each test spins up multiple Docker containers including cEOS devices). Always run specific tests rather than the full suite, as running all tests concurrently will exhaust memory on most machines.

# Run a specific test (preferred)
make e2e-test RUN=TestE2E_Multicast_Publisher

# Run with debug logging
make e2e-test-debug RUN=TestE2E_Multicast_Publisher

# Skip docker image rebuild
make e2e-test-nobuild RUN=TestE2E_Multicast_Publisher

# Keep containers after test completion/failure for debugging
make e2e-test-keep RUN=TestE2E_Multicast_Publisher

# Run all tests (requires high-memory machine)
make e2e-test

# Clean up leftover containers
make e2e-test-cleanup