A Research Journal

The Method

Before building, you need to know how to build. The roadmap outlines what we want to achieve, but not necessarily how. Each component needs best practices, working code, methodologies, and prior art. The research phase hydrates the planning docs with real knowledge.

Step 1: Create Research Directives

I used /do-parallel to create all 10 research directives simultaneously:

/do-parallel for each topic, create a directive for our research team to research and write it in research-requests/{name}.md

Claude spawned 10 parallel agents, each writing a directive for one topic. Each directive assigns an expert persona and asks specific, measurable questions.

Step 2: Deep Research

For each directive, I used Gemini's Deep Research:

Open gemini.google.com/app
Click Tools → Deep Research
Paste the entire directive
Wait 5-10 minutes for comprehensive report
Save results to research/{topic-slug}.md

Deep Research browses the web, reads papers, and synthesizes findings. The output is a structured report with citations.

Step 3: Load Research into Project

Save the Deep Research results to research/{topic-slug}.md. You can either copy them directly or have Claude rewrite them into the project format.

What We Learned

Here are examples that came out of the research — concrete patterns, working code, and validated approaches.

eBPF: Use Ring Buffers, Not Perf Arrays

Linux 5.8+ provides BPF_MAP_TYPE_RINGBUF with superior characteristics for high-frequency event capture:

// Ring buffer map definition — shared across all CPUs
struct {
    __uint(type, BPF_MAP_TYPE_RINGBUF);
    __uint(max_entries, 256 * 1024);  // 256KB shared
} events SEC(".maps");

// Zero-copy event submission
SEC("tracepoint/syscalls/sys_enter_munmap")
int trace_munmap_ringbuf(struct trace_event_raw_sys_enter *ctx)
{
    struct entropy_event *event;

    // Reserve space in ring buffer (zero-copy)
    event = bpf_ringbuf_reserve(&events, sizeof(*event), 0);
    if (!event)
        return 0;

    event->pid = bpf_get_current_pid_tgid() >> 32;
    event->bytes_freed = ctx->args[1];
    event->timestamp_ns = bpf_ktime_get_ns();
    event->event_type = ENTROPY_MUNMAP;

    bpf_ringbuf_submit(event, 0);
    return 0;
}

Ring buffers avoid per-CPU allocation, reduce memory footprint, and provide zero-copy semantics. This replaces the perf event array approach in our original design.

Thermal Time Constants: Measure, Don't Assume

The PID controller assumes τ = 1s for CPU, 2s for GPU, 30s for chassis. But these vary by hardware. The research revealed a methodology for measuring actual values:

def thermal_response(t, T_final, tau, T_initial, t_dead):
    """First-order thermal response with dead time."""
    t_effective = np.maximum(t - t_dead, 0)
    return T_initial + (T_final - T_initial) * (1 - np.exp(-t_effective / tau))

# Fit measured data to extract actual τ
popt, pcov = curve_fit(thermal_response, time_data, temp_data,
                       p0=[80, 1.0, 40, 0.1])
tau_measured = popt[1]

Key insight: Sample rate must be ≥10x faster than expected τ (100ms intervals for τ = 1s).

Proof of Inference: Layered Verification

Full cryptographic proofs (zk-SNARKs) are too expensive — 10-1000x inference time. The research identified a layered approach that leverages Maxwell's dual-plane control:

Layer 1: Thermodynamic (Always On)
├─ Power draw must match claimed computation class
└─ Blocks obvious cheats (mining, loops) instantly

Layer 2: PCIe Attestation (Always On)
├─ Hash tensors at bus boundary
└─ Timing signatures must match model profile

Layer 3: Selective ZK (High-Value Only)
├─ For bids above threshold, require ZK proof
└─ Proof of specific layer execution

Layer 4: Random Deep Audit (Rare)
├─ Full inference re-execution by Maxwell
└─ Economic deterrent via staking

This is unique to Maxwell — we control both CPU and GPU planes, so we can instrument the PCIe bus and correlate power telemetry with claimed work.

GSP Auction: Thermal Coupling Matrix

The auction research formalized how thermal coupling affects pricing across cores:

Thermal Coupling Matrix (4-core example):
K = | 1.00  0.85  0.60  0.35 |
    | 0.85  1.00  0.75  0.50 |
    | 0.60  0.75  1.00  0.70 |
    | 0.35  0.50  0.70  1.00 |

Core 0 (near GPU) sees 8x price multiplier when GPU is at 95% utilization. Core 3 (distant) sees only 1.5x. This asymmetry is a feature, not a bug — it naturally routes low-priority work to thermally-isolated cores.

Firecracker: The Pause Mechanism

The latency research clarified what actually happens during VM pause:

Send SIGSTOP to vCPU threads
Wait for vCPUs to halt at safe point
Drain in-flight I/O operations
Return success to API caller

Key finding: Pause latency scales with active I/O, not memory size. A 256MB VM with heavy disk I/O pauses slower than a 4GB idle VM.

The Ten Research Topics

Hardware & Physics Layer

Firecracker Pause/Resume Latency — Benchmarking for thermal emergencies
eBPF Overhead Validation — Production load testing
RAPL Accuracy Calibration — Hardware-specific power reporting
Thermal Time Constant Validation — Measuring actual τ values
Thermal Coupling Measurement — Inter-core heat transfer

Mechanism Design

GSP Thermal Stability — Auction equilibrium under thermal dynamics
High-Frequency Auction Research — 100Hz market clearing

Novel Capabilities

Power-Trace Verification — Distinguishing inference from mining
Proof of Inference — Cryptographic verification approaches
Thermal Gossip Consensus — Distributed thermal coordination

Next Steps

Take the research findings and update the roadmap with validated approaches, working code patterns, and concrete methodologies.

Research Planning

Prompts Used

Skills Used

Files Created