Research Planning
Prompts Used
Create directives
/do-parallel for each topic, create a directive for our research team to research and write it in research-requests/{name}.mdDeep research (Gemini)
For each directive in research-requests/: 1. Open gemini.google.com/app 2. Click Tools → Deep Research 3. Paste the entire directive 4. Review the comprehensive report
Skills Used
Files Created
The Method
Before building, you need to know how to build. The roadmap outlines what we want to achieve, but not necessarily how. Each component needs best practices, working code, methodologies, and prior art. The research phase hydrates the planning docs with real knowledge.
Step 1: Create Research Directives
I used /do-parallel to create all 10 research directives simultaneously:
/do-parallel for each topic, create a directive for our research team to research and write it in research-requests/{name}.md
Claude spawned 10 parallel agents, each writing a directive for one topic. Each directive assigns an expert persona and asks specific, measurable questions.
Step 2: Deep Research
For each directive, I used Gemini's Deep Research:
- Open gemini.google.com/app
- Click Tools → Deep Research
- Paste the entire directive
- Wait 5-10 minutes for comprehensive report
- Save results to
research/{topic-slug}.md
Deep Research browses the web, reads papers, and synthesizes findings. The output is a structured report with citations.
Step 3: Load Research into Project
Save the Deep Research results to research/{topic-slug}.md. You can either copy them directly or have Claude rewrite them into the project format.
What We Learned
Here are examples that came out of the research — concrete patterns, working code, and validated approaches.
eBPF: Use Ring Buffers, Not Perf Arrays
Linux 5.8+ provides BPF_MAP_TYPE_RINGBUF with superior characteristics for high-frequency event capture:
// Ring buffer map definition — shared across all CPUs
struct {
__uint(type, BPF_MAP_TYPE_RINGBUF);
__uint(max_entries, 256 * 1024); // 256KB shared
} events SEC(".maps");
// Zero-copy event submission
SEC("tracepoint/syscalls/sys_enter_munmap")
int trace_munmap_ringbuf(struct trace_event_raw_sys_enter *ctx)
{
struct entropy_event *event;
// Reserve space in ring buffer (zero-copy)
event = bpf_ringbuf_reserve(&events, sizeof(*event), 0);
if (!event)
return 0;
event->pid = bpf_get_current_pid_tgid() >> 32;
event->bytes_freed = ctx->args[1];
event->timestamp_ns = bpf_ktime_get_ns();
event->event_type = ENTROPY_MUNMAP;
bpf_ringbuf_submit(event, 0);
return 0;
}
Ring buffers avoid per-CPU allocation, reduce memory footprint, and provide zero-copy semantics. This replaces the perf event array approach in our original design.
Thermal Time Constants: Measure, Don't Assume
The PID controller assumes τ = 1s for CPU, 2s for GPU, 30s for chassis. But these vary by hardware. The research revealed a methodology for measuring actual values:
def thermal_response(t, T_final, tau, T_initial, t_dead):
"""First-order thermal response with dead time."""
t_effective = np.maximum(t - t_dead, 0)
return T_initial + (T_final - T_initial) * (1 - np.exp(-t_effective / tau))
# Fit measured data to extract actual τ
popt, pcov = curve_fit(thermal_response, time_data, temp_data,
p0=[80, 1.0, 40, 0.1])
tau_measured = popt[1]
Key insight: Sample rate must be ≥10x faster than expected τ (100ms intervals for τ = 1s).
Proof of Inference: Layered Verification
Full cryptographic proofs (zk-SNARKs) are too expensive — 10-1000x inference time. The research identified a layered approach that leverages Maxwell's dual-plane control:
Layer 1: Thermodynamic (Always On)
├─ Power draw must match claimed computation class
└─ Blocks obvious cheats (mining, loops) instantly
Layer 2: PCIe Attestation (Always On)
├─ Hash tensors at bus boundary
└─ Timing signatures must match model profile
Layer 3: Selective ZK (High-Value Only)
├─ For bids above threshold, require ZK proof
└─ Proof of specific layer execution
Layer 4: Random Deep Audit (Rare)
├─ Full inference re-execution by Maxwell
└─ Economic deterrent via staking
This is unique to Maxwell — we control both CPU and GPU planes, so we can instrument the PCIe bus and correlate power telemetry with claimed work.
GSP Auction: Thermal Coupling Matrix
The auction research formalized how thermal coupling affects pricing across cores:
Thermal Coupling Matrix (4-core example):
K = | 1.00 0.85 0.60 0.35 |
| 0.85 1.00 0.75 0.50 |
| 0.60 0.75 1.00 0.70 |
| 0.35 0.50 0.70 1.00 |
Core 0 (near GPU) sees 8x price multiplier when GPU is at 95% utilization. Core 3 (distant) sees only 1.5x. This asymmetry is a feature, not a bug — it naturally routes low-priority work to thermally-isolated cores.
Firecracker: The Pause Mechanism
The latency research clarified what actually happens during VM pause:
- Send SIGSTOP to vCPU threads
- Wait for vCPUs to halt at safe point
- Drain in-flight I/O operations
- Return success to API caller
Key finding: Pause latency scales with active I/O, not memory size. A 256MB VM with heavy disk I/O pauses slower than a 4GB idle VM.
The Ten Research Topics
Hardware & Physics Layer
- Firecracker Pause/Resume Latency — Benchmarking for thermal emergencies
- eBPF Overhead Validation — Production load testing
- RAPL Accuracy Calibration — Hardware-specific power reporting
- Thermal Time Constant Validation — Measuring actual τ values
- Thermal Coupling Measurement — Inter-core heat transfer
Mechanism Design
- GSP Thermal Stability — Auction equilibrium under thermal dynamics
- High-Frequency Auction Research — 100Hz market clearing
Novel Capabilities
- Power-Trace Verification — Distinguishing inference from mining
- Proof of Inference — Cryptographic verification approaches
- Thermal Gossip Consensus — Distributed thermal coordination
Next Steps
Take the research findings and update the roadmap with validated approaches, working code patterns, and concrete methodologies.