← anthropic / Product Manager, Claude Code
cover_letter / art_Qkq7m3wZeQg
role
model
anthropic/claude-sonnet-4.6
created
2026-05-21T00:07
Cover letter
Dear Claude Code Hiring Team,
Anthropic's mission — building AI systems that are reliable, interpretable, and steerable — sits at the exact intersection of safety and capability that defines the next decade of software development. Claude Code in particular represents something I find genuinely compelling: a bet that the terminal is not a legacy interface but the highest-leverage surface for agentic AI, and that developers who live in the CLI deserve a first-class AI collaborator. My path from hand-coding backpropagation through time in C++ at UC Berkeley in 2004 to building a production RL post-training workbench that benchmarks GRPO and DPO across TRL, VeRL, OpenRLHF, and NeMo RL today has been a continuous thread of building developer-facing platforms at the frontier of what models can do.
---
**Technical and AI/ML Foundation**
My technical credibility is grounded in two decades of building, not just specifying. The most relevant recent example is my RL Workbench (2026), a three-phase post-training platform covering the full RLHF/DPO pipeline: a Reward Lab for designing and A/B testing reward functions across GSM8K, MATH, HumanEval, and UltraFeedback; a Playground for real TRL-powered GRPO and DPO training with live SSE metric streaming on Apple Silicon (MPS) and CUDA; and an Arena for head-to-head framework benchmarking across TRL, VeRL, OpenRLHF, and NeMo RL with GPU passthrough in Docker containers. I implemented 12 RL algorithms — PPO, GRPO, DAPO, REINFORCE, REINFORCE++, RLOO, DPO, SimPO, IPO, KTO, ORPO, and SPPO — with algorithm-specific metric profiles and standardized throughput, memory, and convergence benchmarking. This is not background reading; it is the kind of hands-on fluency that lets me have a peer-level conversation with Anthropic's research team about what model capabilities mean for product surfaces.
Complementing that, my aeval platform (2025–2026) is a local-first model evaluation system with five core eval types, adversarial safety testing with refusal detection, bootstrap confidence intervals, Welch's t-test, Cohen's d effect size, and CI/CD integration with automated safety gates — built on FastAPI, TimescaleDB, Redis, and Ollama. And my NeurIPS 2014 publication on artificial neural networks for protein secondary structure prediction, built on a system I originally hand-coded in C++ with custom BPTT in 2004 and rewrote in 2026 spanning 413 parameters to 8 billion, gives me a research foundation that is unusual for a PM.
On the developer platform side, I spent three years at Intuit as Staff PM for Developer Frameworks and Platform Infrastructure, where I extended Java and Python SDK Starter Kits with scaffolding templates, build configurations, testing frameworks, and CI/CD integration — enabling developers to go from zero to production-ready microservice in minutes. I delivered the ICE Self-Service platform, reducing developer onboarding from two to three weeks down to minutes in pre-production and under 24 hours for production, while mitigating over $1M in projected opex growth. That platform scaled to 675M+ engagements in FY23 across QuickBooks, TurboTax, Mint, Mailchimp, and Credit Karma, with throughput scaling from 6K to 50K TPS via rSocket migration supporting approximately 1.5M concurrent connections at sub-25ms TP99.
---
**Why This Role, Why Now**
Anthropic's acquisition of Stainless signals that the company is building a serious developer platform layer, not just a model API. Claude Code is the sharpest edge of that platform — the place where model intelligence meets the developer's actual workflow. The role of defining the roadmap for an area of the Claude Code product suite, translating cutting-edge AI advances into practical developer features, and building an ecosystem around the CLI maps precisely to the work I have done: shipping developer SDKs at Intuit, building the OpenClaw multi-agent orchestration framework with gateway protocol and subagent delegation at StreamIO, and designing evaluation infrastructure that gives developers structured feedback on model behavior.
---
**Role-Specific Connection**
What excites me most about this role is the specific challenge of keeping Claude Code ahead of model capabilities as those capabilities accelerate. Claude Opus 4.7's SOTA performance on coding and multi-step agentic tasks means the product surface needs to evolve faster than the model does — and that requires a PM who can read a research paper, understand what it implies for the CLI experience, and turn it into a concrete feature spec before the model ships. The JD's emphasis on building an ecosystem so developers can share best practices resonates directly with my ICE platform work, where the hardest problem was not the infrastructure but the developer adoption flywheel. I also bring direct experience with the agentic coding workflow as a practitioner: I built StreamIO's production Electron and React desktop application with 100+ components, Redux Toolkit state management, and native macOS integration — using Claude via the MCP SDK as the AI backbone — so I understand the developer experience from the inside.
---
**Selected Prior Experience**
- **RL Workbench (2026):** Implemented 12 RL algorithms (PPO, GRPO, DAPO, DPO, SimPO, and others) with cross-tab workflow lineage tracking and standardized benchmarking across TRL, VeRL, OpenRLHF, and NeMo RL — hands-on fluency with the post-training stack that powers Claude's capabilities.
- **Intuit ICE Self-Service Platform:** Delivered DevPortal, GitOps config, and ICE Playground, reducing developer onboarding from weeks to minutes; scaled to 675M+ engagements and 50K TPS at sub-25ms TP99.
- **Java and Python SDK Starter Kits (Intuit):** Extended with scaffolding templates, build configurations (Gradle/Maven), testing frameworks, and CI/CD integration — enabling zero-to-production microservice in minutes across ~20 mobile apps and 30+ product SKUs.
- **OpenClaw Multi-Agent Orchestration (StreamIO):** Implemented gateway protocol, subagent delegation, profile management, and session switching — enabling coordinated AI agent workflows across multiple industry verticals.
- **StreamIO Desktop Application:** Built production Electron + React + TypeScript application with 100+ components, native macOS ScreenCaptureKit integration via Swift, embedded terminal (xterm.js/node-pty, 20+ slash commands), and MCP server exposing screen capture tools to AI coding assistants — directly analogous to the Claude Code CLI ecosystem.
- **aeval Evaluation Platform (2025–2026):** Built local-first model evaluation with adversarial safety testing, statistical rigor (bootstrap CI, Welch's t-test, Cohen's d), and CI/CD regression detection — the kind of evaluation infrastructure that keeps a developer-facing AI product honest as models evolve.
- **Splunk Scheduler Service:** Delivered end-to-end in approximately four months, enabling scheduled search capabilities for first-party applications — demonstrated ability to ship ambitious developer-facing features on compressed timelines.
---
**Closing**
Anthropic's commitment to building AI that is safe and beneficial is not a constraint on ambition — it is the reason the work matters. Claude Code is the product that puts the most capable Claude models directly in the hands of the developers who will build the next generation of AI applications. I want to help define what that product becomes as model intelligence continues to accelerate. I would welcome the opportunity to discuss how my background in developer platforms, RL post-training infrastructure, and agentic AI product development can contribute to that work.
Thank you for your consideration.
**O. Felix Amoruwa**
famoruwa@berkeley.edu | 909-731-9011 | felixamoruwa.info