← thinkingmachines / Research Product Manager
cover_letter / art_IAwemwJwhTg

role
thinkingmachines / Research Product Manager
model
anthropic/claude-sonnet-4.6
created
2026-05-20T03:28
↓ Download .docx
Cover letter

Dear Thinking Machines Lab Hiring Team,

Thinking Machines Lab is working on one of the most consequential problems in technology: building collaborative general intelligence that is genuinely accessible — not just to well-resourced institutions, but to anyone with a problem worth solving. That mission resonates with me directly. My own path from hand-coding backpropagation through time in C++ at UC Berkeley in 2004, to publishing at NeurIPS, to building RL post-training infrastructure that benchmarks GRPO and DPO across TRL, VeRL, OpenRLHF, and NeMo RL today, has been defined by the same conviction: that the gap between frontier research and real-world utility is a product problem as much as a technical one.

**Technical and Research Foundation**

My AI/ML work is hands-on and longitudinal. The BRAIN project began as a hand-coded neural network in C++ with custom BPTT for protein secondary structure prediction — work that was accepted at NeurIPS 2014 and grew out of computational biology research at Lawrence Berkeley National Laboratory under Dr. Steven Holbrook. The 2026 rewrite of that system spans five neural architectures (feedforward, GRU, Transformer, ESM-2, multi-task), MLflow experiment tracking, Optuna hyperparameter optimization, and FastAPI serving — scaling from the original 413-parameter model to 8 billion parameters, a 19-million-fold increase.

More recently, I built a full RL post-training workbench covering the RLHF/DPO pipeline end-to-end. The platform includes a Reward Lab for designing and A/B testing reward functions (RLVR, learned, and hybrid) across GSM8K, MATH, HumanEval, and UltraFeedback; a Playground for real TRL-powered GRPO and DPO training with live SSE metric streaming on Apple Silicon (MPS) and CUDA; and an Arena for head-to-head framework benchmarking across TRL, VeRL, OpenRLHF, and NeMo RL with GPU passthrough in Docker containers. The workbench implements 12 RL algorithms — PPO, GRPO, DAPO, REINFORCE, REINFORCE++, RLOO, DPO, SimPO, IPO, KTO, ORPO, and SPPO — with standardized throughput, memory, and convergence benchmarking across frameworks.

I also built aeval, a local-first model evaluation platform with five core eval types (factuality, reasoning, instruction-following, safety, and code generation), adversarial safety testing with refusal detection, and data contamination detection via SHA-256 hashing. Statistical rigor is built in: bootstrap confidence intervals, Welch's t-test, Cohen's d effect size, and saturation detection, with CI/CD integration for regression detection and automated safety gates.

**Connecting the Arc**

What ties this technical work to my product career is a consistent pattern: I operate most effectively at the boundary where research ambition meets execution reality — translating complex, evolving systems into scoped plans, resource roadmaps, and cross-functional alignment. That is precisely what the Research Product Manager role at Thinking Machines Lab requires.

**Why This Role**

The RPM role description — driving large-scale research products, translating technical ideas into actionable milestones, creating compute and resource roadmaps, and bridging frontier research with production systems — maps directly to the work I find most energizing. The emphasis on thriving in deeply technical discussions while maintaining organizational momentum across model development, data campaigns, and infrastructure is not a stretch for me; it is the mode I have operated in across research, platform, and applied product contexts. The team's lineage — contributors to ChatGPT, Character.ai, Mistral, PyTorch, OpenAI Gym, and Segment Anything — represents the kind of environment where technical depth is a prerequisite, not a differentiator.

**Selected Prior Experience**

- **RL Workbench (2026):** Designed and built a 3-phase post-training platform implementing 12 RL algorithms with cross-tab workflow lineage tracking and standardized benchmarking across TRL, VeRL, OpenRLHF, and NeMo RL — directly relevant to post-training research program management.

- **aeval (2025–2026):** Built a model evaluation platform with adversarial safety testing, statistical rigor (bootstrap CI, Welch's t-test, Cohen's d), and CI/CD regression detection — relevant to evals infrastructure and research quality gates.

- **NeurIPS 2014:** Published research on artificial neural networks for protein secondary structure prediction, establishing a foundation in ML research contribution and scientific communication.

- **Intuit — ICE Platform (2021–2024):** Delivered 275% YoY engagement growth, scaling to 675M+ engagements in FY23; scaled throughput from 6K to 50K TPS via rSocket migration supporting ~1.5M concurrent connections with sub-25ms TP99 — demonstrating infrastructure roadmap ownership at scale.

- **Intuit — Developer Onboarding (2021–2024):** Reduced developer onboarding from 2–3 weeks to minutes in pre-prod and under 24 hours for production via the ICE Self-Service platform, mitigating $1M+ in projected opex growth — evidence of translating technical complexity into measurable execution outcomes.

- **Splunk — Search Orchestration (2019–2021):** Owned Go microservices, PostgreSQL metadata services, and SPL/SPL2 roadmaps; delivered Scheduler Service end-to-end in approximately four months and achieved up to 10x query performance improvements for a Fortune 500 beta customer.

- **Fintellect AI (2024–Present):** Architected a RAG retrieval pipeline with ChromaDB, multi-provider LLM orchestration (Claude, GPT-4, Gemini) with fallback routing, structured output validation, and token budget optimization — applied ML infrastructure built from the ground up.

**Closing**

Thinking Machines Lab's mission — empowering humanity through collaborative general intelligence — is not a tagline I am drawn to abstractly. It is the logical destination of a career spent making complex AI systems legible, usable, and impactful for people who need them. I would welcome the opportunity to bring that combination of research depth, platform-scale execution experience, and product clarity to your team.

Thank you for your consideration.

**O. Felix Amoruwa**
famoruwa@berkeley.edu | 909-731-9011 | felixamoruwa.info