← baseten / Product Manager - Dedicated Inference
cover_letter / art_StrvHawNhDg

role
baseten / Product Manager - Dedicated Inference
model
anthropic/claude-sonnet-4.6
created
2026-05-29T18:35
↓ Download .docx
Cover letter

Dear Baseten Hiring Team,

Baseten sits at the exact inflection point where ML infrastructure stops being a research curiosity and becomes the operational backbone of production AI — powering inference for companies like Cursor, Notion, and Abridge at the frontier. That mission resonates directly with work I have been doing for the past several years: building the platform layers that let developers stop wrestling with infrastructure and start shipping. When I read about Baseten Loops launching a Training SDK for frontier RL workloads and Frontier Gateway unifying model access behind a single API, I recognized the same problems I have been solving from the other side of the table.

**Technical and AI/ML Foundation**

My credibility here is hands-on, not theoretical. In 2026 I built a full RL post-training workbench covering the complete RLHF/DPO pipeline: a Reward Lab for designing and A/B testing reward functions across GSM8K, MATH, HumanEval, and UltraFeedback; a TRL-powered training Playground with live SSE metric streaming on Apple Silicon (MPS) and CUDA; and an Arena for head-to-head framework benchmarking across TRL, VeRL, OpenRLHF, and NeMo RL with GPU passthrough in Docker containers. I implemented 12 RL algorithms — PPO, GRPO, DAPO, REINFORCE, REINFORCE++, RLOO, DPO, SimPO, IPO, KTO, ORPO, and SPPO — with standardized throughput, memory, and convergence benchmarking across frameworks. That is the same class of infrastructure Baseten Loops is now productizing for external customers, and I understand the tradeoffs at the algorithm and systems level.

On the inference side, my work at Intuit as Staff PM for Developer Frameworks and Platform Infrastructure scaled the ICE platform from 6K to 50K TPS via an rSocket migration supporting approximately 1.5 million concurrent connections at sub-25ms TP99 — a throughput and latency profile directly comparable to what Baseten's inference harness targets. I also architected the RAG retrieval pipeline and multi-provider LLM orchestration layer for Fintellect AI, including fallback routing across Claude, GPT-4, and Gemini with structured output validation and token budget optimization — which gave me direct exposure to the production model-serving tradeoffs that Baseten's Frontier Gateway is designed to abstract away.

My research foundation goes back further: a NeurIPS 2014 paper on neural networks for protein secondary structure prediction, and a 2026 rewrite of that original C++ BPTT system into a full PyTorch platform spanning 413 parameters to 8 billion — a 19-million-fold scale increase with MLflow experiment tracking, Optuna HPO, and FastAPI serving.

**Why This Role**

I have spent my career at the intersection of developer experience and platform infrastructure — building the SDKs, APIs, and self-service tooling that turn complex systems into products developers actually want to use. Baseten's Core Product PM role is precisely that job, applied to the fastest-moving surface in software right now: production AI inference.

What specifically draws me to this role is the scope of the example initiatives: asynchronous inference, multi-component workflow chains, and model training built for production inference are not incremental features — they are the architectural primitives that determine whether an AI platform scales with its customers or becomes a ceiling. I want to own that roadmap. The recent launch of DFlash (3x LLM inference speedup via attention optimization) and sub-second image generation with Flux.2 signal that Baseten's engineering team is operating at the applied research boundary, and I want to be the PM translating that capability into developer-facing products that are as powerful as they are usable.

**Selected Prior Experience**

- **ICE Self-Service Platform (Intuit):** Delivered DevPortal, GitOps config, and ICE Playground — reducing developer onboarding from 2–3 weeks to minutes in pre-production and under 24 hours for production, while mitigating $1M+ in projected opex growth.

- **SDK Starter Kits (Intuit):** Extended Java and Python SDK Starter Kits with scaffolding templates, build configurations (Gradle/Maven), testing frameworks, and CI/CD integration — enabling developers to go from zero to production-ready microservice in minutes.

- **RL Post-Training Workbench (2026):** Built 3-phase workbench benchmarking GRPO/DPO across TRL, VeRL, OpenRLHF, and NeMo RL with 12 algorithm implementations, live metric streaming, and GPU Docker passthrough.

- **aeval — AI Model Evaluation Platform (2025–2026):** Built local-first eval platform with FastAPI orchestrator, TimescaleDB, Redis job queue, and Next.js dashboard; statistical rigor includes bootstrap confidence intervals, Welch's t-test, Cohen's d, and automated safety gates with CI/CD regression detection.

- **Multi-Provider LLM Orchestration (Fintellect AI):** Architected RAG pipeline with ChromaDB, multi-provider routing (Claude, GPT-4, Gemini) with fallback logic, structured output validation, and token budget optimization in a production mobile and web application.

- **Scheduler Service (Splunk):** Delivered end-to-end in approximately four months, enabling scheduled search capabilities for first-party applications — demoed at Splunk .conf19 to an audience of enterprise customers and developers.

- **ICE Throughput Scaling (Intuit):** Achieved 275% YoY growth in ICE engagements, scaling to 675M+ in FY23; drove rSocket migration from 6K to 50K TPS supporting ~1.5M concurrent connections at sub-25ms TP99.

**Closing**

Baseten's mission — uniting applied AI research, flexible infrastructure, and seamless developer tooling so that companies at the frontier can actually ship — is the work I want to be doing. I have built the infrastructure, written the research, shipped the SDKs, and managed the platforms. I am ready to bring that full arc to bear on Baseten's core product surface.

Thank you for your consideration. I would welcome the opportunity to discuss how my background maps to what you are building.

Sincerely,

**O. Felix Amoruwa**
famoruwa@berkeley.edu | 909-731-9011 | felixamoruwa.info