← nvidia / Senior Product Manager, AI Frameworks
cover_letter / art_HxM4Ac55SiY
role
model
anthropic/claude-sonnet-4.6
created
2026-05-20T21:59
Cover letter
Dear NVIDIA AI Frameworks Hiring Team,
NVIDIA sits at the center of every serious AI infrastructure decision being made today — the GPU platform is not just hardware, it is the substrate on which frontier research becomes production reality. That alignment between research and production is where I have spent the better part of my career, and it is what draws me to this role. From hand-coding backpropagation through time in C++ at UC Berkeley in 2004 to building a post-training RL workbench that benchmarks GRPO and DPO across TRL, VeRL, OpenRLHF, and NeMo RL today, I have tracked the arc of deep learning from first principles to production scale — and I want to help NVIDIA push that arc further.
## Technical Foundation
My AI/ML work is not adjacent to my product work — it is the same work. In 2025–2026, I built an RL post-training workbench from scratch covering the full RLHF/DPO pipeline across three phases: a Reward Lab for designing and A/B testing reward functions (RLVR, learned, and hybrid) across GSM8K, MATH, HumanEval, and UltraFeedback; a Playground for real TRL-powered GRPO and DPO training with live SSE metric streaming on Apple Silicon (MPS) and CUDA; and an Arena for head-to-head framework benchmarking across TRL, VeRL, OpenRLHF, and NeMo RL with GPU passthrough in Docker containers. I implemented 12 RL algorithms — PPO, GRPO, DAPO, REINFORCE, REINFORCE++, RLOO, DPO, SimPO, IPO, KTO, ORPO, and SPPO — with algorithm-specific metric profiles and standardized throughput, memory, and convergence benchmarking. This is the kind of post-training landscape awareness the JD calls out directly.
My evaluation platform, aeval, adds statistical rigor to model assessment: bootstrap confidence intervals, Welch's t-test, Cohen's d effect size, saturation detection, adversarial safety testing with refusal detection, and CI/CD regression gates. The stack — FastAPI orchestrator, TimescaleDB, Redis job queue, Next.js dashboard, Ollama — reflects the same infrastructure thinking required to build developer-facing AI frameworks at NVIDIA's scale.
My NeurIPS 2014 paper on artificial neural networks for protein secondary structure prediction, and the subsequent 2026 rewrite of that system in PyTorch spanning 413 parameters to 8B (a 19-million-fold scale increase), ground my ML credibility in both research rigor and modern production engineering.
## Connecting the Arc
The through-line of my career — from SOA platform infrastructure at Kaiser, to search orchestration microservices at Splunk, to developer frameworks at Intuit, to founding AI-native products — is building platforms that meet technical users where they are and remove the friction between what they want to build and what the infrastructure allows. That is precisely what NVIDIA's AI Frameworks team does for RecSys and Generative Recommender researchers.
## Why This Role
Generative Recommender models — GEM, TIGER, and the emerging paradigms building on them — represent one of the most technically demanding intersections of large-scale distributed training, inference optimization, and production deployment. The opportunity to own the product strategy for enabling frontier RecSys and GenRec model builders on NVIDIA GPUs, working directly with researchers and operators to identify key improvements and build E2E ML lifecycle roadmaps, is exactly the kind of high-leverage, technically grounded PM role I am built for. The emphasis on open-source, GitHub-first developer products with deep customer interaction mirrors how I have operated across every platform role I have held.
## Selected Relevant Experience
- **RL Post-Training Workbench (2026):** Built 3-phase workbench covering reward design, real GRPO/DPO training with live metric streaming on MPS/CUDA, and head-to-head benchmarking across TRL, VeRL, OpenRLHF, and NeMo RL with GPU Docker passthrough — directly applicable to NVIDIA's post-training framework work.
- **aeval — AI Model Evaluation Platform (2025–2026):** Built local-first evaluation platform with 5 eval types, adversarial safety testing, statistical rigor (bootstrap CIs, Welch's t-test, Cohen's d), and CI/CD integration with automated safety gates — FastAPI, TimescaleDB, Redis, Ollama.
- **Intuit ICE Platform — Developer Frameworks & Infrastructure (2021–2024):** Delivered ICE Self-Service platform (DevPortal, GitOps config, ICE Playground), reducing developer onboarding from 2–3 weeks to minutes in pre-prod and under 24 hours for production, while mitigating $1M+ in projected opex growth. Scaled throughput from 6K to 50K TPS via rSocket migration supporting ~1.5M concurrent connections with sub-25ms TP99.
- **Intuit SDK Starter Kits:** Extended Java and Python SDK Starter Kits with scaffolding templates, build configurations (Gradle/Maven), testing frameworks, and CI/CD integration — empowering developers to go from zero to production-ready microservice in minutes.
- **Enterprise-Wide Service Language Assessment:** Conducted analysis across 9 languages (Java, Python, Kotlin, Go, TypeScript, Scala, PHP, C++, Groovy), analyzing usage data and developer feedback to inform strategic language investment decisions presented to the CTO — the kind of cross-stack technical breadth required to reason about framework tradeoffs at NVIDIA.
- **Splunk Search Orchestration (2019–2021):** Owned Search Service (Go microservices), Search Catalog (PostgreSQL metadata service), and SPL/SPL2 — delivered Scheduler Service end-to-end in ~4 months and led query performance optimization achieving up to 10x improvements for beta customers.
- **NeurIPS 2014 / BRAIN Platform:** Published researcher on neural networks for protein structure prediction; 2026 PyTorch rewrite spans 5 architectures (feedforward, GRU, Transformer, ESM-2, multi-task), MLflow experiment tracking, Optuna HPO, and FastAPI serving across 413 to 8B parameters.
## Closing
NVIDIA's mission is to solve problems that matter at the frontier of computing. The AI Frameworks team sits at the exact intersection of that mission — enabling the researchers and engineers who are defining what large-scale recommender and generative models can become. I bring 12+ years of technical platform product leadership, hands-on RL and evaluation infrastructure I built myself, and a NeurIPS publication that traces back to the same curiosity driving this role. I would welcome the opportunity to discuss how I can contribute.
Respectfully,
**O. Felix Amoruwa**
famoruwa@berkeley.edu | 909-731-9011 | felixamoruwa.info