← elastic / Principal Product Manager, AI agents - Search
cover_letter / art_Ny7eu9LkOSY
role
model
anthropic/claude-sonnet-4.6
created
2026-05-22T21:39
Cover letter
Dear Elastic Hiring Team,
Elastic sits at a rare intersection: the precision of structured search and the intelligence of large language models, serving more than half the Fortune 500 with infrastructure that underpins how enterprises find answers at scale. That combination — retrieval rigor meeting agentic reasoning — is exactly the problem space I have spent the last several years building toward, from architecting RAG pipelines and multi-agent orchestration frameworks to benchmarking RL post-training algorithms across production frameworks. When I read the Agent Builder mandate, I recognized it immediately as the product I would want to build.
**Technical Foundation**
My AI and ML work is hands-on and end-to-end. In 2025–2026 I built aeval, a local-first model evaluation platform covering factuality, reasoning, instruction-following, safety, and code generation — with adversarial refusal detection, bootstrap confidence intervals, Welch's t-test, Cohen's d effect size, and automated safety gates wired into CI/CD. The stack (FastAPI orchestrator, TimescaleDB, Redis job queue, Next.js dashboard, Ollama) was designed to give teams the statistical rigor needed to make defensible decisions about model quality — exactly the kind of benchmarking and evaluation infrastructure the Agent Builder role calls for.
On the retrieval side, I architected the RAG pipeline for Fintellect AI: ChromaDB vector store, multi-provider LLM orchestration across Claude, GPT-4, and Gemini with fallback routing, structured output validation, and token budget optimization. I understand the full context engineering stack — chunking strategy, embedding selection, retrieval scoring, re-ranking, and the tradeoffs that determine whether an agent's context window is signal or noise.
For multi-agent orchestration, I designed and implemented OpenClaw — a gateway protocol with subagent delegation, profile management, and session switching — enabling coordinated agent workflows across distinct industry verticals inside StreamIO. Building that system taught me where agent context breaks down in practice: context window exhaustion, delegation ambiguity, and the absence of observability into what context an agent actually used to reach a decision. Those are precisely the gaps Elastic's Agent Builder is positioned to close.
My RL post-training workbench adds another layer of credibility on the evaluation side: a three-phase platform covering Reward Lab (A/B testing reward functions across GSM8K, MATH, HumanEval, and UltraFeedback), a TRL-powered training Playground with live SSE metric streaming, and an Arena for head-to-head framework benchmarking across TRL, VeRL, OpenRLHF, and NeMo RL — 12 algorithms in total, with standardized throughput, memory, and convergence benchmarking. This work, along with my NeurIPS 2014 publication on neural networks for protein structure prediction and my original hand-coded BPTT implementation in C++ at UC Berkeley, reflects a research orientation that complements the product execution side.
**Why This Role**
My arc — from platform infrastructure PM at Intuit to AI founder to ML researcher — has consistently pointed toward the same problem: giving developers and enterprises the context layer they need to build reliable, scalable AI systems. The Agent Builder is that context layer at Elastic's scale, and the opportunity to define how enterprises construct, refine, and benchmark agent context on top of Elastic's retrieval and relevance infrastructure is one I am well-positioned to lead.
What specifically excites me about this role is the benchmarking and evaluation mandate. Working directly with data science and engineering to build evaluation infrastructure for agent capabilities is not a side project for me — it is the work I have been doing in aeval and the RL Workbench. I also see the ecosystem partnership dimension — Google, Amazon, Microsoft, and community developers — as a natural extension of the cross-functional alignment work I did at Intuit, where I managed relationships with hyperscaler partners during the Mailchimp GCP-to-AWS migration and coordinated across 9 engineering language ecosystems for a CTO-level strategic assessment.
**Selected Relevant Experience**
- **Fintellect AI — RAG Pipeline Architecture:** Built ChromaDB-backed retrieval pipeline with multi-provider LLM orchestration (Claude, GPT-4, Gemini), fallback routing, structured output validation, and token budget optimization — directly applicable to Elastic's context engineering capabilities.
- **OpenClaw Multi-Agent Orchestration:** Designed gateway protocol with subagent delegation, profile management, and session switching for coordinated AI agent workflows — hands-on experience with the architectural patterns Agent Builder is designed to support.
- **aeval — Model Evaluation Platform:** Built evaluation platform with 5 eval types, adversarial safety testing, statistical rigor (bootstrap CI, Welch's t-test, Cohen's d), and CI/CD regression detection — directly relevant to the agent benchmarking and evaluation strategy this role requires.
- **RL Workbench:** Benchmarked 12 RL algorithms across TRL, VeRL, OpenRLHF, and NeMo RL with standardized throughput/memory/convergence metrics — demonstrates the technical depth needed to work credibly with data science and engineering teams on agent evaluation.
- **Intuit — ICE Self-Service Platform:** Delivered developer platform that reduced onboarding from 2–3 weeks to minutes, scaled to 675M+ engagements in FY23, and grew throughput from 6K to 50K TPS — track record of shipping developer-facing infrastructure at enterprise scale.
- **Intuit — Java and Python SDK Starter Kits:** Extended SDK scaffolding with build configurations, testing frameworks, and CI/CD integration, enabling developers to reach production-ready microservices in minutes — experience directly applicable to Agent Builder's developer-facing tooling.
- **Splunk — Search Orchestration PM:** Owned Search Service (Go microservices), Search Catalog (PostgreSQL metadata), and SPL/SPL2; delivered Scheduler Service end-to-end in four months and achieved up to 10x query performance improvements — foundational experience in search infrastructure that informs how retrieval and relevance underpin agent context.
**Closing**
Elastic's mission — enabling everyone to find answers in real time, using all their data, at scale — is not abstract to me. It describes the infrastructure problem I have been building against from multiple angles: as a platform PM scaling developer tools to hundreds of millions of engagements, as a founder building RAG pipelines and multi-agent systems, and as a researcher publishing on model evaluation and neural architectures. The Agent Builder is the product that makes agentic AI reliable at enterprise scale, and I would bring both the technical depth and the platform PM experience to define and ship it.
I would welcome the opportunity to discuss how my background maps to Elastic's roadmap in more detail.
Sincerely,
**O. Felix Amoruwa**
famoruwa@berkeley.edu | 909-731-9011 | felixamoruwa.info