← cresta / Forward Deployed Product Manager, AI Agent
brief / art_LSjaXdHzVto
role
model
anthropic/claude-sonnet-4.6
created
2026-05-24T21:27
Company snapshot
Cresta is a Stanford AI Lab spinout that sells a unified AI platform for contact centers, combining autonomous conversational AI agents, real-time agent assist, and conversation intelligence. The company has raised over $270M from a16z, Greylock, and Sequoia, and counts United Airlines, Cox Communications, and Marriott among its enterprise customers. CEO Ping Wu previously founded and led Google's Contact Center AI and Vertex AI platforms, giving the company strong AI credibility and enterprise GTM DNA. Based on the JD and public signals, Cresta appears to be in a growth phase, expanding its autonomous AI Agent product line beyond real-time assist into full end-to-end automation — a strategic shift that likely drives this Forward Deployed PM hire. Specific recent internal milestones or product launches beyond what is publicly stated in the JD are not known with certainty.
Team stack
Based on the JD and public signals: conversational AI agents built on LLMs (likely fine-tuned or RAG-augmented models given the contact center domain); real-time inference infrastructure requiring low-latency serving (likely); internal agent-building tooling and workflow configuration UIs used by Forward Deployed Engineers; integrations with enterprise telephony and CRM systems (Salesforce, Genesys, Five9 — likely based on contact center vertical); Python-based backend services (likely); and possibly proprietary conversation design / dialogue management tooling. The JD references 'Cresta's internal tools and the latest technologies,' suggesting a mix of proprietary platform tooling and third-party LLM APIs. Specific internal frameworks are not publicly confirmed.
Likely questions (10)
| area | question | why |
|---|---|---|
| behavioral | Tell me about a time you owned a complex enterprise deployment end-to-end — from pre-sale scoping through post-go-live optimization. What did you own, what broke, and how did you recover? | The JD explicitly calls out owning 'the deployment lifecycle for AI Agents — from pre-sale scoping and kickoff to launch and post-go-live optimization.' They want evidence of full-cycle ownership, not just handoffs. |
| domain | Walk me through how you would design an AI Agent for a large airline's customer support contact center — what use cases would you prioritize first, how would you define success metrics, and what failure modes would you plan for? | United Airlines is a named Cresta customer; the JD asks candidates to 'find automation opportunities' and 'define and deliver on the metrics that move the needle.' This tests domain instinct for the contact center vertical. |
| system_design | You need to deploy a conversational AI agent that handles flight rebooking for a major airline. The agent must integrate with a legacy CRM, handle escalation to human agents, and maintain context across channels. How do you architect this? | The JD requires 'systems thinking and exposure to building Agents' and working with FDEs on integrations. This tests whether the candidate can think through agentic architecture in a real enterprise constraint environment. |
| coding | Given a transcript of a customer support conversation, write a function (in Python) that extracts intent, entities, and a recommended next action — and explain how you'd use this in an evaluation loop for an AI agent. | The JD values 'hands-on work style' and 'strong technical understanding.' Forward Deployed PMs at AI-first companies are expected to prototype and validate alongside FDEs, not just write PRDs. |
| domain | How would you evaluate whether an AI Agent is ready to go live versus needs more iteration? What metrics, test sets, or qualitative signals would you use, and how would you communicate readiness to a skeptical enterprise stakeholder? | The JD calls out 'design, build, test and iterate on AI agents' and 'delivering the highest customer ROI.' This probes evaluation rigor and the ability to translate technical quality signals into executive-facing narratives. |
| behavioral | Describe a situation where you had to facilitate a workshop with senior customer executives to align on requirements or success criteria for a product that didn't fully exist yet. How did you manage ambiguity and drive alignment? | The JD explicitly lists 'facilitate workshops and design sessions with customer stakeholders to align on agent use cases, workflows, and success criteria' as a core responsibility. |
| system_design | How would you design a multi-agent orchestration system where a primary agent delegates subtasks to specialized subagents (e.g., billing lookup, policy retrieval, escalation routing) in a contact center context? What are the key failure modes? | The JD references building AI Agents using 'the latest technologies'; multi-agent orchestration is central to modern agentic contact center platforms and directly maps to the candidate's OpenClaw work. |
| culture | Cresta is a fast-moving startup where the product is still being defined partly by what customers need in the field. How do you balance being responsive to individual customer requests versus protecting the integrity of the platform roadmap? | The JD asks for someone who provides 'a tight feedback loop' to product/engineering while also owning customer outcomes. This tension is real at forward-deployed roles in growth-stage AI companies. |
| behavioral | Tell me about a developer platform or SDK you shipped that meaningfully reduced time-to-value for the people using it. What did you measure, and what would you do differently? | Cresta's FDE model depends on tooling and playbooks that accelerate deployment. The JD mentions 'improve our playbooks for agent development and customer deployment,' and the candidate's Intuit ICE/DevPortal work is directly relevant here. |
| domain | How do you think about reward design and evaluation for a conversational AI agent in a customer support context — what signals indicate the agent is improving, and how do you avoid reward hacking or metric gaming? | The JD requires deep AI agent expertise; Cresta's platform involves ongoing optimization post-launch. This tests whether the candidate understands RLHF/evaluation concepts at a level beyond buzzwords, which the candidate's RL Workbench and aeval work directly supports. |
Talking points
- I built OpenClaw, a production multi-agent orchestration framework with a gateway protocol, subagent delegation, profile management, and session switching — deployed across real estate, insurance, and financial markets verticals. This maps directly to the agentic architecture Cresta is deploying in contact centers, and I can speak to the failure modes (context bleed, delegation loops, latency under concurrent sessions) from first-hand experience, not just theory.
- At Intuit, I owned the ICE Self-Service platform end-to-end — from DevPortal and GitOps config through production onboarding — reducing developer time-to-production from 2–3 weeks to under 24 hours and scaling to 675M+ engagements in FY23. That's the same deployment lifecycle ownership the Forward Deployed PM role requires, just applied to enterprise AI agents instead of internal developer infrastructure.
- I built aeval, a local-first AI model evaluation platform with statistical rigor (bootstrap CIs, Welch's t-test, Cohen's d), adversarial safety testing, and CI/CD regression gates — and an RL Workbench that benchmarks GRPO, DPO, PPO, and 9 other algorithms across TRL, VeRL, OpenRLHF, and NeMo RL with live metric streaming. When Cresta asks how I'd evaluate agent readiness for go-live, I have a concrete, implemented answer.
- I've shipped production software across the full stack — Electron + React + TypeScript desktop apps, React Native mobile, FastAPI backends, Redis job queues, FFmpeg streaming pipelines, and Stripe/OAuth integrations — while simultaneously leading customer discovery and go-to-market. The 'hands-on work style with a bias for action' the JD describes is how I've operated as a founder; I don't wait for an FDE to prototype something I can validate myself.
- I've taught enterprise cloud computing, data analytics, and Java programming at De Anza College for 7+ years, which means I can translate complex AI agent concepts to non-technical senior stakeholders — a core skill for the workshop facilitation and executive relationship management this role demands. I've also presented at DeveloperWeek 2022 and Splunk .conf18/19, so I'm comfortable owning the room with enterprise audiences.