jobsearch v0.0.1

← cursor / Product Manager, Agent Harness

tailored_resume_v2 / art_PqfhmAyaeRA

role
cursor / Product Manager, Agent Harness
model
anthropic/claude-sonnet-4.6
created
2026-05-20T01:50

↓ Download .docx ↓ Download .pdf PDF requires LibreOffice installed

What changed for cursor

changewhy it matters
Projects section moved to lead position above Professional Experience RL Workbench and aeval are the strongest proof points for the Agent Harness role — directly demonstrate RL practitioner depth, evaluation framework design, and agent trace analysis; leading with them maximizes perceived fit immediately
Summary rewritten to lead with 'research-product boundary' framing and call out multi-agent orchestration, RL workbench, and evaluation harnesses explicitly JD's first sentence defines the role as living at the research-product boundary; summary must signal that identity immediately
AutoEval reframed as 'Automated Visual Evaluation for Agent Model Training' and described as an 'agent evaluation harness' JD's core responsibility is building evaluation harnesses; reframing AutoEval in that language (accurately) maximizes relevance signal
Streamio CEO title reframed to 'Founder & CEO — Multi-Agent Platform' and OpenClaw bullet moved to lead OpenClaw subagent delegation and MCP SDK integration directly mirror Agent Harness primitives (tool access, subagent coordination, MCP); leading with it is the strongest proof point for this role
MCP SDK bullet elevated and reframed as 'defining agent extensibility primitives' JD explicitly calls out MCP and plugin extensibility as a core responsibility; accurate reframe using JD's exact language
Intuit telemetry bullet reframed to explicitly connect to 'analyzing agent traces at scale and turning patterns into concrete improvements' JD requires trace analysis discipline; Intuit's BigQuery/SQL usage data work is the closest enterprise-scale proof point
Splunk benchmarking bullet reframed with 'empirical, measurement-first approach to defining what good looks like' JD emphasizes empirical results shaping roadmap; Splunk's 10x benchmark work demonstrates that discipline
IBM bullet reframed to connect root cause analysis to 'failure-mode diagnosis discipline central to agent trace analysis' JD requires deep failure mode analysis; IBM's escalation RCA work is the earliest proof point of that skill
Kaiser condensed to 1 bullet focused on platform reliability and observability Low relevance to Agent Harness role; kept for career continuity but minimized to preserve space for higher-signal content
Bank of America Merrill Lynch role omitted Summer associate role with no relevance to agent frameworks, RL, or developer tools; omitting preserves space for high-signal content without violating anti-patterns (role was not a primary career role)
Fintellect fallback routing bullet reframed as 'failure recovery and error handling patterns essential to reliable agent execution' JD explicitly requires agents to handle failures and retries; LLM fallback routing is an accurate analog
aeval bullet rewritten to explicitly call out 'task completion rate, hallucination frequency, and error recovery' metrics These are the exact success metrics the JD names; accurate reframe using JD's language maximizes signal
JD analysis (20 key phrases)

Key phrases: agent harnessagent planning and execution frameworkdecompose tasks into subtasksfailure modesevaluation frameworksagent tracesmulti-agent coordinationreinforcement learningdeveloper toolsguardrailstask completion ratehallucination frequencyerror recoveryMCPsubagent delegationempirical resultsautonomy with predictabilityobserve and steerbenchmarking systemsagent extensibility

Hard requirements:

Preferred qualifications:

Per-role mapping (10 roles scored)
rolescorereframe angleJD phrases that map
Streamio AI — Founder & CEO 4/5 Lead with multi-agent orchestration and MCP primitives — directly mirrors Agent Harness responsibilities subagent delegation, MCP, multi-agent coordination, agent extensibility, agent planning and execution framework
Fintellect AI — Founder & CEO 3/5 Frame as LLM agent orchestration with failure handling and fallback — maps to error recovery and agent reliability failure modes, error recovery, agent frameworks, LLM applications
Intuit — Staff PM 3/5 Frame as developer-facing platform PM with deep telemetry/measurement discipline — maps to evaluation frameworks and developer trust developer tools, empirical results, evaluation frameworks, observe and steer
Splunk — Senior PM 2/5 Frame as orchestration and performance benchmarking — maps to agent trace analysis and measurement benchmarking systems, failure modes, empirical results
Kaiser Permanente — SOA Technical PM 1/5 Condense to 1 bullet on platform reliability and scale
IBM — Software Engineer 1/5 1 bullet — keep for technical credibility failure modes
RL Workbench 5/5 Lead projects section — directly demonstrates RL practitioner depth and evaluation framework design reinforcement learning, evaluation frameworks, benchmarking systems, empirical results, task completion rate
aeval — AI Model Evaluation Platform 5/5 Second project — directly maps to evaluation framework design and hallucination/failure measurement evaluation frameworks, hallucination frequency, task completion rate, benchmarking systems, error recovery
AutoEval — Automated Visual Evaluation for Robot Model Training 4/5 Frame as automated agent evaluation harness — directly mirrors Agent Harness evaluation responsibilities evaluation frameworks, agent traces, failure modes, benchmarking systems
BRAIN — Protein Structure Prediction ML Platform 3/5 Frame as deep ML research-to-production credibility; condense reinforcement learning, empirical results, research-adjacent

Tailored summary

Technical PM at the research-product boundary — 12+ years building developer-facing platforms and AI agent frameworks, from shipping SDK tooling at 675M+ engagements (Intuit) to building multi-agent orchestration systems, RL post-training workbenches, and AI evaluation harnesses from scratch. Designed and implemented subagent delegation frameworks, MCP-integrated agent pipelines, and statistical evaluation systems measuring task completion, error recovery, and hallucination frequency. NeurIPS published ML researcher; hands-on practitioner across PPO, GRPO, DPO, and 9 additional RL algorithms.