jobsearch v0.0.1

← coreweave / Staff Product Manager, Data Services

interviewer_questions / art_1dxlQBIlYWQ

role
coreweave / Staff Product Manager, Data Services
model
anthropic/claude-sonnet-4.6
created
2026-05-21T05:24

Interviewer

The interviewer is a CoreWeave team member, likely on the Data Services or Platform PM team, but no specific LinkedIn profile data was provided beyond the company context. Based on CoreWeave's profile as a GPU-dense AI cloud provider that recently IPO'd at $23B+, the interviewer likely has a background in cloud infrastructure, data platforms, or AI/ML systems. The expected interview focus will center on technical depth in data services (databases, streaming, lakehouse), platform product strategy at scale, and the candidate's ability to operate in a fast-moving, high-growth infrastructure environment. Given CoreWeave's customer base (OpenAI, Microsoft, frontier AI labs), the interviewer will probe for experience with enterprise-grade, high-throughput data systems and AI workload-specific data patterns.

My profile through their lens

Felix presents as a rare hybrid: a technically credentialed PM (UC Berkeley CompE, NeurIPS published, hand-coded BPTT in C++) who has shipped platform infrastructure at genuine scale — 675M+ ICE engagements, 50K TPS, sub-25ms TP99 at Intuit. From CoreWeave's perspective, the most compelling signal is that Felix has been a firsthand customer of GPU-dense infrastructure through his RL post-training workbench (GRPO/DPO across TRL, VeRL, OpenRLHF, NeMo RL on Apple Silicon MPS and CUDA), meaning he understands the data pipeline demands of AI training workloads from the inside. His aeval platform (FastAPI, TimescaleDB, Redis, Ollama) and BRAIN protein structure prediction work demonstrate hands-on data platform construction. The gap the interviewer will probe is direct ownership of managed database products, streaming infrastructure (Kafka/CDC), or data governance/catalog systems — Felix's data platform experience is real but largely internal-tooling and ML-adjacent rather than externally-facing managed data services.

Questions they may ask (24)

categoryquestionwhyhow to prepare
resume_deep_dive Walk me through the ICE platform at Intuit — specifically the rSocket migration that took you from 6K to 50K TPS. What were the data persistence and streaming bottlenecks you hit, and how did you make the architectural tradeoffs? This is Felix's strongest infrastructure-at-scale signal directly relevant to CoreWeave's throughput requirements. The interviewer will want to test whether Felix owned the technical depth or was a passenger on an engineering-led initiative. Reconstruct the rSocket migration story with specifics: what the bottleneck was (connection overhead, serialization, backpressure), why rSocket over alternatives (gRPC, HTTP/2), and what your PM contribution was vs. engineering's. Have a crisp answer on the ~1.5M concurrent connections architecture.
resume_deep_dive Your aeval platform uses TimescaleDB, Redis, and FastAPI. Why TimescaleDB over a standard PostgreSQL or a columnar store like ClickHouse for evaluation metrics? What data modeling decisions did you make for storing time-series eval results? CoreWeave's Data Services role requires fluency in database selection tradeoffs. Felix built aeval himself, so this is a fair deep-dive. The interviewer wants to see if Felix can reason about database primitives, not just name-drop technologies. Prepare a 2-minute explanation of the TimescaleDB choice: time-series compression, continuous aggregates for bootstrap CI calculations, hypertable partitioning. Be ready to compare against alternatives and articulate what you'd change at 10x scale.
resume_deep_dive You built a RAG retrieval pipeline at Fintellect with ChromaDB and multi-provider LLM orchestration. How did you handle data freshness, embedding versioning, and retrieval quality degradation over time — and what would a production-grade version of that architecture look like on a managed cloud platform? Vector search and AI data serving pipelines are explicitly listed as preferred experience in the JD. This probes whether Felix's RAG work has depth beyond a prototype and whether he can extrapolate to managed service design. Think through the operational gaps in your Fintellect RAG system: embedding model versioning strategy, index rebuild triggers, recall vs. latency tradeoffs. Then articulate what a managed vector search service would need to offer (SLAs, auto-scaling, namespace isolation) to serve enterprise AI customers.
resume_deep_dive Tell me about the Asterias declarative asset lifecycle management platform you built at Intuit with a GraphQL API. What data model did it use, how did you handle schema evolution, and what were the hardest governance or metadata challenges? Metadata and catalog services are a core pillar of the Data Services JD. Asterias is the closest analog on Felix's resume to a data catalog/governance product. The interviewer will probe whether Felix has real depth here. Reconstruct the Asterias data model: what entities it tracked, how asset lineage was represented, how schema changes were versioned. Connect it explicitly to modern data catalog concepts (Apache Atlas, DataHub, OpenMetadata) and articulate what you'd build differently with those primitives.
technical_domain CoreWeave's customers run large-scale AI training jobs that generate massive checkpoint, log, and metric data streams. Design a data ingestion and storage architecture for a managed data service that needs to handle 100TB/day of training telemetry with sub-second query latency for live job monitoring. What storage layers, streaming components, and indexing strategies would you recommend? This is the core technical scenario for the role — GPU workload data at scale. Felix's RL workbench experience and ICE platform background make this a fair but stretching question. The interviewer wants to see if Felix can reason about lakehouse architecture, streaming ingestion, and tiered storage. Prepare a layered architecture: streaming ingest (Kafka/Kinesis), hot storage (ClickHouse or TimescaleDB for recent metrics), cold storage (Parquet on object store), and a query federation layer. Reference your RL workbench's SSE metric streaming as a starting point and extrapolate to multi-tenant managed service requirements.
technical_domain How would you think about CDC (Change Data Capture) as a primitive for CoreWeave's data services customers — specifically for AI teams that need to sync training metadata, model registries, or feature stores in real time across hybrid deployments? CDC and streaming pipelines are explicitly called out in the JD. Felix has not listed direct CDC experience, so this tests his ability to reason from first principles about a gap area. Study Debezium, AWS DMS, and Kafka Connect CDC patterns. Frame your answer around the customer use case: ML teams needing consistent views of experiment metadata across on-prem and cloud. Acknowledge the gap honestly but demonstrate you can reason about WAL-based replication, exactly-once semantics, and schema registry needs.
technical_domain What's your mental model for how a feature store fits into the data services portfolio at a GPU cloud provider? How would you differentiate CoreWeave's offering from Feast, Tecton, or Vertex AI Feature Store? AI/ML data products including feature stores are listed as preferred experience. Felix's RL workbench and aeval work show ML pipeline familiarity, but feature store product strategy is a specific domain the interviewer will probe. Study Feast vs. Tecton vs. managed offerings on the three axes: online/offline consistency, point-in-time correctness, and serving latency. Frame CoreWeave's differentiation around GPU-collocated serving (low-latency feature retrieval for inference workloads) and tight integration with training job orchestration.
technical_domain Walk me through how you'd design the SLA framework for a managed streaming service (think Kafka-as-a-service) on CoreWeave. What metrics would you commit to, how would you handle multi-tenant isolation, and what would your tiered pricing model look like? API-first platform thinking, SLA design, and SaaS/PaaS pricing are all called out in the JD. Felix's ICE platform work (sub-25ms TP99, 50K TPS) gives him credibility here, but he'll need to extend to externally-facing managed service design. Define three SLA tiers: throughput (MB/s per partition), latency (P99 produce/consume), and durability (replication factor, retention). Map these to pricing tiers. Reference your ICE TP99 work as evidence you've thought about latency SLAs before, then extend to multi-tenant isolation patterns (dedicated brokers vs. shared with namespace quotas).
gap_transition The JD asks for 3-4 years focused specifically on managed database or streaming services — products that external customers pay for and depend on. Your data platform experience at Intuit was largely internal developer tooling. How do you think about that distinction, and where have you built the most relevant muscle? This is the most significant gap on Felix's resume relative to the JD requirements. The interviewer will probe it directly. Felix needs a credible bridge narrative. Acknowledge the distinction honestly: internal platform vs. external managed service. Then bridge through: (1) ICE had 675M+ engagements across QuickBooks/TurboTax/Mailchimp — these are effectively external product teams as customers; (2) your Fintellect and Streamio work involved real external customers and payments; (3) your RL workbench was built as a consumer of GPU cloud services, giving you the customer perspective. Emphasize the transferable primitives: SLA design, developer experience, API contracts.
gap_transition CoreWeave is a post-IPO, hyper-growth infrastructure company. Your most recent Staff PM role was at Intuit, a large enterprise, and since then you've been running early-stage startups. How do you think about operating at the Staff level in a fast-scaling infrastructure company that's neither a big enterprise nor a 2-person startup? The 9-month gap between Intuit (Sept 2024) and the current role, combined with the founder/CEO experience at two early-stage companies, creates a legitimate question about operating rhythm fit at a scaling public company. Frame your Intuit Staff PM experience as the anchor: you've operated at scale with cross-functional alignment, CTO-level presentations, and multi-team dependencies. Your founder experience adds speed and ownership mentality. Acknowledge that CoreWeave's pace is closer to your startup mode than Intuit's enterprise cadence — and that's the energy you want to bring back.
gap_transition You have deep experience with observability and telemetry data (Splunk, ICE metrics, BigQuery at Intuit), but the JD emphasizes transactional and analytical storage, data lakes, and governance. How have you developed your thinking on the governance and catalog side specifically? Data governance, catalog, and compliance are called out in the JD as core responsibilities. Felix's resume shows telemetry/observability strength but limited explicit governance product work. Reference Asterias (declarative asset lifecycle management) as your closest governance analog. Then demonstrate current knowledge: study Apache Atlas, DataHub, Unity Catalog, and AWS Glue Data Catalog. Frame your ICE Drift Detection program (scanning Git repos for config drift) as a governance mindset applied to infrastructure.
behavioral_situational Tell me about a time you had to make a major architectural bet on a platform product with incomplete information — and it turned out you were wrong. What did you do? CoreWeave's core values include 'Be Curious at Your Core' and comfort with ambiguity. The JD explicitly says 'you don't need perfect information to move forward.' The interviewer wants to see intellectual honesty and recovery speed. Use the Splunk Scheduler Service or ICE rSocket migration as your story. Be specific about what signal you were missing, what the wrong bet cost (time, resources, customer trust), and what you changed. Avoid stories where everything worked out — the interviewer wants to see how you handle being wrong.
behavioral_situational Describe a situation where you had to influence a major technical architecture decision without direct authority — specifically where engineering wanted to go one direction and you believed the product/customer data pointed another way. The JD calls out 'influence without authority' explicitly. Felix's Intuit experience (CTO-level language assessment, MSaaS Drift Detection) suggests he's done this, but the interviewer wants a specific story. Use the Service Language Assessment (9 languages, CTO presentation) or the Mailchimp GCP-to-AWS migration as your story. Be specific about the data you used to make your case, who you had to convince, and what the outcome was. Emphasize the analytical rigor (SQL, BigQuery usage data) that gave you credibility.
behavioral_situational Give me an example of how you've gone deep with a strategic customer on their data architecture — not just gathering requirements, but co-designing a solution. What did you learn that changed your product roadmap? Customer and partner engagement with deep technical co-design is a named responsibility in the JD. Felix's customer discovery work at Fintellect and Streamio is relevant, but the interviewer wants an enterprise-scale example. Use the Splunk beta customer (Assurance) story — you built a mirrored Enterprise topology for benchmark testing and achieved 10x performance improvements. Frame it as co-design: you went into their environment, understood their topology, and built a solution together. Connect to what that taught you about query performance requirements that changed your SPL2 roadmap.
behavioral_situational Tell me about a time you had to balance a long-term platform vision against urgent near-term customer or business pressure. How did you make the call and what was the tradeoff? The JD explicitly calls out 'balancing near-term delivery with long-term bets.' CoreWeave is in hyper-growth mode with major enterprise contracts — this tension is real and daily. Use the ICE Self-Service platform story: you had to deliver developer onboarding reduction (near-term: $1M+ opex mitigation) while also building the longer-term Asterias asset lifecycle platform. Be specific about how you sequenced the work and what you deferred.
role_specific_scenario CoreWeave's customers include frontier AI labs running multi-thousand GPU training runs. What does the data services portfolio need to look like in 3 years to be the default data layer for AI training and inference workloads — and what would you build first? This is the core strategic question for the role. The interviewer wants to see if Felix has a coherent multi-year vision for AI-native data services, not just a list of features. Structure your answer around three layers: (1) training data pipeline (high-throughput object storage + streaming ingest for checkpoint/log data), (2) experiment and model metadata catalog (lineage, versioning, governance), (3) inference-time data serving (vector search, feature stores, low-latency KV). Anchor 'build first' on the highest-pain customer problem — likely checkpoint storage and experiment tracking given CoreWeave's training-heavy customer base.
role_specific_scenario How would you approach pricing and packaging for a managed PostgreSQL offering on CoreWeave, given that you're competing with RDS Aurora, Cloud SQL, and Neon — and your differentiation is GPU-collocated compute, not database features? SaaS/PaaS pricing and packaging is listed as preferred experience. CoreWeave's differentiation is infrastructure, not database software, which creates a genuine pricing strategy challenge. Frame the pricing strategy around GPU-collocated value: customers pay a premium for data locality (eliminating cross-datacenter egress latency for training jobs that need to read from the database). Consider usage-based pricing anchored on storage + IOPS + egress, with a premium tier for reserved capacity co-located with GPU clusters. Reference your Intuit tiered ICE pricing experience.
role_specific_scenario A large AI lab customer tells you their biggest pain point is that their training jobs spend 30% of GPU time waiting on data I/O — their data pipeline can't feed the GPUs fast enough. Walk me through how you'd diagnose the problem and what data service primitives you'd prioritize building to solve it. This is a real CoreWeave customer problem (GPU utilization bottlenecked by data I/O) and tests Felix's ability to connect data services product decisions to GPU workload performance — his unique differentiator as a candidate. Use your RL workbench experience as the entry point: you've personally experienced data loading bottlenecks during GRPO/DPO training. Diagnose across three layers: storage throughput (object store IOPS, NVMe caching), data format (Parquet vs. raw, sharding strategy), and pipeline parallelism (prefetch workers, async data loading). Then map to product primitives: high-throughput object storage with NVMe caching tier, streaming data loader service, and a data proximity placement API.
motivation_fit You've been running two AI startups for the past 9 months. What specifically about CoreWeave's Data Services team makes you want to step back into a Staff PM role at a company, and why now? The interviewer will want to understand the motivation transition from founder to Staff PM, and whether Felix is genuinely excited about CoreWeave's mission or just looking for stability post-startup. Be honest and specific: CoreWeave is building the infrastructure layer that your RL workbench and aeval platform ran on — you've been a customer. Frame it as wanting to operate at the infrastructure layer where the leverage is highest, rather than building applications on top of it. Avoid framing it as 'startup didn't work out' — frame it as a deliberate choice to have more impact at the platform layer.
motivation_fit CoreWeave's core value is 'Act Like an Owner.' Given that you've literally been an owner as a founder, how do you translate that into a Staff PM role where you don't control the company's direction? The founder-to-Staff-PM transition is a real cultural fit question. CoreWeave wants ownership mentality but also someone who can operate within a larger organization. Frame ownership at the product level: you own the data services roadmap, the customer relationships, and the outcomes — not the company. Use your Intuit experience as evidence that you've operated with ownership mentality inside a large organization (CTO-level initiatives, $1M+ opex decisions). The founder experience amplifies this, not contradicts it.
unique_to_this_interviewer CoreWeave recently IPO'd and is in hyper-growth mode with major contracts from OpenAI and Microsoft. How do you think about building a data services roadmap when your largest customers can effectively dictate your near-term priorities — and how do you protect the long-term platform vision? This question is anchored in CoreWeave's specific current moment: post-IPO, with concentrated enterprise customer relationships that create real roadmap tension. Any CoreWeave interviewer will be living this tension daily. Frame your answer around a tiered customer influence model: strategic customers (OpenAI, Microsoft) get co-design input on architecture but not unilateral roadmap control; their requests get evaluated against platform generalizability. Reference your Splunk experience managing Fortune 500 customer requirements against platform roadmap using RICE prioritization.
unique_to_this_interviewer Given that CoreWeave's differentiation is GPU-dense infrastructure, how do you think about which data services to build natively vs. partner with or resell from existing vendors like Snowflake, Databricks, or Confluent? This is a live strategic question for CoreWeave's Data Services team — the build vs. partner decision is real and the interviewer likely has strong opinions. It tests Felix's ability to reason about platform strategy in CoreWeave's specific competitive context. Use a framework: build natively where GPU-colocation creates differentiated performance (high-throughput object storage, NVMe-cached hot data, low-latency vector search for inference); partner where the differentiation is in the software layer (analytical SQL, data transformation, BI). Reference your Intuit experience evaluating build vs. buy for the Service Language Assessment and MSaaS platform decisions.
product_prioritization You're inheriting the Data Services roadmap at CoreWeave. You have requests for: (1) a managed PostgreSQL service, (2) a Kafka-compatible streaming service, (3) a data catalog and lineage product, and (4) a vector search service optimized for LLM inference. You can staff two of these in the next two quarters. How do you decide? The JD lists all four of these as portfolio areas. The interviewer wants to see Felix's prioritization framework applied to CoreWeave's specific context — GPU workloads, AI lab customers, and infrastructure differentiation. Apply a framework that weights: (1) customer pain severity (which bottleneck most directly costs GPU utilization?), (2) CoreWeave's unique ability to win (where does GPU-colocation create differentiated value?), (3) build vs. partner feasibility. Vector search for inference and streaming for training data pipelines likely win on all three axes. Be ready to defend your stack-rank with specific customer evidence.
product_metrics What would you set as the north star metric for CoreWeave's Data Services portfolio, and what leading indicators would you track to know if you're on track before revenue shows up? Metrics definition is a core PM competency and the JD emphasizes 'clarity from ambiguity.' CoreWeave's data services are infrastructure — the right metrics are non-obvious and the interviewer will probe for sophistication. Propose a north star of 'data bytes processed per GPU-hour' (ties data services directly to CoreWeave's core value unit). Leading indicators: data service adoption rate among active GPU customers, P99 data I/O latency for training jobs, and percentage of GPU idle time attributable to data pipeline bottlenecks. Reference your ICE engagement metrics (675M+ engagements, 275% YoY growth) as evidence you've built metrics frameworks for platform products.

Preparation priorities

  1. 1. BRIDGE THE MANAGED DATA SERVICES GAP: The JD requires 3-4 years on external managed database/streaming products. Felix's experience is real but internal-tooling adjacent. Prepare a crisp narrative that bridges ICE (675M engagements, external product teams as customers) + Fintellect RAG + aeval TimescaleDB to demonstrate managed service product thinking. Study RDS Aurora, Confluent Cloud, and Databricks pricing/SLA models to speak fluently about externally-facing data services.
  2. 2. GPU-DATA INTERSECTION STORY: Felix's RL workbench is his unique differentiator — he's been a firsthand consumer of GPU-dense infrastructure running data-hungry training workloads. Prepare 2-3 specific stories about data I/O bottlenecks, checkpoint storage decisions, and metric streaming architecture from the workbench. This is the angle no other candidate has.
  3. 3. DATA SERVICES PORTFOLIO DEPTH: Study the four pillars in the JD — (a) relational/distributed databases (PostgreSQL, SingleStore, Snowflake), (b) streaming/pipelines (Kafka, CDC, lakehouse), (c) data governance (Unity Catalog, DataHub, IAM/RBAC), (d) AI/ML data products (vector search, feature stores). Felix is strong on (d) and has surface-level exposure to (a)-(c). Prioritize closing gaps on CDC and data catalog.
  4. 4. COREWEAVE-SPECIFIC STRATEGY: Prepare a 3-year data services vision anchored in CoreWeave's specific context: GPU-collocated data locality as differentiation, AI lab customer use cases (training checkpoints, experiment metadata, inference serving), and the build vs. partner framework for competing with AWS/GCP managed services. The 'build vs. partner' question is live at CoreWeave and will come up.
  5. 5. PRIORITIZATION FRAMEWORK WITH COREWEAVE CONTEXT: Prepare a crisp prioritization framework (RICE or weighted scoring) applied specifically to the four data service areas in the JD, with CoreWeave's customer base (OpenAI, Microsoft, frontier AI labs) as the demand signal. Be ready to stack-rank and defend with specific customer evidence from your own GPU workload experience.

⚠ Watch-outs