The 2026 Netflix workshop on Personalization, Recommendation and Search (PRS) aims at bringing together practitioners and researchers working in domains to facilitate the sharing of ideas, information and approaches to build bridges between these communities. This year marks our 10th event! 🎉
Â
Please register in advance using the RSVP button above. Registrations will close when we reach capacity (which we have in prior years) or by Friday, May 29th. So if you're interested, don't delay.
The event will be in-person only, at the historic Fox Theatre in Redwood City.
Â
This @NetflixResearch workshop is organized by:
Â
Claire DormanÂ
David FagnanÂ
Fernando Amat GilÂ
Nathan KallusÂ
Linas BaltrunasÂ
Matteo RinaldiÂ
Colleen ChanÂ
Gary Tang Â
Hailey BorenÂ
Jeff VadukumcherryÂ
Liza ArgentÂ
For questions, contact prs-organizers@netflix.com
Previous PRS workshops: 2025, 2024, 2023, 2022, 2021, 2019, 2018, 2017, 2016.
Craig Saldanha is the Chief Product Officer at Yelp, where he leads global product and design teams to transform the consumer experience, local business tools, and monetization through cutting-edge AI. Previously, he directed product and engineering for Amazon Prime Video International and held multiple leadership roles across Prime Video and Kindle, shaping global streaming and digital content strategies. Craig also serves as an Executive in Residence and Lecturer at Carnegie Mellon University’s Tepper School of Business, teaching product management to MBA students and executives.
Becks Wood is the Senior Director for Core Discovery @ Netflix. Core Discovery includes the TV, mobile, web and personalization consumer experience. Before Netflix, Becks was a product lead on Google Search, leading consumer search for verticals such as TV/movies, recipes, sports and more. At Google, she worked specifically on the genAI search integrations as well as vertical features such as leading the Olympics launch. Becks loves the intersection of entertainment and technology. She's an avid movie & concert goer, loves a dance party, and spends time riding bikes around San Francisco. Becks holds a BA from Princeton University in economics.
Sanmi (Oluwasanmi) Koyejo is an Assistant Professor in the Department of Computer Science at Stanford University and an adjunct Associate Professor at the University of Illinois at Urbana-Champaign. He leads the Stanford Trustworthy AI Research (STAIR) lab, which develops measurement-theoretic foundations for trustworthy AI systems, spanning AI evaluation science, algorithmic accountability, and privacy-preserving machine learning, with applications to healthcare and scientific discovery. His research on AI capabilities evaluation has challenged conventional understanding in the field, including work on measurement frameworks cited in the 2024 Economic Report of the President.
Koyejo has received the Presidential Early Career Award for Scientists and Engineers (PECASE), Skip Ellis Early Career Award, Alfred P. Sloan Research Fellowship, NSF CAREER Award, and multiple outstanding paper awards at flagship venues, including NeurIPS and ACL. He has delivered keynote presentations at major conferences, including ECCV and FAccT. He serves in key leadership roles, including Board President of Black in AI, Board of Directors of the Neural Information Processing Systems Foundation, and other leadership positions in professional organizations advancing AI research and broadening participation in the field.
Julie Choi is an Engineering Manager within LinkedIn’s Core AI organization, where she leads the team responsible for next-generation recommendation and ranking systems. Her work focuses on the development and productionization of generative AI and sequence-based modeling to enhance personalization across a wide range of LinkedIn products, including Feed and Ads. Prior to LinkedIn, Julie focused on building large-scale machine learning systems and holds a Ph.D. in Electrical Engineering with a minor in Statistics from Stanford University.
 Konstantina Christakopoulou is a Staff Research Engineer at Google DeepMind, leading efforts around AI agents that help improve people’s lives. She co-founded and technically led Project ALLY, a Google Brain Moonshot aimed at the next generation of assistive recommendation that helps users throughout their lifelong journey. She has led multiple cross-PA efforts to align industrial recommendation platforms with human values, publishing in top-tier conferences and delivering 20+ launches while working closely with senior leadership at Google. Prior to Google, she received her PhD from the University of Minnesota, during which she completed research internships at Google Research and Microsoft Research, and first-authored influential works on conversational recommendation.
Ondrej Linda has been bringing his experience in AI, machine learning, data science and ethical AI to Zillow for the past eight years. Ondrej currently leads AI Science and Engineering teams focused on exploring the frontiers of Agentic AI systems to build customer-focused features to help people find homes and move. Ondrej also plays a lead role in shaping Zillow's responsible AI-driven initiatives and is actively engaged in supporting Zillow’s ethical AI efforts, ensuring fairness in AI model implementation.
Prior to joining Zillow, Ondrej spent six years at Expedia, where he worked as a Data Scientist specializing in Natural Language Processing and Information Retrieval. He also managed the Data Science team for the Hotwire brand.
Ondrej earned his PhD in Computer Science with a focus on Machine Learning from the University of Idaho in 2012, and holds a Masters in Computer Graphics from Czech Technical University.
Kevin Zielnicki is a research scientist in the Machine Learning and Inference Research team at Netflix. His work focuses on recommendation systems and how users interact with recommendations to make choices. He received his PhD from the University of Illinois at Urbana-Champaign and previously developed recommendation systems for Stitch Fix.
 Sudeep Das is a senior machine learning and artificial intelligence leader with over 15 years of experience building large-scale, consumer-facing AI systems. He currently serves as Head of Machine Learning & AI for New Business Verticals at DoorDash, where he leads
personalization, search, catalog intelligence, and decision-making systems across rapidly
expanding consumer experiences including grocery, convenience, alcohol, and retail. At DoorDash, Sudeep focuses on applying deep learning, recommender systems, and
generative AI to create highly adaptive, real-time consumer experiences. His work spans
ranking and retrieval, large language model–powered discovery, and agentic systems that reason across user intent, context, and constraints to deliver personalized outcomes rather than static recommendations. Previously, Sudeep was a Machine Learning Lead at Netflix, where he helped develop next-generation personalization and discovery algorithms used by hundreds of millions of users worldwide. He holds a Ph.D. in Astrophysics from Princeton University and brings a strong scientific foundation to practical AI leadership at scale. Sudeepis a frequent speaker at leading international conferences including RecSys, SIGIR, ICML, and QCon, where he shares insights on production ML systems, personalization, and the future of intelligent consumer platforms.
Luke Lequn Wang is a Research Scientist and Engineer at Netflix, specializing in the intersection of generative AI and personalization. He leads the development of Netflix's generative homepage recommender, a customized and efficient LLM that constructs personalized homepages in real-time. His broader research background encompasses reinforcement learning, user interactive systems, LLMs, and trustworthy machine learning. He obtained his Ph.D. in Computer Science from Cornell University.
Jaewon Yang is a Principal Machine Learning Engineer at Pinterest, where he specializes in advancing machine learning technologies for recommender systems, generative AI, and representation learning across Pinterest’s products. Before joining Pinterest, he was a Distinguished Machine Learning Engineer at Nextdoor and a Principal ML Engineer at LinkedIn. Jaewon holds a Ph.D. in Machine Learning and a Master’s in Statistics from Stanford University’s Infolab. He has published over 50 papers with more than 9,000 citations and has served on the senior program committees of top-tier conferences such as SIGKDD and CIKM for years. His contributions have been recognized with five best paper awards from SIGKDD, WSDM, and ICDM, including three test-of-time awards.
Title:Â When Recommendation Becomes Evaluation: Aggregation, Heterogeneity, and Strategic Gaming
Speaker:Â Sanmi Koyejo (Standford)
Abstract:
Collaborative filtering and matrix factorization were designed to predict what individual users want. Increasingly, they do something else: rank AI systems on public leaderboards and certify which facts are true on platforms used by billions. Recommendation has become evaluation infrastructure, and that repurposing has consequences. We examine two. In AI evaluation, preference heterogeneity gets averaged away, producing aggregate rankings that hide systematic disagreement across annotator populations. In crowdsourced fact-checking (deployed across X, Meta, TikTok, and Google), matrix factorization repurposed for consensus detection admits coordinated manipulation: a small number of adversarial accounts can fabricate cross-ideological agreement with no exploit required. Robust evaluation is a mechanism design problem, and the recommendation community is well-positioned to address it.
Abstract:
Modern recommender systems can observe long, rich user histories, but most production rankers still struggle to use that history directly at serving time. In this talk, we describe LinkedIn’s work on Generative Ranking, which models member behavior as long sequences and serves candidates through amortized shared-context attention. This design made long-history ranking practical in production and has been deployed in LinkedIn Feed and Ads.
We then discuss Semantic IDs as a compact way to reuse rich signals across ranking systems. Instead of serving every dense representation or full behavior sequence directly, SIDs convert behavior, text, content, or LLM-derived embeddings into discrete tokens that downstream rankers can train on under their own objectives. Together, these techniques show how long-history user understanding can move from offline modeling into production-scale ranking systems.
Title:Â The Value of Personalized Recommendations: Evidence from Netflix
Speaker: Kevin Zielnicki (Netflix)
Abstract:
Personalized recommendation systems shape much of user choice online, yet their targeted nature makes separating out the value of recommendation and the underlying goods challenging. We build a discrete choice model that embeds recommendation-induced utility, low-rank heterogeneity, and flexible state dependence and apply the model to viewership data at Netflix. We exploit idiosyncratic variation introduced by the recommendation algorithm to identify and separately value these components as well as to recover model-free diversion ratios that we can use to validate our structural model. We use the model to evaluate counterfactuals that quantify the incremental engagement generated by personalized recommendations. First, we show that replacing the current recommender system with a matrix factorization or popularity-based algorithm would lead to 4% and 12% reduction in engagement, respectively, and decreased consumption diversity. Second, most of the consumption increase from recommendations comes from effective targeting, not mechanical exposure, with the largest gains for mid-popularity goods (as opposed to broadly appealing or very niche goods).
Title:Â From Ranking to Reasoning: Building The Next Generation of Consumer AI at DoorDash
Speaker: Sudeep Das (DoorDash)
Abstract:
Consumer AI is rapidly evolving beyond static ranking and one-shot recommendations into intelligent systems that can reason, remember, guide, and adapt across the entire shopping journey. In this talk, we will share how DoorDash transformed its Search and Discovery stack from traditional deep learning–based approaches into a dynamic, contextual, and generative AI–powered platform for novel consumer experiences. We will explore how this evolution enables dynamic merchandising moments, richer search experiences, hyper-personalized browsing, and conversational shopping assistants that support multimodal interaction. The talk will cover the hybrid architecture behind these experiences, combining the strengths of traditional embeddings, rankers, and sequential models with the semantic reasoning, multimodal understanding, and adaptability of LLMs and VLMs. Finally, we will share key lessons from building a conversational shopping assistant, including how we designed an evaluation harness, developed a flexible consumer memory system, and used personalization to create more intuitive and engaging shopping experiences.
Title:Â Building AI You Can Trust: Innovation & Fair Housing in Real Estate
Speaker:Â Ondrej Linda (Zillow)
 Abstract:
Real estate is one of the most complex and regulated consumer domains with months-long journeys spanning multiple personas. In this talk, we will introduce Zillow AI Mode, a conversational AI experience that turns traditional home search into coordinated action by leveraging Zillow's unique combination of live housing data, consumer behavioral context, and real-estate platform infrastructure. AI Mode delivers contextual understanding, experience navigation, multi-modal responses via widgets, and timely connections to human experts. Deploying such a system responsibly requires compliance with fair housing requirements, which prohibit housing discrimination based on protected characteristics. We describe our approach to building fair housing guardrails around the AI Mode experience.
Title: GenPage: Towards Generative End-to-End Homepage Construction at Netflix
Speaker: Luke Lequn Wang (Netflix)
Abstract:
In this talk, I'll present GenPage, Netflix's end-to-end generative approach to homepage construction — a single transformer that replaces our traditional multi-stage recommender stack. GenPage treats user context as a prompt, and autoregressively generates the entire structured, multi-row homepage as the response. We adapt the LLM training recipe: pretraining followed by post-training via weighted binary classification (WBC) or reinforcement learning (RL). For industry-scale deployment, we introduce techniques addressing cold start, model freshness, business-rule enforcement, and serving efficiency. In online A/B tests against our mature, highly optimized production homepage recommender, GenPage delivered statistically significant improvements on our core user engagement metric, while reducing end-to-end serving latency by 20%. Offline experiments yield two findings worth highlighting: enriching the prompt yields a larger improvement than scaling model capacity in our current regime, and RL post-training increases homepage diversity even though diversity is not part of the objective.
Title: The Generative Shift in RecSys at Pinterest: Foundation Models for Ranking, Sequence Generation for Retrieval
Speaker: Jaewon Yang (Pinterest)
Abstract:
Pinterest's recommender system is shifting to a generative paradigm: large transformers trained to predict next items and served in real time. This shift has two main characteristics. First, pretraining and fine-tuning amortize training cost across surfaces. Second, sequence engineering replaces feature engineering, allowing new tasks to be addressed by changing the input sequence rather than the model or features. We describe two systems built on this paradigm. PinRec is a generative retrieval model: a causal transformer pretrained on cross-surface user activity and fine-tuned for each surface through input sequence changes. It introduces outcome-conditioned generation, enabling retrieval of candidates aligned with specific business objectives. PinFM is a 20B-parameter foundation model for ranking. It is pretrained on long-term user activity and fine-tuned by appending candidate Pins to the input sequence, allowing attention to directly model user-candidate interactions. To control serving cost, a request-level transformer deduplicates user sequence processing across candidates.
Both systems share the same recipe: pretrain once, then adapt through sequence engineering. Deployed across Homefeed, Search, and Related Pins, they are now the primary drivers of engagement gains across Pinterest's major surfaces.
Semantic ID–Powered Personalization for Notifications
Aria LiÂ
Â
Generative Conversational Search for Mobile and Voice TV
Aditya Sinha, Dhinesh Dhanasekaran, Shahrzad Naseri, Spencer L'Heureux, Vito Ostuni, Matteo Rinaldi
Â
Semantic ID @ Netflix: From Generation to Integration
Sejoon Oh, Fernando Amat Gil, Dawit Mureja Argaw, Mark Thornburg, Moumita Bhattacharya, Ashish Rastogi
Â
Semantic IDs and LLM-friendly tokenization of titles for Member LLM
Dawit Mureja Argaw
Â
From Sparse Coverage to Production-Scale Calibration: A Multi-Strategy GenAI Framework for Trustworthy Retrieval Benchmarks
Ehsan Gholami, Ding Tong, Shahrzad Naseri, Lucas Zhang
 Â
Self-Reflection in Personalized Explanation Generation and Evaluation
Emma Kong, JJ Tan, David Fagnan
Â
Multimedia Asset Personalization via Multimodal Embeddings at Netflix
Emma Kong, Aditya Deshpande, David Fagnan, Ashish RastogiÂ
Â
MediaFM: A Multimodal Content Model Powering Personalization at Netflix
Avneesh Saluja, Santiago Castro, Bowei Yan, Ashish Rastogi
Â
Building a Vertical Video Ranker from Scratch: A Crawl-Phase Journey from Cold Start to Calibrated Engagement
Erik Schmidt, Ramya Nagarajan, Ishita Verma, Yunan Hu, James McInerneyÂ
From Classical Recommenders to LLM‑Native Recommendation Systems at Netflix
Shradha Sehgal, Ying Li, Arjun Rao, Linas Baltrunas