AI can scale conversations. It doesn't automatically scale understanding. | Perspectives

Jeremy Korst, Stefano Puntoni, and Olivier Toubia published a piece in Harvard Business Review this week mapping how AI moderators are transforming qualitative research. It's a careful, well-reported overview of a market that's moving quickly. The companies they profile are delivering real results: richer responses, faster timelines, lower costs, broader reach. The evidence that AI can hold meaningful conversations with real people at scale is becoming substantial.

To me, the piece points toward something even larger than the frame it uses. What we're seeing may be the early stages of a broader shift in how organizations understand people.

Three layers, not one

It's useful to separate what's happening into three distinct challenges. The first is conversation execution: can AI sustain engagement, adapt its probes, and collect rich data at scale? The platforms profiled in the article have made real progress here, with adaptive follow-ups, multi-modal capture, and asynchronous scheduling. This is largely where the market is competing.

The second is conversation design: are the questions and probes grounded in a theory of behavior? This layer receives less attention, and it may be where the most consequential differences emerge.

The third is interpretation: can the outputs be transformed into reliable constructs, actionable decisions, or accurate predictions? This is where the digital twins question the article raises at the end becomes relevant.

The article focuses primarily on the first layer. What interests me is the relationship between the second and third: the quality of what you can interpret or predict depends heavily on what you asked and how you asked it.

The conversation design question

Much of what organizations know about their people comes from surveys and structured feedback, approaches that largely invite reflective, considered responses. Qualitative research has always been better at reaching specifics, and skilled interviewers do this intuitively: reading the room, probing past rationalizations, following threads in real time. The challenge is that when you scale conversations beyond what a single researcher can conduct, that intuition has to be made explicit. The behavioral science that a good interviewer applies instinctively needs to be encoded into question design, probing logic, and sequencing. That's a different kind of problem than training a human moderator.

Kahneman and colleagues drew a useful distinction between the experiencing self and the remembering self. The remembering self constructs coherent narratives and tends to surface what people believe about themselves. Reflective accounts often contain real insight. The risk is that they present a tidier picture than reality warrants.

Their research on income and well-being illustrates this nicely. When people are asked about their life satisfaction, income correlates meaningfully. When their actual moment-to-moment experience is measured, that correlation largely disappears. The global question and the experiential question reach different versions of the same person.

Behavioral science has spent decades developing frameworks for targeting specific decision moments, identifying contextual frictions, and distinguishing between stated reasons and underlying mechanisms. Embedding that knowledge into conversation design doesn't guarantee better understanding. It does increase the probability of surfacing the mechanisms, contexts, and frictions that generic interviewing is more likely to miss.

Beyond market research

The HBR piece frames AI-moderated interviews primarily as a market research tool, and the use cases are compelling. This is also where the potential starts to expand. Many functions within organizations have been operating with limited or no qualitative evidence about the people they serve. The methods simply didn't exist at the right combination of depth, scale, and cost.

Consider healthcare. A patient support program wants to know why patients disengage from treatment. Surveys return “I felt better” or “side effects.” A conversation designed around behavioral frameworks is better positioned to surface the specific moment a patient stopped: the pharmacy was too far, the dose timing didn't fit their life, the concept of “maintenance therapy” was never explained in terms that made sense. These are actionable findings for clinical teams, not marketing teams.

Similar dynamics apply in compliance and risk, where AI conversations can surface behavioral dynamics that social desirability bias typically obscures. In employee experience, where engagement surveys measure attitudes but rarely capture the workarounds and friction points of daily work. In B2B contexts, where understanding a client's decision-making process requires depth that transaction data can't provide.

The common thread is that AI conversations are making possible a kind of behavioral understanding that many functions have never had access to, complementing and in some cases going beyond survey-based approaches that were always an imperfect proxy for the conversations we actually needed to have.

From episodic research to behavioral listening

The article describes AI-moderated interviews as a research method: something you deploy for a study, collect results, and write a report. That framing fits the traditional qualitative model, which has always been episodic.

AI conversations also open the possibility of something more continuous. If they can be integrated into existing processes, onboarding flows, patient check-ins, post-audit reviews, policy rollouts, at low marginal cost, what emerges is closer to a listening infrastructure: a systematic, recurring way of understanding what's actually happening with the people an organization serves or employs.

This matters because behavior changes. The friction that blocks medication adherence in January may differ from the friction in June. The compliance risk in one market may not exist in another. Episodic research gives you a snapshot. Continuous listening gives you a signal that evolves with the context it's measuring.

The infrastructure challenges are real: integration, privacy, data governance, the question of when and how often to listen. We're early in this. The direction seems worth taking seriously.

Better data for better synthetic personas

The article ends by pointing toward what may be the most consequential research question in this space: what training data produces the most accurate digital twins? Korst, Puntoni, and Toubia announce a new study with Columbia Business School and Twinloop to investigate this.

This connects naturally to the conversation design question. Conversations structured around validated behavioral frameworks produce a different kind of data than unstructured interviews: ranked drivers and barriers, classified by established constructs, each finding traceable to specific moments in the conversation. It seems plausible that this kind of structured behavioral evidence would be a stronger foundation for synthetic personas that need to predict behavior.

The social and behavioral sciences have spent decades building models of why people do what they do. It would be surprising if that knowledge turned out to be irrelevant to building accurate behavioral replicas. The empirical work is still in its early stages. We're exploring it through our own research collaboration with the University of Zurich, and through the Behavioral AI Institute, which recently published a paper working toward establishing behavioral AI as a scientific discipline with co-authors from Harvard and Duke.

What's emerging

The next frontier after scalable AI interviewing may be behavioral validity: confidence that what you're measuring actually reflects how people behave.

The HBR piece maps a market making qualitative research faster, cheaper, and more scalable. What's also emerging alongside it is something broader: the possibility of grounding organizational decisions in behavioral evidence at a depth and scale that hasn't been available before. Across healthcare, compliance, employee experience, and anywhere the question is “why do people do what they do?”

The conversation infrastructure matters. And so does the science that shapes the conversation, the breadth of contexts where it can be applied, and the shift from one-off studies to continuous understanding. That's what we're working on at lumenx. There's a lot of room for everyone in this space to push it forward.