Part of my research involves teaching machines to read emotion from faces, voices, and text. The goal, when the work is going well, is genuinely useful: detecting signals that might indicate distress, dysregulation, or mental health changes in ways that self-report alone cannot capture. Emotional states shape behaviour, decision-making, wellbeing. Being able to track them more continuously and more objectively is not a trivial possibility.

But I have also spent enough time inside this field to know what it cannot reliably do. And lately I keep returning to that knowledge when I read about what retailers are already building with it.

What is already in the store with you

A company called Cloverleaf, in partnership with Affectiva — one of the best-known Emotion AI developers, now part of Smart Eye — has deployed a system called shelfPoint in physical retail stores. The technology replaces the static price tags and cardboard marketing displays along store shelves with LCD strips equipped with small optical sensors and cameras. As shoppers pass, the system reads their facial expressions in real time, classifies them into emotional categories — joy, sadness, anger, fear, surprise — and uses that classification to adapt what appears on the display. Content changes based on the emotional state the system infers the shopper is experiencing.

Early retail pilots showed double-digit sales uplift. The company described this as providing brands and retailers with the same level of behavioural analytics that e-commerce platforms have long used to understand customers, now imported into physical space.

The system also captures demographic data — age, gender, and what the documentation describes as major ethnic group — using the same optical sensors. It does not, the company says, store identifiable personal images.

Most shoppers walking past these shelves have no idea any of this is happening.

Why I find this genuinely difficult to evaluate

My instinct when I encounter Emotion AI deployed at scale is not a simple alarm. The field I work in gave rise to these tools, and I understand why the underlying project is compelling. Emotions are real. They influence what people buy, what they attend to, what draws them in and what pushes them away. If a system could genuinely detect emotional state in real time and use that information to show a shopper something more relevant to what they actually need or want, there is a version of that which is not obviously harmful.

The difficulty is the gap between what the technology claims to do and what the science supports.

For instance, a systematic review by Lisa Feldman Barrett found that how people communicate even basic emotions varies substantially across different situations, cultures, and individuals. The same internal state does not produce the same facial expression reliably. The same facial expression does not indicate the same internal state reliably. A smile is not joy. A furrowed brow is not anger. These are culturally shaped, context-dependent, personality-modulated performances, not legible readouts of an inner condition.

This matters because Emotion AI systems, including Affectiva’s, are built on the premise that facial configurations do map reliably onto emotional categories. The categories themselves — joy, sadness, anger, fear, surprise, disgust — come largely from the work of Paul Ekman, whose universality hypothesis has faced sustained and serious challenge in the psychological literature. 

The EU AI Act of 2024 classifies emotion recognition technology as high-risk, citing its limited reliability, lack of specificity, and limited generalisability. It restricts its use in several contexts on exactly these grounds.

What the system is actually reading

Working with digital biomarkers — extracting emotional and psychological signals from facial, vocal, and text features — means living daily with the tension between what a signal can tell you and what you want it to tell you.

Facial expression data contains real information. It is not noise. Patterns in facial movement do correlate with emotional and cognitive states in ways that are meaningful at the population level, under controlled conditions, in research contexts where the limitations are understood and accounted for. The problem is that correlation at the population level does not translate cleanly into reliable inference about an individual shopper, standing at a shelf, in a particular culture, carrying a particular history, whose neutral face a model trained mostly on Western datasets may misread as disengaged, or whose suppressed curiosity looks like nothing at all.

Emotional regulation means that what appears on the face is often not what is happening inside. A person who has learned to manage their expressions in public — which is most people, in most cultures, most of the time — is already partially invisible to these systems. A person whose cultural norms around emotional display differ from the training data the model was built on may be systematically misclassified.

And a person whose demographic characteristics place them in a category the system captures — age, gender, ethnicity — may be receiving content shaped not only by their momentary expression but by what the algorithm predicts people like them tend to respond to. That is a different kind of inference. It does not require reading emotion at all. It requires only that the system have learned, from previous data, what different demographic groups buy.

These two functions — emotional inference and demographic targeting — are bundled together in the same product. The distinction between them tends not to appear on the display.

The consent problem

E-commerce platforms track behaviour explicitly and at length. Most people who use them know this is happening, at least in general terms. The terms of service say so. The cookie notices say so. The personalised advertisements that follow you across platforms for days after a single search are their own kind of notification.

A shopper standing in a grocery aisle looking at a shelf has not agreed to have their face read. They have not been told that the display in front of them is adapting to what a sensor has inferred about their emotional state. They have not consented to having their demographic data captured and used to shape what they see.

The shelfPoint documentation notes that no identifiable personal images are stored. This is meaningful from a privacy standpoint and worth acknowledging. It does not address the question of whether people should know this is happening to them in the moment it is happening, regardless of whether a record is kept.

Emotional experience is among the most personal of human states. The idea that it can be commercially harvested in real time, without knowledge or consent, from the ordinary act of standing in a shop and looking at a shelf, represents a change in the texture of public space that most people have not had the opportunity to evaluate or consent to.

What I find myself holding alongside the discomfort

The double-digit sales uplift is real. The technology works — in the narrow sense that it changes behaviour and increases conversion rates. Whether it works because it accurately reads emotion, or because any dynamic, responsive display is more engaging than a static one, or because demographic targeting at the shelf is simply effective regardless of emotional inference, is a question the marketing case studies do not fully answer.

What I know from working in this space is that the appeal of the technology is genuine and the underlying science is more contested than the commercial applications acknowledge. That the distance between detecting a facial configuration and knowing what someone feels is larger than a well-designed product tends to suggest. That the people most likely to be misread by these systems are often the people least well-represented in the data they were trained on.

And that the rollout of Emotion AI into physical retail is not a future event. The shelf has already started watching. What it thinks it sees is a more complicated question than the sales figures imply.