Skip to main content

Open Access 13.01.2024 | Original Paper

A Dynamic Disadvantage? Social Perceptions of Dynamic Morphed Emotions Differ from Videos and Photos

verfasst von: Casey Becker, Russell Conduit, Philippe A. Chouinard, Robin Laycock

Erschienen in: Journal of Nonverbal Behavior

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Dynamic face stimuli are increasingly used in face perception research, as increasing evidence shows they are perceived differently from static photographs. One popular method for creating dynamic faces is the dynamic morph, which can animate the transition between expressions by blending two photographs together. Although morphs offer increased experimental control, their unnatural motion differs from the biological facial motion captured in video recordings. This study aimed to compare ratings of emotion intensity and genuineness in video recordings, dynamic morphs, and static photographs of happy, sad, fearful, and angry expressions. We found that video recordings were perceived to have greater emotional intensity than dynamic morphs, and video recordings of happy expressions were perceived as more genuine compared to happy dynamic morphs. Unexpectedly, static photographs and video recordings had similar ratings for genuineness and intensity. Overall, these results suggest that dynamic morphs may be an inappropriate substitute for video recordings, as they may elicit misleading dynamic effects.
Hinweise
Author Note: Data and additional online materials are openly available at the project’s Open Science Framework repository (https://​www.​doi.​org/​10.​17605/​osf.​io/​bjvty). We have no relevant financial or non-financial conflicts of interest to disclose. The formal analysis, investigation, and the original draft of this study was completed by Casey Becker. Supervision was provided by Robin Laycock, Russell Conduit, and Philippe A. Chouinard. All authors participated in the conceptualization of the study, as well as the review and editing process.

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Introduction

Subtle facial muscle movements can convey complex emotions (Dong et al., 2021). The speed, amplitude, and duration of these movements affect how emotions are perceived (Dong et al., 2021; Schmidt et al., 2006). While traditional face perception research has relied on static images of peak expressions (e.g., Blair et al., 1999), technological advancements have led researchers to challenge this approach (Tcherkassof et al., 2007; Wehrle et al., 2000), as people perceive static and dynamic faces differently (for reviews, see Alves, 2013; Krumhuber et al., 2013; Krumhuber et al., 2023). For example, people are often better at recognising emotions in dynamic compared to static faces, a phenomenon known as the “dynamic advantage” (e.g., Cunningham & Wallraven, 2009). However, as we will discuss, the effect of dynamism may depend on the type of motion portrayed.

The Dynamic Advantage: Static Images and Video Recordings

Some studies show that people are just as good at recognising emotions in photographs compared to videos (Trautmann-Lengsfeld et al., 2013; Yitzhak et al., 2018), while others demonstrate an advantage for video recordings (Ambadar et al., 2005; Cunningham & Wallraven, 2009; Fujimura & Suzuki, 2010). Li et al. (2022) found an advantage for happy videos, but not angry ones. Interestingly, interspersing each video frame with noise images eliminates the dynamic advantage, suggesting the effect is not due to an increase in static information (Ambadar et al., 2005; Bould & Morris, 2008). People are better at distinguishing between posed and spontaneous expressions in videos compare to static faces (Krumhuber & Manstead, 2009; Zloteanu et al., 2018), perhaps because videos are perceived as more genuine (Krumhuber & Manstead, 2009), or because they elicit greater implicit facial mimicry in the observer (Rymarczyk et al., 2019).
Distinct neural responses have been found for video-recorded expressions compared to static faces in studies employing functional magnetic resonance imaging (fMRI; Foley et al., 2012; Kilts et al., 2003; Pitcher et al., 2014; Pitcher et al., 2019; Rymarczyk et al., 2018, 2019; Trautmann-Lengsfeld et al., 2013; Trautmann et al., 2009) and electroencephalography (EEG; Alp & Ozkan, 2022; Hernández-Gutiérrez et al., 2018; Li et al., 2022; Quadrelli et al., 2021; Trautmann-Lengsfeld et al., 2013). Consequently, static faces are unlikely to elicit neural and behavioural responses associated with face processing as it occurs in real dynamic social interactions.

The Widespread Use of Dynamic Morphs

Video recordings of facial expressions offer greater ecological validity, however they are difficult to standardise, as expression intensity and duration varies across actors. Videos also contain movements that are treated as experimental noise (Holland et al., 2019; Mühlberger et al., 2009). As a result, many researchers turn to image-based morphing, using computer software to simulate movement from photographs of faces (Kosonogov et al., 2022; Schonenberg et al., 2019). This technique can be used to create ambiguous stimuli by blending together emotions or identities, or to create varying levels of intensity by blending together neutral and expressive photographs (Bucks et al., 2008). Originally, morphing was used to present these stimuli in static form (Hess et al., 1997).
Some have proposed that dynamic morphs lack ecological validity, as they portray unnatural facial movement (Dobs et al., 2018; Krumhuber et al., 2023). Despite this, dynamic morphs are at the forefront of multiple research areas, as illustrated by recent studies examining distinct aspects of face perception, such as social perceptions (Li et al., 2019) and emotion contagion (Wróbel & Olszanowski, 2019). Researchers often use dynamic morphs to study face perception deficits in conditions such as anxiety (Gutiérrez-García et al., 2019), Attention Deficit Hyperactivity Disorder (ADHD): (Greco et al., 2021; Schonenberg et al., 2019), facial paralysis (De Stefani et al., 2019; Japee et al., 2022), and autism (Griffiths et al., 2019). Importantly, the extent of dynamic morphs usage often goes unnoticed, as many researchers refer to their dynamic morphed stimuli as “videos” or “video-clips” (e.g., Bogdanova et al., 2022; Calvo et al., 2018; Calvo et al., 2019a; Calvo et al., 2019b; Kunecke et al., 2014; Macinska et al., 2023).

The Dynamic Advantage: Static Images and Dynamic Morphs

Some have argued that dynamic morphs increase the perceptual realism (Malatesta et al., 2023), and ecological validity (e.g., Gehb et al., 2022; Mayes et al., 2009; Ventura-Bort et al., 2023) of the photographs from which they were made, and constitute a suitable proxy for video recordings when studying facial muscle movements (Wehrle et al., 2000). However, literature on the dynamic effect of morphs is less consistent than that on video recordings. For example, participants in one study were better at recognising emotions in dynamic morphs than photos, especially for low-intensity expressions (Calvo et al., 2016), while other studies showed better recognition from static faces (Kamachi et al., 2013; LaBar et al., 2003), or no difference between display types (Fiorentini et al., 2012; Mayes et al., 2009; Naples et al., 2015). Two studies have shown that people see dynamic morphs as more intense than static face (Biele & Grabowska, 2006; Rymarczyk et al., 2011), but another showed no dynamic effect (Kessler et al., 2011). In one study, participants displayed increased implicit facial mimicry to morphs compared to static (Philip et al., 2018; Sato & Yoshikawa, 2007), while another observed this effect only for happy and not angry expressions (Rymarczyk et al., 2011). Another study showed that dynamic morphs improved the ability to differentiate between, but not label, face emotions (Darke et al., 2019). Collectively, these findings highlight the need for further research to better understand why, and under what conditions, dynamic morphs are perceived differently from static images.

Ecological Validity of Dynamic Morphed Motion

Dynamic morphs used in research typically show a linear rise from a neutral to peak expression, but natural transitions are inherently non-linear (Cunningham & Wallraven, 2009; Krumhuber et al., 2023). This difference can be observed by motion-tracking facial landmarks in expression-to-expression transitions (e.g., angry-to-happy), which shows that the overall rate of change varies over time in video recordings, while dynamic morphs display a constant frame-to-frame shift (Korolkova, 2018b). Facial expressions can be broken down into individual movements known as action units, which correspond with the movement of underlying muscles (Ekman & Friesen, 1978). Analysis of video recordings shows that these action units differ in their rising duration, peak duration, decay duration, and magnitude (Dobs et al., 2014), with each component following a unique nonlinear trajectory (Bartlett et al., 2006; Fiorentini et al., 2012; Pantic & Patras, 2006; Zhao et al., 2016). In contrast, morphs force synchronisation upon each expression feature, e.g., the brow-furrow and mouth frown during an angry expression (see Fig. 1).
By manipulating realistic 3D-renders of faces, researchers found that non-linear motion appears more natural compared to linear motion, especially for expressions with complex skin wrinkling (Cosker et al., 2010), which is inherently non-linear (Bickel et al., 2007). Expressions look more natural when features move independently (Curio et al., 2006), and people perceive such naturalistic motion in avatar expressions as more intense, genuine, typical, and trustworthy (Krumhuber et al., 2007; Wallraven et al., 2008).
Reversing the timeline of expression-to-expression transitions (e.g., happy to sad) diminishes intensity and realism perception, and influences emotion recognition, regardless of emotion direction (Korolkova, 2018b; Reinl & Bartels, 2014). Yet, dynamic morphed motion remains identical when played forward and backward, thus reversing the timeline has no effect on behaviour (Korolkova, 2018b).
To our knowledge, one study has directly compared video recorded and dynamic morphed stimuli. Korolkova (2018a) showed that photographs of ambiguous emotions will look more like the dynamic morphed or video recorded emotion that preceded it (adaptive after-effects), but less like the emotion in a preceding static photograph (contrastive after-effects). Although the dynamic effect in this study was comparable for dynamic morphs and videos, it remains unclear if this holds true when using dynamic stimuli as target stimuli. In addition, Korolkova et al. presented stimuli at 100 frames per second to explore subtle differences in non-linearities, and it is unclear whether the standard framerates used in face databases would elicit distinct responses compared to dynamic morphs.
Two recent analyses of the literature suggest that emotion perception differs for static, video recorded, and dynamic morphed expressions. Khosdelazad et al. (2020) compared emotion recognition for three established emotion recognition tests, each using either static, video, or morphed expressions. Despite these tests proposing to measure the same construct, recognition rates for each emotion did not correlate between display types. Similarly, a meta-analysis of emotion perception in adolescence revealed that happiness was the only facial emotion in which recognition was not impacted by presentation as either static, video recorded, or dynamic morph (Zupan & Eskritt, 2023).

The Current Study

We aimed to systematically examine the ecological validity of dynamic morphs, and establish whether the dynamic advantage differs for dynamic morphs and videos. To do this, we assessed whether photos and dynamic morphs convey the same social information as the videos from which they were created. High intensity expressions can often lead to ceiling effects on categorical emotion recognition tasks (Kamachi et al., 2013), however they can still elicit differences in social judgments (Biele & Grabowska, 2006; Kamachi et al., 2013; Krumhuber & Manstead, 2009; Oda & Isono, 2008). We therefore compared judgements of expression strength (also known as intensity) and emotion genuineness (also known as authenticity, sincerity) of each display type. Given video-recorded expressions have previously demonstrated a more consistent dynamic advantage, and given naturalistic facial movements are seen as more intense and authentic, we hypothesised that video-recorded expressions will be perceived as stronger and more genuine compared to dynamic morphed and static photos.

Method

Participants

A power analysis was conducted based on an effect size of η2𝑝 = 0.06 from a comparable study examining naturalness of morphed facial expression (Sato & Yoshikawa, 2004), and 95% power. The required sample size of 42 was increased to account for online delivery and additional emotions. Ninety-six participants (53 female, 36 male, 4 nonbinary/gender diverse/other) were initially recruited in 2021 via online advertisements on multiple platforms, including Facebook, Twitter, Instagram, LinkedIn, Reddit, and other research-focused forums. We targeted a variety of volunteer, community, and student groups across multiple English-speaking towns and cities to ensure an unbiased sample. One participant was excluded due to a loading delay of more than 10 s for more than 20% of trials. No other participants had loading delays for any trials.
Participants were located in Australia (53), United Kingdom (8), United States (7), Canada (5), India (3), Switzerland (3), France (1), Germany (1), Malaysia (1), and New Zealand (1). Twelve participants did not report their location. Interested individuals completed an online screening questionnaire, and those aged between 18 and 40 with normal vision (as vision declines with age; Mateus et al., 2013) were invited to complete the online experimental study. Due to a technical error during data collection, the exact age of many participants was not obtained. They provided informed consent via a checkbox.
Twenty-two participants indicated they had a diagnosis of a psychiatric or neurological condition. Eighteen listed at least one psychiatric condition (Depression related: 14, Bipolar related: 2, Anxiety related: 4, Bipolar disorders: 2, Post-Traumatic-Stress Disorder: 2, Borderline Personality Disorder: 2, Obsessive-Compulsive Disorder: 2, Attention Deficit Hyperactivity Disorder: 1, Autism: 2). Three participants chose not to indicate their diagnosis. No participants listed any neurological disorders. The study was approved by the local ethics committee of RMIT University.

Stimuli Selection and Creation

Facial expressions of happiness, anger, fear, and sadness were sourced from the Amsterdam Dynamic Facial Expression Set, which portrays posed expressions (ADFES; van der Schalk et al., 2011). While the time it takes to express spontaneous emotions typically varies, rising time for posed expressions can be similar across emotions (Mavadati et al., 2016). Consistent with this, we observed that the rising time for expressions was similar across emotions in the ADFES database, with many of the actors reaching an intense emotion 800ms after expression onset. This allowed us to standardise all emotion transitions to 800ms. Actors who reached the peak before 800ms were excluded, as were actors who reached the peak well after 800ms. Hence, actors were only chosen if their expression at 800ms could reasonably be considered the peak by two researchers.
To ensure the neutral face was perceived before expression transition, the neutral frame before expressive movement was held for 200ms. To prevent the appearance that the expression was cut short, the final intense emotion shown after the 800ms transition was held for 200ms, resulting in a total stimulus time of 1200ms. This method created a pseudo-peak, where the expression appeared to culminate without a noticeably static pause. Such pauses are consistent with research comparing static expressions to dynamic morphs (Arsalidou et al., 2011; Biele & Grabowska, 2006; Kessler et al., 2011; LaBar et al., 2003; Pelphrey et al., 2007; Rymarczyk et al., 2011; Sato & Yoshikawa, 2007; Sato & Yoshikawa, 2004) and videos (Krumhuber & Scherer, 2016).
The resulting expressions were depicted by 6 female (two Mediterranean, four Northern European) and 6 male actors (one Mediterranean, five Northern European). Static stimuli consisted of the peak displayed for 1200ms. Dynamic morphs were created by utilising the neutral and peak expression still frames from each video stimulus, and linearly morphed using FantaMorph© (Version 4.1), which simulated movement between the two images. Approximately 50 markers were placed at corresponding locations for the neutral and peak image (Kaufman & Johnston, 2014), especially at the edges of facial features, and points with the most movement. These locations included the pupils, eye corners, eye lids, hairline, lip edges, and facial moles (see Fig. 2).

Experimental Procedure and Tasks

Participants completed the experimental task online from a laptop or desktop computer. The experiment was created using the online behavioural experiment builder Gorilla (Anwyl-Irvine et al., 2020), and ran for approximately 30 min, including short rest periods to avoid fatigue. Each participant completed 144 trials, including 48 trials for each display type (video recording, dynamic morph, static), with 12 trials for each emotion (anger, fear, happiness, sadness). Trials began with a fixation cross on a grey background, lasting for 400ms, followed by the face expression stimulus, which lasted for 1200ms, followed by a blank screen for 500ms (see Fig. 3). Participants were asked “How strong was the expression”, indicating their answer using a 5-point Likert scale with three labels; (1) Very weak (3) Moderate, (5) Very Strong (Livingstone & Russo, 2018).
Next, participants were asked, “How genuine was the emotion”, indicating their answer using a 5-point Likert scale with three labels; (1) Not Genuine (3) Somewhat Genuine, (5) Very Genuine (Livingstone & Russo, 2018). Note that genuineness was explicitly framed in terms of emotion, rather than as an indication of a “real video”, as participants were not informed that there were any differences in stimulus realism. They completed a practice task, where they responded to two examples of each stimulus type, made from actors not used in the main experiment. During practice, participants were instructed that the ratings of emotion and genuineness were independent from each other, and to not overthink their ratings. The dynamic task consisted of video recordings and dynamic morphs presented randomly, with a break at the half-way point. Static photographs were shown in a separate task, resulting in three blocks of 48 trials each. Breaks were included between tasks, which were counterbalanced.

Statistical Analysis

For each rating type (genuineness, strength) a two-way repeated-measures ANOVA was performed with factors ‘display type’ (static, dynamic morph, video) and ‘emotion’ (happiness, sadness, fear, anger). Results were considered as significant at p < .05. Main effects were analysed with post hoc tests, and interaction effects were interpreted using simple main effects. In both cases, Bonferroni corrections were applied, and corrected p-values (pc) are reported. Estimates of effect size are reported as partial eta squared (η2𝑝), and their magnitude was interpreted using: 0.01 (small effect), 0.06 (medium effect), and 0.14 (large effect; Cohen, 2013). Statistical analyses were performed using the statistical software package SPSS (version 28.0). Where the assumption of sphericity was violated, Greenhouse-Geisser correction was applied to the degrees of freedom reported.

Data Availability Statement

In line with Journal Article Reporting Standards (JARS; Kazak, 2018), we report how we determined our sample size, all data exclusions, all manipulations, and all measures in the study. The dataset and all stimuli used in the current study are available in the Open Science Framework repository: https://www.https://​doi.​org/​10.​17605/​osf.​io/​bjvty. This study was not preregistered.

Results

We conducted preliminary analyses on strength and genuineness ratings to explore potential influences of gender and clinical diagnosis, as prior studies suggest these factors may affect face perception (Feuerriegel et al., 2015; Fischer et al., 2018). Results indicated no significant differences between male and female participants when adding gender as a between-subjects factor (p’s > 0.7). Likewise, no interaction effects were observed when adding clinical diagnosis, mood disorder, or anxiety disorder as a between-subjects factor (p’s > 0.09). Analysing the data with each group removed produced the same patterns of results as for the main analysis with no exclusions. Thus, our analysis was collapsed across gender and clinical diagnosis.
We conducted correlational analysis to establish whether strength and genuineness influenced each other. There was a weak correlation between strength and genuineness, r(94) = 0.306, p = .003.

Expression Strength

Strength ratings for each condition can be seen in Fig. 4. ANOVA revealed a significant main effect of display type (F(2,186) = 69.62, p < .001, η2𝑝 = 0.428). Participants evaluated dynamic morphs as having less strength compared to both videos and static photographs (both pc’s < .001). Strength ratings between videos and static photographs were not statistically significant (pc = .241).
There was also a significant main effect of emotion (F(2.21, 205.52) = 107.55, p < .001, η2𝑝 = 0.536. Participants rated fearful expressions as stronger than angry, happy, and sad expressions (all pc’s < .003). Happy expressions were rated as being stronger than angry and sad expressions (pc’s < .001), while angry expressions were rated as stronger than sad expressions (pc’s < .001).
Importantly, there was a significant interaction between display type and emotion (F(6,558) = 7.86, p < .001, η2𝑝 = 0.078). Within each emotion, photographs and videos were rated as being stronger than dynamic morphs (all pc’s < .001). There were no significant differences between photographs and videos for fearful (pc = .463) or happy (pc = .807), while static photographs were rated as being stronger than videos for angry and sad expressions (all pc’s < .002).

Emotion Genuineness

Genuineness ratings for each condition can be seen in Fig. 5. ANOVA revealed a significant main effect of display type (F(1.45,127.37) = 16.14, p < .001, η2𝑝 = 0.155. Dynamic morphed emotions were rated as less genuine compared to video recorded (pc < .001) and static emotions (pc < .001). No significant difference in perceived genuineness was found between video recorded and static emotions (pc = .319).
There was also a significant main effect of emotion (F(2.31,203.28) = 139.44, p < .001, η2𝑝 = 0.613). Participants evaluated happiness as more genuine than anger, fear, and sadness (all pc values < .001). Sadness was perceived as more genuine than anger (pc = .023) and fear (pc = .002). The perceived genuineness for anger and fear was not significantly different. (pc = .288).
There was a significant interaction between display type and emotion ((6,552) = 46.108, p < .001, η2𝑝 = 0.334). Dynamic morphed happy expressions were rated as less genuine compared to both video recorded (pc < .001), and static happy expressions (pc < 0.001). Genuineness ratings for static and video recorded happiness was not significantly different (pc = 1.00). Fearful videos were rated as more genuine than static fear (pc = 0.096) and dynamic morphed fear (pc = .096), though these differences did not survive Bonferroni correction. There were no significant effects between the display types for angry (pc’s > 1.00) or sad expressions (pc’s > .213).

Discussion

We examined the perceived strength and genuineness of static, dynamic morphed, and video-recorded facial expressions, given the rising trend of synthetic dynamics in face perception research. We created dynamic morphs and static images from video frames, yielding stimuli with identical neutral and peak expressions that varied only in the dynamics between. We expected that video-recorded expressions would be perceived as stronger and more genuine than dynamic morphs and static images. Participants viewed videos (all emotions) as stronger than dynamic morphs, and they viewed happy videos as more genuine than happy dynamic morphs. Surprisingly, participants viewed static photographs and video recordings as equally strong for happy and fearful expressions, and equally genuine for sadness, anger, and fear. In this discussion, we will examine hypotheses relating to strength ratings, followed by genuineness ratings, before discussing the findings collectively and evaluating their implications.
Ratings of strength and genuineness were weakly correlated. Requiring participants to make both judgements for each stimulus may have contributed to their correlation in the current study, despite instructions that they were distinct concepts. However, previous research shows that perceived strength and genuineness are weakly correlated when asked separately, especially for smiles (Dawel et al., 2015). Thus, we will interpret these concepts separately, before discussing the findings collectively and acknowledging their implications.

The Perception of Strength

Participants viewed emotions in videos as stronger than in dynamic morphs. Where Korolkova (2018b) showed that dynamic morphs and videos have different time-inversion effects, our findings demonstrate that social judgements of morphed emotions differ from videos, and that this can occur with a temporal resolution as low as 25fps. As with studies using 3D avatars (Wallraven et al., 2008), we show that people perceive synchronous facial motion (i.e., all features moving at once, as is characteristic of dynamic morphs) as less intense when compared with more naturalistic asynchronous facial motion (i.e., video recordings). This insight may be worth considering for researchers who prefer dynamic morphs over 3D avatars to enhance ecological validity, particularly those interested in emotion intensity.
Participants viewed happiness and fear as similarly intense in photos and videos, while they viewed static anger and sadness as stronger than original videos (though this effect was small). Our findings partly align with Kilts et al. (2003), who reported no difference in strength ratings between static and video recorded conditions happy expressions. However, unlike our study, Kilts et al. also reported no difference for angry expressions. Although both studies utilised video recordings of trained actors exhibiting peak expressions, the stimuli in the Kilts et al. study depicted an actor displaying the emotion for 4 s and included head movements. It is possible that the brief neutral-to-peak transitions used in our study led to qualitative differences in static angry stimuli compared to those used in Kilts et al., resulting in a perception of greater strength in the static condition.
Other studies that have reported increased strength ratings for dynamic compared to static stimuli have primarily used dynamic morphs (e.g., Biele & Grabowska, 2006; Kamachi et al., 2013), which may not be suitable for making predictions about video recorded stimuli. However, it is surprising that static photographs were rated as equal to or stronger than video recordings. This finding suggests that improved emotion recognition (Butcher & Lander, 2017; Butcher et al., 2011) and unconscious emotion-congruent facial activity (Rymarczyk et al., 2019) for video-recorded compared to static expressions may not be related to their perceived strength. The continuous display of peak expression in static stimuli could lead to an increased perception of strength without impacting facial responses or emotion categorisation abilities.
Some emotion recognition studies have found no dynamic advantage for video-recorded happy expressions (Ambadar et al., 2009; Cunningham & Wallraven, 2009), but have shown an advantage for other expressions, such as sadness (Ambadar et al., 2009; Cunningham & Wallraven, 2009) anger, and fear (Ambadar et al., 2009). Our finding that static photographs of fear and anger were perceived as stronger than video recordings suggests the dynamic advantage for recognizing these emotions is not related to their perceived strength. Although we predicted participants would view static expressions as weaker than video recorded expressions, the fact that they were perceived as stronger can still be considered to indicate their inadequacy. Video recordings, while not a substitute for a present and interactive human, still capture the expression’s temporal sequence as it happens. Static peak expressions should be perceived similarly to the video recordings from which they were derived if they are to be validated as suitable stimuli in face perception research.
In our study, static photographs were consistently perceived as stronger than dynamic morphs for each emotion and on average across all emotions. This finding is surprising, as two studies reported dynamic morphed expressions were stronger than static counterparts (Biele & Grabowska, 2006; Rymarczyk et al., 2011). Methodological differences might explain these discrepancies. Biele and Grabowska (2006) presented static and dynamic stimuli randomly within the same task, while Rymarczyk et al. (2011) used blocks of the same kind of stimuli. In contrast, we presented static stimuli separately to avoid confusion between stimuli types. This separation could have led participants to “reset” their rationale for strength ratings between tasks.
Uono et al. (2010) found that the final image (peak) of dynamic morphs appeared more emotionally exaggerated than static facial expressions. However, this study asked participants to rate the intensity of the initial (static or dynamic) stimulus while viewing a second static image of the peak expression presented subsequently, and it is unclear whether this influenced results. More in line with our findings, (Kamachi et al., 2013) reported no increase in intensity for dynamic morphs compared to static photographs. These inconsistent findings underscore the need for further research to clarify the conditions under which dynamic morphs might be perceived as stronger or weaker than static photographs.
The dynamic morphs used in the current study transitioned from neutral to peak emotion. Some dynamic morphs are truncated, ending before the peak expression is reached (e.g., Calvo et al., 2016). This allows researchers to generate more challenging emotion recognition tasks, which are typically too easy when the peak expression is perceived, leading to ceiling effects (Kamachi et al., 2013). Truncated dynamic morphs are assumed to portray a lower strength/intensity version of the full expression. It is unclear whether such truncated expressions are similar to a low intensity video-recorded emotion. In any case, our findings suggest that dynamic morphs do not adequately portray expression strength, relative to both photos and videos, even when they end on a true photograph.

The Perception of Genuineness

Participants viewed happiness in videos and photos as more genuine than other emotions, and more genuine than happy morphs. They viewed anger, fear, and sadness as similarly genuine across display types. We measured genuineness due to its high social value (Zloteanu et al., 2018) and to avoid making participants aware of artificially animated stimuli through questions on naturalism. However, the video database used in the current study portrayed actors were coached to portray emotions naturally and accurately (van der Schalk et al., 2011). Hence, there is no “correct” answer, as we do not know how genuinely each emotion was felt by actors. Such posed expressions are in some sense disingenuous, and are viewed as less genuine than spontaneously induced emotions (Krumhuber & Manstead, 2009; Zloteanu et al., 2018). This may account for similarities between stimulus types for anger, fear, and sadness, which may be differentiated for spontaneous expressions.
Smiles hold a unique position among emotional expressions, as they serve multiple purposes: they signify genuine happiness and convey many social cues, such as shared understanding (Martin et al., 2017). Duchenne smiles, which involve extra muscle movements that cause wrinkling near the eyes (crow’s feet), have been said to signify true spontaneous happiness (Ekman et al., 1990), and are perceived differently to posed and false smiles (Gunnery & Ruben, 2016). However, Duchenne smiles occur in both spontaneous and posed conditions, and posed Duchenne smiles do not have to accompany positive feelings (Krumhuber & Manstead, 2009). It is therefore possible that compared to other emotions, happiness was more genuinely felt, or more convincingly faked by the ADFES actors used in the current study. In any case, unlike dynamic morphs, photos appear capable of conveying the perceived genuineness of smiles observed in videos. Again, it appears that adding computer-generated motion to photographs makes them less similar to a video, perhaps because it removes our ability to imagine the naturalistic motion that generated the expression.
Studies which measure similar constructs, such as naturalness and realism, may provide insight into our findings. Oda and Isono (2008) found that non-linear expression trajectories were perceived as more “natural” than linear expressions overall, consistent with our findings for happy. As established in our study, the overall effect was not present for sadness, which was perceived as highly realistic for both linear and S shaped functions. While these findings don’t explain our null results for anger and fear, they do indicate that social perceptions related to genuineness for linear and naturalistic facial motion are influenced by the emotional expression. This is also consistent with findings from McLellan et al. (2010) who found that participants could reliably detect whether video-recorded facial expressions were genuinely felt or simulated, although the pattern was not consistent across emotions, suggesting emotion-specific sensitivity.
The unnatural motion of dynamic morphs may be particularly evident for expression characteristics that reveal previously hidden or less visible features. This could explain why morphs are not commonly used to display blinking, as opening the eyes uncovers the iris. Similarly, toothy smiles expose the teeth. This is an inherent aspect of creating neutral-to-peak morphed expressions (Calder et al., 1996), even with precise landmark placement. This is a significant disadvantage for dynamic morphs, as studies that measure emotion recognition for partially obscured videos have highlighted the importance of the mouth region (Blais et al., 2012), especially for happy expressions (Hoffmann et al., 2013). Further, mouth openness can influence the perceived meaning of a smile (e.g., amused, nervous, or polite smiling; Ambadar et al., 2009). While the literature generally shows differences in the perception of dynamic morphed and static emotion (e.g., Calvo et al., 2016; Recio, 2013) our results indicate that these dynamic effects differ from those of naturalistic facial emotion.

Constraints on Generality and Future Directions

Our discussion of ecological validity has focused on the biological accuracy of stimulus dynamism. Experiments which contain complex, dynamic, naturalistic stimuli can nevertheless lack ecological validity if the experimental environment fails to emulate the real-world situation of interest (Shamay-Tsoory & Mendelsohn, 2019). It is possible that the effects of naturalistic facial dynamism are contingent on the ecological validity of the experimental task and environment (Risko et al., 2012). Future research may assess the effects of realistic dynamism in naturalistic and interactive settings.
There is evidence that education level (Demenescu et al., 2014) and cultural background (Engelmann & Pogosyan, 2013) influence face perception. While an effort was made to recruit a diverse range of participants, online posts which gained the most engagement consisted primarily of student groups for universities located within Australia, Europe, and the United States. This is reflected in our country of residence data and likely contributes to the overrepresentation of Western, educated, industrialized, rich, and democratic (WEIRD) populations in research (see Roberts et al., 2020). Additionally, we showed actors of Northern European and Mediterranean ethnic backgrounds, which may not fully capture the diversity of facial expressions encountered in daily life, particularly in multicultural societies such as those from which our participants were drawn. Notably, the perception of other races and ethnicities may differ for similar samples (Shriver et al., 2008). As the actors were culturally Dutch and emotional expressions can differ by culture (Srinivasan et al., 2016), further research on non-western expressions is needed.

Conclusions

Overall, our findings indicate that dynamic morphed stimuli are perceived differently from video recordings and may not be suitable replacements, despite their ease of alteration for various experimental tasks. The difference in ratings of expression strength and genuineness observed between static photographs and dynamic morphs seems to stem from their unnatural, synchronous linear motion, rather than their ability to represent realistic facial dynamics. This conclusion is in contrast to the suggestion that morphs are a suitable proxy for video stimuli (Wehrle et al., 2000), and offer increased ecological validity over static photographs (e.g., Gehb et al., 2022; Mayes et al., 2009). This finding has important implications for studies which use dynamic morphs to explore the effect of facial dynamism (e.g., Calvo et al., 2016; Recio, 2013). If dynamic morphs are indeed perceived differently to video recordings, as our findings suggest, such studies would instead be measuring an artificial morphed advantage. Our findings also have implications for studies using dynamic morphs to explore the neural mechanisms of face perception (e.g., Kessler et al., 2011; Prochnow et al., 2013) as well as those studies seeking to explore deficits in face perception in clinical populations (e.g., Darke et al., 2021; Hadjikhani et al., 2017; Lassalle et al., 2017; Simões et al., 2018). Our results underscore that humans are highly sensitive to the temporal dynamics of real and artificial facial motion. Future research should consider this sensitivity when selecting stimuli, particularly if aiming to approximate real-world interactions.

Declarations

Competing Interests

The authors declare no competing interests.
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://​creativecommons.​org/​licenses/​by/​4.​0/​.

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Literatur
Zurück zum Zitat Bartlett, M. S., Littlewort, G., Frank, M., Lainscsek, C., Fasel, I., & Movellan, J. (2006). Fully automatic facial action recognition in spontaneous behavior. In 7th International Conference on Automatic Face and Gesture Recognition (FGR06) (pp. 223–230). IEEE.https://doi.org/10.1109/FGR.2006.55 Bartlett, M. S., Littlewort, G., Frank, M., Lainscsek, C., Fasel, I., & Movellan, J. (2006). Fully automatic facial action recognition in spontaneous behavior. In 7th International Conference on Automatic Face and Gesture Recognition (FGR06) (pp. 223–230). IEEE.https://​doi.​org/​10.​1109/​FGR.​2006.​55
Zurück zum Zitat Cosker, D., Krumhuber, E., & Hilton, A. (2010). Perception of linear and nonlinear motion properties using a FACS validated 3D facial model. In Proceedings of the 7th Symposium on Applied Perception in Graphics and Visualization, Los Angeles, USA. https://doi.org/10.1145/1836248.1836268 Cosker, D., Krumhuber, E., & Hilton, A. (2010). Perception of linear and nonlinear motion properties using a FACS validated 3D facial model. In Proceedings of the 7th Symposium on Applied Perception in Graphics and Visualization, Los Angeles, USA. https://​doi.​org/​10.​1145/​1836248.​1836268
Zurück zum Zitat Curio, C., Breidt, M., Kleiner, M., Vuong, Q. C., Giese, M. A., & Bülthoff, H. H. (2006). Semantic 3d motion retargeting for facial animation. In Proceedings of the 3rd Symposium on Applied Perception in Graphics and Visualization, Boston Massachusetts, USA. https://doi.org/10.1145/1140491.1140508 Curio, C., Breidt, M., Kleiner, M., Vuong, Q. C., Giese, M. A., & Bülthoff, H. H. (2006). Semantic 3d motion retargeting for facial animation. In Proceedings of the 3rd Symposium on Applied Perception in Graphics and Visualization, Boston Massachusetts, USA. https://​doi.​org/​10.​1145/​1140491.​1140508
Zurück zum Zitat De Stefani, E., Ardizzi, M., Nicolini, Y., Belluardo, M., Barbot, A., Bertolini, C., Garofalo, G., Bianchi, B., Coudé, G., & Murray, L. (2019). Children with facial paralysis due to Moebius syndrome exhibit reduced autonomic modulation during emotion processing. Journal of Neurodevelopmental Disorders, 11(1), 1–16. https://doi.org/10.1186/s11689-019-9272-2.CrossRef De Stefani, E., Ardizzi, M., Nicolini, Y., Belluardo, M., Barbot, A., Bertolini, C., Garofalo, G., Bianchi, B., Coudé, G., & Murray, L. (2019). Children with facial paralysis due to Moebius syndrome exhibit reduced autonomic modulation during emotion processing. Journal of Neurodevelopmental Disorders, 11(1), 1–16. https://​doi.​org/​10.​1186/​s11689-019-9272-2.CrossRef
Zurück zum Zitat Dong, Z., Wang, G., Lu, S., Yan, W. J., & Wang, S. J. (2021). A Brief Guide: Code for spontaneous expressions and micro-expressions in videos. Proceedings of the 1st Workshop on Facial Micro-Expression: Advanced Techniques for Facial Expressions Generation and Spotting, Ottowa, Canada. https://doi.org/10.1145/3476100.3484464. Dong, Z., Wang, G., Lu, S., Yan, W. J., & Wang, S. J. (2021). A Brief Guide: Code for spontaneous expressions and micro-expressions in videos. Proceedings of the 1st Workshop on Facial Micro-Expression: Advanced Techniques for Facial Expressions Generation and Spotting, Ottowa, Canada. https://​doi.​org/​10.​1145/​3476100.​3484464.
Zurück zum Zitat Ekman, P., Davidson, R. J., & Friesen, W. V. (1990). The Duchenne smile: Emotional expression and brain physiology: II. Journal of Personality and Social Psychology, 58(2), 342.CrossRefPubMed Ekman, P., Davidson, R. J., & Friesen, W. V. (1990). The Duchenne smile: Emotional expression and brain physiology: II. Journal of Personality and Social Psychology, 58(2), 342.CrossRefPubMed
Zurück zum Zitat Ekman, P., & Friesen, W. V. (1978). Facial action coding system. Environmental Psychology & Nonverbal Behavior. Ekman, P., & Friesen, W. V. (1978). Facial action coding system. Environmental Psychology & Nonverbal Behavior.
Zurück zum Zitat Fiorentini, C., Schmidt, S., & Viviani, P. (2012). The identification of unfolding facial expressions. Perception, 41(5), 532–555.CrossRefPubMed Fiorentini, C., Schmidt, S., & Viviani, P. (2012). The identification of unfolding facial expressions. Perception, 41(5), 532–555.CrossRefPubMed
Zurück zum Zitat Greco, C., Romani, M., Berardi, A., De Vita, G., Galeoto, G., Giovannone, F., Vigliante, M., & Sogos, C. (2021). Morphing Task: The emotion recognition process in children with attention deficit hyperactivity disorder and autism spectrum disorder. International Journal of Environmental Research and Public Health, 18(24), 13273. https://doi.org/10.3390/ijerph182413273.CrossRefPubMedPubMedCentral Greco, C., Romani, M., Berardi, A., De Vita, G., Galeoto, G., Giovannone, F., Vigliante, M., & Sogos, C. (2021). Morphing Task: The emotion recognition process in children with attention deficit hyperactivity disorder and autism spectrum disorder. International Journal of Environmental Research and Public Health, 18(24), 13273. https://​doi.​org/​10.​3390/​ijerph182413273.CrossRefPubMedPubMedCentral
Zurück zum Zitat Griffiths, S., Jarrold, C., Penton-Voak, I. S., Woods, A. T., Skinner, A. L., & Munafo, M. R. (2019). Impaired recognition of Basic emotions from facial expressions in Young People with Autism Spectrum Disorder: Assessing the importance of expression intensity. Journal of Autism and Developmental Disorders, 49(7), 2768–2778. https://doi.org/10.1007/s10803-017-3091-7.CrossRefPubMed Griffiths, S., Jarrold, C., Penton-Voak, I. S., Woods, A. T., Skinner, A. L., & Munafo, M. R. (2019). Impaired recognition of Basic emotions from facial expressions in Young People with Autism Spectrum Disorder: Assessing the importance of expression intensity. Journal of Autism and Developmental Disorders, 49(7), 2768–2778. https://​doi.​org/​10.​1007/​s10803-017-3091-7.CrossRefPubMed
Zurück zum Zitat Simões, M., Monteiro, R., Andrade, J., Mouga, S., França, F., Oliveira, G., Carvalho, P., & Castelo-Branco, M. (2018). A novel biomarker of compensatory recruitment of face emotional imagery networks in autism spectrum disorder [Original Research]. Frontiers in Neuroscience, 12. https://doi.org/10.3389/fnins.2018.00791 Simões, M., Monteiro, R., Andrade, J., Mouga, S., França, F., Oliveira, G., Carvalho, P., & Castelo-Branco, M. (2018). A novel biomarker of compensatory recruitment of face emotional imagery networks in autism spectrum disorder [Original Research]. Frontiers in Neuroscience, 12. https://​doi.​org/​10.​3389/​fnins.​2018.​00791
Zurück zum Zitat Tcherkassof, A., Bollon, T., Dubois, M., Pansu, P., & Adam, J. M. (2007). Facial expressions of emotions: A methodological contribution to the study of spontaneous and dynamic emotional faces. European Journal of Social Psychology, 37(6), 1325–1345. https://doi.org/10.1002/ejsp.427.CrossRef Tcherkassof, A., Bollon, T., Dubois, M., Pansu, P., & Adam, J. M. (2007). Facial expressions of emotions: A methodological contribution to the study of spontaneous and dynamic emotional faces. European Journal of Social Psychology, 37(6), 1325–1345. https://​doi.​org/​10.​1002/​ejsp.​427.CrossRef
Metadaten
Titel
A Dynamic Disadvantage? Social Perceptions of Dynamic Morphed Emotions Differ from Videos and Photos
verfasst von
Casey Becker
Russell Conduit
Philippe A. Chouinard
Robin Laycock
Publikationsdatum
13.01.2024
Verlag
Springer US
Erschienen in
Journal of Nonverbal Behavior
Print ISSN: 0191-5886
Elektronische ISSN: 1573-3653
DOI
https://doi.org/10.1007/s10919-023-00448-3