Distinct neuronal populations process emotions in different domains: A study on non-verbal vocalisations and facial expressions

Table of Contents

Abstract

Often emotional information is presented simultaneously in different domains and needs to be integrated rapidly: previous research has suggested that there are distinct neuronal populations responsible for processing separate emotions in both the visual and auditory domain, however it is unclear whether the same neuronal populations are responsible for processing distinct emotions across domains. In order to investigate this we employed an adaptation paradigm, which works by desensitising the neurons associated with processing one stimulus and then measuring responding to a second stimulus: if the two stimuli are processed by the same neuronal populations then responding to the second stimulus will be impaired due to neural fatigue. This study aims to consider whether adaptation effects can be found between stimuli from different domains: non-verbal emotional vocalisations and facial expressions. Experiment one looks at adaptation in only the visual domain using three emotional categories: anger, fear and sadness, and confirms the findings of previous studies that distinct neuronal populations are responsible for the processing of emotion in this domain. Experiment two uses the same emotions and methodology as experiment one but employs non-verbal vocalisations in place of the adapting images. No cross modal adaptation effects occurred, however it was found that prior exposure to an emotional vocalisation impaired recognition of different emotions in facial expressions. This suggests that whilst the same neuronal populations do not represent facial and vocal emotional expressions, there is some communication between the neurons responsible for processing emotions in different domains. No improvement of emotional recognition was found after exposure to a same-emotion vocalisation, however it is suggested that this may be due to ceiling effects and that pre-exposure to emotional vocalisations may prime recognition of the same emotion in facial expressions.

Introduction

Ekman (1972) argued that some emotions and the expressions associated with them are innate, descending from the communication mechanisms used by early humans. There is some debate in the literature as to what specific emotions should be included in this theory, however the six that were proposed by Ekman (1972) and are generally accepted are anger, fear, sadness, happiness, disgust and surprise. Significant support for this theory has been provided by cross-cultural studies: Ekman (1980) found that across five countries, judgements of facial expressions correlate both within and between cultures. Furthermore, natives of New Guinea, who had never been exposed to western culture, were able to identify what facial expressions would be appropriate to different situations in a way that was consistent with western judgements. Sauter, Eisner, Ekman and Scott (2010) extended these findings to emotional vocalisations, suggesting that the association between non-verbal vocalisation features and specific emotions was recognised universally. Some critics argue that the correlations found between cultures were not 100%, meaning the expressions may not be truly universal, however Ekman (1994) argued that the recognition of expressions across cultures in a manner that was significantly higher than chance was enough: some cross cultural variation could be expected due to differences in what situations are likely to elicit specific emotions in different cultures (ie. A stimulus that one culture may react to with anger may be reacted to with fear in another culture), however this difference in responding does not change the nature of the emotion felt and displayed. Furthermore, some cultures may vary in what they deem to be appropriate when expressing emotions: some cultures may frown upon expressing emotions such as sadness publicly.

Recognition of emotions is thought to be essential for social development and even survival (Ekman, 1992). Emotional cues are often presented simultaneously through multiple modalities across different domains such as facial expressions (visual domain) and non-verbal vocalisations (auditory domain), which for the purposes of this research are defined as non-speech sounds that convey emotion through their variations in prosody. In order to ensure processing and responding is both rapid and correct it is essential that information from all modalities be considered, however it is unclear how information from these separate modalities is processed and integrated within the brain. Several researchers have argued that the most effective method would be for neuronal populations to be organised around emotional categories with individual neurons able to process and respond to multiple inputs (Carroll and Young, 2005) in order to cope with the temporal demands of constantly monitoring and responding to stimuli that can change rapidly. De Gelder and Vroomen (2000) provided support for this method of neural organisation, arguing that when asked to classify faces, participants were influenced by a co-occuring emotional sound even when explicitly told to ignore this. This relationship was found to be bidirectional and suggests automatic integration of signals from different modalities.

In support of the presence of multi-modal neurons, specific brain areas that appear to respond to multi-modal emotional stimuli have been identified. Calvert, Spence and Stein (2004) argued that in order to be considered fully multi-modal, a region should show a supra-additive response to stimuli from different modalities (in the case of this study, audio and visual), meaning a higher level of response will be recorded than the sum of the response to purely audio or visual stimuli. Hagan et al (2009) found that the right posterior superior temporal sulcus (pSTS) meets this criterion and showed a response within 200ms of congruent facial expressions and vocalisations being presented, thus suggesting the integration of emotional information from different modalities is automatic. Furthermore, Watson et al (2014) found that FMRI signals in the right pSTS were reduced in response to a facial expression when it was proceeded by a congruent emotional vocalisation, thus providing preliminary evidence that emotional representation and integration in this area may be attributable to populations of multimodal neurons specific to each emotion category as opposed to interwoven populations of unisensory neurons.

The amygdala has been shown to respond to various emotional facial expressions (Fitzgerald, Angstadt, Jelsone, Nathan and Phan, 2006) and vocalisations (Fecteau, Belin, Joanette and Armony 2007). However its primary role is thought to be in fear processing: Dolan, Morris and Gelder (2001) found that whilst increased amygdala activation was found in response to all congruent pairs of emotional facial expressions and vocalisations compared to incongruent, this effect was primarily driven by fear pairs. Furthermore, Aube, Angulo-Perkins, Peretz, Concha and Armony (2015) found a cluster of neurons within the left amygdala that responded exclusively to fear stimuli regardless of domain. In addition these brain imaging findings, Scott et al (1997) reported the case of a patient with bilateral amygdala lesions who showed impaired recognition of both fearful and angry faces and vocalisations, whilst Adolphs, Tranel, Damasio and Damasio (1994) reported the case of patient SM who was impaired in recognising fearful facial expressions due to bilateral calcification of the amygdala. Whilst early reports suggested SM showed intact recognition of emotional prosody, later follow ups showed impaired recognition of fearful vocalisations (Feinstein, Adolphs, Damasio and Tranel, 2011). Therefore it appears that the amygdala is primarily implicated in processing fearful stimuli across domains but does have a role in the processing of other emotions. However, whilst case studies are undoubtedly useful as they provide human evidence that would otherwise have been unobtainable, brain injuries are rarely focused neatly in one specific area: it could instead be that anatomically close but distinct regions, or even neurons within the same region, process facial and vocal emotional expressions independently.

In order to explicitly test whether the same neurons are responsible for emotion processing in each domain, previous studies have employed an adaptation paradigm, which works by desensitising the neurons responsible for responding to a specific stimulus (the adapting stimulus) before measuring responding to a test stimulus (Webster and Macleod, 2011). If both the adapting stimulus and test stimulus are processed by the same distinct neuronal populations then pre-exposure to the adapting stimulus will result in a reduction in responding to the test stimulus as these neurons have already been activated and are therefore fatigued. However, if both stimuli are processed by separate populations of neurons then responding to the test stimulus will be unaffected by pre-exposure to the adapting stimulus.

Hsu and Young (2004) conducted a series of experiments and confirmed that adaptation effects in recognition of facial expressions are a relatively robust finding that are maintained across stimulus size and identity. They found that when pre-exposed to an image of either a happy, sad or fearful facial expression, participants were less correct in identifying a preceding image of the same emotion, thus suggesting distinct neuronal populations are responsible for the processing of different emotions in facial expressions. Hsu and Youngâ€™s (2004) finding that adaptation effects are maintained when the adapting and test stimuli vary in size are particularly important as they suggest that adaptation effects occur in higher visual areas beyond the primary visual cortex, and adaptation is to the emotional expression itself as opposed to lower level features such as orientation and spatial frequency.

Hsu and Young (2004) also considered the timings required in order to produce or abolish adaptation effects. They found that even when the interval between the presentation of the adapting stimulus and test stimulus was extended from 100ms to 1000ms the adaptation effects still remained. However, when the adapting stimulus was presented for 500ms, as opposed to 5000ms in their original experiment, and the interval between the two stimuli was set at 1000ms, adaptation effects were abolished. This suggests that the effects found in their prior experiments were due to neuronal fatigue, as opposed to a criterion shift for what constitutes a specific emotion, as a reduction in prior stimulation would reduce the level of neuronal fatigue and therefore the strength of adaptation effects, whilst the duration of the adaptation stimulus would be irrelevant in the development of a criterion shift.

Evidence of adaptation effects has also been found in the auditory domain, albeit literature in this area is more limited. Bestelmeyer et al (2010) found that adaptation to an angry vocalisation caused a subsequent ambiguous vocalisation to be classified as more fearful in a forced choice task. Furthermore, different voices were used for the adapting and test stimuli, suggesting that adaption was to the emotion as opposed to the qualities of the voice itself. If the actual emotion being adapted to is more important than how it is presented, this could suggest that the modality in which the emotion is presented is also less important than the emotion being presented, and therefore the adaptation effects observed may be consistent when tested across modalities (ie. Adaption to an emotional vocalisation could lead to adaptation effects in the identification of emotional facial expressions and vice versa). If this were the case, it would be that the same distinct neuronal populations are responsible for processing emotions across modalities, instead of distinct neural populations for each domain. This appears likely when considering both the biological evidence of activation in the same brain areas for emotional stimuli in different modalities and the irrelevance of lower level features of stimuli as described by Bestelmeyer (2010) and Hsu and Young (2004)

Until very recently the occurrence of cross modal adaptation effects in emotion recognition had not been considered. Wang et al (2017) provided the first piece of literature explicitly focusing on this area and found that when pre-exposed to a sound of laughter, participants were more likely to identify a following face as sad in a forced choice happy-sad condition, however this effect was significantly lower than when participants were pre-exposed to a happy face: despite providing evidence that they do occur, this suggests that cross modal adaptation effects were not as strong as single domain.

The reasons for this weaker relationship are unclear. It could be that some neurons are involved in the processing of distinct emotions regardless of the modality in which they are presented, however other modality-discriminant neurons are also involved, meaning a proportion of, but not all neurons involved in processing the facial test stimuli were fatigued. Alternatively, it could be that auditory recognition is poorer in general than visual recognition (Cohen, Horowitx and Wolfe, 2009), however this effect was found to be attention dependent and both could be recognised with over 95% accuracy when full attention was allocated. The test paradigm used by Wang et al (2017) and other previous studies of adaptation effects explicitly requires participants to focus their attention on the stimuli being presented and only presents one modality at a time, meaning lack of attention is unlikely to be responsible for this reduced effect.

Furthermore, Wang et al (2017) only used opposing emotions (happy and sad) and only used happy as an adapting stimuli: they did not investigate whether pre-exposure to a sad vocalisation would influence judgement of test faces. It has previously been argued that people are more accurate in identifying pleasant vs unpleasant facial expressions as opposed to categorising emotion specific expressions (Russell, 1994), suggesting that the adaptation effects found could be related to valence as opposed to distinct emotional categories. Furthermore, Fernandez-Dols and Crivelli (2013) argued that in terms of facial expressions, happiness is the most easily recognisable emotion (of Ekmanâ€™s six core emotions): therefore it is unclear whether cross modal adaptation effects would still be found if adapting and test stimuli come from more similar emotion categories.

For this reason we used sadness, fear and anger as adaptation and test stimuli in order to confirm whether Wang et alâ€™s (2017) findings can be generalised to all emotions, or whether they merely represent adaptation in pleasant vs unpleasant emotions. The main aim of this study is to investigate whether the same neuronal populations are responsible for processing and responding to different emotions across different modalities (faces and vocalisations). Experiment oneâ€™s aim is to replicate the study of Hsu and Young (2004) but using the emotions fear, sadness and anger in order to confirm that adaptation effects are seen across these facial expressions. It is hypothesized that this experiment will produce similar results to that of Hsu and Young (2004) and demonstrate evidence of an adaptation effect in the facial domain. Experiment two will employ the same methodology as experiment one, but will instead consider whether adaptation effects when judging the emotions portrayed in facial expressions can be induced through pre-exposure to emotional vocalisations. It is hypothesized that in this experiment cross modal adaptation effects will be found, based on the preliminary findings of Wang et al (2017) and evidence that emotional stimuli from different modalities is often processed in the same brain areas.

Method

Experiment 1

Participants

Thirty participants completed the experiment, however three were excluded due to being above two standard deviations away from the mean. Further information regarding this exclusion criterion is provided in the results section of this paper. Therefore twenty-seven participantâ€™s data was used in the analysis (twenty-five females and two males). All Participants were students at the University of Leeds with a mean age of 19.19 years (SD= 1.21), and had normal or corrected to normal vision and hearing. The majority of participants were Psychology students who were recruited through the sona system Leeds University Participant Pool Scheme in exchange for course credits. Other participants were recruited through word of mouth and received no reward.

All participants signed a consent form signalling they were happy to take part and were aware they could withdraw from the study at any point.

Materials

Three continua of morphed facial expressions of the model JJ were taken from the FEEST database (Young, Perrett, Calder, Sprengelmeyer and Ekman, 2002): neutral to fearful expression, neutral to angry expression and neutral to sad expression. Three experimental adapting images were used: model JJ displaying a prototypical fearful, angry or sad expression at 100% morph. For the baseline control condition, a grey screen was presented in place of the adapting stimuli.

For the test stimuli, the neutral expression in each continuum was morphed 10%, 20%, 30%, 40%, 50% or 60% towards the emotion expression: therefore there were 18 different possible test stimuli which were presented with each adapting stimulus five times each. The stimuli were presented on a laptop screen with an eye-screen distance of approximately 60cm. All stimuli can be seen in figure 1.

Design

The study was a within-subjects design, with the independent variable being the relatedness of the adapting and target stimuli. This variable had three levels: related (adapting stimuli and test stimuli displayed the same emotion), unrelated (adapting stimuli and test stimuli displayed different emotions) or neutral (baseline image was presented in place of an adapting image). Each level was created by averaging scores across morph levels and emotion in order to provide a clearer focus on the overall adaptation effect. The dependent variable was the percentage of correct answers given. This research was subject to ethical guidelines set out by the British Psychological Society and was approved by the University of Leeds, School of Psychology (ethics application number 17-0240, date of approval 14/09/2017).

Figure 1. All adapting and test stimuli used in experiment one. Images taken from the FEEST database (Young, Perrett, Calder, Sprengelmeyer and Ekman, 2002)

Procedure

The experiment lasted approximately forty-five minutes and was made up of 360 trials: 90 related, 90 neutral and 180 unrelated.

Emotion Test Stimuli Adapting Stimuli

10% 20% 30% 40% 50% 60% 100% Control

Anger

Fear

Sadness

Each trial began with the presentation of an adapting stimulus for 5000ms, which participants were instructed to inspect continuously (however no specific fixation point was given). This was either a prototypical fearful, angry or sad expression, or the baseline (control) image. This was followed by an interstimulus interval (ISI) of 100ms, which immediately preceded the test stimuli. The test stimulus was one of the morphed expressions from one of the three continua and was presented for 50ms. Following this, a black screen was presented and participants were given 2500ms to respond by pressing â€˜Aâ€™ â€œFâ€™ or â€˜Sâ€™ on the keyboard in order to indicate which emotion they believed they had seen in the test stimulus. For participantsâ€™ convenience and in order to enable fast responding the keyboard keys I, O and P were labelled with these letters. If a participant failed to respond during the time given, the experiment progressed to the next trial with no response recorded for the previous trial.

The experiment was split into five blocks of 72 trials with participants receiving four breaks, however no performance feedback was given at any point. Each unique adaptation stimulus and test stimulus pairing was presented once in each block (and therefore five times in total throughout the experiment), in a random order that was consistent across participants.

Participants were verbally debriefed once the experiment was complete and given the opportunity to ask any questions. They were provided with the researcherâ€™s contact details in case they had any queries or concerns at a later point.

Trial Adapting stimuli (5000ms) ISI

(100ms) Test stimuli

(50ms) Alternative forced choice (2500ms)

Prototypical fear expression

30% morphed fear expression

Unrelated

Prototypical sad expression

50% morphed angry expression

Neutral

20% morphed sad expression

Figure 2: A visual example of each type of trial. Images taken from FEEST database (Young, Perrett, Calder, Sprengelmeyer and Ekman, 2002)

Experiment 2

Participants

Like in experiment one, thirty participants initially completed the experiment, however two were removed from analysis due to their data being over two standard deviations away from the mean. Therefore twenty-eight participantâ€™s data was used in the analysis (twenty-six females and two males). All Participants were students at the University of Leeds with a mean age of 19.18 years (SD= 1.22), and had normal or corrected to normal vision and hearing. Again the majority of participants studied Psychology and completed the study in exchange for course credits; those that did not received no reward. All participants signed a consent form signalling their consent to take part and their knowledge that they could withdraw from the study at any point.

Materials

The test stimuli used were the same as that in experiment one: angry, sad and fearful morphed facial expressions of the model JJ taken from the FEEST database (Young, Perrett, Calder, Sprengelmeyer and Ekman, 2002). In place of the prototypical facial expressions used in experiment one, the adapting stimuli were three vocalisations taken from the Montreal Affective Voices (Belin, Fillion-Bilodeau and Gosselin, 2008) which indicated the same emotions used as test stimuli at 100% intensity: anger, fearfulness and sadness. A blank grey screen was presented while these vocalisations were playing. The baseline condition was the same as in Experiment one: a grey screen presented with no sound playing. As in experiment one the visual stimuli were presented on a laptop screen with an eye-screen distance of approximately 60cm. The auditory vocalisations were played through the same laptop at a medium volume.

Design

The study was a within-subjects design, with the same independent and dependent variables as experiment one. The independent variable was the relatedness of the adapting and target stimuli (related, unrelated and neutral) and the dependent variable was the percentage of correct answers given. The research was subject to ethical guidelines set out by the British Psychological Society and was been approved by the University of Leeds, School of Psychology (ethics application number 17-0240, date of approval 14/09/2017).

Procedure

The general experimental procedure as well as the number of trials and length of stimulus presentation were identical to experiment one, except the emotional vocalisations were presented as adapting stimuli in place of the emotional facial expressions. All vocalisations were presented for approximately 5000ms but due to variation in length of the recordings fear and anger were played on repeat five times whilst the sad vocalisation was only repeated twice.

All timings were the same, except in this experiment there was no time limit on responding and the next trial would only begin once a response had been given.

Results

Exclusion criteria

In both experiments, participants were removed from the analysis if their score was above two standard deviations away from the mean in at least one of the three conditions (related, unrelated or neutral). This was because upon inspection of the data, this pattern of responding appeared to suggest participants had misunderstood the task and had responded with the emotion they had observed in the adapting stimuli as opposed to the test stimuli.

Experiment one

Three participants were removed from the analysis. Figure one shows the average scores of the remaining twenty-seven participants in related (M=0.52, SD=0.12), unrelated (M=0.55, SD=0.11) and neutral (M=0.56, SD=0.12) conditions.

Figure 3: Percentage of correctly identified emotional faces when pre-exposed to related, unrelated or neutral facial expressions

A paired samples t-test was conducted to compare emotion recognition when pre-exposed to related and neutral faces. There was a significant difference in the scores for related (M=.55, SD=.15) and neutral (M=.6, SD=.15); t(26)=-2.249, p=0.033. These results suggest that participants were worse at recognizing emotions when pre-exposed to a related face compared to a neutral face.

A second paired samples t-test was conducted to compare emotion recognition when pre-exposed to unrelated and neutral faces. There was no significant difference in the scores for unrelated (M=.59, SD=.13) and neutral M=.6, SD=.15), t(26)=-.933, p=.359. These results suggest that there was no difference in participantsâ€™ ability to recognize emotions in faces when pre-exposed to an unrelated or neutral face.

Experiment two

Two participants were removed from the analysis. Figure two shows the average scores of participants in related (M=0.64, SD=0.1), unrelated (M=0.53, SD=0.11) and neutral (M=0.62, SD=0.11) conditions.

Figure 4: Percentage of correctly identified emotional faces when pre-exposed to related, unrelated or neutral emotional vocalisations

A paired samples t-test was conducted to compare recognition of emotional facial expressions when pre-exposed to a related or neutral emotional vocalization. There was no significant difference in the scores for related (M=.7, SD=.14) and neutral vocalisations (M=.67, SD= .14), t(27)=1.343, p=.191. These results suggest that there was no difference in participantsâ€™ ability to recognize emotions in faces when pre-exposed to a related or neutral emotional vocalization.

Another paired samples t-test was conducted to compare recognition of emotional facial expressions when pre-exposed to an unrelated or neutral emotional vocalization. There was a significant difference in the scores for unrelated (M=.56, SD=.13) and neutral (M=.67, SD=.14), t(27)=-7.839, p<.001.

Discussion

The primary aim of this study was to investigate whether cross modal adaptation effects would occur (ie. Whether the same neurons that are responsible for processing emotions in the visual domain are also responsible for emotions in the auditory domain).

Experiment one found that participants were less correct in related trials compared to neutral trials, thus providing evidence of adaptation effects in the visual domain. This supports and adds credibility to the findings of Hsu and Young (2004), as we were able to confirm that the effects were present even when different emotions of a more similar, negative valence were used. However, no significant difference between performance in unrelated and control trials was found, thus contradicting the findings of Hsu and Young that pre-exposure to one emotion could enhance recognition of other emotions. Our inability to replicate this finding could be due to a number of factors: firstly, our use of emotions with only negative valence compared to Hsu and Youngâ€™s (2004) inclusion of happiness. Whilst Hsu and Young (2004) did find predictive relationships between fear and sadness, the strength of these relationships tended to be lower than that of either emotion paired with happiness. Furthermore, we did not consider relationships between different emotions separately but instead analysed them as an overarching â€˜unrelatedâ€™ factor, meaning smaller relationships between individual emotions may have been missed.

In contrast to our predictions, Experiment two found no evidence of adaptation effects: there was no significant difference between related and neutral trials. This suggests that the same neuronal populations do not process emotions across these modalities, as fatiguing the neurons through exposure to an emotion in the auditory domain did not reduce responding to the same emotion in the visual domain.

However, despite finding no evidence of cross modal adaptation effects, there did appear to be some kind of relationship between pre-exposure to an emotional vocalisation and participants response to facial expressions. Participants gave significantly less correct answers on unrelated trials compared to neutral, suggesting that pre-exposure to an incongruent vocalisation reduced accuracy in identifying facial expressions: the opposite results to what would be expected for an adaptation effect.

In order to gain a deeper understanding of these surprising effects further analysis was conducted into what response was given when an unrelated trial was answered incorrectly (ie. Whether the response was the same as the adapting emotion, or the other, unrelated emotion). The mean number of times participants said the adapting emotion and the other emotion is displayed in appendix 1. A paired samples t-test was conducted to compare the proportion of times participants said the adapting emotion compared to the other emotion when giving an incorrect answer on an unrelated trial. There was a significant difference in the scores for adapting (M=.62, SD=.12) and other emotion (M=.43, SD=.1); t(27)=4.601, p<.001. These results suggest that when pre-exposed to an emotional vocalisation, participants were more likely to incorrectly label the test facial expression as being the same as the preceding vocalisation, as opposed to the third irrelevant choice, ie. In a fear- anger trial, when participants were incorrect they were more likely to give the answer â€˜fearâ€™ than â€˜sadâ€™.

These findings suggest that whilst the same specific neurons do not respond to emotions across modalities, the systems responsible for processing emotions in the visual and auditory domain are in some way linked. Upon consideration, it appears that one explanation for these unexpected findings is priming, defined as a change in the processing of a stimulus caused by prior exposure to an identical or related stimulus (Lavrakas, 2008). Accuracy in the related condition was not significantly higher than that in the neutral condition, as would be expected if priming were occurring, however this could be evidence of a â€˜ceiling effectâ€™. It is possible that correct emotional facial recognition for the stimuli used reached its peak at around 62% accuracy. Whilst studies using this exact methodology have not been previously conducted, there is substantial evidence in the literature regarding priming and emotion recognition. Carroll and Young (2005) found that nonverbal emotional sounds primed participantsâ€™ judgements of facial expressions, however in this study priming was assessed through speed of responding rather than accuracy: prototypical images and vocalisations were used meaning the error rate in all trials was low. Therefore future research should look to combine the methodology of this study and that of Carroll and Young (2005) and consider both the percentage of correct responses and reaction times in order to investigate whether a significant difference between neutral and related stimuli can be produced, thus providing evidence of a full priming effect.

It is likely that these surprising findings were directly caused by the interaction of cross-modal stimuli as opposed to the involvement of other extraneous factors: firstly, the methodology was beneficial as nothing else changed between the two experiments other than the nature of the adapting stimuli. Secondly, the use of purely non-verbal vocalisations meant they were unlikely to be erroneously processed as speech as they contained very minimal phonetic information. Furthermore, they have high ecological validity as non-verbal vocalisations are often produced more spontaneously and therefore may offer a more reliable signal of emotion (Johnstone and Scherer, 2000). Previous studies have used sentences or words spoken in an emotional tone, however even if the words used were neutral their congruence or incongruence with the emotional tone may have interfered with participants interpretation and used up processing resources, meaning attention may not have been fully focused on the emotional quality of the speech.

Whilst this study appears to provide evidence of a cross-modal priming effect, pre-exposure was only to vocalisations as opposed to facial expressions: it is therefore unclear whether this relationship is bidirectional, ie. Whether pre-exposure to an emotional facial expression would have the same effect on judgements of emotional vocalisations. Vesker et al (2018) found that whilst emotional vocalisations could prime recognition of facial expressions, pre-exposure to an emotional facial expression had no influence on participantsâ€™ judgement of vocalisations, therefore suggesting the priming relationship found in the current study may not be bidirectional. As discussed in the introduction, de Gelder and Vroomen (2000) did find a bidirectional relationship, however in this study the facial expressions and vocalisations were presented simultaneously. Holig et al (2017) found that pre-exposure to both faces and voices could prime participantâ€™s judgement of the speakerâ€™s age, however no other study as of yet has explicitly considered the effects of pre-exposure to an emotional facial expression on judgements of emotion portrayed in vocalisations. The current study is limited in that it only considered vocalisations as adapting stimuli: further research is needed in order to investigate whether this relationship is bi-directional and if this is not the case, why this may be.

These priming effects do suggest that the neurons responsible for processing emotions are likely to be organised around emotional categories, as if all neurons responded equally to all emotions neither priming or adaptation effects would occur. However, the occurrence of cross-modal priming instead of adaptation effects suggests that these neuronal populations are not made up of multimodal neurons, but of interwoven unimodal neurons that are able to communicate with each other, signalling to neurons focused on the same emotion that they are likely to also be activated and thus priming them to respond.

It is unclear how Wang et al (2017) were able to find cross modal adaptation effects using a similar methodology, whilst the opposite was found in this study. This may be due to the stimuli employed: Wang et al (2017) used the emotions happy and sad, which as well as representing distinct emotional categories could also represent extreme differences in valence. In contrast, the stimuli employed in this experiment were all of negative valence. Furthermore, Wang et al (2017) only used one auditory adapting stimulus: laughter. They claimed that this was associated with happiness, however unlike the stimuli utilised in the current study, the sound was not taken from a recognised database and whilst unlikely, potentially could have inadvertently activated neurons associated with different emotions instead. In addition to the issues with the stimuli employed, Wang et al (2017) used a two alternative forced choice task, meaning that there was 50% chance of participants giving the desired answer just by chance. As the current study employed a three alternative forced choice task, and further analysis suggested that in the unrelated condition when participants were incorrect they consistently responded with the adapting emotion as opposed to the other irrelevant emotion, it is much less likely that these results were achieved purely through chance.

Another significant difference between the current study and that of Wang et al (2017) is how the auditory adapting stimulus was presented: the laughter stimulus in Wang et alâ€™s (2017) study was one long sound, whereas the vocalisations in this study were presented in repeated short bursts. It is therefore possible that this difference may have influenced the behaviour of the neurons responsible for responding to it: whilst one long sound may have caused neuronal adaptation, repeated sounds with a small break may have repeatedly activated the neurons. However, Ying and Xu (2017) found that rapid serial presentation of the same emotion in facial expressions did not influence single-domain adaptation effects: their findings were similar to those of experiment one in this study. Therefore it is unlikely that the repeated presentation of adapting vocalisations in our study would be the primary reason for the difference in findings, however further research may wish take this into account and directly compare whether pre-exposure to a long sound or repeated short sounds from the same emotion will lead to differences in recognition of following facial expressions.

Feedback from participants upon completion of both of our experiments suggested the overall length of each experiment (approximately 45 minutes) was too long. Participants reported becoming fatigued and bored and therefore paid less attention on some later trials. This may have influenced our results in several ways: firstly, lack of concentration on the adapting stimuli may have made adaptation effects less pronounced as neurons were not responding as intensely to adapting stimuli and therefore becoming less fatigued. Furthermore, lack of concentration combined with short presentation and response times may have impaired recognition and responding to the test image.

In addition to these issues, whilst the trials were presented in a random order with each individual trial presented once per block, this random order was consistent across participants, suggesting participants were likely to become bored or fatigued on the same trials. However, as each individual trial type (when considering morph levels and emotion) was presented once, and at a different point during each block of 72 trials, and these trials were then averaged together to create three overarching factors, it is unlikely that lack of concentration or fatigue on some trials would have a significant impact on the overall findings.

In terms of real life applications, emotion processing is known to be inpaired in several neurological and psychiatric disorders, including autism, schizophrenia and frontotemporal dementia (REFERENCE). Research into how emotional information is processed and integrated in the brain in healthy functioning is therefore important in order to fully investigate how this processing is impaired in such conditions. West, Copland, Arnott, Nelson and Angwin (2018) found that in people with high autistic traits, vocal prosody or semantics alone were not enough to prime recognition of emotional facial expressions: only when both were available and congruent were they able to prime participants judgements of facial expressions. It therefore may be that autistic patients find integration of emotional cues from different modalities is more challenging and this may contribute to the known deficits of emotion recognition in autism.

In conclusion, the findings of this study suggest that in terms of facial expressions there are distinct populations of neurons tuned to different emotions. However, due to the absence of cross-modal adaptation effects found in experiment two it appears that these neurons are unimodal and do not respond to emotional stimuli in the auditory domain. However, there does appear to be some communication between the neurons responsible for processing distinct emotions in facial expressions and vocalisations as evidenced by the partial priming effect found in experiment two. Therefore future research should attempt to further unpack this relationship, perhaps by employing a stricter methodology encompassing both the number of correct responses and reaction times in order to investigate whether an increase in responding to related emotions can be produced.

Essay: Distinct neuronal populations process emotions in different domains: A study on non-verbal vocalisations and facial expressions

Essay details and download:

Text preview of this essay:

Abstract

Introduction

Method

About this essay:

Essay details and download:

Text preview of this essay:

Abstract

Introduction

Method

About this essay:

Essay Categories: