Training Rhesus Macaques on Human Facial Expression Discrimination Tasks

Shirin Taghian Alamooti; Hamidreza Ramezanpour; Kohitij Kar

Apr 16, 2025

Training Rhesus Macaques on Human Facial Expression Discrimination Tasks

DOI

dx.doi.org/10.17504/protocols.io.eq2ly65rwgx9/v1

Shirin Taghian Alamooti¹,
Hamidreza Ramezanpour¹,
Kohitij Kar¹

¹Department of Biology, Centre for Vision Research (CVR), and Centre for Integrative and Applied Neuroscience (CIAN), York University

Kohitij Kar

York University

DOI: dx.doi.org/10.17504/protocols.io.eq2ly65rwgx9/v1

Protocol Citation: Shirin Taghian Alamooti, Hamidreza Ramezanpour, Kohitij Kar 2025. Training Rhesus Macaques on Human Facial Expression Discrimination Tasks. protocols.io https://dx.doi.org/10.17504/protocols.io.eq2ly65rwgx9/v1

License: This is an open access protocol distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited

Protocol status: Working

We use this protocol and it's working

Created: February 18, 2025

Last Modified: April 16, 2025

Protocol Integer ID: 120847

Funders Acknowledgements:

Simons Foundation Autism Research Initiative

Grant ID: 967073

Brain Canada Foundation

Grant ID: 2023-0259

Canada Research Chair Program

Grant ID: 102848

Abstract

This protocol describes a step‐by‐step training regimen for macaque monkeys to discriminate human facial expressions. The training is divided into 14 sub-steps that progressively increase in complexity by varying the number of emotions, identities, emotional intensities, and image presentation durations. The overall goal is to enhance the monkeys’ ability to rapidly and accurately identify emotional expressions under changing task demands.

Materials

The facial image set used in this study is the Montreal Set of Facial Displays of Emotion (MSFDE). These images, sourced directly from the MSFDE, offer a rich array of facial expressions that vary not only in the type of emotion depicted but also in terms of the identities presented and the intensity of the emotional expressions. The collection is thoroughly detailed in the seminal work by Beaupré, M. G., Cheung, N., & Hess, U. (2000), titled The Montreal Set of Facial Displays of Emotion, published by the Department of Psychology at the University of Quebec in Montreal, Canada. This diverse image set forms the backbone of our training tasks, providing the nuanced stimuli necessary to challenge and refine the subjects’ facial emotion discrimination capabilities.

To ensure that variations in low-level visual characteristics do not confound any behavioral effects observed during our experiments, we meticulously controlled the low-level properties of our images using the SHINE toolbox. This specialized toolbox allows researchers to standardize key parameters such as luminance, contrast, and spatial frequency across a set of images. By doing so, it ensures that differences in the stimuli are solely attributable to the high-level features—such as the emotional expression itself—rather than to extraneous visual factors. This careful standardization is critical for the validity of our findings in facial emotion discrimination tasks. 

Willenbockel, V., Sadr, J., Fiset, D. et al. Controlling low-level image properties: The SHINE toolbox. Behavior Research Methods 42, 671–684 (2010). https://doi.org/10.3758/BRM.42.3.671

For further reference, the images can be accessed here, and the corresponding metadata is available here.

In conjunction with the MSFDE, our experimental setup leverages a state-of-the-art home-cage kiosk system. This innovative system enables fully automated in-cage testing, ensuring that the behavioral tasks are conducted with minimal human interference while maximizing the precision of stimulus presentation and response recording. The home-cage kiosk system not only facilitates a natural testing environment for the subjects but also enhances the reliability and repeatability of the behavioral measurements. Detailed information about this system can be found in the article titled, Low-cost, portable, easy-to-use kiosks to facilitate home-cage testing of nonhuman primates during vision-based behavioral tasks.

Together, the MSFDE image set and the home-cage kiosk system form a robust foundation for our study, enabling us to systematically explore and understand the mechanisms underlying facial emotion discrimination.

Before start

The animals need to be trained with two alternate forced-choice tasks in the form of object or shape discrimination (see Rajalingham et al. 2018) before they can be started on this protocol.    

Overview

The training is divided into sequential seasons (phases), each increasing in difficulty and complexity. The monkey will learn to discriminate human facial expressions by being gradually introduced to more categories (emotions), identities, and subtle intensity variations. Training typically occurs 5 days per week, with sessions of ~1000-2000 trials each, depending on the monkey’s motivation. The monkey progresses to the next season after meeting performance criteria (e.g., ≥80% correct over two consecutive sessions) on the current tasks. Throughout all seasons, we use positive reinforcement (water reward for correct answers) and a brief timeout for incorrect answers to guide learning.

Task Setup: Use a two-alternative forced-choice (2AFC) paradigm with match-to-sample structure:
Each trial begins with the monkey pressing on a central dot. 
Sample stimulus: A single face image is displayed at the center of the screen (e.g., for 500 ms). Visual image size is usually 8 deg.  This image is either a neutral face or an emotional face, chosen randomly.
Blank delay: The face disappears, and a blank screen is shown briefly (e.g., 100 ms).
Choice screen: Two choice images appear: Each image is the full-intensity image for the target expression (matching the identity of the face shown in the sample) and a distractor expression (also matching the identity of the face shown in the sample). The choice screen can stay up to 15 seconds. 

In early training, to simplify, for the choice screen, you can present the exact same image that was shown as a sample on one side and a different image (from the same identity) on the other side. For example, if the sample was a happy face of Identity A, the choices might be the same happy A (correct match) or a sad face (100 % intensity) of Identity A (non-match). 

5. Monkey’s Response: The monkey indicates its choice by touching one of the two images for a minimum amount of time.
6. Outcome: If the monkey chooses the correct option (matching the sample’s category), a reward (e.g., water) is delivered immediately. If the choice is incorrect, no reward is given. Instead, typically, there is a timeout where the screen goes blank for a short period (e.g., 5 seconds), and no trials can be initiated. 
7. Inter-trial interval (ITI): After a correct trial and reward consumption, impose a short ITI (e.g., 500 milliseconds) before the next initiation point appears to start a new trial. After an incorrect trial’s timeout, the subsequent trial begins once the timeout is over, and the monkey re-engages.

Seasons

Season 1:

Goal: Teach the monkey to distinguish one extreme emotional facial expression from another extreme emotional facial expression. In this first phase, the emotion is very pronounced (full intensity) to make the discrimination obvious.
Stimuli: Use a small set of images from the MSFDE. For example, choose two individual identities (models) and three emotion categories (100% intensity) – 6 images total.
Procedure: During Season 1, the monkey learns by trial and error that choosing the correct category (an exact image match) yields a reward. Using the identical-image matching method, the task can be solved by simple picture matching at first.  
Performance Criterion: Continue daily sessions (e.g., 1500 trials/day) until the monkey achieves a high accuracy (e.g., ≥80% correct) in discriminating the full-intensity expressions. Once the criterion is met on two consecutive days, proceed to Season 2.

Season 2:

Goal: Having learned the basic distinction between two extreme emotional faces, the monkey is now probed with one additional identity -- just to test whether the previous performance generalizes across another identity.
Stimuli: Add another identity with 100% intensity for the three chosen expressions.
Criterion: Continue daily sessions (e.g., 1500 trials/day) until the monkey achieves a high accuracy (e.g., ≥80% correct) in discriminating the full-intensity expressions.

Season 3
In Season 3, the task is expanded to challenge the monkey's ability to discriminate between facial expressions by introducing a greater variety of emotional categories and identities, as well as by incorporating a second intensity level. In this phase, the stimulus set comprises four distinct emotions—Joy, Fear, Shame, and Anger—presented across three different identities (models numbered 1, 12, and 5) and two intensity levels (100% and 80%). A total of 24 images are used, each presented for 800 milliseconds. Additionally, 12 choice images are displayed on each trial. This season pushes the animal beyond simple identical-image matching, requiring it to start generalizing across subtle differences in both emotion and intensity.

Season 4

Season 4 further increases the complexity of the task by adding another emotional category. The stimulus set now includes five emotions—Joy, Fear, Shame, Anger, and Sadness—while still employing the same three identities (1, 12, and 5) and the two intensity levels (100% and 80%). With 30 images presented over 800 milliseconds and 15 choice images per trial, the monkey must refine its discrimination skills to accurately differentiate among these more nuanced emotional expressions.

Season 5

In Season 5, the full range of emotional expressions is introduced. The task now features all six emotions (Joy, Fear, Shame, Anger, Sadness, and Disgust) while maintaining three identities and two intensities (100% and 80%). The stimulus set is expanded to 36 images, and each image is still displayed for 800 milliseconds. With 18 choice images, this season tests the monkey's ability to discriminate a complete array of emotional signals, enhancing its ability to parse subtle differences in facial expression.

Season 6

Season 6 raises the challenge by adding another level of intensity to the stimulus set. Although the same six emotions and three identities are used, the images now vary across three intensity levels—100%, 80%, and 60%. This modification increases the total number of images to 54, while the presentation duration remains at 800 milliseconds and the number of choice images stays at 18. This phase is designed to evaluate whether the monkey can generalize its emotion discrimination abilities across a broader range of intensity variations.

Season 7

In Season 7, the task incorporates the full spectrum of intensity levels. The stimulus set includes all six emotions and three identities (1, 12, and 5) presented across all five intensity levels (100%, 80%, 60%, 40%, and 20%)— the total number of images amounts to 90. Each image is presented for 800 milliseconds, and 18 choice images are provided per trial. This season challenges the monkey to make fine discriminations across a wide dynamic range of facial expressions.

Season 8

Building on the complexity of the previous phase, Season 8 retains the same stimulus set of all six emotions, three identities, and five intensity levels (with the same missing combination noted) but decreases the image presentation duration to 600 milliseconds. The total number of images remains at 90, and 18 choice images are shown per trial. The shorter presentation time demands quicker processing and decision-making from the monkey, further testing its rapid perceptual discrimination abilities.

Season 9

Season 9 introduces an additional identity to the task. Now, the stimulus set features all six emotions and five intensity levels, but is expanded to include four identities (numbers 1, 12, 5, and 4). This adjustment increases the total number of images to 120, while the presentation duration stays at 600 milliseconds. The number of choice images is increased to 24, requiring the monkey to generalize its learned discriminations across a broader array of faces.

Season 10

In Season 10, the identity variable is further expanded. The stimulus set now includes five different identities (numbers 1, 12, 5, 4, and 9) alongside all six emotions and five intensity levels. With 150 images in total and a presentation duration of 600 milliseconds, the number of choice images rises to 30. This phase assesses whether the monkey’s discrimination performance continues to generalize as the variability in the facial stimuli increases.

Season 11

Season 11 further scales up the task by incorporating a sixth identity (adding model 7 to the existing set). The stimulus set, therefore, comprises six identities, all six emotions, and all five intensity levels. The total image count reaches 180, with a presentation duration of 600 milliseconds and 36 choice images per trial. This season is designed to significantly enhance task complexity and examine the robustness of the monkey’s discrimination abilities under increased stimulus variability.

Season 12

In Season 12, the stimulus set remains the same as in Season 11, with six identities, six emotions, and five intensity levels, but the image presentation duration is reduced to 400 milliseconds. With 180 images and 36 choice images per trial, this phase imposes an additional time constraint, compelling the monkey to process and respond to the stimuli more rapidly, thereby testing the limits of its perceptual speed and accuracy.

Season 13

Season 13 further intensifies the challenge by reducing the image presentation duration to 200 milliseconds. The stimulus set continues to feature all six emotions, six identities, and five intensity levels, amounting to 180 images, with 36 choice images provided on each trial. The dramatically shortened exposure time forces the monkey to extract and act upon critical expression cues under rapid time pressure (similar to natural fixation timelines, and likely restricted to feedforward visual processing), further refining its rapid discrimination skills.

Season 14

In the final phase, Season 14, the task reaches its maximum complexity. The stimulus set now includes all six emotions and all five intensity levels across 12 different identities (numbers 1, 12, 5, 4, 9, 7, 8, 2, 6, 3, 10, and 11). This results in a substantial increase in the total number of images to 360. Despite the increased stimulus variability, the presentation duration remains at a brief 200 milliseconds, and the number of choice images is elevated to 72. This final season is designed to rigorously evaluate the monkey's facial emotion discrimination performance under conditions of maximal stimulus variability and minimal (core recognition consistent) processing time.

Protocol references

Ramezanpour, H., Giverin, C., & Kar, K. (2024). Low-cost, portable, easy-to-use kiosks to facilitate home-cage testing of nonhuman primates during vision-based behavioral tasks. Journal of Neurophysiology, 132(3), 666-677.

Beaupre M. G., Hess U. (2005). Cross-cultural emotion recognition among Canadian ethnic groups. J. Cross Cult. Psychol. 36, 355–370 10.1177/0022022104273656

Rajalingham, R., Issa, E. B., Bashivan, P., Kar, K., Schmidt, K., & DiCarlo, J. J. (2018). Large-scale, high-resolution comparison of the core visual object recognition behavior of humans, monkeys, and state-of-the-art deep artificial neural networks. Journal of Neuroscience, 38(33), 7255-7269.

Willenbockel, V., Sadr, J., Fiset, D. et al. (2010), Controlling low-level image properties: The SHINE toolbox. Behavior Research Methods 42, 671–684.

Public workspaceTraining Rhesus Macaques on Human Facial Expression Discrimination Tasks

Training Rhesus Macaques on Human Facial Expression Discrimination Tasks