High-Level Concepts for Affective Understanding of Images



Appeared in WACV 2017
Appeared in WACV 2017

This paper aims to bridge the affective gap between image content and the emotional response of the viewer, it elicits, by using High-Level Concepts (HLCs). In contrast to previous work that relied solely on low-level features or used convolutional neural network (CNN) as a blackbox, we use HLCs generated by pre-trained CNNs in an explicit way to investigate the relations/associations between these HLCs and a(small)set of Ekman’s emotional classes. As a proof-of-concept, we first propose a linear admixture model for modeling these relations, and the resulting computational framework allows us to determine the associations between each emotion class and certain HLCs (objects and places). This linear model is further extended to a nonlinear model using support vector regression (SVR) that aims to predict the viewer’s emotional response using both low-level image features (LLF) and HLCs extracted from images. These class-specific regressors are then assembled into a regressor ensemble that provide a flexible and effective predictor for predicting viewer’s emotional responses from images. Experimental results have demonstrated that our results are comparable to existing methods, with a clear view of the association between HLCs and emotional classes that is ostensibly missing in most existing work.


Afsheen Rafaqat Ali, Usman Shahid, Mohsen Ali, Jeffrey Ho



title={High-Level Concepts for Affective Understanding of Images},

author={Afsheen Rafaqat Ali, Usman Shahid, Mohsen Ali and Jeffrey Ho},

booktitle={IEEE Winter Conference on Applications of Computer Vision},



We have explored the associations between High-level concepts (objects and places) and emotion classes. In particular, we found that different emotion classes require different sets of high-level concepts for affective classifications.

High-level concepts and their sample corresponding images from Emotion6 dataset [1]
Our approach offers following two important advantages over state of the art methods.

    • Mapping HLCs to emotion distributions allows us to disassociate affective computation from the image feature extraction step and this allows for a shorter training time with more precise and diverse training inputs.
    • Advantage of using HLCs is that they are easily recognized by humans and therefore, it can be readily applied to tasks such as image retrieval and caption generation with an affective component.

Qualitative Results

We proposed a linear admixture model to relate occurrences of HLCs to the emotion distributions.

Heat map showing association between ImageNet HLCs and six emotion classes, strong associations are shown in darker shades.
Heat map showing association between ImageNet HLCs and six emotion classes, strong associations are shown in darker shades.
Heat map showing association between Places205 HLCs and six emotion classes
Heat map showing association between Places205 HLCs and six emotion classes

As shown above, disgust-emotion gives high value to HLCs like ’garbage-dump’, ’ant’, ’ashcan’, ’bathtub’ and ’fly’ while Joy emotion gives importance to HLCs like flowers, greenery and open-spaces (’daisy’, ’lakeside’, ’wheat-field’, ’orchard’).

Quantitative Results

For quantitative evaluation, we designed an ensemble called Hybrid Model (HM). In order to create ensemble, we have trained seven regressors (each on different subset of features) for each emotion in Emotion6.

Emotion prediction using Hybrid Model

Using HM on Emotion6 data set we see considerable improvement in classification accuracy with respect to state of the art. The results validated our hypothesis that each emotion class rely on the different set of high-level concepts (HCL). As both categorical and dimensional model scan not totally replace each other, therefore, we have also trained a Valence Arousal hybrid model to predict VA values for Emotion6 dataset.



[1] Peng, T. Chen, A. Sadovnik, and A. Gallagher. A Mixed Bag of Emotions: Model, Predict, and Transfer Emotion Distributions. In 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pages 860–868. IEEE, June 2015.