We investigate the influence of low-level image features for aesthetics prediction. We show that the aesthetic quality of a photography depends on its context. Image features learned from a specific image category are not necessarily the same as features learned from a generic image collection. Experiments conducted on specific image categories show that specific features obtain statistically significantly better results than generic ones.
Dataset & Code
We use the AVA dataset that can be found here.
The code for feature extraction, feature selection and feature comparison can be downloaded below.
For any questions regarding the code or the results, please contact Florian Simond:
florian [dot] simond [at] alumni.epfl.ch
We run our experiments on five datasets:
- The generic dataset, containing the 50k highest and lowest ranked images. (See Fig. 1)
- The animal dataset, containing the animal photos from the generic dataset.
- The portrait dataset, containing the portaits from the generic dataset
- The nature dataset, containing the nature photos from the generic dataset
- The city dataset, containing the cityscapes from the generic dataset.
We run the Sequential Forward Floating Selection (SFFS) algorithm in order to select the best subset of 7 features. The result of this procedure can be seen in figure 2.
For each of the 4 specific datasets, animal, portrait, nature and city, we compare the generic features (the one extracted from the generic dataset) with its specific features.
For the animal dataset, for example, we split it into a train and test set, train two models, one with the generic features and one with the animal features, and compare them on the test set. We repeated this operation several hundreds times and compiled the results in figure 3.
|Generic features||Specific features||p-value*|
Figure 3: Comparison of the specific features with the generic ones.
*p-value of the Wilcoxon Signed-Ranked test