Tagged Media-Aware Multimodal Content Annotation



Ivan Ivanov


Prof. Touradj Ebrahimi




The approach for multimedia content access based only on content analysis has not delivered widely accepted solutions. User activities in social networks, as tagging, annotating and rating of multimedia content, provide an entirely new view on how to solve the multimedia content access problem. The goal of this thesis is to find new models of interaction between automatic multimedia content analysis and social tagging. This project takes as successful instances new services and products such as Flickr, Facebook, YouTube, MySpace, and many others. Our research will address the challenge of efficient management and organization of image collections by enriching images with a semantic context.

Over the last few years, social network systems have greatly increased users’ involvement in online content creation and annotation. Annotations and their association with images provide a powerful cue for their grouping and indexing. The success of Flickr and FaceBook proves that users are willing to provide semantic context through manual annotations. However, tagging a lot of photos by hand is a time-consuming task. Users typically tag a small number of shared photos only, leaving most of them with incomplete metadata. The main novelty of our work in this thesis comes from the application which realizes an interactive service that minimizes the users’ tedious and time-consuming manual annotation process. We propose an interactive online platform which is capable of performing semi-automatic image annotation and tag recommendation for an extensive online database of images containing various object classes.

Tag propagation

In the past few years sharing photos within social networks has become very popular. In order to make these huge collections easier to explore, images are usually tagged with representative keywords such as persons, events, objects, and locations. In order to speed up the time consuming tag annotation process, tags can be propagated based on the similarity between image content and context. We proposed an interactive online platform that is capable of performing semi-automatic image annotation and tag recommendation for an extensive online database. First, when the user marks a specific object in an image, the system performs an object duplicate detection and returns the search results with images containing similar objects. Then, the annotation of the object can be performed in two ways: (1) In the tag recommendation process, the system recommends tags associated with the object in images of the search results, among which, the user can accept some tags for the object in the given image. (2) In the tag propagation process, when the user enters his/her tag for the object, it is propagated to images in the search results. Different techniques to speed-up the process of indexing and retrieval were exploited and their effectiveness demonstrated through a set of experiments considering various classes of objects.

User trust modeling for tag propagation

Since the initial tags are provided by humans it cannot be taken for granted that they are always correct since a user may put wrong tags just by mistake or even on purpose. Consequently, the trend to collaboratively attach any, theoretically unrestricted, free-form keywords (tags) to multimedia content may produce a danger that wrong or irrelevant tags may finally be propagated to other photos and prevent users from the benefits of annotated photos. Therefore, we proposed to consider user trust information derived from users’ tagging behavior for the tag propagation. We considered a system for efficient geotag propagation based on a combination of object duplicate detection and user trust modeling. The geotags are propagated by training a graph based object model for each of the landmarks on a small tagged image set and finding its duplicates within a large untagged image set. Based on the established correspondences between these two image sets and the reliability of the user, tags are propagated from the tagged to the untagged images. The user trust modeling reduces the risk of propagating wrong tags caused by spamming or faulty annotation. The effectiveness of the proposed method was demonstrated through a set of experiments on an image database containing various landmarks.

Photo album summarization

When people share their photos, they usually organize them in albums according to events or places. To tell the story of some important events in one’s life, it is desirable to have an efficient summarization tool which can help people to get a quick overview of an album containing huge number of photos. We proposed an approach for photo album summarization through a novel social game “Epitome” for mobile phones. Our approach to album summarization consists of two games: “Select the Best!” and “Split it!”. The goal of the first game is to allow a user to select the most representative photo of a reduced set of images, while in the second game, the user has to split the reduced set into two distinct parts. As it could be time-consuming to look at a huge collection of photos on a mobile phone, it is more enjoyable and pleasant to show only a limited number of images which can be fit into one mobile screen. The results obtained in these games are combined to produce a summarization and are then compared with the results of other users. As a final result, a unique summarization sequence of photos is determined. The determined sequence of photos can be used to create a collage of one album or a cover for an album. The proof of concept of the proposed method was demonstrated through a set of experiments on several photo collections. Moreover, we compared results obtained by this game with an automatic image selection by making use of different state-of-the-art visual and temporal features.


Social networks, image annotation, tag propagation, tag recommendation, object duplicate detection, geotags, user trust model, social game, photo summarization, collage, mobile game.


Tag propagation

  • Tag propagation system [demo]
  • Semi-automatic image annotation via tag propagation (presented at ACM MIR’10) [paper] [poster]
  • Geotag propagation (presented at SPIE’10) [paper] [slides]

User trust modeling for tag propagation

  • Geotag propagation based on user trust modeling (published in MTA) [paper] [slides]

Photo album summarization

  • Social game “Epitome” [demo]
  • Photo album summarization through social game “Epitome” (presented at ACM MM’10) [paper] [poster] [slides]
  • Comparison between social game “Epitome” and automatic visual analysis (presented at ACM MM’10) [paper]