Available Projects – Spring 2021 ‒ IVRL ‐ EPFL

Description:

A watermark printed with daylight fluorescent inks hidden into a color background can be recovered under UV light. The goal is to adapt an already developed Android software package acquiring images embedding hidden watermarks. Recovering the watermark requires on the fly image acquisition, real-time sharpness evaluation, and appropriate image processing algorithms.

Reference:

Rossier, R.D. Hersch, Hiding patterns with daylight fluorescent inks

Proc. IS&T/SID’s 19th Color Imaging Conference, San Jose, CA, USA, November 7-11, 2011, pp. 223-228, see http://lsp.epfl.ch/colorpublications

Deliverables: Report and running prototype (Matlab and/or Android).

Prerequisites:

– knowledge of image processing / computer vision
– coding skills in Matlab (and possibly Java Android)

Level: MS semester project

Supervisor:
Dr Romain Rossier, Innoview Sàrl, [email protected], tel 078 664 36 44

Description

This project is concerned with the classification of degraded images. More specifically, you would be working with (1) distribution learning networks, and (2) a novel generative learning restoration framework, for integration with image classifiers.

Deliverables

Code with thorough results analysis.

Prerequisites

Experience with PyTorch for deep learning. Is a plus: experience with classification networks and restoration networks.

Type of work

50% research, 50% implementation.

References

https://github.com/XingangPan/deep-generative-prior

The other references are internal to the lab and are not published yet.

Level

Supervisor(s)

Majed El Helou

Description (Master Semester Project open to EPFL students)

In this project, you will research the existing literature on weakly-supervised or unsupervised monocular depth estimation and build a model for estimating the depth maps for images in a new unseen domain. Traditionally, there have been a myriad of single-view depth estimation techniques that have been applied to real world images and are found to work considerably well. However, it becomes challenging to achieve the same results when applied to other image domains such as comics, cartoons etc. A possible solution is to develop a weakly supervised technique for depth estimation in the unseen domain via domain adaptation. A better solution would be to perform zero-shot learning to predict the relative depths.

You may contact the supervisor at any time should you want to discuss the idea further.

References

[1] “Towards Robust Monocular Depth Estimation: Mixing Datasets for Zero-shot Cross-dataset Transfer”; René Ranftl, Katrin Lasinger, David Hafner, Konrad Schindler, Vladlen Koltun.

[2] “Digging Into Self-Supervised Monocular Depth Estimation” Clément Godard, Oisin Mac Aodha, Michael Firman, Gabriel Brostow

Type of Work (e.g., theory, programming)

50% research, 50% development and testing

Prerequisites

Experience in deep learning and computer vision, experience in Python, Pytorch. Experience in statistical analysis.

Models will run on Kubernetes. (We will guide you how to use Kubernetes- no prior knowledge required).

Supervisor(s)

Deblina BHATTACHARJEE ([email protected])

Description: In this project, you will work on the improvement of machine learning models for the prediction of TV series performance on streaming platforms. Based on our database of ~200000 series and using our Deep Learning model for automatic video and text genre detection, you will develop a solution to predict the performance of series by combining video/text analysis and features extracted from network representations of the Internet Movie Database.

Tasks:
– Analyse the literature and add missing data useful for prediction.
– Implement and test different machine learning approaches for series performance prediction.
– Test the model on case studies.

Deliverables: At the end of the semester, the student should have implemented and tested machine learning models for series performance prediction.

Prerequisites: Experience in deep learning and machine learning, experience in Python. Experience in statistical analysis.

Type of work: 40% research, 60% development and testing

References:
– Jeon, Hongjun, et al. “Hybrid machine learning approach for popularity prediction of newly released contents of online video streaming services.” Technological Forecasting and Social Change 161 (2020): 120303.
– Fukushima, Yusuke, Toshihiko Yamasaki, and Kiyoharu Aizawa. “Audience ratings prediction of tv dramas based on the cast and their popularity.” 2016 IEEE Second International Conference on Multimedia Big Data (BigMM). IEEE, 2016.
– Ghiassi, Manoochehr, David Lio, and Brian Moon. “Pre-production forecasting of movie revenues with a dynamic artificial neural network.” Expert Systems with Applications 42.6 (2015): 3176-3193.

Level: Master

Supervisor: Ilaria Lauzana ([email protected])

Description: In this project, you will first work on extracting characters’ information from the script. You will then train sentiment analysis models to predict the personality and likeability of characters in a script.

Tasks:
– Understand the literature and our framework.
– Improve the script parsing framework to extract necessary character information.
– Train NLP-based models for sentiment analysis and personality prediction.
– Test the model on case studies.

Deliverables: At the end of the semester, the student should provide a framework for predictions of characters likeability from a script.

Prerequisites: Experience in deep learning and NLP, experience in Python, experience in Keras/TensorFlow.

Type of work: 50% research, 50% development and testing

References:
– Flekova, Lucie, and Iryna Gurevych. “Personality profiling of fictional characters using sense-level links between lexical resources.” Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing. 2015.
– Jacobs, Arthur M. “Sentiment analysis for words and fiction characters from the perspective of computational (Neuro-) poetics.” Frontiers in Robotics and AI 6 (2019): 53.
– Konijn, Elly A., and Johan F. Hoorn. “Some like it bad: Testing a model for perceiving and experiencing fictional characters.” Media psychology 7.2 (2005): 107-144.

Level: Master

Supervisor: Ilaria Lauzana ([email protected])

Description:

Startup company Innoview Sàrl has developed software to recover by smartphone a watermark hidden into a grayscale image that uses halftones to display simple graphical elements such as a logo. Now the software has been extended to hide the watermark within graphical elements. Adapt this software to work within an Android smartphone. Tune and optimize the available parameters.

Deliverables: Report and running prototype (Matlab and/or Android).

Prerequisites:

– knowledge of image processing / computer vision

– basic coding skills in Matlab and Java Android

Level: BS or MS semester project or possibly master project

Supervisors:

Dr Romain Rossier, Innoview Sàrl, [email protected], , tel 078 664 36 44

Prof. Roger D. Hersch, INM034, [email protected], cell: 077 406 27 09

Description:

Startup company Innoview Sàrl has developed software to recover by smartphone a hidden watermark printed on a desktop Epson printer. Special Epson P50 printer driver software enables printing the hidden watermark. That Epson P50 printer is now replaced by new types of Epson printers that require a modified driver software. The project consists in understanding the previous driver software and at modifying it so as to be able to drive the new Epson printer. Reverse engineering will be necessary to obtain some of the new non documented driver codes.

Deliverables: Report and running prototype (C, C++ or Matlab).

Prerequisites:

– knowledge of image processing

– basic coding skills in C, C++ or Matlab

Level: BS or MS semester project

Supervisors:

Dr Romain Rossier, Innoview Sàrl, [email protected], , tel 078 664 36 44

Prof. Roger D. Hersch, INM034, [email protected], cell: 077 406 27 09

Description:

Face detection is identifying human faces in natural images. Convolutional neural networks and deep neural networks have proved their effectiveness on detecting faces. However, performance of these approaches drop significantly on artistic images such as drawings, paintings and illustrations due to the limited training data in these domains.

In this project, we will perform face detection on comics characters. These faces differ from natural human faces due to the artistic interpretation of the authors and the fantastic nature of the characters. Therefore, we will use transfer learning and domain adaptation techniques to extract and translate facial information between different domains.

Tasks:

– Understand the literature and state-of-art

– Test several face detection algorithms on comics

– Develop a method to detect faces of different characters’ faces from multiple artistic styles

– Compare the performances of existing state-of-art face detection algorithms and our method

Prerequisites:

Experience in machine learning and computer vision, experience in Python, experience in deep learning frameworks

Deliverables:

At the end of the semester, the student should provide a framework that provides the face detection and a report of the work.

Level:

MS semester or thesis project

Type of work:

65% research, 35% development and testing

References:

[1] X. Qin, Y. Zhou, Z. He, Y. Wang and Z. Tang, “A Faster R-CNN Based Method for Comic Characters Face Detection,” 2017 14th IAPR International Conference on Document Analysis and Recognition (ICDAR), Kyoto, 2017, pp. 1074-1080, doi: 10.1109/ICDAR.2017.178.

[2] N. Inoue, R. Furuta, T. Yamasaki, K. Aizawa, Cross-domain weakly-supervised object detection through progressive domain adaptation, arXiv:1803.11365 (2018).

[3] W. Sun, J. Burie, J. Ogier and K. Kise, “Specific Comic Character Detection Using Local Feature Matching,” 2013 12th International Conference on Document Analysis and Recognition, Washington, DC, 2013, pp. 275-279, doi: 10.1109/ICDAR.2013.62.

Supervisor: Bahar Aydemir ([email protected])

Description:

Comic domain lacks annotations required for training and evaluation of machine learning models. Pixel-wise annotations crucial for the segmentation task. Our aim is to come up with an annotation tool for instance masks in comics and create a comic segmentation dataset. Previous works on superpixels and foreground extraction can bring efficiency to the annotation process.

References:
[1] Achanta, R., Shaji, A., Smith, K., Lucchi, A., Fua, P., & Süsstrunk, S. (2012). SLIC superpixels compared to state-of-the-art superpixel methods. IEEE transactions on pattern analysis and machine intelligence (PAMI)

[2] Rother, C., Kolmogorov, V., & Blake, A. (2004). ” GrabCut” interactive foreground extraction using iterated graph cuts. ACM transactions on graphics (TOG)

Deliverables:

Report, annotations and running prototype (Python or Matlab).

Prerequisites:

– Knowledge of image processing
– Basic coding skills in Python and basic deep learning knowledge

Level:

BS semester project

Supervisor: Baran Ozaydin (baran.ozaydin@epfl.ch)

Description:

Traditional methods for instance-level image segmentation have provided limited ability to deal with other imaging domains such as comics, due to the lack of annotated data on these domains. In this project, we will implement the state-of-the-art methods for this task and apply them on comics datasets. In addition, we will propose a weakly- or un-supervised instance-level image segmentation method that leverages a domain adaptation technique.

References:
[1] P. O. Pinheiro, R. Collobert, and P. Dollar, “Learning to segment
object candidates,” NIPS, 2015.
[2] B. Zhou, A. Khosla, L. A., A. Oliva, and A. Torralba, “Learning Deep Features for Discriminative Localization.” CVPR, 2016.
[3] A. Rozantsev, M. Salzmann, and P. Fua, “Residual parameter transfer for deep domain adaptation,” CoRR, 2017.

Deliverables: Report and reproducable implementations

Prerequisites: Experience with deep learning, Pytorch, computer vision

Level: MS semester project

Type of work: 60% research, 40% implementation

Supervisors: Baran Ozaydin (baran.ozaydin@epfl.ch)

Description:

Visual saliency refers a part in a scene that captures our attention. Current approaches for saliency estimation use eye tracking data on natural images for constructing ground truth. However, in our project we will perform eye tracking on comics pages instead of natural images. Later, we will use the collected data to estimate saliency in comics domain. In this project, you will work on an eye tracking experiment with mobile eye tracking glasses.

Tasks:
– Understand the key points of an eye tracking experiment and our setup.

– Conduct an eye tracking experiment according to given instructions.

Deliverables: At the end of the semester, the student should provide the collected data and a report of the work.

Type of work: 20% research, 80% development and testing

References:

[1] A. Borji and L. Itti, “Cat2000: A large scale fixation dataset for boosting saliency research,” CVPR 2015 workshop on ”Future of Datasets”, 2015.

[2] Kai Kunze , Yuzuko Utsumi , Yuki Shiga , Koichi Kise , Andreas Bulling, I know what you are reading: recognition of document types using mobile eye tracking, Proceedings of the 2013 International Symposium on Wearable Computers, September 08-12, 2013, Zurich, Switzerland.

[3] K. Khetarpal and E. Jain, “A preliminary benchmark of four saliency algorithms on comic art,” 2016 IEEE International Conference on Multimedia & Expo Workshops (ICMEW), Seattle, WA.

Level: BS semester project

Supervisor: Bahar Aydemir ([email protected])

Description:

Recent studies demonstrated that current deep neural networks (DNNs) are vulnerable to crafted perturbations that can cause misclassification. The perturbations can be generated by norm-bounded adversarial attacks, which limit the Lp distortion. However, a critical limitation of the existing adversarial attacks is that they just try to get DNNs to make incorrect predictions, neglecting the semantic properties of images. In a recent year, several semantic adversarial attacks have been proposed such as EdgeFool [1] and ColorFool [2]. They can craft content-based perturbations by mimicking the effect of traditional image processing filters.

In this project, we will analyze the characteristics of the semantic adversarial examples generated by content-based adversarial attack, compared to normal images processed by traditional image processing filters or norm-bounded adversarial examples. Based on the investigation, we will be able to better understand how DNNs perceive the image data.

Tasks

Understand the literature and our framework.
Implement an existing state-of-the-art (SOTA) semantic adversarial attack
Investigate the difference between the semantic adversarial examples, normal images processed by traditional image processing filters, and norm-bounded adversarial examples.

Deliverables

Project report
Reproducible code

Prerequisites

Knowledge of image processing and computer vision
Basic coding skills in Python for deep learning

Type of work

40% research, 60% development and testing

References

[1] A. S. Shamsabadi et al., “EdgeFool: An adversarial image enhancement filter,” in IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2019.

[2] A. S. Shamsabadi et al., “ColorFool: Semantic Adversarial Colorization,” in IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2020.

Level

BS Semester Project (Spring 2021)

Supervisor(s)

Hakgu Kim ([email protected])

Description (Master Semester Project open to EPFL students)

“Recent studies suggest that deep neural networks can take advantage of contextual representation for the estimation of a depth map for a given image. Therefore, focusing on the scene context can be beneficial for successful depth estimation.” [1]

In this project, you will research the existing literature on weakly-supervised or unsupervised monocular depth estimation and build a model for incorporating contextual information to estimate the depth maps for images. A possible solution is to use [3] and incorporate the context relationship between objects in a pre-existing state-of-the-art depth estimation network.

You may contact the supervisor at any time should you want to discuss the idea further.

References

[1] D. Kim, S. Lee, J. Lee and J. Kim, “Leveraging Contextual Information for Monocular Depth Estimation,” in IEEE Access, vol. 8, pp. 147808-147817, 2020, doi: 10.1109/ACCESS.2020.3016008.

[2] Yuru Chen, Haitao Zhao, Zhengwei Hu , “Attention-based Context Aggregation Network for Monocular Depth Estimation”, https://arxiv.org/abs/1901.10137

[3] Chenhan Jiang, Hang Xu, Xiaodan Liang, Liang Lin, “Hybrid Knowledge Routed Network for Large-scale Object Detection”, https://github.com/chanyn/HKRM

Type of Work (e.g., theory, programming)

50% research, 50% development and testing

Prerequisites

Experience in deep learning and computer vision, experience in Python, Pytorch. Experience in statistical analysis.

Models will run on Kubernetes. (We will guide you how to use Kubernetes- no prior knowledge required).

Supervisor(s)

Deblina BHATTACHARJEE ([email protected])

Description

Deep neural networks (DNNs) have achieved great successes in various vision applications while recent studies have shown that DNNs are vulnerable to adversarial examples which are manipulated images targeting to mislead DNNs to make incorrect predictions. Currently, most of such adversarial examples are generated with additive subtle perturbation in Lp norm balls.

In this project, we will propose a new framework for generating natural looking adversarial examples via GAN-based face image editing model. Based on user’s sketch and color input, the proposed method will be able to synthesize adversarial face examples which look natural to human perception but are misrecognized by DNNs (i.e., machine perception).

Tasks

Understand the literature and our framework.
Implement an existing state-of-the-art (SOTA) GAN-based image editing model.
Develop a method to generate natural looking adversarial examples which can fool the deep learning classifier by designing a novel objective function.
Measure the visual quality and attack success rate of the generated adversarial examples

Deliverables

Project report
Reproducible code

Prerequisites

Experience and knowledge of deep learning and computer vision
Experience with TensorFlow and PyTorch for deep learning

Type of work

50% research, 50% development and testing

References

[1] Y. Jo and J. Park, “SC-FEGAN: Face Editing Generative Adversarial Network with User’s Sketch and Color,” in IEEE International Conference on Computer Vision (ICCV), 2019.

[2] H. Qiu et al., “Semanticadv: Generating adversarial examples via attribute-conditional image editing” in European Conference on Computer Vision (ECCV), 2020.

Level

MS Semester Project (Spring 2021)

Supervisor(s)

Hakgu Kim ([email protected])

Description:

Modern deep neural networks are the state-of-the-art techniques for many appli cations such as computer vision and natural language processing, but they are vulnerable to adversarial attacks. On the other hand, they are are over-parameterized. Millions of, even billions of, parameters make it different to be deployed on memory-deficient devices, like mobile phones.

In this project, we will combine model robustness with compression, especially model pruning. The first part of the project is to reproduce the state-of-the-art pruning methods for adversarially robust neural networks. Based on that, we will explore the methods to either improve the performance or implement the network pruning under more difficult settings. For example, related to popular Lottery Ticket Hypothesis, some recent work finds that randomly weighted networks contains subnetworks of competitive performance even without any training. Despite many interesting phenomena in network pruning, most of them are under vanilla settings, i.e., no adversarial attack. We would like to explore whether or not these phenomena hold when we consider adversarial attacks.

More details and reference : Full project description

Deliverables:

Report. Reproducible code. Possible paper submission.

Prerequisites:

Mathematical foundations (calculus, linear algebra, probability). Optimization. Deep learning.

Level:

MS semester project. (Spring 2021)

Type of work:

20% literature review, 50% research, 30% development and testing.

Supervisor:

Chen Liu ([email protected])