Available Projects – Fall 2020 ‒ IVRL ‐ EPFL

Description:

Startup company Innoview Sàrl has developed software to recover by smartphone a watermark hidden into a grayscale image that uses halftones to display simple graphical elements such as a logo. Now the software has been extended to hide the watermark within graphical elements. Adapt this software to work within an Android smartphone. Tune and optimize the available parameters.

Deliverables:

Report and running prototype (Matlab and/or Android).

Prerequisites:

– Knowledge of image processing / computer vision

– Basic coding skills in Matlab and Java Android

Level:

BS or MS semester project or possibly master project

Supervisor:

Dr Romain Rossier, Innoview Sàrl, [email protected], , tel 078 664 36 44

Prof. Roger D. Hersch, INM034, [email protected], cell: 077 406 27 09

Description:

Startup company Innoview Sàrl has developed an interactive interface enabling selecting one code hiding method among several proposed methods. This interface also allows specifying the parameters of the created graphic or image elements. The project aims at further extending the interface in order to support new code hiding methods and to test the influence of the different parameters.

Prerequisites:

– knowledge of image processing / computer vision

– basic coding skills in Matlab and C#

Level: BS or MS semester project

Supervisors:

Dr Romain Rossier, Innoview Sàrl, [email protected], , tel 078 664 36 44

Prof. Roger D. Hersch, INM034, [email protected], cell: 077 406 27 09

Synopsis:

A watermark printed with daylight fluorescent inks hidden into a color background can be recovered under UV light. The goal is to adapt an already developed Android software package acquiring images embedding hidden watermarks. Recovering the watermark requires on the fly image acquisition, real-time sharpness evaluation, and appropriate image processing algorithms.

Reference:

Rossier, R.D. Hersch, Hiding patterns with daylight fluorescent inks Proc. IS&T/SID’s 19th Color Imaging Conference, San Jose, CA, USA, November 7-11, 2011, pp. 223-228, see http://lsp.epfl.ch/colorpublications

Deliverables: Report and running prototype (Matlab and/or Android).

Prerequisites:

– knowledge of image processing / computer vision
– coding skills in Matlab (and possibly Java Android)

Level: MS semester project

Supervisor:
Dr Romain Rossier, Innoview Sàrl, [email protected], tel 078 664 36 44

Description:

Modern deep learning systems are known to be vulnerable to adversarial attacks. Small not well-designed adversarial perturbation can make the state-of-the-art model predict wrong label with very high confidence. Fast Gradient Sign Method (FGSM) [1] and Projected Gradient Descent (PGD) [2] are two effective method to obtain robust models against adversarial attacks. In this project, we study the validity and strength of FGSM-based and PGD-based adversarial training. Furthermore, we will take a look at the loss landscape of training objective in normal training, FGSM-based training and PGD-based training. (Full description is available in this document.)

References:

[1] Ian J Goodfellow, Jonathon Shlens, and Christian Szegedy. Explaining and harnessing adversarial examples. ICLR 2014.

[2] Aleksander Madry, Aleksandar Makelov, Ludwig Schmidt, Dimitris Tsipras, and Adrian Vladu. Towards deep learning models resistant to adversarial attacks. ICLR 2018.

Deliverables:

Report. Reproducible code. Visualization of loss landscape studied.

Prerequisites:

Mathematical foundations (calculus, linear algebra). Gradient descent. Deep learning.

Level:

BS semester project. (Spring 2020)

Type of work:

20% literature review, 50% research, 30% development and testing.

Supervisor: Chen Liu

Description:

This project will combine model robustness with parameter binarization. We will investigate the robustness of a specific kind of network where all parameters are binary. How to train binary networks in a non-adversarial environment is well studied in recent years [1, 2]. For non-binary networks, Project Gradient Descent (PGD) [3] is a straightforward but empirically effective method to obtain robust models. In this project, we will study the robustness property of binary networks. We will further design algorithms to train robust binary networks. (Full description is available in this document.)

References:

[1] Matthieu Courbariaux, Yoshua Bengio, and Jean-Pierre David. Binarycon- nect: Training deep neural networks with binary weights during propagations. NIPS 2015.

[2] Matthieu Courbariaux, Itay Hubara, Daniel Soudry, Ran El-Yaniv, and Yoshua Bengio. Binarized neural networks: Training deep neural net- works with weights and activations constrained to+ 1 or-1. 2016.

[3] Aleksander Madry, Aleksandar Makelov, Ludwig Schmidt, Dimitris Tsipras, and Adrian Vladu. Towards deep learning models resistant to adversarial attacks. ICLR 2018.

Deliverables:

Report. Reproducible code. Possible paper submission.

Prerequisites:

Mathematical foundations (calculus, linear algebra). Optimization (gradient descent, primal-dual method). Deep learning.

Level:

MS semester project. (Spring 2020)

Type of work:

20% literature review, 50% research, 30% development and testing.

Supervisor: Chen Liu

Synopsis: Traditional methods for instance-level image segmentation have provided
limited ability to deal with other imaging domains such as comics, due
to the lack of annotated data on these domains. In this project, we will
implement the state-of-the-art methods for this task and apply them on
comics datasets. In addition, we will propose a weakly- or un-supervised
instance-level image segmentation method that leverages a domain
adaptation technique.

References:
[1] P. O. Pinheiro, R. Collobert, and P. Dollar, “Learning to segment
object candidates,” NIPS, 2015.
[2] B. Zhou, A. Khosla, L. A., A. Oliva, and A. Torralba, “Learning Deep Features for Discriminative Localization.” CVPR, 2016.
[3] A. Rozantsev, M. Salzmann, and P. Fua, “Residual parameter transfer for deep domain adaptation,” CoRR, 2017.

Deliverables: Report and reproducable implementations

Prerequisites: Experience with deep learning with Pytorch or another
framework, computer vision

Level: MS semester project

Type of work: 60% research, 40% implementation

Supervisors: Baran Ozaydin ([email protected])

Description:

Startup company Innoview Sàrl has developed software to recover by smartphone a hidden watermark printed on a desktop Epson printer. Special Epson P50 printer driver software enables printing the hidden watermark. That Epson P50 printer is now replaced by new types of Epson printers that require a modified driver software. The project consists in understanding the previous driver software and at modifying it so as to be able to drive the new Epson printer. Possibly, reverse engineering will be necessary to obtain some of the new non documented driver codes.

Deliverables: Report and running prototype (C, C++ or Matlab).

Prerequisites:

– knowledge of image processing

– basic coding skills in C, C++ or Matlab

Level: BS or MS semester project

Supervisors:

Dr Romain Rossier, Innoview Sàrl, [email protected], , tel 078 664 36 44

Prof. Roger D. Hersch, INM034, [email protected], cell: 077 406 27 09

Description:

Startup company “Global ID” has developed a 3D finger vein biometric identification system. To improve and verify the performance of their biometric identification system, they have to get a large number of the finger vein images in person. However it is very cumbersome, time consuming, and labor intensive to collect a large-scale dataset from subjective experiments. To address that, in this project, we will generate a large number of finger vein images from a small number of the real finger vein images using deep generative model. For this purpose, we will study the generative adversarial network (GANs). Furthermore, we will design a GAN-based data augmentation algorithm for synthesising realistic finger vein images.

Prerequisites:

Knowledge of image processing and computer vision
Experience with PyTorch for deep learning

Deliverables:

Report
Reproducible code with the augmented dataset

Level:

MS Semester Project

Type of work:

40% research, 60% development and testing

Supervisor: Hakgu Kim ([email protected])

Description:

Visual saliency refers a part in a scene that captures our attention. Conventionally, many saliency detection techniques, including deep convolutional neural networks (CNNs) based approaches, try to predict where people look without considering temporal information. However, temporal ordering of the fixations includes important cues about which part of the image may be more prominent. In this project, you will work with time weighted saliency maps or saliency volumes given the eye-fixations, ground truth saliency maps and images. In experiments, you will investigate how much the saliency estimation performance is boosted by incorporating temporal information.

References:

[1] M. Assens Reina, X. Giro-i-Nieto, K. McGuinness, and N. E. O’Connor, “SaltiNet: Scan-path prediction on 360 degree images using saliency volumes,” in Proceedings of the IEEE International Conference on Computer Vision, 2017, pp. 2331–2338.

[2] Fosco, C., Newman, A., Sukhum, P., Zhang, Y. B., Oliva, A., & Bylinskii, Z. (2019). How Many Glances? Modeling Multi-duration Saliency. In SVRHM Workshop at NeurIPS.

Tasks:

– Understand the literature and state-of-art

– Implement several temporal saliency prediction algorithms

– Develop a method to estimate saliency by incorporating temporal information

– Compare the performances of existing state-of-art saliency algorithms and temporal algorithms on spatial saliency maps

Prerequisites:

Experience in machine learning and computer vision, experience in Python, experience in deep learning frameworks

Deliverables:

At the end of the semester, the student should provide a framework that provides the saliency prediction and a report of the work.

Level:

MS semester or thesis project

Type of work:

60% research, 40% development and testing

Supervisor: Bahar Aydemir ([email protected])

Description:

This project is ideal for future PhD applicants, as we expect it to turn into a research publication at the end of the semester (conditioned on the quality of results). It revolves around classification neural networks and how they can deal with degraded images. You would be working with the most recent deep-learning based image processing and classification methods, guided by a clear research plan. If interested, reach out to the supervisors for further details and to discuss whether the project would fit you.

Deliverables: Code with thorough results analysis.

Prerequisites: Experience with PyTorch for deep learning. Is a plus: experience with classification networks, and restoration networks.

Type of work: 50% research, 50% implementation

Level: MS

Supervisor(s): Majed El Helou, Deblina Bhattacharjee

Description:

In this project, you will work on the recommendation system of the short film streaming platform Sofy.tv. Sofy.tv uses a recommendation system based on the “recipes” of the movies, found through our multi-channel deep learning system. The goal of this project is to improve the current recommendations by considering users’ preferences in a hybrid recommender system.

Tasks:
– Understand the literature and our framework
– Revise our taste clustering system and
– Revise our matchmaking system between the users and films.
– Test the revised model.

Deliverables: At the end of the semester, the student should provide an enhanced framework for the recommendation.

Prerequisites: Experience in machine learning/deep learning and computer vision, experience in Python, experience in Keras, Theano, or TensorFlow. Basic experience in web programming.

Type of work: 50% research, 50% development and testing.

References:
– Y. Hu, Y. Koren and C. Volinsky, “Collaborative Filtering for Implicit Feedback Datasets”, 2008 Eighth IEEE International Conference on Data Mining, 2008, pp. 263-272.
– R. Burke, “Hybrid Recommender Systems: Survey and Experiments”, User Model User-Adap Inter 12, 2002, pp. 331–370.

Level: Master

Supervisor: Ilaria Lauzana ([email protected])

Description: In this project, you will work with models for facial recognition and similarity between images to improve our character casting predictions (currently based on our movie-cast matching magic). In particular, we want to make casting propositions based on how similar they are to a chosen actor/actress, to propose appropriate actors for family relations, time jumps, etc., or propose physically similar people to replace the actor proposed.

Tasks:
– Understand the literature and our framework for character-actor matching.
– Implement models for facial similarity, with eventual age gaps.
– Test the model on case studies.

Deliverables: At the end of the semester, the student should provide a framework for similarity predictions of faces based on multiple .

Prerequisites: Experience in deep learning and computer vision, experience in Python, experience in Keras, Theano, or TensorFlow.

Type of work: 50% research, 50% development and testing

References:
– S. Chopra, R. Hadsell and Y. LeCun, “Learning a similarity metric discriminatively, with application to face verification,” 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR’05), San Diego, CA, USA, 2005, pp. 539-546 vol. 1.
– Li L., Feng X., Wu X., Xia Z., Hadid A. (2016) Kinship Verification from Faces via Similarity Metric Based Convolutional Neural Network. In: Campilho A., Karray F. (eds) Image Analysis and Recognition. ICIAR 2016. Lecture Notes in Computer Science, vol 9730. Springer, Cham
– Chen BC., Chen CS., Hsu W.H. (2014) Cross-Age Reference Coding for Age-Invariant Face Recognition and Retrieval. In: Fleet D., Pajdla T., Schiele B., Tuytelaars T. (eds) Computer Vision – ECCV 2014. ECCV 2014. Lecture Notes in Computer Science, vol 8694. Springer, Cham

Level: Master

Supervisor: Ilaria Lauzana ([email protected])

Description:

In this project, you will work on the development of machine learning models for the prediction of TV series performance on streaming platforms. Based on our database of ~200000 series and using our Deep Learning model for automatic video and text genre detection, you will develop a solution to predict the performance of series by combining video/text analysis and features extracted from network representations of the Internet Movie Database.

Tasks:
– Understand the literature and our framework.
– Perform in depth statistical analysis of our series database.
– Implement and test different machine learning approach for series performance prediction.
– Test the model on case studies.

Deliverables: At the end of the semester, the student should have implemented and tested machine learning models for series performance prediction.

Prerequisites: Experience in deep learning and machine learning, experience in Python. Experience in statistical analysis.

Type of work: 50% research, 50% development and testing

References:
– M. Ghiassi, David Lio, Brian Moon, Pre-production forecasting of movie revenues with a dynamic artificial neural network, Expert Systems with Applications, Volume 42, Issue 6, 2015, Pages 3176-3193.
– Simonoff, J. S. and Sparrow, I. R. Predicting movie grosses: Winners and losers, blockbusters and sleepers. In Chance, 2000.

Level: Master

Supervisor: Ilaria Lauzana ([email protected])

Description:

Visual saliency refers a part in a scene that captures our attention. Current approaches for saliency estimation use eye tracking data on natural images for constructing ground truth. However, in our project we will perform eye tracking on comics pages instead of natural images. Later, we will use the collected data to estimate saliency in comics domain. In this project, you will work on an eye tracking experiment with mobile eye tracking glasses.

Tasks:
– Understand the key points of an eye tracking experiment and our setup.

– Conduct an eye tracking experiment according to given instructions.

Deliverables: At the end of the semester, the student should provide the collected data and a report of the work.

Type of work: 20% research, 80% development and testing

References:

[1] A. Borji and L. Itti, “Cat2000: A large scale fixation dataset for boosting saliency research,” CVPR 2015 workshop on ”Future of Datasets”, 2015.

[2] Kai Kunze , Yuzuko Utsumi , Yuki Shiga , Koichi Kise , Andreas Bulling, I know what you are reading: recognition of document types using mobile eye tracking, Proceedings of the 2013 International Symposium on Wearable Computers, September 08-12, 2013, Zurich, Switzerland.

[3] K. Khetarpal and E. Jain, “A preliminary benchmark of four saliency algorithms on comic art,” 2016 IEEE International Conference on Multimedia & Expo Workshops (ICMEW), Seattle, WA.

Level: BS semester project

Supervisor: Bahar Aydemir ([email protected])

Description:

Visual saliency refers a part in a scene that captures our attention. Current approaches for saliency estimation use eye tracking data on natural images for constructing ground truth. However, in our project we will perform eye tracking on comics pages instead of natural images. Later, we will use the collected data to estimate saliency in comics domain. In this project, you will analyse the data collected from an eye tracking experiment.

Tasks

– Perform a detailed analysis of collected data by producing heatmaps, scanpaths and histograms.

– Evaluate a state-of-the art saliency estimation model on the collected data and compare the performance with existing results on natural images

Deliverables: At the end of the semester, the student should provide the analysis on the data and a report of the work.

Type of work: 20% research, 80% development and testing

References:

[1] A. Borji and L. Itti, “Cat2000: A large scale fixation dataset for boosting saliency research,” CVPR 2015 workshop on ”Future of Datasets”, 2015.

[3] K. Khetarpal and E. Jain, “A preliminary benchmark of four saliency algorithms on comic art,” 2016 IEEE International Conference on Multimedia & Expo Workshops (ICMEW), Seattle, WA.

Level: BS semester project

Supervisor: Bahar Aydemir ([email protected])

Description:

To annotate the relative depth in natural scenes and come up with a model for relative depth ordering.

Deliverables:

Report and running prototype (Python or Matlab).

Prerequisites:

– Knowledge of image processing
– Basic coding skills in Python and basic deep learning knowledge

Level:

BS semester project

Supervisor: Deblina Bhattacharjee ([email protected])

Description: In this project, you will research the existing literature on weakly-supervised or unsupervised depth estimation and build a model for estimating the depth maps in Comic images. Traditionally, there have been a myriad of depth estimation techniques that have been applied to real world images and are found to work considerably well. However, it becomes challenging to achieve the same results when applied to other image domains such as comics. A possible solution to this problem is domain adaptation where you may use a pretrained model on a natural image dataset and transfer it to a comic dataset. A better solution is to develop a weakly supervised technique for depth estimation in the comics domain. A good starting point is [3].

In this project, you will propose a framework for translating the natural images to comic domain [1-2] and propose a depth estimation network on the comics domain in a weakly-supervised manner. You may contact any of the supervisors at any time should you want to discuss the idea further.

Tasks:
– Understand the literature and our framework.
– Implement an existing state-of-the-art (SOTA) depth estimation model trained on natural image.
– Develop a method to translate the natural images to comics images using the depth maps of natural images
– Compare the performances of existing SOTA on natural images, generated images and comics images

Deliverables: At the end of the semester, the student should provide a framework for the depth estimation in comic domain along with a project report based on this work.

Prerequisites: Experience in deep learning and computer vision, experience in Python, experience in Keras, Theano, or TensorFlow. Experience in statistical analysis.

Type of work: 50% research, 50% development and testing

References:
[1] Jun-Yan Zhu, Taesung Park, Phillip Isola, and Alexei A. Efros. “Unpaired Image-to-Image Translation using Cycle-Consistent Adversarial Networks”, in IEEE International Conference on Computer Vision (ICCV), 2017.
[2] Youssef A. Mejjati, Christian Richardt, James Tompkin, Darren Cosker, and Kwang In Kim, “Unsupervised Attention-guided Image-to-Image Translation”, in Advances in Neural Information Processing Systems (NIPS), 2018.
[3] Andrea Pilzer, Dan Xu, Mihai Marian Puscas, Elisa Ricci and Nicu Sebe, “Unsupervised Adversarial Depth Estimation using Cycled Generative Networks”, in Proceedings of the 6th International Conference on 3D Vision (3DV 2018). IEEE, 2018.
[4] Ziyu Zhang, Alexander G. Schwing, Sanja Fidler, and Raquel Urtasun. “Monocular Object Instance Segmentation and Depth Ordering with CNNs”, in IEEE International Conference on Computer Vision (ICCV), 2015.

Level: Master

Supervisor: Deblina Bhattacharjee ([email protected]), Seungryong Kim ([email protected])