Deep learning for urban modeling from street-view images

The population of Africa will grow to 4.2 billion people by 2100, with urbanization growing at an

unprecedented rate and in a largely unstructured way. The lack of any up-to-date geospatial information prevents efficient decision making, makes cities less resilient and the population more exposed to natural disasters and to the effects of climate change. Several private and public organizations are currently acquiring street-view coverage of larger and larger areas of Africa. Automated methods for image interpretation are however needed to extract synthetic, geo-referenced information on specific urban features relevant for urban planning and decision making.

Within this project, the student will implement a system to geo-reference (i.e., assign precise geographic coordinates on a map) to objects, such as vehicles, cars, buildings detected in 360 degree street-view images following an approach recently proposed in [1]. This has been successfully employed to automatically build a catalog for all the trees in a city from Google Street-View images. 

The student will learn the methods employed from the original authors and attempt to reproduce the results targeting specific classes of objects and a unique dataset in Africa

A tentative work plan follows:

  • Literature review: understand the methodology and the code employed in [1]. Explore and understand the structure of each neural network, its output and the training data that has been employed.
  • Reproduce the results in [1] using a different image dataset and the same training for the object detection and the graph neural network.
  • Re-train the object detection neural network to detect a specific class of objects.

The recommended type of project:

Work breakdown:

Prerequisites:

Keywords:

Deep learning, graph neural networks, computer vision, urbanization, mapping

Contacts:

Davide Antonio Cucci

References:

[1] Nassar, Ahmed Samy, Sébastien Lefèvre, and Jan D. Wegner. “” arXiv preprint arXiv:2003.10151 (2020).