Instance Segmentation and Classification of Point Cloud Data – A Reinforcement/Semi-Supervised Deep Learning Approach in Forests

Airborne laser scanning (ALS) is a widely adopted remote sensing technology, renowned for its efficient and precise modeling of forests. This is attributed to its capability to accurately describe the geometric features of trees within a forest. However, automating the identification of individual trees and their species from ALS data poses a formidable challenge. Traditional closed-form clustering algorithms yield inaccurate segmentation results, and deep learning-based methods demand substantial amounts of labeled training data, which is impractical to establish manually.

This project aims to tackle the challenges associated with object labeling and accuracy by employing unsupervised and self-supervised approaches. Unsupervised methods are utilized to obtain a preliminary segmentation of the ALS data. Subsequently, these roughly segmented tree examples will be hand labeled and employed to train a classifier, facilitating the identification of well-segmented tree individuals. In the final step, these labels will be used to calibrate and refine state of the art segmentation and classification algorithms, employing a semi-supervised approach.

Task Description and Methodology

The focus of this phase of the project is to develop a semi-supervised/reinforcement learning framework to enhance a point cloud segmentation task through an iterative approach. The goal is to improve the model’s performance with minimal labeled data, focusing on instances that are challenging or uncertain.

In the context of point cloud segmentation, this approach aims to enhance the accuracy and efficiency of segmentation models by strategically selecting points from the point cloud for annotation. Instance segmentation involves assigning a label to each point in the cloud, indicating whether is belongs to a unique group (in this case a tree individual) amongst the entirety of points in the point cloud.

The approach will be applied for the point cloud instance segmentation task as follows:

  1. Initial Model Training:

    • Train an initial segmentation model on a small labeled dataset derived via an unsupervised method.
    • Use this model to make predictions on the unlabeled data.
  2. Uncertainty Estimate:

    • Employ uncertainty estimation techniques to identify points where the model is uncertain or likely to make errors. This uncertainty can arise from ambiguous shapes, occlusions, or other challenging scenarios.
  3. Query Strategy:

    • Develop a query strategy to select points that contribute the most to reducing model uncertainty. Common strategies include choosing points with high prediction entropy or low-confidence predictions.
  4. Annotation and Model Update:

    • The selected points will be relabeled by a hand.
    • Update the model with the newly annotated data and repeat the training process.
  5. Iteration:

    • Repeat the process iteratively, selecting new points for annotation based on the updated model or classification results.
    • Over time, the model should become more accurate with fewer labeled examples.


  • 10cm GSD airborne orthophoto / LiDAR pointcloud ~ 20 pnts/m2
  • ~700 in-situ localized tree species observations
  • 40,000 pre-classified tree-instance point clouds


  • Report summarizing findings of the investigation
  • Code implementation published to lab GitLab account
  • Output data prepared in a format readable by a standard GIS software


  • Experience with python, Deep Learning concepts and common deep learning frameworks (pytorch, Tensorflow,etc.), Computer Vision.




Interested candidates are kindly asked to send us by email their CV/github profile and a short motivation statement.

Jesse Lahaye , Laurent Jospin